www.ClassicTW.com
https://mail.black-squirrel.com/

Sorting in twx (alphabetically)
https://mail.black-squirrel.com/viewtopic.php?f=15&t=34136
Page 1 of 1

Author:  Astrochimp [ Thu Oct 03, 2013 2:00 pm ]
Post subject:  Sorting in twx (alphabetically)

I'm wondering if we should add some sortArray command in twxProxy, but until then,
I thought I'd share this sorting routine I just wrote for sorting lists of bot commands.
It uses a type of inefficient bucket sort; it's inefficient because I only use buckets on the first character of the strings; sorting the contents of those buckets is brute force, which is slow, but since the number of strings in each bucket will be small (e.g. if you're sorting a list of two or three hundred bot commands) it's much faster than say pure bubble sort or insertion sort.

Edit: Did some testing; it's pretty much instant on N<200. At N=500 it takes about four seconds.

It would sure be a lot easier to write if I could use recursion... not sure I'm ready to implement recursion-capable gosubs yet in twx though.

If you have something faster than this, feel free to share!

AstroSortAlpha subroutine (using as an include would require changes):
Code:
#+++ AstroSortAlpha
# Sorts input array into output array, ascending alphabetically; case is ignored
# Input:  $AstroSortAlpha_Input - the array to read, and the size of that array
# Output:  $AstroSortAlpha - corresponding sorted array, with $AstroSortAlpha = array size
#
# Algorithm:
# function bucketSort(array, n) is
  # buckets ← new array of n empty lists
  # for i = 0 to (length(array)-1) do
    # insert array[i] into buckets[msbits(array[i], k)]
  # for i = 0 to n - 1 do
    # nextSort(buckets[i]);
  # return the concatenation of buckets[0], ...., buckets[n-1]

:AstroSortAlpha 
  getTimer $AstroSortAlpha_startTicks
 
  setArray $buckets 255
  setVar $buckets 255

  setArray $bucketDrops 255
  #$bucketDrops array elements are each initialized to 0
 
  echo "*Sorting"
  setVar $i 0
#This is the part that saves all the time; we throw each string
# into a bucket, depending on the charCode of the first char in each.
# Since we can index an array to add a string to a bucket, it doesn't cost 
# much time to do this.
  while ($i < $AstroSortAlpha_Input)
    add $i 1
    echo "."
    setVar $input $AstroSortAlpha_Input[$i]
    lowercase $input
    cutText $input $firstChar 1 1
    getCharCode $firstChar $charCode
    if ($charCode > 19) AND ($charCode < 255)
      add $bucketDrops[$charCode] 1
      setVar $bucketDropsIndex $bucketDrops[$charCode]
      setVar $buckets[$charCode][$bucketDropsIndex] $input
    else
      echo "*Input error in AstrosortAlpha on $charCode:* " & $charCode
    end
  end
   
  #+++ sort each bucket - would be great if we could do this recursively but I guess we can't as of yet
   # This part is actually slow algorithmically, except that we only need
   # to sort the contents of each bucket.
   # It could be improved in similar fashion as with the first character
   # in the strings, but benefits of that aren't really seen until
   # input sizes go above three or four hundred.
  setVar $i 0
  while ($i < $buckets)
    add $i 1
   
    echo ","
    setVar $drops $bucketDrops[$i]
   
    setVar $j 0
    while ($j < $drops)
      add $j 1
     
      setVar $k ($j + 1)
      while ($k <= $drops)
        setVar $swap FALSE
       
        setVar $compPos 1
       
        setVar $string1 $buckets[$i][$j]
        getLength $string1 $lenString1
     
        setVar $string2 $buckets[$i][$k]
        getLength $string2 $lenString2

        setVar $stop FALSE
        while ((($compPos < $lenString1) OR ($compPos < $lenString2)) AND ($swap = FALSE) AND ($stop = FALSE))

#edit don't really need to check for $swap above. 

          add $compPos 1
         
          if ($lenString1 < $compPos)
            setVar $swap FALSE
            setVar $stop TRUE
          else
            if ($lenString2 < $compPos)
              setVar $swap TRUE
              setVar $stop TRUE
            else
              cutText $string1 $cp1 $compPos 1
              cutText $string2 $cp2 $compPos 1
             
              getCharCode $cp1 $cc1
              getCharCode $cp2 $cc2
   
              if ($cc1 < $cc2)
                setVar $swap FALSE
                setVar $stop TRUE
              elseif ($cc2 < $cc1)
                setVar $swap TRUE
                setVar $stop TRUE
              end
            end
          end
          if ($swap = TRUE)
            setVar $temp $buckets[$i][$j]
            setVar $buckets[$i][$j] $buckets[$i][$k]
            setVar $buckets[$i][$k] $temp
          end
         
        end
         
        add $k 1
      end
    end
  end
  #---
 
  # move buckets to output array
  setVar $i 0
  setVar $index 0
  while ($i < $buckets)
    add $i 1
    setVar $j 0
    while ($j < $bucketDrops[$i])
      add $j 1
      add $index 1
      setVar $AstroSortAlpha[$index] $buckets[$i][$j]
    end
  end

 
  getTimer $AstroSortAlpha_endTicks
  setPrecision 2
  setVar $AstroSortAlpha_ticks (($AstroSortAlpha_endTicks - $AstroSortAlpha_startTicks) / 1000000)
  setPrecision 0
  setVar $AstroSortAlpha $index
 
  echo "*Done sorting.  N= " $AstroSortAlpha " CPU tick delta/1M: " $AstroSortAlpha_ticks & "*"
return
#---


example usage for sorting your array called $List,
where the values to be sorted are the first words of each $list[] entry
(can be modified to include entire string instead of just first word):
Code:
# would a copyArray command be useful in twx?
setVar $i 0
while ($i < $list)
   add $i 1
   setVar $AstroSortAlpha_Input[$i] $list[$i]
end
setVar $AstroSortAlpha_Input $i
gosub :AstroSortAlpha
setVar $i 0
while ($i < $AstroSortAlpha)
  add $i 1
  setVar $message ($message & $AstroSortAlpha[$i] & "*")
end
echo $message

Author:  ElderProphet [ Thu Oct 03, 2013 10:36 pm ]
Post subject:  Re: Sorting in twx (alphabetically)

I enjoy logic problems like this, and would love to spend some time optimizing with you, but clearly this is hard to do without built-in array sorting. Let me throw out a couple tidbits though.

First, while it's ugly, you can handle that second dimension index as follows:
setVar $buckets[$charCode][buckets[$charCode]]

Secondly, it looks like your 2nd dimension in the $buckets array is dynamic, and could likely benefit if static. You can create a multidimension static array as follows:
setArray $buckets 256 10

A 2nd dimension can be added to any given element as well, as in:
setArray $buckets[$charCode] 10

You can also use a dynamic array as the first dimension, and add a static 2nd dimension to an element, as above. This makes sense, for example, if the first dimension has a huge range of possibilities, but you expect results to fall into just a few buckets. I assert that small dynamic arrays perform better than huge static ones, so you can likely extract the best performance by combining them where sensible.

Author:  Astrochimp [ Thu Oct 03, 2013 11:09 pm ]
Post subject:  Re: Sorting in twx (alphabetically)

Thanks, I didn't know about the multidimensional array declarations, that could be useful. And your comments regarding dynamic/static arrays are interesting.

Regarding the nested array indexing, I find it more readable to use another variable.. otherwise like you say it's messy. I guess in this algorithm you're probably right, I should be more interested in optimization than readability.

Page 1 of 1 All times are UTC - 5 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/