
-----------------------------------
md
Wed Nov 07, 2007 8:37 pm

Sorting Contest
-----------------------------------
As inspired by rdrake's crappy assignment I am going to run a simple contest.

The Rules
* You can use any language you like, provided I can compile the source on linux. This unfortunately limits Turing unless someone wishes to make an effort to get OpenT running for me.
* You must provide source code
* My workstation will be used for judging.
* I shall post my solution at the end of the contest, as well as the winning entry.
* The contest shall start at 9:00 tonight and run until 12:00 tomorrow night. After that I shall test all programs and announce the winner (most likely friday night/saturday morning).
* The winner(s) shall be the person who can sort each data set fastest. There could be two winners.

The Goal
* You must provide one program capable of sorting single-byte values and one capable of sorting 2-byte integers (shorts, words, whatever).
* Your program must take a filename as the first argument. That file will contain bytes/shorts that are to be sorted. Your program should overwrite this file with sorted data.
* Your program must not use excessive amounts of memory. Reading the entire file into memory to sort it is cheating (and not practical).
* Your program must sort 256MB (give or take) worth of bytes/shorts, and then 1GB worth of bytes/shorts.


The Data
* Data will be generated for the final test with dd and /dev/urandom. Each program will be run with the same file; and the resulting file will be tested to ensure it contains the same numbers as the original. I will not provide such files for download as it's faster for people to generate their own test files. 

The Prize
The winner will receive bits equal to the number of seconds faster then the next fastest program they are. This shall be limited to 360 bits. If anyone manages to beat my time they shall receive double bits.

Go Nuts!

Clarifications
* Data files are in binary, not text. One byte of the file is one byte to be sorted.
* There are four data sets to be sorted, a 256mb file containing single-byte values (unsigned char), a 256mb file containing 2-byte values(unsigned short), a 1gb file containing single-byte values (unsigned char) and a 1gb file containing 2-byte values (unsigned short).
* They are to be sorted from least to greatest.
* They are unsigned. So no negative numbers.

-----------------------------------
Nick
Wed Nov 07, 2007 8:49 pm

RE:Sorting Contest
-----------------------------------
lmao i like the limitation on turing due to linux doesnt matter anyways because turing's max intake is 256 chars (even with multiple varibles asigned it would take longer to do making it harder to win)

-----------------------------------
CodeMonkey2000
Wed Nov 07, 2007 8:52 pm

RE:Sorting Contest
-----------------------------------
momop, do you even know what you are talking about?

-----------------------------------
Nick
Wed Nov 07, 2007 8:55 pm

RE:Sorting Contest
-----------------------------------
* Your program must sort 256MB (give or take) worth of bytes/shorts, and then 1GB worth of bytes/shorts. 

that means that the intake must be longer than 256 chars

and iunno do u know what im talking about?

-----------------------------------
CodeMonkey2000
Wed Nov 07, 2007 9:00 pm

RE:Sorting Contest
-----------------------------------
You are sorting an array. Turing can definitely handle more than 256 elements.

-----------------------------------
Nick
Wed Nov 07, 2007 9:31 pm

RE:Sorting Contest
-----------------------------------
OH!!!!!!!!!!!!!
i thought it gave u numbers and u sorted them :p geuss i didnt know what i was talking about

-----------------------------------
md
Wed Nov 07, 2007 9:59 pm

RE:Sorting Contest
-----------------------------------
Technically you are sorting a file (or an array stored in a file if you prefer). Sorting a file is harder then sorting an array in memory; especially since loading the entire file into memory is both infeasible and against the rules.

I may also extend this to include Saturday if there are enough people who express an interest.

-----------------------------------
CodeMonkey2000
Wed Nov 07, 2007 10:12 pm

RE:Sorting Contest
-----------------------------------
So you would be continuously opening and closing the file for every iteration you go through?

-----------------------------------
Saad
Wed Nov 07, 2007 10:16 pm

RE:Sorting Contest
-----------------------------------
Can you please extend the deadline to saturday? as it gives me more time to think

-----------------------------------
md
Wed Nov 07, 2007 11:29 pm

Re: RE:Sorting Contest
-----------------------------------
So you would be continuously opening and closing the file for every iteration you go through?

Sure, how you do it doesn't really matter to me.

I give you a file containing 1073741824 bytes. You must sort that file so it's in order. You can do it in any way you want; so long as it ends up sorted from smallest to largest. Likewise for half as many shorts.

-----------------------------------
Clayton
Wed Nov 07, 2007 11:45 pm

RE:Sorting Contest
-----------------------------------
Please, for the love of god, don't tell me you're going to use 5GB files too.

-----------------------------------
rdrake
Wed Nov 07, 2007 11:50 pm

Re: RE:Sorting Contest
-----------------------------------
Please, for the love of god, don't tell me you're going to use 5GB files too.He said 256 MB and 1 GB in the original post.  I'd be surprised if he bothered running all the submissions on 5 GB files.

-----------------------------------
Clayton
Thu Nov 08, 2007 12:00 am

RE:Sorting Contest
-----------------------------------
Yeah, well...


(11:53:28 PM) Clayton{}: I'm going to use "I'm tired" as an excuse

-----------------------------------
md
Thu Nov 08, 2007 8:38 am

RE:Sorting Contest
-----------------------------------
I'll extend this to Saturday at midnight; now you got no excuse clayton ;-)

-----------------------------------
Saad
Thu Nov 08, 2007 6:03 pm

Re: Sorting Contest
-----------------------------------

The Goal
* You must provide one program capable of sorting bytes and one capable of sorting 2-byte integers (shorts, words, whatever).

The part underlined i don't understand, how many bytes in each value do you want?

-----------------------------------
md
Thu Nov 08, 2007 6:13 pm

RE:Sorting Contest
-----------------------------------
Sorry, I have clarified the original post.

There are four data sets to be sorted, a 256mb file containing single-byte values (unsigned char), a 256mb file containing 2-byte values(unsigned short), a 1gb file containing single-byte values (unsigned char) and a 1gb file containing 2-byte values (unsigned short).

They are to be sorted from least to greatest.

-----------------------------------
zylum
Thu Nov 08, 2007 10:54 pm

Re: Sorting Contest
-----------------------------------
http://compsci.ca/v3/viewtopic.php?t=10521  :lol:

-----------------------------------
md
Fri Nov 09, 2007 8:32 am

RE:Sorting Contest
-----------------------------------
I should point out that the files to be sorted are binary, not text. And there has been one submission already.

-----------------------------------
bugzpodder
Fri Nov 09, 2007 10:18 am

Re: Sorting Contest
-----------------------------------
[deletia]

md says: What part of "Contest" did you miss? Try not to give out solutions for others to copy; it makes the whole thing more fun for all involved.

-----------------------------------
klopyrev
Fri Nov 09, 2007 1:28 pm

Re: Sorting Contest
-----------------------------------
I bet Knuth has a faster way to sort without reading the tapes more than once:P

-----------------------------------
md
Fri Nov 09, 2007 4:14 pm

RE:Sorting Contest
-----------------------------------
Thanks bugz... where before it was left to people to figure that out on their own, now there is a solution they can just copy.

Helping people is good and all, but giving away most of one of the better solutions is just ass-hat-ery.

-----------------------------------
zylum
Sat Nov 10, 2007 4:28 am

RE:Sorting Contest
-----------------------------------
well its the most obvious solution :P

and i dont think my solution will work as it reads text not binary :S

-----------------------------------
md
Sat Nov 10, 2007 9:14 am

RE:Sorting Contest
-----------------------------------
feel free to fix that part ;-)

Also, it's only the obvious solution if you think of it. There are more then a few people on IRC that missed it.

-----------------------------------
md
Sun Nov 11, 2007 1:04 pm

Re: Sorting Contest
-----------------------------------
Results

There were two entries, one from saad and one from zylum. I of course have written my own sorts to compare against.

results for sorting bytes:

                        256MB                        1GB
md                      5.0s                         32.0s
saad                    10.1s                        48.1s
zylum                   SF                           SF



results for sorting shorts:

                        256MB                        1GB
md                        -                           -
saad                      -                           -
zylum                    SF                           SF


I didn't run the sorts for shorts, since in prevous testing I've shown that my sort is faster then saads, and because zylums segfaulted again. Therefor... the winner is...

Saad

+360bits

-----------------------------------
wtd
Sun Nov 11, 2007 1:11 pm

RE:Sorting Contest
-----------------------------------
The winning submission should be posted in source form.

-----------------------------------
Saad
Sun Nov 11, 2007 1:20 pm

Re: Sorting Contest
-----------------------------------
Done in C++, compiled via g++ compiler

/* For this sort we will be calculating the number of each 
byte occurs and then output each byte in the number of times it occurs
Data type changes this is for short int
*/

#include 
using std::ifstream;
using std::ofstream;
using std::ios;
int main(int argc, char **args){
    unsigned int count

I dont know which source forum to post it to.

md asked me to post his so [url=http://exabit.org/mirror/sort2.c]here  it is

-----------------------------------
A.J
Thu Jan 31, 2008 1:53 am

Re: Sorting Contest
-----------------------------------
erm.....................saad, is this similar to what i did in turing (at school)? cause i sorted it using your algorithm but i did not know what you mean by sorting bytes (and I never heard about compsci.ca until about a week ago)

plus I started turing recently so I needed to ask you if this is sort of what your algorithm was ?
 
var stream, x : int
open : stream, "ENTER TEXT FILE NAME HERE.txt", get
var sort : array 1 .. (max value being tested) of int
% initialize the 'sort' array to 0 for all values
loop
    exit when eof (stream)
    get : stream, x
    sort (x) += 1
end loop
for i : 1 .. (max value being tested)
    for j : 1 .. sort (i)
        put i, " " ..
    end for
end for


-----------------------------------
Saad
Thu Jan 31, 2008 5:38 pm

RE:Sorting Contest
-----------------------------------
Yes it is, although you should change the code so that it supports binary files. Its read/write in Turing.

-----------------------------------
A.J
Thu Jan 31, 2008 7:15 pm

Re: Sorting Contest
-----------------------------------
could you give man example saad?

-----------------------------------
Saad
Thu Jan 31, 2008 7:35 pm

RE:Sorting Contest
-----------------------------------
var file : int
open : file, "foo.dat", write
for i : 1 .. 10
    write : file, i
end for
close : file
open : file, "foo.dat", read
for i : 1 .. 10
    var number : int
    read : file, number
    put number
end for
close : file

One advantage is to binary files is that binary files can be accessed at random location vs an ASCII (text) file which you need to access sequentially.
