CCC 05
Author 
Message 
gvc

Posted: Sat Mar 19, 2005 12:41 pm Post subject: Re: S5 


thegoose wrote: gvc wrote:
There are a number of O(n log n) algorithms that don't explicitly depend on m. But if you want to be really precise, most of them will be O(n log n log m) .
Depends on whether you consider addition/substraction/comparison to be logm time
I
Precisely.
Many people ignore the fact that it is reasonable to assume that arithmetic operations are constant time only if the size of the operands is bounded by some constant. O() notation considers the runtime (or space or whatever is being measured) as the parameter's values tend to infinity. And it ignores constant factors.
If there really were no bound on m (or a bound larger than, say 2^64) you would definitely have to consider the fact that arithmetic operations are not constant time. Addition and subtraction are linear time in the size of the integer (i.e. log of the magnitude) while naive multiplication and divison algorithms are quadratic. Better multiplication and division algorithms exist, but they're hard to implement. Naive comparison is linear time, but if the values are unequal and you know something about them, you might do better than linear time.
The trie implementation that I gave doesn't actually do integer comparisions on its way to finding the right value to increment. But it does do additions to compute the cumulative ranks, so I guess I should've said that my implementation was O(n log^2 m)  looking up the value only takes O(n log m) singlebit comparisions, but computing the ranks requires full logmbit arithmetic. 





Sponsor Sponsor



thegoose

Posted: Sat Mar 19, 2005 3:57 pm Post subject: Re: S5 


gvc wrote:
Many people ignore the fact that it is reasonable to assume that arithmetic operations are constant time only if the size of the operands is bounded by some constant. O() notation considers the runtime (or space or whatever is being measured) as the parameter's values tend to infinity. And it ignores constant factors.
Addition and subtraction are linear time in the size of the integer (i.e. log of the magnitude) while naive multiplication and divison algorithms are quadratic. Better multiplication and division algorithms exist, but they're hard to implement. Naive comparison is linear time, but if the values are unequal and you know something about them, you might do better than linear time.
Mine would have been O(NlogNlogM) then.
I think I brought this up a while ago (sometime during camp last year), but I never got the conclusive answer for it. In terms of a CPU, what are the relative costs of adding/multiplication/comparison/memory access of say, int (32bit) of C. Aka. how many of them can you do in a second on something like a P4 processor? 





gvc

Posted: Sat Mar 19, 2005 4:10 pm Post subject: (No subject) 


The short answer is that the relative costs ususally don't matter  by and large the running time of your program is dominated by RAM access and the number of instructions executed. But in a very tight loop, you'll find that addition and subtraction are essentially free. Multiplication is relatively slow  maybe the equivalent of 10 additions. Division is 25 times slower still.
Floating point is yet another factor of 210 slower again. So it is entirely possible that floatingpoint algorithms are dominated by the number of floating point operations.

One last counterpoint about O() notation. O() notation is just a tool. In competition what matters is how much you can get done in the time available. If your algorithm uses a hundred million operations, it'll probably run in time. If more, you have to consider the possibility that it won't. In this situation, you should be estimating actual running time, not asymptotic complexity.
So the "log m" discussion is of academic interest  it really has little relevance in deciding whether or not a solution to S5 is fast enough. 





thegoose

Posted: Sat Mar 19, 2005 8:31 pm Post subject: (No subject) 


gvc wrote: by and large the running time of your program is dominated by RAM access
Can you please go into some detail of why this is the case? (aka. why does RAM access take so much time) 





gvc

Posted: Sat Mar 19, 2005 10:21 pm Post subject: (No subject) 


thegoose wrote: gvc wrote: by and large the running time of your program is dominated by RAM access
Can you please go into some detail of why this is the case? (aka. why does RAM access take so much time)
A typical CPU will have an internal clock speed of perhaps 2 GHz. That's 0.5 nanoseconds per cycle. The CPU will be able to do at least one  probably several  simple integer arithmetic operations in one cycle. RAM might be 200 MHz  that's 5 nanoseconds. So when you access RAM you have to waste about 10 cycles waiting for it. The CPU tries really hard to find something else to do while waiting for the RAM but it isn't all that easy.
Also, the RAM is typically wider than 32 bits or whatever the word size is on your computer. So the CPU can actually read in a few words at a time, provided they are adjacent. It saves them in cache which won't slow the CPU down. So if you access RAM sequentially, you get a big saving for this and other reasons. But if you wander all over RAM, as you might while traversing a tree or using a hash table, you'll pay full price for each access.
But perhaps I haven't really answered your question. RAM has vast storage capacity compared to what's in the CPU/cache. To keep it small and cheap and not too power hungry, different technology is used and that is slower. Also, there's a reasonably complex protocol that the CPU uses to talk to the RAM. It sends the address over the bus, waits for the RAM to pick up the address, then waits for the data to be transmitted to or from the RAM. It simply isn't possible to do this as quickly as a CPU cycle. More generally, there are timedistance tradeoffs. In 0.5 nanoseconds you can't send a signal very far and also you get all sorts of weird highfrequency electromagnetic effects.
Not only the data, but the program itself, is stored in RAM. So if you have a large amount of code, especially if it branches all over the place, the time to read the instructions from RAM will be a big factor in the execution time of your program. But if you have a small loop with only a few instructions, it will likely all be read into cache, so will run pretty quickly.
You can try a few experiments. Write a loop that runs a billion times. Then put a simple addition instruction into it and see if it slows it down. Then try unrolling the loop  repeat the body 1000 times and have the loop run only a milliion times. I'll leave it to your imagination to try different combinations. Trying to measure the time consumed by any particular instruction is really tough to do on a modern computer. 





bugzpodder

Posted: Tue Mar 22, 2005 5:26 pm Post subject: (No subject) 


woah, we were just doing this in my CS class today... exact same info....
while we are on the subject of CCC, I am not sure if this information is announced publically or not, but stage 2 this year will be held on the last week of april i believe... and I'll decide if i want to volunteer for this and "babysit" some of you today or tomorrow =.=; after i make sure theres no conflict with my exams that is It is always amusing to listen to Troy/JP lecture. 





thegoose

Posted: Wed Mar 23, 2005 11:54 am Post subject: (No subject) 


bugzpodder wrote: I'll decide if i want to volunteer for this and "babysit" some of you today or tomorrow =.=; after i make sure theres no conflict with my exams that is
Are you gonna to do a talk? 





Aidin Kashigar

Posted: Wed Mar 23, 2005 9:59 pm Post subject: (No subject) 


And also, should we expect a question from you at the 2nd stage? 





Sponsor Sponsor



bugzpodder

Posted: Thu Mar 24, 2005 4:21 pm Post subject: (No subject) 


i thought about it last night, and turned the option of meeting u guys down due to some personal reasons.
I submitted a question to Troy but I have no idea if he'll use it or not... Prof Cormack (gvc?) will usually announce who made the problems so you'll know it if its mine or not. and I always use 'Jack' in the problem statement, so if they keep the original problem statement then it will possibly be my question. I made the problem just for you two (and Simon) so let me know if it gets used or not...
Anyways, best of luck to you two for stage 2 






