Computer Science Canada

Wanna make a program....

Author:  CHUTHAN20 [ Tue Nov 29, 2005 6:08 pm ]
Post subject:  Wanna make a program....

hey guys.... I have no idea about programming but i wanna make a program. i wanna make it like a dictonary.... like when i enter a word it shows the meaning... so which programming should i use.. and a good program that i could use to make that application.

Author:  Andy [ Tue Nov 29, 2005 6:31 pm ]
Post subject: 

if you know absolutely nothing about programming, then i dont think you should even try to make such a program, but from your previous posts, it seems like you've bee looking at turing and php, i'd start with turing just because its easier. go through the turing tutorials and try to understand strings, arrays, records, and file io

Author:  Tony [ Wed Nov 30, 2005 9:45 am ]
Post subject: 

the first question to ask yourself - how would you locate a word and its definition from a list of 100 000 in a reasonable time?

Author:  pavol [ Wed Nov 30, 2005 10:44 am ]
Post subject: 

with visual basic you could probably do the same thing as turing but you don't have to worry about coding the user interface.

Author:  do_pete [ Wed Nov 30, 2005 12:03 pm ]
Post subject: 

The code wouldn't be the biggest problem. It's the database that would be the hardest. It's not that it's hard, it's just that the English language has more than three million words in it
http://www.wordorigins.org/number.htm

Author:  Thuged_Out_G [ Sat Dec 03, 2005 1:49 am ]
Post subject: 

maybe im slow ... but since when does turing have access to DB's ...to my knowledge youyou;d have to use a text file with some I/O

Author:  md [ Sat Dec 03, 2005 3:13 am ]
Post subject: 

A database is simply a way of accessing and storing information. Most (of not all) databases store their data in files... so turing does indeed have access to DBs, you just need to write the DB yourself Wink

Author:  MysticVegeta [ Thu Dec 08, 2005 4:11 pm ]
Post subject: 

Tony wrote:
the first question to ask yourself - how would you locate a word and its definition from a list of 100 000 in a reasonable time?


Well I would say, it wouldn't take that big a time if you do some of the following things,
1) Make 26 files, each file is responisble for 1 alphabet beginning for example: a.txt, b.txt
2) get user input,
3) open the file with the first letter of the input.

Now we have just reduced our time by 26X.

Author:  MysticVegeta [ Thu Dec 08, 2005 4:13 pm ]
Post subject: 

MysticVegeta wrote:
Tony wrote:
the first question to ask yourself - how would you locate a word and its definition from a list of 100 000 in a reasonable time?


Well I would say, it wouldn't take that big a time if you do some of the following things,
1) Make 26 files, each file is responisble for 1 alphabet beginning for example: a.txt, b.txt
2) get user input,
3) open the file with the first letter of the input.

Now we have just reduced our time by 26X.


Edit: 600th post, 900 bits, is this a coincidence lol ROFL

Author:  Tony [ Thu Dec 08, 2005 5:38 pm ]
Post subject: 

MysticVegeta wrote:
Now we have just reduced our time by 26X.

I request a word that is not in the dictionary (misspelling). You still have to read nearly 4000 entries (on average) just to tell me you got no match.

Assuming a 280 byte definition, that's just about 1 MB per file to read and scan.

I guess it's not so bad if you're going for 1 line definitions.

Author:  GlobeTrotter [ Thu Dec 08, 2005 6:42 pm ]
Post subject: 

Why not do a binary search?

Author:  md [ Thu Dec 08, 2005 8:48 pm ]
Post subject: 

The problem is that there are more then 100000 words that realisitcally would be in a dictionary. Storing them all in memory would be impossible, as would be a binary search tree (as you'd still need them in memory). The best you coudld do is some sort of hash based lookup table with pointers to the correct location in the database file.

Author:  CHUTHAN20 [ Thu Dec 08, 2005 10:02 pm ]
Post subject:  hmmmmmmmmmmmmmm.....

but how does other dictionary works cuz... i have installed wordweb dictionary in my computer so how does that work??...

Author:  Martin [ Thu Dec 08, 2005 11:01 pm ]
Post subject: 

It's like this.

Pick an integer from 1 to 100, and I can guess it in at most 7 tries if all you tell me is if my guess is too high or too low (or correct). It's called a binary search - binary, because you eliminate half of the possibilities each time.

For example, suppose you are thinking of 78.

My first guess is 50. You say that my guess is too low. So I've just eliminated 50 numbers
Quote:

1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50

51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100


1 guess down, now I have 50 numbers that are possible.

I guess 75. Again, this is too low. So I eliminate 25 (half of the remaining numbers) numbers. Down to 25 possible numbers in two guesses.

Quote:

1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75
76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100


Now I guess 87. This time my guess is too high.

Quote:

1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75
76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100


Now I guess 81. Still too high.

Quote:

1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75
76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100


78. Dead on. 4 guesses.

The pattern is like this: first, guess the middle number. Then guess the middle number of the remaining numbers, and so on.

Works with words too (since you can tell if a word is greater than another word). To illustrate how effective this technique is: imagine having 1,000,000 words. A binary search of the list would give you the answer in 20 guesses at most (since 2^20 > 1,000,000).

Author:  MysticVegeta [ Fri Dec 09, 2005 12:11 pm ]
Post subject: 

what the? I edited the post.. why did it post as the new one, anyways,
How about using a faster language that could do the search within seconds and i am sure wtf would be able to post a lot of good tips on how to do that. Wink

Author:  codemage [ Fri Dec 09, 2005 1:12 pm ]
Post subject: 

Binary search is blazingly fast in every language, including turing.

If an extra few cycles mean that much to you, you can use a better method in Turing. The trinary sort isn't much more complex than binary, and will cut down your log2N search time to log3N. Rolling Eyes

Author:  bugzpodder [ Fri Dec 09, 2005 9:31 pm ]
Post subject: 

GlobeTrotter wrote:
Why not do a binary search?


lets say you look up word "abracadabra" in the dictionary. Would you flip to the middle of the dictionary, then check if the middle word is 'abracadabra', if not, then look in the first half and repeat the process?

no you wouldnt. you would instead flip to the really beginning of the dictionary, and check! so, no, you shouldnt use binary search in this case.

Author:  bugzpodder [ Fri Dec 09, 2005 9:34 pm ]
Post subject: 

Cornflake wrote:
The problem is that there are more then 100000 words that realisitcally would be in a dictionary. Storing them all in memory would be impossible, as would be a binary search tree (as you'd still need them in memory). The best you coudld do is some sort of hash based lookup table with pointers to the correct location in the database file.


binary search tree can be represented in a tight array. 100000 strings isnt that big of a deal. thats merely a couple of MB of RAM, hardly comparable to machines with GBs of ram. Assuming you are talking about english words, which has average length <10.

Author:  bugzpodder [ Fri Dec 09, 2005 9:36 pm ]
Post subject: 

codemage wrote:
Binary search is blazingly fast in every language, including turing.

If an extra few cycles mean that much to you, you can use a better method in Turing. The trinary sort isn't much more complex than binary, and will cut down your log2N search time to log3N. Rolling Eyes


trinary sort? first of all, sorting takes O(N) time (you need to look at each element once). if you mean "trinary search", i dont think it is possible or that it even exists. but i might be wrong. for starters, why dont u tell me what this "trinary sort" is? i dont see how you can consistently cut down the array by 2/3 each term hence achieving log3N time.

Author:  codemage [ Mon Dec 12, 2005 10:10 am ]
Post subject: 

Freudian slip there.

Referring to the searching algorithm. Trinary search. Trinary search is similar to binary, except it cuts the target group in thirds, instead of in half.

I have the code kicking around somewhere... I'll post it if I can find it.

Re - searching: if you're using a logN type search like binary, if your dictionary is 1 million words long, you're only spending 20 operations to find the word. There's not going to be a humanly noticeable difference between that and a more efficient search algorithm - if everything is stored in a form of quick-access memory.

EDIT: Trinary looks more or less like this:
I won't put the whole test case, 'cuz this isn't the turing source section...

code:
loop
    lowmid := low + ((high - low) div 3)
    highmid := high - ((high - low) div 3)

    counter := counter + 1

    if sortedarray (lowmid) = target then
        put "found it in ", counter, " tries."
        put "@ location ", lowmid
        exit

    elsif sortedarray (highmid) = target then
        put "found it in ", counter, " tries."
        put "@ location ", highmid
        exit

    elsif sortedarray (lowmid) > target then
        high := lowmid - 1

    elsif sortedarray (highmid) < target then
        low := highmid + 1

    else
        high := highmid - 1
        low := lowmid + 1

    end if
end loop

Author:  bugzpodder [ Mon Dec 12, 2005 6:16 pm ]
Post subject: 

However, if you for some reason need to perform a million of these operations every minute, which would be better? 3 million or 20 million? if you could do it faster, why not? As I said, as humans you wouldnt want to start searching for english words in a dictionary from middle, would you? you'd definitely start from beginning. this is because we know the data is uniformly distributed and we can take advantage of this to obtain faster times. so why not let our algorithm do that? this is called interpolation search (TAOCP V2, Knuth, p419). This takes log log N time rather than log N with uniform data.

trinary search maybe be only slightly faster than binary search. You think that its log3N, but clearly we are doing more work per iteration. We are doing up to 5 extra steps per iteration. each binary search does about 3 steps.

so each three iteration of trinary seach corresponds to about five iterations of binary search. thats (1/3)^3=1/27 and (1/2)^5=1/32
can you see which one is better now?

Author:  McKenzie [ Tue Dec 13, 2005 9:29 pm ]
Post subject: 

I'd use VB. VB allows you to access Access databases very simply. All you would need is a simple SQL statement to find the record. If you insist on doing it by hand I'd use a Hash table.

Author:  wtd [ Tue Dec 13, 2005 9:51 pm ]
Post subject: 

McKenzie wrote:
I'd use VB. VB allows you to access Access databases very simply. All you would need is a simple SQL statement to find the record. If you insist on doing it by hand I'd use a Hash table.


Many languages make it easy to deal with databases, and databases other than Access. Smile

Author:  md [ Wed Dec 14, 2005 2:33 pm ]
Post subject: 

McKenzie wrote:
I'd use VB. VB allows you to access Access databases very simply. All you would need is a simple SQL statement to find the record. If you insist on doing it by hand I'd use a Hash table.


Or you could use any number of languages and SQLite; as wtd pointed out there are any number of databases for any number of lanuages.

Author:  McKenzie [ Wed Dec 14, 2005 9:25 pm ]
Post subject: 

Of course, but what was being lost in the argument over which search algorithm is the most efficient was the fact that the original poster was not some programming guru. He asked for simple. VB is simple. It's not limited to Access Databases, that was just an example.

Author:  Martin [ Thu Dec 15, 2005 3:29 am ]
Post subject: 

McKenzie wrote:
Of course, but what was being lost in the argument over which search algorithm is the most efficient was the fact that the original poster was not some programming guru. He asked for simple. VB is simple. It's not limited to Access Databases, that was just an example.


Good point. The problem with programmers eh?

Visual Basic Express can be downloaded here: http://msdn.microsoft.com/vstudio/express/vb/default.aspx

Author:  rizzix [ Thu Dec 15, 2005 10:20 am ]
Post subject: 

McKenzie wrote:
Of course, but what was being lost in the argument over which search algorithm is the most efficient was the fact that the original poster was not some programming guru. He asked for simple. VB is simple. It's not limited to Access Databases, that was just an example.
I believe Java is much easier than VB when it comes to database programming -- JDBC


: