Computer Science Canada Spell Check |
Author: | BigBear [ Thu Apr 23, 2009 7:33 am ] | ||
Post subject: | Spell Check | ||
So I made a spell check program which cheks to see if the word is in the dictionary of 50 000 words and it very slow. Both for checking if it is spelled right then going through and seeing if words are close to it. I am trying to start in the middle of the file and see if the word is greater or less than the word in the file. If it is less then it will find a word half of that, and if it is greater it will start on the other side the middle + the middle div 2 and do the check untill you find the word. |
Author: | DemonWasp [ Thu Apr 23, 2009 8:20 am ] |
Post subject: | RE:Spell Check |
...what do you need help with? If you're just wanting to make it faster, then it's probably all about having a better data structure (seeking files on the disk is incredibly slow; for the reasons why you can read up on The Memory Hierarchy). There are lots of options for a better data structure (even an array in RAM is going to be orders of magnitude faster than seeking in a dictionary file). However, since you know your exact application for the data (finding matches; suggesting similar words), you can probably think of something a little better that might let you do a lot of the work up-front. |
Author: | BigBear [ Thu Apr 23, 2009 3:11 pm ] |
Post subject: | RE:Spell Check |
My problem is it never finds the word if I put Middle it jumps around never finding the word or I thought it might become out of the range of searching <0 but it doesn't |
Author: | Tony [ Thu Apr 23, 2009 3:21 pm ] |
Post subject: | RE:Spell Check |
I'm not sure what the variable count is, but considering that I'm seeing count += 1, I suspect that is it not the file-size in bytes. seek |
Author: | BigBear [ Thu Apr 23, 2009 3:40 pm ] |
Post subject: | RE:Spell Check |
count was to see how many checks it does before the word is found. To compare it to 53 000, which is does when when it checks every word in the text file untill it finds it or eof so 53 000 would be worst case and the word is zulus. So you can ignore count and Middle should probably be named location |
Author: | DemonWasp [ Thu Apr 23, 2009 4:01 pm ] |
Post subject: | RE:Spell Check |
Try outputting the "check" string at each iteration. Compare with Tony's post. Check the documentation for the seek command. |
Author: | metachief [ Thu Apr 23, 2009 9:27 pm ] | ||||||||||||
Post subject: | RE:Spell Check | ||||||||||||
The first thing is that you need to have variable that will keep track of your lower boundry and your higher boundry.
Then your variable "middle" needs to take the value of your "high" plus "low" divided by two. Let's say that your high is 100 and low is 0 then middle would be (100+0) div 2.
After every time you chop the range in half (or change the low and high boundries) you should re-assign your middle variable to (high+low) div 2. So instead of having:
You would have:
Although this works, you migh still run into a problem. Doing it like this:
can end up reading past the text file where your reading from... Look into it and try it on paper using appropriate numbers for comparison. Mod Edit: Remember to use syntax tags! Thanks ![]()
|
Author: | BigBear [ Thu May 28, 2009 5:35 pm ] | ||
Post subject: | RE:Spell Check | ||
The problem is unless I have an array containing every single element which is too large for turing if it is the list of words firefox uses to spell check Instead of reference element number whatever I have to get an element then say it is less than get the word that is in half of the position number as the first but i find the total number of words but i can't seek to the middle word because you seek by character and if i seek by character i get a position in the middle of the word
This is my current method that I am trying to get working. Attached is the array method which becomes to large, it also contains the text file. |