Computer Science Canada

Spell Check

Author:

BigBear [ Thu Apr 23, 2009 7:33 am ]

Post subject:

Spell Check

Turing:

var Middle := count div 2
var sent := "computer"
var correct := false %Assumes it isn't in text file (spelled worng)
var check : string %reads from the file
seek : fn, count
get : fn, check
put check
loop
seek : fn, Middle
get : fn, check
if check < sent then
Middle := Middle + (Middle div 2)
elsif check > sent then
Middle := Middle div 2
elsif sent = check then
correct := true
exit
end if
exit when Middle <= 0
put Middle
count += 1
end loop

So I made a spell check program which cheks to see if the word is in the dictionary of 50 000 words and it very slow. Both for checking if it is spelled right then going through and seeing if words are close to it.

I am trying to start in the middle of the file and see if the word is greater or less than the word in the file.

If it is less then it will find a word half of that, and if it is greater it will start on the other side the middle + the middle div 2 and do the check untill you find the word.

Author:	DemonWasp [ Thu Apr 23, 2009 8:20 am ]
Post subject:	RE:Spell Check
...what do you need help with? If you're just wanting to make it faster, then it's probably all about having a better data structure (seeking files on the disk is incredibly slow; for the reasons why you can read up on The Memory Hierarchy). There are lots of options for a better data structure (even an array in RAM is going to be orders of magnitude faster than seeking in a dictionary file). However, since you know your exact application for the data (finding matches; suggesting similar words), you can probably think of something a little better that might let you do a lot of the work up-front.

Author:	BigBear [ Thu Apr 23, 2009 3:11 pm ]
Post subject:	RE:Spell Check
My problem is it never finds the word if I put Middle it jumps around never finding the word or I thought it might become out of the range of searching <0 but it doesn't

Author:	Tony [ Thu Apr 23, 2009 3:21 pm ]
Post subject:	RE:Spell Check
I'm not sure what the variable count is, but considering that I'm seeing count += 1, I suspect that is it not the file-size in bytes. seek

Author:	BigBear [ Thu Apr 23, 2009 3:40 pm ]
Post subject:	RE:Spell Check
count was to see how many checks it does before the word is found. To compare it to 53 000, which is does when when it checks every word in the text file untill it finds it or eof so 53 000 would be worst case and the word is zulus. So you can ignore count and Middle should probably be named location

Author:	DemonWasp [ Thu Apr 23, 2009 4:01 pm ]
Post subject:	RE:Spell Check
Try outputting the "check" string at each iteration. Compare with Tony's post. Check the documentation for the seek command.

Author:

metachief [ Thu Apr 23, 2009 9:27 pm ]

Post subject:

RE:Spell Check

The first thing is that you need to have variable that will keep track of your lower boundry and your higher boundry.

Turing:

var low:int
var high:int

Then your variable "middle" needs to take the value of your "high" plus "low" divided by two.
Let's say that your high is 100 and low is 0 then middle would be (100+0) div 2.

Turing:

var middle:int:=(high+low) div 2

After every time you chop the range in half (or change the low and high boundries) you should re-assign your middle variable to (high+low) div 2.

So instead of having:

Turing:

if check < sent then
Middle := Middle + (Middle div 2)
elsif check > sent then
Middle := Middle div 2
elsif sent = check then
correct := true
exit
end if

You would have:

Turing:

if check < sent then
high := (low + high) div 2
Middle := (low + high) div 2
elsif check > sent then
low := (low + high) div 2
Middle := (low + high) div 2
elsif sent = check then
correct := true
exit
end if

Although this works, you migh still run into a problem. Doing it like this:

Turing:

high := (low + high) div 2

can end up reading past the text file where your reading from...

Look into it and try it on paper using appropriate numbers for comparison.

Mod Edit: Remember to use syntax tags! Thanks Smile

code:

[syntax="turing"]Code Here[/syntax]

Author:

BigBear [ Thu May 28, 2009 5:35 pm ]

Post subject:

RE:Spell Check

The problem is unless I have an array containing every single element which is too large for turing if it is the list of words firefox uses to spell check

Instead of reference element number whatever I have to get an element then say it is less than get the word that is in half of the position number as the first

but i find the total number of words but i can't seek to the middle word because you seek by character

and if i seek by character i get a position in the middle of the word

Turing:

fcn FileSize (filename : string) : int
var fn : int
var count : int := 0
var word : string
open : fn, filename, get
loop
exit when eof (fn)
get : fn, word : 1
count += 1
end loop
result count
end FileSize

proc Load (var filesize : int)
filesize := FileSize ("English.txt")
end Load

proc Drawi
View.Set ("graphics:300;300,position:bottom;left")
put "hi"
end Drawi

fcn Search (Word, FileName : string, FileSize : int) : int
var fn : int
var sent : string := ""
open : fn, FileName, get, seek
var Target := Word
var first := 1
var last := FileSize
var Middle : int
loop
Middle := (first + last) div 2
seek : fn, Middle
get : fn, sent
if sent >= Target then
last := Middle
else
first := Middle + 1
end if
exit when first >= last
end loop
if sent = Target then
result first
else
result 0
end if
close : fn
end Search

proc Main
var input : string
var FileSize : int %:= 10000
Load (FileSize)
Drawi
put FileSize
loop
get input
put Search (input, "English.txt", FileSize)
end loop
end Main

Main

This is my current method that I am trying to get working.

Attached is the array method which becomes to large, it also contains the text file.

You can syndicate this boards posts using the file backend.php or view the topic map using sitemap.php.

Terms of Use | Privacy Policy