Computer Science Canada

Miranda Character lists

Author:  jamonathin [ Wed Nov 08, 2006 12:23 pm ]
Post subject:  Miranda Character lists

Hey all, im having trouble with this probelm where i have a phrase, and i need to break it down into a list of words.

I am able to remove punctuation and spaces so I'm just left with words, but I cant figure out how to put the words into a list. Which would be a List of a List of characters.

I've spent endless hours, manipulating, trying different strategies, everything.

Here is my code where I can remove punctuation, and i know where the word is isolated, but not shure on how to do something with it.

Miranda:

word "" = ""
word (x:xs) = [x] ++ word xs, if is_letter [x] = True
                  = word xs, otherwise

is_letter (x:xs) = False, if x = ' ' \/ x = '.' \/ x = ',' \/ x = ''' \/ x = '!'
                       = True, otherwise


If you cant run it, here's a sample input and output:

Input wrote:

"When I was a kid my favorite relative was Uncle Caveman. After school we'd all go play in his cave, and every once in a while he would eat one of us. It wasn't until later that I found out that Uncle Caveman was a bear."

Output wrote:

WhenIwasakidmyfavoriterelativewasUncleCavemanAfterschoolwedallgoplayinhiscaveandeveryonceinawhilehewouldeatoneofusItwasntuntillaterthatIfoundoutthatUncleCavemanwasabear


Any help would be appreciated :)

Author:  Lazy [ Thu Nov 09, 2006 6:24 am ]
Post subject: 

I don't know Miranda... but here's a simple solution in Haskell. It can be vastly improved, but as it is, I think it demonstrates the concept well enough. I'm still only learning Haskell, so any constructive criticisms would be greatly appreciated.

code:
import Data.Char

getWords [] = []
getWords s = if getFirst s == "" then getWords ( getRest s )
                                 else getFirst s : getWords ( getRest s ) 
                 
                 where
       
                 suffix [] = []
                 suffix x  = tail x

                 getFirst s = takeWhile isWordChar s
                 getRest  s = suffix $ dropWhile isWordChar s

                 isWordChar c = ( isLetter c ) || ( c == '\'')



EDIT: Haskell actually comes with a similar function, words, which splits [Char] into [[Char]] using whitespace as boundary.

Author:  jamonathin [ Thu Nov 09, 2006 12:04 pm ]
Post subject: 

And i dont know Haskell Laughing, but I did end up figuring it out and it is similar to the solution you gave me (which i appreciate you doing Smile).

Here's the solution

Miranda:

get_words (x:xs) = []:get_words xs, if is_Letter [x] = False
                 = (x:ys):yrest, otherwise
                   where
                   (ys:yrest) = get_words xs, if xs~=[]
                              = []:[], otherwise

is_Letter (x:xs) = False, if x = ' ' \/ x = '.' \/ x = ',' \/ x = '!'
                 = True, otherwise


: