 Computer Science Canada Programming C, C++, Java, PHP, Ruby, Turing, VB   Username:   Password: Wiki Blog Search Turing Chat Room Members
String Manipulation        Author Message
Cervantes  Posted: Mon Mar 28, 2005 9:06 am   Post subject: String Manipulation

Hello thrill-seekers! So you've decided delve into the terrible and mystifying world of string manipulation, aye? For the next half-hour, you will be my slave. Do what I tell you, and you will learn well. Do otherwise, and you shall... not learn well!
Alright, enough of that. Let's start.
What is String Manipulation and why do we care?
String manipulation means, quite simply, manipulating strings. For example, if we take a string that the user inputs (such as Paul Simon, as their user name) we might want to greet the user by saying "Hello ", usersFirstName (in this case, Paul). Or what if we want to prevent the program from crashing when the user does something stupid like enter "qr7" when we want them to input an integer?

Search for a space
Strings allow us to do stuff with a specific character from them. For example,
 code: put "Paul Simon" (1 .. 4)

would output Paul. Or I could do stuff like
 code: if myString (5) = " " then       put "I found a space!" end if

But how do we know that the 5th character of myString is a space? We don't. If we want to find one, we can run through a for loop:
 code: for i : 1 .. length (myString)       if myString (i) = " " then             put "Space found at position ", i       end if end for

Notice the use of the length function. Length returns the length of the string, quite simply. In other words, it returns the number of characters in the string. So length ("a") would be 1, and length ("12345") would be 5.

Find the user's first name
Excellent. So now let's say hello to Mr. Simon, only a little less formally:
 code: var name := "Paul Simon" var firstName : string for i : 1 .. length (name)     if name (i) = " " then         firstName := name (1 .. i - 1)     end if end for put "Hello ", firstName, "!"

We go through the whole string searching for a space. As soon as we find one, we set firstName to be everything before that space, then exit the for loop.
But what if the user enters his/her full name, ie. Paul Frederic Simon. The program will output "Hello Paul Frederic!" That's not what we want. This is easily fixed, though. We'll just add an exit after we've decided where the first space is.
What if the user enters only his first name (ie. "Paul")? The program will crash, because firstName was never assigned anything. We can fix this problem by setting firstName to equal name at the beginning of the program. So, if name contains any spaces, it will chomp them off, otherwise it will just (effectively) use name as the output.
Final code looks like this:
 Turing: var name := "Paul Simon" var firstName := name for i : 1 .. length (name)     if name (i) = " " then         firstName := name (1 .. i - 1)         exit     end if end for put "Hello ", firstName, "!"

Find the users last name
This is easy enough: it's just a small modification on the last code.
 Turing: var name := "Paul Simon" var lastName := name for i : 1 .. length (name)     if name (i) = " " then         lastName := name (i + 1 .. *)     end if end for put "Hello Mr. ", lastName, "!"

Note the user of the asterisk (*). The asterisk represents the last character of the string.
 code: var name := "Paul Simon" put name (*) put name (length(name))

Both of these methods output the same thing. The * is just shorter. Also, note that I removed the exit. This way, if the user's name is set to "Paul Frederic Simon", it will output "Simon" not "Frederic Simon".

Finding the middle name
This one's a little bit trickier.
 Turing: var name := "Paul Frederic Simon" var middleName := name var firstSpace := 0 for i : 1 .. length (name)     if name (i) = " " and firstSpace ~= 0 then         middleName := name (firstSpace + 1 .. i - 1)     end if     if name (i) = " " then         firstSpace := i     end if end for put "Hello Mr. ", middleName, "!"

In this one, I've used a variable to store the positin of the first space. Then, we continue on looking for the next space. When we find it, we make everything after the first space but before the current space equal to middleName. Also, note the positions of the if statements. Were they to be in reverse order, as soon as a space is found, firstSpace would be given a new value and then the next if statement would be true because firstSpace is no longer 0. So we would end up with an error.

Note that all these things could also be done using the index function.

Converting variable types
This next part is really useful for error-traping. But it's also useful for other things, such as drawing text on the screen using Font.Draw (where you must use a string)
The Functions
 code: strint (s : string [base : int]) : int intstr (i : int [, width : int [, base : int]]) : string strreal (s : string) : real realstr (r : real, width : int) : string

A couple things you need to know. First, these are functions. They return a value. That's what the last : typeSpec means: it tells us what kind of value the function will return. Thus, the strint function returns an integer, whereas the intstr function returns a string. Next, anything inside brackets ( () ) are your parameters. These are the things that you pass into the function. In the strreal function, s : string means that I must pass a string into the function. Anything inside square brackets ( [] ) is optional.

Error Proofing
We'll error proof our integer input. The first and basic way to do integer input is like so:
 code: var num : int get num

But that crashes if the user enters "y". So let's fix it.
 Turing: var input : string var num : int get input num := strint (input) put num

Whoo-whee! It runs! Sure, but we've still got the same problem. The program will still crash if the user enters "y". Why? Basically, strint cannot turn a "y" into a number. strint must return an integer (we know this because of the last : int). If it can't, it halts the program.
So, how do we fix this? Well, here's some more nice functions!
 code: strintok (s : string [, base : int]) : boolean strrealok (s : string) : boolean

These functions return boolean values: true or false. They return true if the string can be successfully changed into an integer (or real, in the case of the second function) and false if they cannot. No halting. So let's use them.
 Turing: var input : string var num : int get input if strintok (input) then     num := strint (input)     put num end if

And there we have it. It's error proofed. (Well, close enough. The user can still crash the program by inputting lots and lots (255, is it?) of characters.)
The next thing to do would be to force the user to input an integer. We can use a loop and an exit statement for this:
 Turing: var input : string var num : int loop     cls     locate (1, 1)     put "Enter an integer: " ..     get input     if strintok (input) then         num := strint (input)         put num         exit     else         put "That's not a number, silly!"         delay (1000)     end if end loop

We only want to exit when the user enters a integer, like they were told.

Font.Drawing an integer
After error-proofing, this should be really easy.
If we want to use Font.Draw to put a number on the screen, we have to first convert it to a string. Why? Because Font.Draw expects it's first parameter to be a string. We know this because:
 code: Font.Draw (textStr : string, x, y, fontID, Colour : int)

Alright, so say we want to draw a number on the screen.
 Turing: var font := Font.New ("Garamond:26:bold") var num : int get num %error proof this! Font.Draw (intstr (num), 100, 100, font, black)

Easy enough. We could also have simply done
 code: Font.Draw (intstr (10), 100, 100, font, black)

But the first way gets you to error proof it for me. Mwahaha!

ord and chr
Next up, we learn about two new functions.
 code: ord (ch : char) : int chr (i : int) : char

ord takes a single character and returns an integer (that is specific to that character).
chr takes an integer and returns a single character (that is specific to that integer).
To use these, we need to haul out our ASCII chart. So, open Turing, open the Turing Help Manual (that's code for "press F10"), expand Turing Language, select Keystroke Codes.
Find the letter "A". It's ordinal value is 65. So
 code: put ord ("A")
would output 65. Similarly,
 code: put chr (65)
would output the letter "A" (without the quotes).
 Turing: var s : char := "A" put chr (ord (s)) var i : int := 65 put ord (chr (i))

For any character, s, chr (ord(s)) = s. Also, for any integer, i, ord (chr (i)) = i.
These functions are useful for a variety of things. For example, say you want to loop until the user presses Ctrl + Backspace:
 Turing: var input : string (1) loop     getch (input)     exit when ord (input) = 127 %crtl + backspace end loop

Or, say you want to output the alphabet to the screen, in lower case letters:
 code: for i : 97 .. 122     put chr (i), " " .. end for

How about converting a string from upper case to lower case?
 code: var upperCaseString := "ROOOOAAAR" var lowerCaseString := "" for i : 1 .. length (upperCaseString)     lowerCaseString += chr (ord (upperCaseString (i)) + 32) end for put lowerCaseString

If you don't already know what += means, it simply incriments the variable by whatever is after it.
 code: num += 1
is the same as
 code: num := num + 1
When using strings, it just adds the string (or character) to the end of my string.
Note that the ordinal value of any lower case letter is equal to the ordinal value of any upper case letter + 32.
Note that this doesn't quite work if your string contains things other than numbers. To fix this:
 Turing: var upperCaseString := "RAWR!!  HERE ME ROAR!!" var lowerCaseString := "" for i : 1 .. length (upperCaseString)     if ord (upperCaseString (i)) >= 65 and ord (upperCaseString (i)) <= 90 then %if it is a capital letter         lowerCaseString += chr (ord (upperCaseString (i)) + 32)     else         lowerCaseString += upperCaseString (i)     end if end for put lowerCaseString

Yay! Next up, let's try outputting "Aa Bb Cc ... Yy Zz"
 code: for i : 65 .. 90     put chr (i), chr (i + 32), " " .. end for

Conclusion
So there you have it, string manipulation in a really big nutshell that took a long time to write. Aah, let's let Asian sum things up for me, my fingers are tired:
AsianSensation wrote:

Know this though, string manipulation comes with practice, and problem solving is a big part of this. So don't assume knowing all about index will make sure you will be able to solve a question. Index is merely a tool, not the solution.

Replace the word "index" with the words "the stuff covered in this tutorial, whatever that may be" and heed his advice (or wisdom).

Happy string manipulating,
-Cervantes    jamonathin  Posted: Mon Mar 28, 2005 1:42 pm   Post subject: (No subject)

Wow. Very, very nice Cervantes, if extra bits ment anything to you, i'd give some, but they dont and the post on the bits system says it's useless givin em to mods, so i'll pass. But good job on the tutorial, that'll help lots of people out.

Oh, i liked the "HERE ME ROAR", i could tell you've been workin on the tutorial for a long time. GJ!

+ bits. Naveg Posted: Mon Mar 28, 2005 6:04 pm   Post subject: (No subject)

awesome tutorial, thanks a lot man Flikerator Posted: Tue Mar 29, 2005 4:41 pm   Post subject: (No subject) zylum  Posted: Tue Mar 29, 2005 11:01 pm   Post subject: (No subject)

nice job  Naveg Posted: Wed Mar 30, 2005 12:10 am   Post subject: (No subject)

i think you need to take the speed reading course flikerator gnarky Posted: Mon Apr 04, 2005 5:25 pm   Post subject: (No subject)

Good tut...Can someone help me out with this then please.

I have:

How do I get only '1' and 'admin' from that?
Thanks Cervantes  Posted: Mon Apr 04, 2005 6:01 pm   Post subject: (No subject)

There are lots of ways to do it. Which way is best depends largely on how you are getting the "User 1 : admin" information. If you know it is going to be of a certain form, (such as, "User", userNumber, " : ", userType), you could do something like this:

 Turing: var theString := "User 1 : admin" var numbers := "0123456789" var userNumber : int var temp := -1 for i : 1 .. length (theString)     if temp = -1 and index (numbers, theString (i)) ~= 0 then  %if the character is a number and temp has not yet been assigned a value         temp := i     end if     if temp ~= -1 and index (numbers, theString (i)) = 0 then  %if the character is not a number and temp has already been assigned a value         userNumber := strint (theString (temp .. i - 1))         exit     end if end for put "User Number: ", userNumber var userType : string for i : 1 .. length (theString)     if theString (i) = ":" then         userType := theString (i + 2 .. *)  %+1 to get to the space after the :, then another one to get to the first letter of the user type         exit     end if end for put "User Type: ", userType

That uses searching for numbers (and returning the first set of numbers, no others [fortunately your data only has one number]) using the index method. However, it could also be achieved using ord

 Turing: var theString := "User 1002 : admin" var userNumber : int var temp := -1 for i : 1 .. length (theString)     if temp = -1 then         if ord (theString (i)) >= 48 and ord (theString (i)) <= 57 then %if the character is a number and temp has not yet been assigned a value             %all numbers (0123456789) have ascii values between 48 and 57             temp := i         end if     else         if ord (theString (i)) < 48 or ord (theString (i)) > 57 then %if the character is not a number and temp has already been assigned a value             userNumber := strint (theString (temp .. i - 1))             exit         end if     end if end for put "User Number: ", userNumber

I changed around the if statements a bit, but other than that the only change is index to ord.

If you need any further help, just ask.     gnarky Posted: Mon Apr 04, 2005 6:28 pm   Post subject: (No subject)

Sounds way too confusing for me. Heres my situation.

I have a database full of users. The memberlist (text file) read as so:

User 2 : name

With each new user, the file is automatically updated with the newest member on top. ("User 3 : Matt" would be added to the top)

I want to be able to get the usernumber and name through the open and get statements. Thanks Cervantes  Posted: Mon Apr 04, 2005 6:46 pm   Post subject: (No subject)

So the first task is to open and get the information. Since this is not a files thread, I'll assume you're okay with that. The second task is to find the user's number. How are you going to do that? You can't say that the UserNumber = theString (6) because that only works if the number has only one digit. You could beef up the user number with leading 0's (so instead of 1, it might be 001). If you know you won't have more than a certain number of users, that might be fine. But if you don't know, you need to do what I referred to as a "number search".
The concept is not that difficult. Search through the string, one character at a time, and keep track of the position of the first number that appears. Continuing through the string, as soon as a character is found that is not a number, that is one place after the position of the last digit of the number.
Grasping these things takes time, though. I do not expect you to get it right away. But neither do I expect you to say:
gnarky wrote:

Sounds way too confusing for me.

and give up. Give it some effort! -Cervantes Geminias Posted: Wed Oct 19, 2005 5:03 pm   Post subject: (No subject)

how would you use the index function to parse through code looking for specific characters like 'n' or ' ' (space), and record more than just one of them. Cervantes  Posted: Thu Oct 20, 2005 5:28 pm   Post subject: (No subject)

You might consider going through a loop, and exiting the loop when the index function returns 0. Truncate the string to be everything after the previous find of the certain character.

So if you are searching for all occurances of the letter 'n' in the word "Newton", you would find the first 'n' easily enough. The new word becomes "ewton". You find the next n at the end. The new word is now a null string, "". Index returns 0, you exit the loop.

Recording them would be a matter of storing integers that represent the position of the character (ie. the value returned by index) in a flexible array. theguru Posted: Wed Nov 09, 2005 5:56 pm   Post subject: (No subject)

AWESOME TUTORIAL!! Teaching such a complex concept simply is a tough task and you did it. Great job.

But this coding got me messed up (probably only me cause i'm such a n00b).
 code: var name := "Paul Frederic Simon" var middleName := name var firstSpace := 0 for i : 1 .. length (name)     if name (i) = " " and firstSpace ~= 0 then         middleName := name (firstSpace + 1 .. i - 1)     end if     if name (i) = " " then         firstSpace := i     end if end for put "Hello Mr. ", middleName, "!"

in that code what does
 code: firstSpace ~= 0
the ~= mean. I have a vague idea but I want to clear up my thoughts.[/b] GlobeTrotter Posted: Wed Nov 09, 2005 6:35 pm   Post subject: (No subject)

"~=" means "not =" theguru Posted: Wed Nov 09, 2005 8:18 pm   Post subject: (No subject)

oo. kk, thanks a lot! Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First         Page 1 of 2  [ 27 Posts ]
Goto page 1, 2  Next
 Jump to:  Select a forum  CompSci.ca ------------ - Network News - General Discussion     General Forums   -----------------   - Hello World   - Featured Poll   - Contests     Contest Forums   -----------------   - DWITE   - [FP] Contest 2006/2008   - [FP] 2005/2006 Archive   - [FP] 2004/2005 Archive   - Off Topic     Lounges   ---------   - User Lounge   - VIP Lounge     Programming -------------- - General Programming     General Programming Forums   --------------------------------   - Functional Programming   - Logical Programming   - C     C   --   - C Help   - C Tutorials   - C Submissions   - C++     C++   ----   - C++ Help   - C++ Tutorials   - C++ Submissions   - Java     Java   -----   - Java Help   - Java Tutorials   - Java Submissions   - Ruby     Ruby   -----   - Ruby Help   - Ruby Tutorials   - Ruby Submissions   - Turing     Turing   --------   - Turing Help   - Turing Tutorials   - Turing Submissions   - PHP     PHP   ----   - PHP Help   - PHP Tutorials   - PHP Submissions   - Python     Python   --------   - Python Help   - Python Tutorials   - Python Submissions   - Visual Basic and Other Basics     VB   ---   - Visual Basic Help   - Visual Basic Tutorials   - Visual Basic Submissions     Education ----------- - Student Life   Graphics and Design ----------------------- - Web Design     Web Design Forums   ---------------------   - (X)HTML Help   - (X)HTML Tutorials   - Flash MX Help   - Flash MX Tutorials   - Graphics     Graphics Forums   ------------------   - Photoshop Tutorials   - The Showroom   - 2D Graphics   - 3D Graphics     Teams ------ - dTeam Public

 Style: Appalachia blueSilver eMJay subAppalachia subBlue subCanvas subEmjay subGrey subSilver subVereor Search: