Posted: Mon Mar 28, 2005 9:06 am Post subject: String Manipulation
Hello thrill-seekers! So you've decided delve into the terrible and mystifying world of string manipulation, aye? For the next half-hour, you will be my slave. Do what I tell you, and you will learn well. Do otherwise, and you shall... not learn well!
Alright, enough of that. Let's start.
What is String Manipulation and why do we care?
String manipulation means, quite simply, manipulating strings. For example, if we take a string that the user inputs (such as Paul Simon, as their user name) we might want to greet the user by saying "Hello ", usersFirstName (in this case, Paul). Or what if we want to prevent the program from crashing when the user does something stupid like enter "qr7" when we want them to input an integer?
Search for a space
Strings allow us to do stuff with a specific character from them. For example,
code:
put "Paul Simon" (1 .. 4)
would output Paul. Or I could do stuff like
code:
if myString (5) = " " then
put "I found a space!"
end if
But how do we know that the 5th character of myString is a space? We don't. If we want to find one, we can run through a for loop:
code:
for i : 1 .. length (myString)
if myString (i) = " " then
put "Space found at position ", i
end if
end for
Notice the use of the length function. Length returns the length of the string, quite simply. In other words, it returns the number of characters in the string. So length ("a") would be 1, and length ("12345") would be 5.
Find the user's first name
Excellent. So now let's say hello to Mr. Simon, only a little less formally:
code:
var name := "Paul Simon"
var firstName : string
for i : 1 .. length (name)
if name (i) = " " then
firstName := name (1 .. i - 1)
end if
end for
put "Hello ", firstName, "!"
We go through the whole string searching for a space. As soon as we find one, we set firstName to be everything before that space, then exit the for loop.
But what if the user enters his/her full name, ie. Paul Frederic Simon. The program will output "Hello Paul Frederic!" That's not what we want. This is easily fixed, though. We'll just add an exit after we've decided where the first space is.
What if the user enters only his first name (ie. "Paul")? The program will crash, because firstName was never assigned anything. We can fix this problem by setting firstName to equal name at the beginning of the program. So, if name contains any spaces, it will chomp them off, otherwise it will just (effectively) use name as the output.
Final code looks like this:
Turing:
var name :="Paul Simon" var firstName := name
for i :1.. length(name) if name (i)=" "then
firstName := name (1.. i - 1) exit endif endfor put"Hello ", firstName, "!"
Find the users last name
This is easy enough: it's just a small modification on the last code.
Turing:
var name :="Paul Simon" var lastName := name
for i :1.. length(name) if name (i)=" "then
lastName := name (i + 1.. *) endif endfor put"Hello Mr. ", lastName, "!"
Note the user of the asterisk (*). The asterisk represents the last character of the string.
code:
var name := "Paul Simon"
put name (*)
put name (length(name))
Both of these methods output the same thing. The * is just shorter.
Also, note that I removed the exit. This way, if the user's name is set to "Paul Frederic Simon", it will output "Simon" not "Frederic Simon".
Finding the middle name
This one's a little bit trickier.
Turing:
var name :="Paul Frederic Simon" var middleName := name
var firstSpace :=0 for i :1.. length(name) if name (i)=" "and firstSpace ~=0then
middleName := name (firstSpace + 1.. i - 1) endif if name (i)=" "then
firstSpace := i
endif endfor put"Hello Mr. ", middleName, "!"
In this one, I've used a variable to store the positin of the first space. Then, we continue on looking for the next space. When we find it, we make everything after the first space but before the current space equal to middleName. Also, note the positions of the if statements. Were they to be in reverse order, as soon as a space is found, firstSpace would be given a new value and then the next if statement would be true because firstSpace is no longer 0. So we would end up with an error.
Converting variable types
This next part is really useful for error-traping. But it's also useful for other things, such as drawing text on the screen using Font.Draw (where you must use a string)
The Functions
code:
strint (s : string [base : int]) : int
intstr (i : int [, width : int [, base : int]]) : string
strreal (s : string) : real
realstr (r : real, width : int) : string
A couple things you need to know. First, these are functions. They return a value. That's what the last : typeSpec means: it tells us what kind of value the function will return. Thus, the strint function returns an integer, whereas the intstr function returns a string. Next, anything inside brackets ( () ) are your parameters. These are the things that you pass into the function. In the strreal function, s : string means that I must pass a string into the function. Anything inside square brackets ( [] ) is optional.
Error Proofing
We'll error proof our integer input. The first and basic way to do integer input is like so:
code:
var num : int
get num
But that crashes if the user enters "y". So let's fix it.
Turing:
var input :string var num :int get input
num :=strint(input) put num
Whoo-whee! It runs! Sure, but we've still got the same problem. The program will still crash if the user enters "y". Why? Basically, strint cannot turn a "y" into a number. strint must return an integer (we know this because of the last : int). If it can't, it halts the program.
So, how do we fix this? Well, here's some more nice functions!
code:
strintok (s : string [, base : int]) : boolean
strrealok (s : string) : boolean
These functions return boolean values: true or false. They return true if the string can be successfully changed into an integer (or real, in the case of the second function) and false if they cannot. No halting. So let's use them.
Turing:
var input :string var num :int get input
ifstrintok(input)then
num :=strint(input) put num
endif
And there we have it. It's error proofed. (Well, close enough. The user can still crash the program by inputting lots and lots (255, is it?) of characters.)
The next thing to do would be to force the user to input an integer. We can use a loop and an exit statement for this:
Turing:
var input :string var num :int loop cls locate(1, 1) put"Enter an integer: "..
get input
ifstrintok(input)then
num :=strint(input) put num
exit else put"That's not a number, silly!" delay(1000) endif endloop
We only want to exit when the user enters a integer, like they were told.
Font.Drawing an integer
After error-proofing, this should be really easy.
If we want to use Font.Draw to put a number on the screen, we have to first convert it to a string. Why? Because Font.Draw expects it's first parameter to be a string. We know this because:
code:
Font.Draw (textStr : string, x, y, fontID, Colour : int)
Alright, so say we want to draw a number on the screen.
Turing:
var font :=Font.New("Garamond:26:bold") var num :int get num %error proof this! Font.Draw(intstr(num),100, 100, font, black)
Easy enough. We could also have simply done
code:
Font.Draw (intstr (10), 100, 100, font, black)
But the first way gets you to error proof it for me. Mwahaha!
ord and chr
Next up, we learn about two new functions.
code:
ord (ch : char) : int
chr (i : int) : char
ord takes a single character and returns an integer (that is specific to that character).
chr takes an integer and returns a single character (that is specific to that integer).
To use these, we need to haul out our ASCII chart. So, open Turing, open the Turing Help Manual (that's code for "press F10"), expand Turing Language, select Keystroke Codes.
Find the letter "A". It's ordinal value is 65. So
For any character, s, chr (ord(s)) = s. Also, for any integer, i, ord (chr (i)) = i.
These functions are useful for a variety of things. For example, say you want to loop until the user presses Ctrl + Backspace:
Turing:
var input :string(1) loop getch(input) exitwhenord(input)=127%crtl + backspace endloop
Or, say you want to output the alphabet to the screen, in lower case letters:
code:
for i : 97 .. 122
put chr (i), " " ..
end for
How about converting a string from upper case to lower case?
code:
var upperCaseString := "ROOOOAAAR"
var lowerCaseString := ""
for i : 1 .. length (upperCaseString)
lowerCaseString += chr (ord (upperCaseString (i)) + 32)
end for
put lowerCaseString
If you don't already know what += means, it simply incriments the variable by whatever is after it.
code:
num += 1
is the same as
code:
num := num + 1
When using strings, it just adds the string (or character) to the end of my string.
Note that the ordinal value of any lower case letter is equal to the ordinal value of any upper case letter + 32.
Note that this doesn't quite work if your string contains things other than numbers. To fix this:
Turing:
var upperCaseString :="RAWR!! HERE ME ROAR!!" var lowerCaseString :="" for i :1.. length(upperCaseString) iford(upperCaseString (i)) >= 65andord(upperCaseString (i)) <= 90then%if it is a capital letter
lowerCaseString +=chr(ord(upperCaseString (i)) + 32) else
lowerCaseString += upperCaseString (i) endif endfor put lowerCaseString
Yay! Next up, let's try outputting "Aa Bb Cc ... Yy Zz"
code:
for i : 65 .. 90
put chr (i), chr (i + 32), " " ..
end for
Conclusion
So there you have it, string manipulation in a really big nutshell that took a long time to write. Aah, let's let Asian sum things up for me, my fingers are tired:
AsianSensation wrote:
Know this though, string manipulation comes with practice, and problem solving is a big part of this. So don't assume knowing all about index will make sure you will be able to solve a question. Index is merely a tool, not the solution.
Replace the word "index" with the words "the stuff covered in this tutorial, whatever that may be" and heed his advice (or wisdom).
Happy string manipulating,
-Cervantes
Sponsor Sponsor
jamonathin
Posted: Mon Mar 28, 2005 1:42 pm Post subject: (No subject)
Wow. Very, very nice Cervantes, if extra bits ment anything to you, i'd give some, but they dont and the post on the bits system says it's useless givin em to mods, so i'll pass. But good job on the tutorial, that'll help lots of people out.
Oh, i liked the "HERE ME ROAR", i could tell you've been workin on the tutorial for a long time. GJ!
+ bits.
Naveg
Posted: Mon Mar 28, 2005 6:04 pm Post subject: (No subject)
awesome tutorial, thanks a lot man
Flikerator
Posted: Tue Mar 29, 2005 4:41 pm Post subject: (No subject)
Im still reading through it...
zylum
Posted: Tue Mar 29, 2005 11:01 pm Post subject: (No subject)
nice job
Naveg
Posted: Wed Mar 30, 2005 12:10 am Post subject: (No subject)
i think you need to take the speed reading course flikerator
gnarky
Posted: Mon Apr 04, 2005 5:25 pm Post subject: (No subject)
Good tut...Can someone help me out with this then please.
I have:
User 1 : admin
How do I get only '1' and 'admin' from that?
Thanks
Cervantes
Posted: Mon Apr 04, 2005 6:01 pm Post subject: (No subject)
There are lots of ways to do it. Which way is best depends largely on how you are getting the "User 1 : admin" information. If you know it is going to be of a certain form, (such as, "User", userNumber, " : ", userType), you could do something like this:
Turing:
var theString :="User 1 : admin" var numbers :="0123456789" var userNumber :int var temp := -1 for i :1.. length(theString) if temp = -1andindex(numbers, theString (i)) ~=0then%if the character is a number and temp has not yet been assigned a value
temp := i
endif if temp ~= -1andindex(numbers, theString (i))=0then%if the character is not a number and temp has already been assigned a value
userNumber :=strint(theString (temp .. i - 1)) exit endif endfor put"User Number: ", userNumber
var userType :string for i :1.. length(theString) if theString (i)=":"then
userType := theString (i + 2.. *)%+1 to get to the space after the :, then another one to get to the first letter of the user type exit endif endfor put"User Type: ", userType
That uses searching for numbers (and returning the first set of numbers, no others [fortunately your data only has one number]) using the index method. However, it could also be achieved using ord
Turing:
var theString :="User 1002 : admin" var userNumber :int var temp := -1 for i :1.. length(theString) if temp = -1then iford(theString (i)) >= 48andord(theString (i)) <= 57then%if the character is a number and temp has not yet been assigned a value %all numbers (0123456789) have ascii values between 48 and 57
temp := i
endif else iford(theString (i)) < 48orord(theString (i)) > 57then%if the character is not a number and temp has already been assigned a value
userNumber :=strint(theString (temp .. i - 1)) exit endif endif endfor put"User Number: ", userNumber
I changed around the if statements a bit, but other than that the only change is index to ord.
If you need any further help, just ask.
Sponsor Sponsor
gnarky
Posted: Mon Apr 04, 2005 6:28 pm Post subject: (No subject)
Sounds way too confusing for me. Heres my situation.
I have a database full of users. The memberlist (text file) read as so:
User 2 : name
User 1 : admin
With each new user, the file is automatically updated with the newest member on top. ("User 3 : Matt" would be added to the top)
I want to be able to get the usernumber and name through the open and get statements. Thanks
Cervantes
Posted: Mon Apr 04, 2005 6:46 pm Post subject: (No subject)
So the first task is to open and get the information. Since this is not a files thread, I'll assume you're okay with that. The second task is to find the user's number. How are you going to do that? You can't say that the UserNumber = theString (6) because that only works if the number has only one digit. You could beef up the user number with leading 0's (so instead of 1, it might be 001). If you know you won't have more than a certain number of users, that might be fine. But if you don't know, you need to do what I referred to as a "number search".
The concept is not that difficult. Search through the string, one character at a time, and keep track of the position of the first number that appears. Continuing through the string, as soon as a character is found that is not a number, that is one place after the position of the last digit of the number.
Grasping these things takes time, though. I do not expect you to get it right away. But neither do I expect you to say:
gnarky wrote:
Sounds way too confusing for me.
and give up. Give it some effort!
-Cervantes
Geminias
Posted: Wed Oct 19, 2005 5:03 pm Post subject: (No subject)
how would you use the index function to parse through code looking for specific characters like 'n' or ' ' (space), and record more than just one of them.
Cervantes
Posted: Thu Oct 20, 2005 5:28 pm Post subject: (No subject)
You might consider going through a loop, and exiting the loop when the index function returns 0. Truncate the string to be everything after the previous find of the certain character.
So if you are searching for all occurances of the letter 'n' in the word "Newton", you would find the first 'n' easily enough. The new word becomes "ewton". You find the next n at the end. The new word is now a null string, "". Index returns 0, you exit the loop.
Recording them would be a matter of storing integers that represent the position of the character (ie. the value returned by index) in a flexible array.
theguru
Posted: Wed Nov 09, 2005 5:56 pm Post subject: (No subject)
AWESOME TUTORIAL!! Teaching such a complex concept simply is a tough task and you did it. Great job.
But this coding got me messed up (probably only me cause i'm such a n00b).
code:
var name := "Paul Frederic Simon"
var middleName := name
var firstSpace := 0
for i : 1 .. length (name)
if name (i) = " " and firstSpace ~= 0 then
middleName := name (firstSpace + 1 .. i - 1)
end if
if name (i) = " " then
firstSpace := i
end if
end for
put "Hello Mr. ", middleName, "!"
in that code what does
code:
firstSpace ~= 0
the ~= mean. I have a vague idea but I want to clear up my thoughts.[/b]
GlobeTrotter
Posted: Wed Nov 09, 2005 6:35 pm Post subject: (No subject)
"~=" means "not ="
theguru
Posted: Wed Nov 09, 2005 8:18 pm Post subject: (No subject)