Computer Science Canada

String Manipulation

Author:  Cervantes [ Mon Mar 28, 2005 9:06 am ]
Post subject:  String Manipulation

Hello thrill-seekers! So you've decided delve into the terrible and mystifying world of string manipulation, aye? For the next half-hour, you will be my slave. Do what I tell you, and you will learn well. Do otherwise, and you shall... not learn well!
Alright, enough of that. Let's start.
What is String Manipulation and why do we care?
String manipulation means, quite simply, manipulating strings. For example, if we take a string that the user inputs (such as Paul Simon, as their user name) we might want to greet the user by saying "Hello ", usersFirstName (in this case, Paul). Or what if we want to prevent the program from crashing when the user does something stupid like enter "qr7" when we want them to input an integer?

Search for a space
Strings allow us to do stuff with a specific character from them. For example,
code:

put "Paul Simon" (1 .. 4)

would output Paul. Or I could do stuff like
code:

if myString (5) = " " then
      put "I found a space!"
end if

But how do we know that the 5th character of myString is a space? We don't. If we want to find one, we can run through a for loop:
code:

for i : 1 .. length (myString)
      if myString (i) = " " then
            put "Space found at position ", i
      end if
end for

Notice the use of the length function. Length returns the length of the string, quite simply. In other words, it returns the number of characters in the string. So length ("a") would be 1, and length ("12345") would be 5.

Find the user's first name
Excellent. So now let's say hello to Mr. Simon, only a little less formally:
code:

var name := "Paul Simon"
var firstName : string
for i : 1 .. length (name)
    if name (i) = " " then
        firstName := name (1 .. i - 1)
    end if
end for
put "Hello ", firstName, "!"

We go through the whole string searching for a space. As soon as we find one, we set firstName to be everything before that space, then exit the for loop.
But what if the user enters his/her full name, ie. Paul Frederic Simon. The program will output "Hello Paul Frederic!" That's not what we want. This is easily fixed, though. We'll just add an exit after we've decided where the first space is.
What if the user enters only his first name (ie. "Paul")? The program will crash, because firstName was never assigned anything. We can fix this problem by setting firstName to equal name at the beginning of the program. So, if name contains any spaces, it will chomp them off, otherwise it will just (effectively) use name as the output.
Final code looks like this:
Turing:

var name := "Paul Simon"
var firstName := name
for i : 1 .. length (name)
    if name (i) = " " then
        firstName := name (1 .. i - 1)
        exit
    end if
end for
put "Hello ", firstName, "!"


Find the users last name
This is easy enough: it's just a small modification on the last code.
Turing:

var name := "Paul Simon"
var lastName := name
for i : 1 .. length (name)
    if name (i) = " " then
        lastName := name (i + 1 .. *)
    end if
end for
put "Hello Mr. ", lastName, "!"

Note the user of the asterisk (*). The asterisk represents the last character of the string.
code:

var name := "Paul Simon"
put name (*)
put name (length(name))

Both of these methods output the same thing. The * is just shorter. Smile
Also, note that I removed the exit. This way, if the user's name is set to "Paul Frederic Simon", it will output "Simon" not "Frederic Simon".

Finding the middle name
This one's a little bit trickier.
Turing:

var name := "Paul Frederic Simon"
var middleName := name
var firstSpace := 0
for i : 1 .. length (name)
    if name (i) = " " and firstSpace ~= 0 then
        middleName := name (firstSpace + 1 .. i - 1)
    end if
    if name (i) = " " then
        firstSpace := i
    end if
end for
put "Hello Mr. ", middleName, "!"

In this one, I've used a variable to store the positin of the first space. Then, we continue on looking for the next space. When we find it, we make everything after the first space but before the current space equal to middleName. Also, note the positions of the if statements. Were they to be in reverse order, as soon as a space is found, firstSpace would be given a new value and then the next if statement would be true because firstSpace is no longer 0. So we would end up with an error.

Note that all these things could also be done using the index function.

Converting variable types
This next part is really useful for error-traping. But it's also useful for other things, such as drawing text on the screen using Font.Draw (where you must use a string)
The Functions
code:

strint (s : string [base : int]) : int
intstr (i : int [, width : int [, base : int]]) : string
strreal (s : string) : real
realstr (r : real, width : int) : string

A couple things you need to know. First, these are functions. They return a value. That's what the last : typeSpec means: it tells us what kind of value the function will return. Thus, the strint function returns an integer, whereas the intstr function returns a string. Next, anything inside brackets ( () ) are your parameters. These are the things that you pass into the function. In the strreal function, s : string means that I must pass a string into the function. Anything inside square brackets ( [] ) is optional.

Error Proofing
We'll error proof our integer input. The first and basic way to do integer input is like so:
code:

var num : int
get num

But that crashes if the user enters "y". So let's fix it.
Turing:

var input : string
var num : int
get input
num := strint (input)
put num

Whoo-whee! It runs! Sure, but we've still got the same problem. The program will still crash if the user enters "y". Why? Basically, strint cannot turn a "y" into a number. strint must return an integer (we know this because of the last : int). If it can't, it halts the program.
So, how do we fix this? Well, here's some more nice functions!
code:

strintok (s : string [, base : int]) : boolean
strrealok (s : string) : boolean

These functions return boolean values: true or false. They return true if the string can be successfully changed into an integer (or real, in the case of the second function) and false if they cannot. No halting. So let's use them.
Turing:

var input : string
var num : int
get input
if strintok (input) then
    num := strint (input)
    put num
end if

And there we have it. It's error proofed. (Well, close enough. The user can still crash the program by inputting lots and lots (255, is it?) of characters.)
The next thing to do would be to force the user to input an integer. We can use a loop and an exit statement for this:
Turing:

var input : string
var num : int
loop
    cls
    locate (1, 1)
    put "Enter an integer: " ..
    get input
    if strintok (input) then
        num := strint (input)
        put num
        exit
    else
        put "That's not a number, silly!"
        delay (1000)
    end if
end loop

We only want to exit when the user enters a integer, like they were told.

Font.Drawing an integer
After error-proofing, this should be really easy.
If we want to use Font.Draw to put a number on the screen, we have to first convert it to a string. Why? Because Font.Draw expects it's first parameter to be a string. We know this because:
code:

Font.Draw (textStr : string, x, y, fontID, Colour : int)

Alright, so say we want to draw a number on the screen.
Turing:

var font := Font.New ("Garamond:26:bold")
var num : int
get num %error proof this!
Font.Draw (intstr (num), 100, 100, font, black)

Easy enough. We could also have simply done
code:

Font.Draw (intstr (10), 100, 100, font, black)

But the first way gets you to error proof it for me. Mwahaha!

ord and chr
Next up, we learn about two new functions.
code:

ord (ch : char) : int
chr (i : int) : char

ord takes a single character and returns an integer (that is specific to that character).
chr takes an integer and returns a single character (that is specific to that integer).
To use these, we need to haul out our ASCII chart. So, open Turing, open the Turing Help Manual (that's code for "press F10"), expand Turing Language, select Keystroke Codes.
Find the letter "A". It's ordinal value is 65. So
code:
put ord ("A")
would output 65. Similarly,
code:
put chr (65)
would output the letter "A" (without the quotes).
Turing:

var s : char := "A"
put chr (ord (s))

var i : int := 65
put ord (chr (i))

For any character, s, chr (ord(s)) = s. Also, for any integer, i, ord (chr (i)) = i.
These functions are useful for a variety of things. For example, say you want to loop until the user presses Ctrl + Backspace:
Turing:

var input : string (1)
loop
    getch (input)
    exit when ord (input) = 127 %crtl + backspace
end loop

Or, say you want to output the alphabet to the screen, in lower case letters:
code:

for i : 97 .. 122
    put chr (i), " " ..
end for

How about converting a string from upper case to lower case?
code:

var upperCaseString := "ROOOOAAAR"
var lowerCaseString := ""
for i : 1 .. length (upperCaseString)
    lowerCaseString += chr (ord (upperCaseString (i)) + 32)
end for
put lowerCaseString

If you don't already know what += means, it simply incriments the variable by whatever is after it.
code:
num += 1
is the same as
code:
num := num + 1
When using strings, it just adds the string (or character) to the end of my string.
Note that the ordinal value of any lower case letter is equal to the ordinal value of any upper case letter + 32.
Note that this doesn't quite work if your string contains things other than numbers. To fix this:
Turing:

var upperCaseString := "RAWR!!  HERE ME ROAR!!"
var lowerCaseString := ""
for i : 1 .. length (upperCaseString)
    if ord (upperCaseString (i)) >= 65 and ord (upperCaseString (i)) <= 90 then %if it is a capital letter
        lowerCaseString += chr (ord (upperCaseString (i)) + 32)
    else
        lowerCaseString += upperCaseString (i)
    end if
end for
put lowerCaseString

Yay! Next up, let's try outputting "Aa Bb Cc ... Yy Zz"
code:

for i : 65 .. 90
    put chr (i), chr (i + 32), " " ..
end for



Conclusion
So there you have it, string manipulation in a really big nutshell that took a long time to write. Aah, let's let Asian sum things up for me, my fingers are tired:
AsianSensation wrote:

Know this though, string manipulation comes with practice, and problem solving is a big part of this. So don't assume knowing all about index will make sure you will be able to solve a question. Index is merely a tool, not the solution.

Replace the word "index" with the words "the stuff covered in this tutorial, whatever that may be" and heed his advice (or wisdom).

Happy string manipulating,
-Cervantes

Author:  jamonathin [ Mon Mar 28, 2005 1:42 pm ]
Post subject: 

Wow. Very, very nice Cervantes, if extra bits ment anything to you, i'd give some, but they dont and the post on the bits system says it's useless givin em to mods, so i'll pass. But good job on the tutorial, that'll help lots of people out.

Oh, i liked the "HERE ME ROAR", i could tell you've been workin on the tutorial for a long time. Razz GJ!

+ Very Happy bits.

Author:  Naveg [ Mon Mar 28, 2005 6:04 pm ]
Post subject: 

awesome tutorial, thanks a lot man

Author:  Flikerator [ Tue Mar 29, 2005 4:41 pm ]
Post subject: 

Im still reading through it...

Author:  zylum [ Tue Mar 29, 2005 11:01 pm ]
Post subject: 

nice job Razz

Author:  Naveg [ Wed Mar 30, 2005 12:10 am ]
Post subject: 

i think you need to take the speed reading course flikerator

Author:  gnarky [ Mon Apr 04, 2005 5:25 pm ]
Post subject: 

Good tut...Can someone help me out with this then please.

I have:

User 1 : admin

How do I get only '1' and 'admin' from that?
Thanks

Author:  Cervantes [ Mon Apr 04, 2005 6:01 pm ]
Post subject: 

There are lots of ways to do it. Which way is best depends largely on how you are getting the "User 1 : admin" information. If you know it is going to be of a certain form, (such as, "User", userNumber, " : ", userType), you could do something like this:

Turing:

var theString := "User 1 : admin"
var numbers := "0123456789"
var userNumber : int
var temp := -1
for i : 1 .. length (theString)
    if temp = -1 and index (numbers, theString (i)) ~= 0 then  %if the character is a number and temp has not yet been assigned a value
        temp := i
    end if
    if temp ~= -1 and index (numbers, theString (i)) = 0 then  %if the character is not a number and temp has already been assigned a value
        userNumber := strint (theString (temp .. i - 1))
        exit
    end if
end for
put "User Number: ", userNumber

var userType : string
for i : 1 .. length (theString)
    if theString (i) = ":" then
        userType := theString (i + 2 .. *)  %+1 to get to the space after the :, then another one to get to the first letter of the user type
        exit
    end if
end for
put "User Type: ", userType


That uses searching for numbers (and returning the first set of numbers, no others [fortunately your data only has one number]) using the index method. However, it could also be achieved using ord

Turing:

var theString := "User 1002 : admin"
var userNumber : int
var temp := -1
for i : 1 .. length (theString)
    if temp = -1 then
        if ord (theString (i)) >= 48 and ord (theString (i)) <= 57 then %if the character is a number and temp has not yet been assigned a value
            %all numbers (0123456789) have ascii values between 48 and 57
            temp := i
        end if
    else
        if ord (theString (i)) < 48 or ord (theString (i)) > 57 then %if the character is not a number and temp has already been assigned a value
            userNumber := strint (theString (temp .. i - 1))
            exit
        end if
    end if
end for
put "User Number: ", userNumber

I changed around the if statements a bit, but other than that the only change is index to ord.

If you need any further help, just ask. Smile

Author:  gnarky [ Mon Apr 04, 2005 6:28 pm ]
Post subject: 

Sounds way too confusing for me. Heres my situation.

I have a database full of users. The memberlist (text file) read as so:

User 2 : name
User 1 : admin

With each new user, the file is automatically updated with the newest member on top. ("User 3 : Matt" would be added to the top)

I want to be able to get the usernumber and name through the open and get statements. Thanks

Author:  Cervantes [ Mon Apr 04, 2005 6:46 pm ]
Post subject: 

So the first task is to open and get the information. Since this is not a files thread, I'll assume you're okay with that. The second task is to find the user's number. How are you going to do that? You can't say that the UserNumber = theString (6) because that only works if the number has only one digit. You could beef up the user number with leading 0's (so instead of 1, it might be 001). If you know you won't have more than a certain number of users, that might be fine. But if you don't know, you need to do what I referred to as a "number search".
The concept is not that difficult. Search through the string, one character at a time, and keep track of the position of the first number that appears. Continuing through the string, as soon as a character is found that is not a number, that is one place after the position of the last digit of the number.
Grasping these things takes time, though. I do not expect you to get it right away. But neither do I expect you to say:
gnarky wrote:

Sounds way too confusing for me.

and give up. Give it some effort! Smile

-Cervantes

Author:  Geminias [ Wed Oct 19, 2005 5:03 pm ]
Post subject: 

how would you use the index function to parse through code looking for specific characters like 'n' or ' ' (space), and record more than just one of them.

Author:  Cervantes [ Thu Oct 20, 2005 5:28 pm ]
Post subject: 

You might consider going through a loop, and exiting the loop when the index function returns 0. Truncate the string to be everything after the previous find of the certain character.

So if you are searching for all occurances of the letter 'n' in the word "Newton", you would find the first 'n' easily enough. The new word becomes "ewton". You find the next n at the end. The new word is now a null string, "". Index returns 0, you exit the loop.

Recording them would be a matter of storing integers that represent the position of the character (ie. the value returned by index) in a flexible array.

Author:  theguru [ Wed Nov 09, 2005 5:56 pm ]
Post subject: 

AWESOME TUTORIAL!! Teaching such a complex concept simply is a tough task and you did it. Great job.

But this coding got me messed up (probably only me cause i'm such a n00b).
code:
var name := "Paul Frederic Simon"
var middleName := name
var firstSpace := 0
for i : 1 .. length (name)
    if name (i) = " " and firstSpace ~= 0 then
        middleName := name (firstSpace + 1 .. i - 1)
    end if
    if name (i) = " " then
        firstSpace := i
    end if
end for
put "Hello Mr. ", middleName, "!"


in that code what does
code:
firstSpace ~= 0
the ~= mean. I have a vague idea but I want to clear up my thoughts.[/b]

Author:  GlobeTrotter [ Wed Nov 09, 2005 6:35 pm ]
Post subject: 

"~=" means "not ="

Author:  theguru [ Wed Nov 09, 2005 8:18 pm ]
Post subject: 

oo. kk, thanks a lot!

Author:  Saad [ Sun Oct 12, 2008 9:24 am ]
Post subject:  RE:String Manipulation

Added to Wiki
Turing String Manipulation

Author:  Lekegolo killer [ Tue Dec 09, 2008 3:56 pm ]
Post subject:  Re: String Manipulation

i took a bit of the code i found here and mophed it a bit so it looks like this:

var day2 :int
var day3 : string
loop
cls
locate (1, 1)
put "Requesting day of curent month"
get day3
if strintok (day3) then
day2 := strint (day3)

exit
else
put "That is not a valid number"
delay (1000)
end if
end loop
delay(1000)
cls


how would i make it so that it will only accept numbers under 31 as a valid number? (i am trying to get the day of the month without it crashing if they put in a invalid number).

Author:  gitoxa [ Tue Dec 09, 2008 4:24 pm ]
Post subject:  RE:String Manipulation

You have to test the number to make sure it is smaller than 31 before assigning it to your variable. (after works too, but you dont want to assign bad data)

You'll want to check to make sure your number is 31 or less ( < 32 ) using the strint function before assigning it to your variable

Author:  Lekegolo killer [ Wed Dec 10, 2008 1:26 pm ]
Post subject:  Re: String Manipulation

ohhhh kk i got it...i think... i put a if in a if.

Author:  Flipmc [ Fri Jan 30, 2009 6:37 pm ]
Post subject:  RE:String Manipulation

Thanks a lot for this!

Author:  Draconis [ Tue May 05, 2009 9:30 pm ]
Post subject:  RE:String Manipulation

Thanks sooo much man, i've been looking for something that gave out this info since foreverrrr! THANKS! =)

Author:  stas054 [ Sat Jun 13, 2009 3:18 pm ]
Post subject:  RE:String Manipulation

I <3 pie

Author:  DaBigOne [ Sun Dec 30, 2012 5:17 pm ]
Post subject:  Re: String Manipulation

I am making a calculator, and this is what I got so far.

var num1 : real
var num2 : real
var reply : string


loop
put " what kind of function would you like to perform? type in '+', '-', '*', or '/'"
get reply
if reply = "/" or reply = "-" or reply = "*" or reply = "+" then
if reply = "+" then

put " What is the first number?"
get num1
put " What is the second number?"
get num2
put num1 + num2
end if
if reply = "-" then
put " What is the first number?"
get num1
put " What is the second number?"
get num2
put num1 - num2
end if
if reply = "*" then
put " What is the first number?"
get num1
put " What is the second number? "
get num2
put num1 * num2
end if
if reply = "/" then
put " What is the first number?"
get num1
put " What is the second number?"
get num2
put num1 / num2
end if
else
put " Invalid Function "
delay (1000)
end if
end loop




I managed to fix the problem at the beginning if someone types in something other than what the required symbols, as it makes a message pop up saying "Invalid Function"
However, after they put in the math symbol, the program asks them for a number.
I want to make sure they input a number (including real and integers) and nothing else, but I cannot seem to find out how to do that.
This tutorial only works for integers, but I want the user to be able to input real numbers as well; it is a calculator after all.
Any advice????

Author:  Insectoid [ Sun Dec 30, 2012 6:04 pm ]
Post subject:  RE:String Manipulation

strreal() and strrealok()

Author:  DaBigOne [ Mon Dec 31, 2012 4:16 pm ]
Post subject:  RE:String Manipulation

so basically, strreal makes it so only numbers can be entered i take it, and strrealok does what?

Author:  Insectoid [ Mon Dec 31, 2012 4:24 pm ]
Post subject:  RE:String Manipulation

Did you read the first post in this thread? If you still don't get it after that, you can check the Turing Documentation, which can be found by clicking the 'Turing' button at the top of the page.

Author:  DaBigOne [ Mon Dec 31, 2012 6:08 pm ]
Post subject:  RE:String Manipulation

OK, yes I found that part. That helped a lot, thanks!


: