Computer Science Canada
|Author:||Clayton [ Sat Dec 02, 2006 2:36 pm ]|
|Post subject:||[Tutorial] Arrays|
NOTE: When going through this tutorial, all headers in red are problems that you should do in order to fully understand the concepts of arrays. Finish them before you continue reading the tutorial. It is highly recommended that you complete these problems. The solutions will follow under the orange headers. Please don't be tempted to look at the solutions unless you are completely stuck.
Write a program to get 15 people's names. After you have gotten all of the names, display all of the names in the order that you got them.
The code that you got should look something like this:
Whew! That took a long time to write, and it doesn't even do anything for us except regurgitate some names for us. Either way, you've got to be thinking, "There's got to be a better way!". Have no fear, there is!
What is an Array?
An array is a linear collection of values of the same type. In other words, its a group of variables referred to by the same name, but each value has a unique index. "But what's an idex?". Well, let me show you an example of an array:
Don't worry, it's really not as difficult as it looks. Lets go over it part-by-part.
Simple really, we have now created some sort of variable called "my_array".
This here is the meat of the array. What is going on is the variable my_array is being typed as an array, with 15 elements (0 is an element, plus the other 14) and all of the elements of that array are of the type of string.
What's an Element?
Think of an array as a bunch of boxes in a line. Each box can hold a value, we can then access this value by using its index reference (the "box number", if you will) and use it like a normal variable.
Great, how do I use arrays?
Quite simply really. After you have declared your array, you can then use any of its elements independently of each other as you could any other variable. Ex:
Pretty nifty eh? The number in the parentheses is the index for the element you want to access. Just be sure to call that element with an integer, nothing else, otherwise Turing will get angry.
Indexing your Array
In Turing, arrays are not automatically zero-indexed; that is, that any array's index does not by default start at zero, like it does in other languages such as C++. When you declare your array, you set the lower and upper bounds (in other words, you set the index for the first box and set the index for the last box, and all the boxes in between get created as well). That being said, the following is perfectly legal:
However, it is generally discouraged to do such a thing unless there is a good reason to do so. Generally, your arrays should start at either 0 or 1. Starting at 1 is intuitive, because when we talk about the first element in our array, we index it with the number 1. When we talk about the sixth element, we index it with the number 6. However, starting an array with a 0 index is also useful. It may seem counter-intuitive because we must use the index 4 to access the fifth element of the array. However, there are many times that starting from 0 makes the arithmetic we do easier.
Modify Problem 1 so that it uses an array instead of mass amounts of variables.
NOTE: If you're not sure about for loops, go here
Wow! How much faster was that to write? Don't arrays make your life so much easier?
Now I've done a couple of interesting things here. First, I've refrenced an element in an array through a variable integer (i in my for loops). Remember what I said above about referencing arrays, all you need was an integer? An integer variable is still an integer.
Second, I've used the function upper(). What it does is take an array for an argument (names in this case), and returns the upper bounds of the array that was passed to it. This function is accompanied (although I didn't use it) with lower(), which returns the lower bounds of an array.
Having a user defined upper bound
When declaring your array, you can substitute integer variables into your declaration of your lower and upper bounds. Eg:
There's no magic here. The value for UPPER_BOUNDS is substituted into the expression to make the variable, and we get what we expect: an array of booleans with lower bound 1 and upper bound 6.
Modify your program from Problem 2 so that, instead of requiring that the user always input exactly 15 names, the user is first prompted to enter how many names (s)he intends to enter.
Only a slight modification to the previous program; this time we ask for how many names the user's going to input, then we declare our array. Then we proceed like last time for the rest of it.
Passing Arrays as Arguments
So now you have learned the basics of arrays; you should feel fairly comfortable with them. But what if for some reason you have to pass an array as an argument to a function or procedure? Say, to sort data? You declare a parameter as normal, with it's type.
Captain James T. Kirk
However, this procedure, print_array, is highly specialized. It prints an array of strings of exactly 5 elements. Not only that, the lower bounds has to be 1 and the upper bounds must be 5. If we gave it a 5 element array of strings, starting with index 0, what would it do? It would output the last four names in the array and then give an error, because we are trying to access the element at index 5, but there is none.
We can generalize this somewhat. We can modify our procedure so that it takes an array of unknown length. To do this, when declaring the parameters to our procedure, we replace the upper bound (previously 5) with an asterisk (*).
Here's an important example: sorting. The algorithm we will look at is called an insertion sort. We will write a procedure that takes in an array and modifies it so that it is sorted. Here's the code (taken from zylum's sorting topic which can be found here):
The code is rather complex. Give it a good look and try to understand how it works, but don't sweat too much if you don't follow it. There's plenty of time to learn sorting algorithms later.
Returning an Array from a Function
What if you have an array and you want to apply a specific function on each element (for simplicity's sake we'll say we want to add 1 to each of the elements of the array)? One thing you could do is use a procedure and modify the contents of the array within, but that's not good coding practice. Instead, we should create a function that applies a function on each of the elements of the array, and returns a new array. However, there's a limitation in Turing that only allows you to return an array from a function that has known bounds at compile time. In other words, you can only return a static array from a function.
This kind of behaviour also shows us something else: an array is a first-class value, just like an integer or a string. We know this because we are able to pass arrays around as values and return them from functions just as we would strings or integers.
Just remember that when you return an array from a function, that you have to return a static array.
End of Section 1
|Author:||Silent Avenger [ Sat Dec 02, 2006 3:46 pm ]|
Very nice tutorial Freakman, good job!
|Author:||Clayton [ Mon Dec 11, 2006 9:47 pm ]|
Have the user input the names of 5 people, along with their nicknames.
Multidimensional arrays is, in essence, an array of an array.
Yes, an array of an array. Take our example from part 1 with the refrence of arrays being a line of boxes. A multidimensional array however (more specifically for now a two dimensional array), is more like a matrix (a grid with rows and columns). It is refrenced in the manner as if it did in fact have rows and columns.
Fancy eh? So a multidimensional array is in fact an array of an array. How would you call an element of such a beast you say? Easy!
Now, that way of calling an array is kind of laborious--who wants to use that many brackets? I know I don't. So Turing has a handy little feature built in. Regarde:
Not much of a change here, just a bit easier on the eyes. Just remember that the first bounds you declare is your "rows" and the second your "columns".
Modify problem 4 so that it uses 2D arrays instead of parallel arrays to get the users names and nicknames (assume they have 3 this time).
Much easier with a multidimensional array eh?
Just remember that an array can have as many dimensions as you want, the additions of dimensions are easy to add, if you have any questions please post them in [Turing Help] and I'm sure someone would be glad to answer your question.
|Author:||Cervantes [ Mon Dec 11, 2006 10:26 pm ]|
The examples you've given for multidimensional arrays are better suited to using a linear array of a record:
I think a more appropriate example would be creating a grid for a board game, or some such.
Also, if you're going to be introducing multidimensional arrays as arrays of arrays, why not explain that this is entirely legal because of the following argument:
An array is simply a list of values. An array is itself a value. Thus, an array can be a list of arrays. Then we get the following syntax:
The more usual syntax,
is simply syntactic sugar for the first one.
This is looking nice, though. How many parts are you planning?
|Author:||Clayton [ Mon Dec 11, 2006 10:34 pm ]|
I was kind of planning to have 3 parts (I can't remember what I was going to write part 3 on though , any suggestions?)
I was thinking about doing the grid for a game, but I didn't want to bring graphics into this, so I kept it at names (and yes, I realized records were the way to go when I was writing this too, but I figured for all intents and purposes, this would do for now,) but I guess I could change it, or make that part 3 for now.
I will also be adding in your points about why the 2D arrays are legal (with the argument you provided)
Thanks for the feedback!
|Author:||Cervantes [ Mon Dec 11, 2006 10:45 pm ]|
You don't need to introduce heavy graphics. Just do it in text!
In the Turing Walkthrough, we've learned about subroutines by the time we get to arrays, so we can write subroutines to keep the code for this fairly light and understandable. Not too tough for tutorial material.
Part 3 of arrays could be flexible arrays. You're welcome to clean up my tutorial, if you like.
|Author:||Clayton [ Mon Dec 11, 2006 10:46 pm ]|
Alright, I like the sound of that, that way, all the arrays are all here in one tutorial (although, you may have to rearrange the posts to get them together). Also, for text, could I just say, make a chessboard, then have them output the color numbers (just to makes sure that they have it) or something simple like that?
|Author:||Cervantes [ Mon Dec 11, 2006 11:08 pm ]|
That would be a very easy example. I'd suggest doing two examples. That could be the first. Perhaps the second could be the foundations of a tile-based video game like Zelda. Each element of the 2D array is an integer: 0 for grass, 1 for tree, 2 for rock... Then have an array of people; each need an x and y coordinate. Hmm, this is really needing records. Perhaps records should just go before arrays in the Turing Walkthrough. Thoughts? That's how it was taught in my course in Scheme: structures were taught before lists. It worked well.
|Author:||wtd [ Mon Dec 11, 2006 11:10 pm ]|
Sounds good to me.
|Author:||Clayton [ Mon Dec 11, 2006 11:11 pm ]|
I think that records really should go before arrays. The thing is, if ever you need more than a single dimension in your array, you should proably be using a record anyways. Then, we can work that into the tutorial too.
|Author:||[Gandalf] [ Tue Dec 12, 2006 9:18 pm ]|
Not really. Again, in the example of a grid, say you were making a chessboard... What makes more sense, a 8x8 grid of squares, or an array of 8 rows/columns which are 8 squares long?
Also, good work on the tutorial, but the colours could generally use some improvement, and some extra use of bold/larger font tags would be nifty. (What is it with these ugly rainbow coloured tutorials these days? People trying to mimic Cervantes', only doing it terribly wrong?)
*edit* Must've been tired when I wrote this... It's an 8x8 grid not 10x10.
|Author:||Cervantes [ Tue Dec 12, 2006 9:45 pm ]|
The thing is, if ever you need more than a single dimension in your array, you should proably be using a record anyways.
As Gandalf said, no. Take the chessboard. Using your advice, we would represent it as an array of a record:
which might as well be
which is essentially
because the record only has one field.
|Author:||Clayton [ Sat Dec 30, 2006 9:02 pm ]|
So you now know what an array is, how to work with it, and what they can do. But, did you know that arrays can become even better? Well, they can. The problem with the arrays we've been using is that they are static--they are always exaclty the same size. Why can this be a problem, you ask? Because you never really know how many names the user is going to input, or how many bullets are currently on the screen.
So when you get down to the nitty gritty: A flexible array is an array with a dynamic upper bound.
How do I use one?
Excellent question. The declaration of a flexible array is fairly straightforward:
Fairly simple right? You have your variable name, which is a flexible array, going from lower to upper (which are both integers) and are of the type typeSpec. Now something very interesting about flexible arrays is their ability to have an upper bound one less than the lower bound. Now this may not sound like it makes sense, but it does. Take the following example into account:
Now that may seem strange, but all it means is that currently there are zero elements in the array. That is as it should be as there is no information to fill that array yet.
Wait, you said I could change the upper bound!
And you can, using the new keyword. Simply reference your flexible array you want to change the upper bound of, along with the new upper bound you want. Take this for an example:
After running that, you will have created (at the start) an array called foo with zero elements. The next line will take the array foo, and reallocate it to have 5 elements in it instead. Simple really. You can make changes as much as you want to your upper bound, however, make sure that you never try to make the upper bound less than one below the lower bound.
Now how does this help me?
Flexible arrays are incredibly useful. Before long, you'll find yourself using them all the time. Here's an example. Take inputing information from a file; chances are, until now, you've had to know exactly how many lines were in your file so that you could read it into an array. You need not know that any more! Now you can use what you know of flexible arrays, the upper() function, and loops to solve your problems (if you are not sure about File I/O don't worry so much about the file stuff, just concentrate on the code itself).
Fairly simple here. We declare a flexible array to hold all of our file's contents. Then we open the file and enter the loop. If we're not at the end of the file, then we add a new element to our array (by taking the current upper bound and adding one to it; this is a common thing with flexible arrays) and get the file content for that line.
Another use for flexible arrays comes in shooting games, where you have to deal with multiple bullets. You have a record to hold all of the bullets' information (such as x and y coordinates, velocities, etc.), and create a flexible array of that type. Remember, any type, whether you create it, or it's predefined, can be assigned to your flexible array. So take a look at this example to see what I mean:
The next portion of this tutorial was written by Cervantes, as I believe he can explain it better than I can:
Multi-Dimensional Flexible Arrays
Turing will sometimes tell you, "Complex multi-dimensioned flexible array reallocation not implemented yet - sorry."
Bullsh*t! Contrary to popular belief, we can change the upper bounds of multi-dimensional flexible arrays. However, there are limitations.
I am working with Turing 4.0.5, so everything I say is valid only for this version. What I am about to say may not apply to your particular version of Turing. You'll have to try it out for yourself.
So, to change the upper bound of a multi-dimensional flexible array, it would seem that we first have to set the upper bounds of at least one of the dimensions to zero or leave one of the dimensions alone. Let's have some code:
This is all valid, Turing-loving code (again, for 4.0.5, not necessarily previous versions). Let's try leaving one dimension as it is while changing the other:
Part of it works. We can change the first dimension, but not the second, provided we leave the other dimension alone. If you must change both dimensions, there's a roundabout method:
There is a bit of a way around this, takes up a bit of space though. Just create a new array (doesn't have to be flexible) and store the orginal array in that then change the dimension to 0 and back to what you want before restoring the original array's values.
Now, let's try 3 dimensions.
This also works. So, let's get some rules going on here.
And so ends Part III of this tutorial. This section introduced the topic of arrays whose upper bound is flexible. This new power gives great potential. We're moving towards manipulating data that is dynamic, not static. Dynamic data is what we encounter in the real world.
End of Part III
|Author:||Clayton [ Sat Dec 30, 2006 10:32 pm ]|
Removing Elements from a Flexible Array
You know from Part III that you can simply remove the last element from an array by reallocating that array with the new keyword, and supplying a new upper bound that is lower than the current one. However, what if you don't want to remove the last element? What if you want to remove, oh say, the 3rd element in a 7 element array? The simplest way is to copy the value of the 7th (last) element into the 3rd position. It would look like this:
Note that you should only use this method if you do not care about the order of the information stored in the array, because it will get screwed up.
What if you do care about the order though? If you need to keep everything in alphabetical order, then you need to keep all the elements together properly. This one isn't so simple. We'll need to take all the elements above the third and move them down one position and then we can lower the upper bound. How might we do this? (Note I'm using integers for simplicities sake, but it is exaclty the same for strings and alphabetical order) Like so:
So lets walk through this to make sure you get it. The first part is straightforward: declare the flexible array foo and assign each of its seven elements the values of 1, 2, 3, ..., 7. Next I ask the user to input an element, between 1 and 7, to be removed. The last for loop is where everything goes down. We start at the element to be removed and work our way up to the second last element. In the for loop (on the first iteration), the element that we want to remove gets written over by the element above it. This happens all the way up to the second last element (so that the old last element becomes the second last element), and then we exit the for loop and shrink the array by one element, resulting in the deletion of an element.
Removing an Element from an Array Within a For Loop:
What if we are iterating through our array, checking for a condition when we need to remove an element? Take the example of a shooting game. We have a flexible array of bullets. In our main loop, we have to iterate through the array of bullets checking for collisions. If there is a collision, we have to remove that bullet from our array. So we're inside a for loop iterating from 1 up to the upper bounds of our array, and we find that we need to remove the second element. If we simply do what we did above, then after the second element is removed, our array will be one element smaller. However, there's a problem. The for loop will still iterate up to the old upper bounds, not the new upper bounds. This is because the for loop special form evaluates its "arguments" once, at the start of its execution. So if we remove an element from our array, the for loop will eventually reach the old upper bounds, which is one greater than the new upper bounds, so we will get an error.
There are several solutions to this problem. The easiest is simply to exit the for loop after an element has been removed. This is a bit unsatisfying, though, because it means that, at most, one element can be removed per iteration of the main loop, even if two or more elements should be removed. This may not be much of a problem, or it could be a huge problem.
A Flawed Attempt
Here's one possible solution. The idea here is that we keep track of the number of elements we are going to remove. This is stored in number_to_remove. As we iterate through our array, if we find an element we want to remove, shift the elements above it down one (as we have done above) and incriment number_to_remove. However, we don't shrink the array until after we exit this for loop. Here's the code:
This seems to work. However, what if instead of removing all multiples of 4, we want to remove all multiples of 4 and 5? Instead, we would use the following code:
There's two problems. The first is that 5 is in the list. That happened because when we removed 4, we shifted 5, 6, 7, ... down by one position. So now 5 is the fourth element. Our for loop has already tested the fourth element. It was 4. So now it moves onto the fifth element, which is now 6. We totally skipped 5! The second problem is that we somehow removed 14, 17, 18, and 19, but none of these are multiples of 4 or 5. To understand this, let's look at what our array looks like before we issued the new command:
Our Array wrote:
You'll also notice that 16 was not removed, for the same reasons that 5 was not removed. 16 is immediately after 15, which is a multiple of 5.
Now, see all the 20's at the end? It makes sense that they are there, right? When we shift elements down, we leave the top alone. As a result, the top element, 20, gets repeated. Our for loop iterates all the way up to the 20th element, however. Once it reaches the start of that block of 20's, it thinks it needs to remove that element. In reality, we don't care about that element, as it is going to be removed when we issue the new command. However, 20 is a multiple of both 4 and 5, so the condition of our if statement is true. This causes number_to_remove to be incrimented. This is why our array ended so much sooner that it should have: at 13 rather than 19.
As you can plainly see, there are a lot of problems with this approach. Describing this failed approach is not a waste of time, however. It's important to understand these pitfalls. They are not the kind of errors that are easily recognizable, and they take a fair amount of work to understand why the error is happening.
A Working Method
The second approach we shall try it to use another flexible array to store the indeces of the elements we want to remove. We'll see that this approach works. I'll present the code and disect it after.
One thing to notice is that I used 0 for a lower bounds to indeces_to_remove. It made things easier. If I had started it at 1, I would have had to subtract 1 in some places.
You'll also notice that I lied to you, a little bit. I said we were going to use an array to store the indeces of the elements we want to remove. I have, however, used an array to store the elements we want to remove with some additional arithmentic done. Each time I found an element (at position i) I wanted to remove, I added a new element to my indeces_to_remove array and shoved i - upper (indeces_to_remove in there. The reason for the subtraction will become apparent later.
The next step is to remove the elements that indeces_to_remove instructs us to. We do this in the usual way, by shifting elements down and removing the top element. Now let's think about what would have happened if I hadn't done that very important subtraction I mentioned in the above paragraph. The first element to be removed is 4, at index 4. I shift everything above 4 down by one position, so now my array looks like this: [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 20]. Then I remove the top element. Next, I am instructed to remove the fifth element. That corresponded to element 5 before, but now, look! The fifth element is 6! I remove it, getting my new array: [1, 2, 3, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 20]. And again, I cut off the top element. Next I'm instructed to remove the eighth element, which originially corresponded to the value 8. Now, however, it corresponds to the value 10! I got lucky here, because 10 is actually something I want to remove. However, it's still a problem, because later I'll be asked to remove the tenth element.
So, the reason for the subtraction is to remedy this problem. Every time we remove an element like indeces_to_remove instructs us to, we shift the array by one. Thus, when we go to remove the second element that we are to remove, it's actually found at an index one lower than it was originally! When we go to remove the second element that we are to remove, it's actually found at an index two lower than it was originally! Thus, I've pre-planned for this otherwise disasterous complication by performing that subtraction.
Again, this may seem complicated, but understanding these quirks is dreadfully important. It's very painful to write a good program and have one flaw in it that takes hours to locate. The error we caught here is the sort of error that could cause much hair-pulling and teeth grinding.
You have now learned how to deal with dynamic data and how to handle, manipulate, and remove such data from your dynamic arrays. Keep these things in mind when you write code, and apply it as much as you can. The more dynamic you can make a program, the better it will be. It's always a great thing to write a generalized, dynamic program and see it handle all sorts of situations, some of which you might not have even considered deeply.
End of Part IV and Tutorial