Computer Science Canada
An Introduction to Java: Second Edition
|Author:||wtd [ Thu Feb 07, 2008 4:59 pm ]|
|Post subject:||An Introduction to Java: Second Edition|
Note: this edition features revised code formatting and is locked. If you have Java questions, ask elsewhere. Thanks to all those who replied to the original and caught mistakes I missed.
What is Java?
Java is really three things.
How do I get Java?
Head on over to http://java.sun.com and download the Java SE 5.0 Software Development Kit. This includes the compiler which turns Java code you've written into machine code for the Java Virtual Machine. The Java Virtual Machine and standard library are also installed so you can run programs.
What else do I need?
You'll also need a plain text editor. On Windows, Notepad will work, but I'd recommend something like Textpad. Line numbers in particular help.
What does "Hello world!" look like?
How do I compile and run this?
You need to save the code example in a file called "HelloWorld.java". Navigate to where this file is stored in a command prompt window (or terminal window, if you're not using Windows).
We then use the "javac" program to compile the code to something that can be run.
Once this completes, you can run it by using the "java" program.
A bit of theory...
Now you know how to create a "Hello world!" program and compile it, and then run it. This doesn't do you much good.
It's important to understand how programs in Java go together.
All executable code in Java is organized into methods. In our "Hello world!" example, "main" is a method. Methods themselves, however, are further organized into classes.
But... what is a class?
A class lays the groundwork for an object.
But... what is an object?
An object is a combination of data, and operations on that data. By "encapsulating" data within an object, and hiding it from the outside world, we can control the ways in which that data can be changed. Instead of the direct ability to manipulate that data, we provide only certain abilities, via methods.
Back to classes. A class describes the data an object is composed of, and provides the methods for it. Methods are typically "public" and thus accessible to the rest of the world, while data is typically "private", and thus hidden from the rest of the world.
Of course, it doesn't always make sense for methods to be directly attached to objects. In this case, we have "static" methods, like our "main" method. Static methods are associated with the class itself, rather than an object of that class.
The "main" method serves as the entry point for our program. It contains the actual code that is executed when the program is run. Since it doesn't return anything to the program, we say that it returns "void". In contrast, a function called "multiply" that takes two numbers and multiples them might be specified as returning "int" (short for "integer").
A final note: the entry point always accepts an array of strings. This represents the arguments given to the program when it's run from the command prompt.
Creating our own method
Understanding methods is critical, and you can't really understand methods until you write them yourself.
Instead of directy printing "Hello world!", let's have a method do that, and then call that method from the main method.
The new method is also static, because it's not associated with any particular object but rather just the HelloWorld class.
Parameters and arguments
Of course that example is still pretty boring. Our sayHelloWorld method can only ever do one thing.
Of course, this is at it should be. To modify the behavior of a method we need to give it a parameter. You can think of a parameter as being an input. Methods may in fact have numerous parameters. Let's give a sayHelloTo method a string parameter containing a name.
Now the string we actually gave to the sayHelloTo method when we called the method is called an "argument".
Getting a bit more selective with more parameters and "if" statements
At some point we need to be able to make choices based on parameters. Let's refine our sayHelloTo method so that it can offer more than one type of greeting.
Now that previous example looks a bit messy. Clearly:
Should print "Hello Bob!". But the method needs two arguments. We can't just call it with one.
Not yet, but it's not impossible. What we need to do is overload the method. That is, we provide another method with the same name, but which has a different number, or different types of parameters. The compiler can then simply choose the right one for the job, as necessary.
Methods that return useful values
So far, all of the methods we've seen have simply taken an action. As such, they've had a return type of "void". Of course, it's very possible to have methods that return values, which the program can then use in whatever way it wishes.
A few quick notes between example-heavy sections.
You'll note that I use a capital letter at the beginning of class names, including "String", when I use that. This is a convention used by Java programmers. Learn to love it, because trying to be different will just make things harder. There are lots of other places to be creative.
Perhaps more importantly you'll notice that aside from "main", which is standard, all of my method names follow a specific convention. Methods that "do" something have verb names. Methods that return new values have noun names.
Also, method names begin with a lowercase letter.
Thus far when I've defined a new method, I've called it from another by simply writing the name of the method, with parentheses and any required arguments. This is fine, since it's all been within a single class. Java can assume that you qualified the method with the name of the class for static methods. We can be more explicit, though. The previous example can be changed to incorporate this more explicit notation.
Of course this is unnecessary in this case and thus silly. It is necessary, though, if we have another class. We can have multiple classes in a Java source file so long as only one is public.
We've seen how static methods can be used, and how they can be, and sometimes must be, qualified with the name of the class they are associated with.
But there is still the question of how non-static methods work. They aren't associated with the class directly, but rather with an object of that class. We can create an object of a class easily enough.
We can assign that object to a variable.
But what does this mean for how non-static methods work?
Well, to understand that, we need to discuss a term called context. In the case of the static methods we've seen, that context is the class itself. We signify this in the code by qualifying the name of the methods with the name of the class. Or we can avoid qualifying names at all, so long as the two methods in question exist within the same context. When they exist in different contexts, though, we must qualify the method name.
For non-static methods, our context is something different. It's the object itself. Thus, even within the same class, static and non-static methods exist in different contexts.
Let's look at a simple example.
Clearly when we use a non-static method inside a static member, we need to qualify the method name, to explicitly state the context for that method.
What happens if we turn this around? If we try to use a static method from within a non-static method?
Objects belong to classes. In the previous example "hw" is an object of the "HelloWorld" class. As a result, an object carries with it the context of its class. The practical result is that static methods can be called without qualification from within a non-static method. That is, unless there are static and non-static methods with the same name.
Calling One Non-static Method From Another and "this"
Within another non-static method, we can either allow the context to remain implicit, or we can use the special "this" variable, which refers to the current object, to explicitly state the proper context.
Why Have a Difference Between Static and Non-static?
From what we've seen thus far, there seems to be no point in having both static and non-static methods. It creates a confusing set of different contexts with different effects on how you can call other methods.
The need for this doesn't become clear until you consider data. We've dealt with data already. Strings, integers, floating point numbers... these are all pieces of data. Thus far we have not given them names, though.
To do so, we need to make use of variables. Variables are called that precisely because the value they hold can be changed. In Java variables have to be declared before they can be used. Declaring a variable is a simple matter of specifying the type of data and the name. Then you can assign a value to the variable.
Alternatively you can do both in one statement.
Variables in methods are simple. Consider a method which takes two components of a name and joins them into a single string, then prints the whole thing.
More complex is when we use variables outside of methods. Most commonly these are in the non-static context.
Classes with State
An object with only methods isn't terribly useful, since it will do the same thing every time, and might as well be a set of static methods.
Where objects do become useful is when we introduce data ouside of methods. Let's consider a simple example building on the previous examples.
The variables outside of any method are available to any non-static method, such as fullName.
So, why did I declare those variables "private"?
Well, if I try to do the following, an error will be produced.
Why is this a good thing?
We can make firstName accessible from outside the class by declaring it as "public".
However, now we're faced with another problem. We can not only read that variable from outside the class, but we can also write to it.
I don't think we ever meant to change Bob's name to "Jane", but there it is, and with that variable public, it's perfectly legal, and it'll compile just fine. Bob's gonna be ticked.
The original way we had this written was correct. Access to the variables themselves was only possible from within the class, and thus that data could only be changed in ways we specifically allowed. Specifically, in this case, there was no way to change the name once it was created.
This is very important, because it ensures that data will remain in a sensible state.
That said, what if I really want to be able to get the first and last names?
Well, then we have to write methods which can do that for us.
Now we have a way to read those pieces of data, but fortunately not to change them, since we have no way of tracking those changes.
What if we did have a way, though? What if we could change the name and keep track of the changes?
Let's not modify our existing class (much), since having a nice constant Name class could very well be handy. Instead, we'll create a new class called MutableName, which can change. We already have a good bit of the work done with the Name class, so what should we do?
Well, we could copy and paste it into a new file easily enough, but that's working too hard, and then we just have two completely unrelated classes.
Instead, let's create a new class which inherits from (or "extends") our existing class. This means it gets everything in Name for free, and can add extras as it sees fit. We also create a relationship between the two classes. MutableName, under this arrangement, will be seen as a Name by any other method. A MutableName will be able to go anywhere a Name can.
That's it. We recreated the constructor which sets up the initial state of the name, but instead of assigning the variables initial values ourselves, we simply called the parent (or "super") class' constructor to do it for us. Still, this doesn't do much for us. It only has the capabilities of the Name class.
Now, let's try to add a changeName method.
That seems pretty straightforward and reasonable, right? Yes, it does, but it also doesn't work.
When we inherit a class, the new "child" class does not access to the "private" variables and methods in the parent class. As such, our MutableName class cannot access the firstName and lastName variables in the Name class.
Fortunately we don't have to choose exclusively from public and private. There is a third qualifier called "protected". A protected variable or method is not accessible outside of the object, but is accessible from children classes.
Thus we need to make a small change to the Name class.
And now the MutableName class from above will work just fine.
That's just peachy. We can change a name, but still we have no way of tracking those changes. We need something that can store the previous names. There will be more than one, so we'll want an array.
Arrays are used to hold multiple values of a given type and are signified relatively simply syntactically. We'll use an array of Names to hold the previous names. We'll also need to add a line to the constructor to initialize the array. Let's have room for 10 names.
Of course, we'll also want to keep track of how many Names we're currently storing.
And then, let's write a method for adding a name to the previousNames array.
Now we can modify changeName so it automatically backs up.
We'll probably also want an accessor method for getting to the previousNames array.
And of course we can also use this array to restore the previous name. We simply access that previous name in the array, then set our counter back by one.
Now we can create a MutableName in a test program and change names, and we can even access those names. Doing anything with them could be very tedious if we had to access each name manually.
Fortunately, loops make this easy to do automatically.
Every loop is made up of three components:
The Java language's "for" loop provides a convenient syntactic form for these three components.
Now, there's just one thing we're missing from the MutableName class. We need a way to figure out how many previous names have been saved. This means an added accessor method.
Now, we can print all of the previous names, as well as the current name.
Now here's a question: what happens if there are no previous names?
Well, we'd get "Previous names:" printed, followed by nothing, and that wouldn't look very professional.
How do we prevent this?
We need to be able to tell our program to only run certain code if a condition is met. Fortunately this is pretty easy.
Conditionals Part Two: Alternatives, or "The Elsening"
We have a program that prints an individual's current name, and if there are previous names, prints those. If it doesn't have previous names, it does nothing after printing the current name.
What if we want it to do something else? What if I want it to print "No previous names to display."?
That was easy enough, but something's still lacking. If there's only one name, it makes little sense to print "Previous names:". Fortunately we can create a conditional to deal with this.
Polymorphism and Working Smart
So far our uses of the System.out.println method have been quite mundane. We've simply fed it a string. However, System.out.println can also work directly with objects that are not strings. If It receives such an object as its argument, it will look for a toString method on that object.
We can, therefore, simplify our code by providing a toString method in the Name class.
Our Test class can now be as simple as the following.
A Goal and Some Math
For when we print out the previous names, when there is more than one, what if we decide to add numbers?
Well, we could easily see output like:
Notice the problem?
The numbers aren't lined up. To get them lined up, I'd need to insert spaces in front of the numbers as necessary. Of course, to know how many spaces are necessary, I have to first know how many spaces wide the largest number will be.
We can get that number with the getPreviousNamesCount method if any given MutableName object.
Let's write a general purpose static method in our Test class which finds the length of an integer when it's converted to a string.
Let's analyze what we know. We know the width of a number less than ten is one character.
And if it's not less than ten? Well, we can divide by ten and find the width of that number. That plus one will give us the width of the original number.
Here we see demonstrated a concept known as recursion. Recursion involves calling a method from itself to accomplish a looping behavior. As with the "for" loop, a condition has to be present to cease looping. In this case, when the input number is less than ten, the return value is simply one, and the method is not called again.
If we run this method on a number like 11243, we can see exactly how it would be evaluated.
So now we can determine how many spaces an integer will take up when it's printed.
This means we can determine how many spaces have to be added at the beginning of the line so that everything will line up correctly. We can use a for loop to accomplish this.
The Ternary Operator: Embrace the Ugly
In my previous example, I had a simple recursive method stringWidthOfInt.
That's a lot of lines for so simple a decision.
The ternary operator can make this a single line of code. Now, this operator can easily be abused, and should not be used without serious consideration, because it can easily make code unreadable.
However, it does get used, so it's essential that you understand it. It takes a boolean condition, and two values. If the boolean condition is true, it returns the first value. Otherwise it returns the second value.
A ? and : separate the three components of this operator.
How can this get ugly? Well, let's look at a quick sample from the interactive Ruby interpreter. Ruby also makes use of the ternary operator.
Without actually seeing the result, could you have easily predicted it by looking at that expression?
Certainly the following equivalent expression is easier to decipher.
Taking a few extra lines to express something isn't necessarily a sin.
Let's go back for a second, and look at how the MutableName class saves previous names.
It uses an array of ten Name objects. Each time it adds one name, it increments a count of the previous names stored in the array. The effect is that each new name goes onto the end of the array.
However, the array only stores at most ten previous names.
So what happens if I try to store eleven previous names?
An exception indicates that some exceptional event has taken place as the program runs. In this case we've tried to assign to the eleventh element of an array that only has room for ten elements. The "at" comments indicate where the exception took place in the code. In this case the exception took place in "addNameToPrevious", which was called from "changeName", which was called from "main".
The type of the exception object that was "thrown" is "ArrayIndexOutOfBoundsException".
Before we talk about how we can avoid this particular exception, let's look at how we can address it once it occurs.
First we start out by wrapping the offending code in a "try" block. We'll use a for loop to change a name 11 times.
Of course, this does nothing to deal with the exception that will occur. For that, we need a "catch" block.
Of course, more than one kind of exception may occur in a "try" block, and they may all require different actions. Fortunately we can match different exceptions to different types of exception objects.
The generic "catch" block will catch any exception not otherwise handled.
The end result of the above is that the exception is caught, an error message is printed, and then the program continues on as though nothing happened. We could also rethrow the exception, allowing it to be handled elsewhere.
ArrayLists: Avoiding the Exception
The exception that occurs when we change names more than ten times only occurs because the array that's holding those names is limited to storing ten names.
We could create a bigger array in the constructor, but that would eventually overflow anyway.
We could resize the array when such an error occurs, and copy the old array into the new array.
That's tremendously tedious. Instead, let's simply store the previous names in something that can be as large as it needs to be This would also mean that we don't have to keep track of the number of previous names.
The ArrayList class can provide this.
An ArrayList object is just that: an object. It is like any other object. There is no syntax sugar as there is with arrays. There is one change, though. ArrayList is a parameterized type. Just as methods can have parameers, classes can have other classes as parameters. The effect of this is to make sure that an ArrayList can only hold one type of object.
So, enough with the talk, let's see some code. MutableName did look like:
Our Test class should now be altered to:
We've seen how we can use inheritance to extend the Name class into the MutableName class, where we can change the name and keep a record of previous names. Now, what if we want a FormalName class with room for a title as well as a first name and surname?
First, I'm going to rename Name to BasicName.
Looks pretty familiar. Now, let's extend it to have a FormalName class.
The next task is to modify both of these classes to create mutable versions of them. This would be easy enough, except for one problem. If I create MutableBasicName and MutableFormalName classes, I won't ever be able to specify MutableName as a type.
Java only allows a class to inherit from a single other class. MutableBasicName cannot extend both BasicName and a MutableName class. Being able to do that would establish an "is a" relationship, allowing us to classify both MutableBasicName and MutableFormalName as MutableName classes.
Fortunately, this kind of relationship is possible via an interface. Interfaces in Java do not specify any method implementations. They only specify methods the implemnting class must have.
We can boil MutableName down to four common methods.
That is, any MutableName should be able to change the name to some other BasicName, restore the previous name, give you a list of previous names, and tell you how many previous names there are. How we accomplish this is up to the implementing class.
Now, to create a MutableBasicName.
And of course we can do the same for MutableFormalName, but there's a complication. The MutableName interface specifies that we're dealing with BasicName objects. We can use FormalName objects because a FormalName is a BasicName.
Wherever a method expects a BasicName, we can give it a FormalName. However, when we do that, the FormalName will now be treated as a BasicName, without the added capabilities of a FormalName, such as the ability to access the title.
To deal with this, we need to add a few new ideas to our toolbox.
Casts allow us to change one type of data to another, related type of data. We can downcast, by turning a FormalName into a BasicName. This presents no great problem. A FormalName inherits all of a BasicName object's capabilities, and so it can function perfectly well as a BasicName.
Going the other way can be troublesome. A FormalName is a BasicName, but a BasicName need not be a FormalName. Trying to upcast a BasicName to a FormalName will not work unless the object being cast is actually a Formal name.
To test for this, we have the "instanceof" operator, which you'll see put to use in the following example.
Interfaces provide us a way to establish an "is a" relationship between classes which only share certain important capabilities, rather than sharing a common lineage.
Let's look at one particular method from our previous example.
What happens if there aren't any previous names to restore?
Well, as it stands, nothing. We just silently continue to use the current name.
This is widely considered to be bad practice. We asked the object to restore the previous name, and it didn't do so. This is exceptional behavior. It's not what should happen. We should signal this in some way.
Fortunately Java provides us with exactly the tools we need to do so. Exceptions are objects which can be "thrown" to indicate that some unexpected thing has happened. There are many classes of exceptions. The class of the exception can go a long way toward telling us exactly what kind of exceptional behavior occurred.
As a result, it would behoove us to have our own NoPreviousNamesException class.
That was easy, wasn't it?
Now, we need to modify our restorePreviousName method to throw the exception.
|Author:||wtd [ Thu Feb 07, 2008 5:04 pm ]|
|Post subject:||RE:Java Intro, in one post|
So we can throw an exception. Whoop-de-doo.
What does this do? Well, it interrupts the execution of the program entirely. Everything grinds to a screeching halt.
This is bad, right? Of course it is. This is why we want to provide code to deal with the exception. When we do this, we can fix things up, and let the program continue on its merry way.
So, let's do see this in action.
This throws an exception and we never see the second output.
So let's try to restore the previous name, but catch a NoPreviousNamesException and handle it by doing nothing at all.
The "e" in the "catch" is the NoPreviousNamesException object that was thrown.
Of course, we could also effectively do the same thing by specifying no particular class of exception, and catching all exceptions.
This is, in fact, probably a lot nicer to look at. The problem is that it now catches any exception, including those I might not have foreseen. It's better to be specific about which class of exception we're actually handling.
Now, I'd mentioned that "e" was the exception object the restorePreviousName method threw. What if we wanted to print an error message, but still throw the exception so that some other level of exception handling could deal with it?
Knowing this, let's add a restoreOriginalName method to the MutableName interface, and implement it in MutableFormalName. This will skip back to the very first name used.
One approach would simply be to grab the first name from the previousNames ArrayList and use that to restore the name, then clear the list.
Of course, an exception may be thrown, so we need to keep it from wreaking havoc.
Of course, what if we just kept restoring the previous name until an exception was thrown? That would accomplish the goal too. In this way we can use exceptions as a form of control flow.
It should be noted, though, that while this is possible, it is not often done. The former version involves much less work. Nevertheless, the latter is possible, and any Java programmer should be prepared to run into it.
As a sidenote, the while loop performs a block of code repeatedly, until a condition is not met. Since we gave it "true", it will loop forever. In this case it only stops when an exception is thrown, breaking the flow of execution.
At this point it's important to recap all of the code we've written so far.
One More Thing...
We've discussed output extensively thus far. Unfortunately, we haven't discussed input.
For output, we've used the "System.out" object. There is a corresponding "System.in" object, but it's relatively primitive, and provides little functionality. To gain useful functionality, we create a BufferedReader object from "System.in", using an InputStreamReader object as an intermediary.
The most common use of this is almost certainly reading a line the user has entered.
This string can then be used as we would use any other string.
If we look at the MutableFormalName class, we can see a problem.
Even though we want to internally deal with FormalName objects, we have to implement it in terms of the BasicName object. Why is this?
The MutableName interface specifies that we must deal with the BasicName class. This includes the FormalName class, due to inheritance, so this approach mostly works. However, it means that when we want to change the name stored in a MutableFormalName object, we can give it a BasicName and that's perfectly fine.
It also means that we have to insert casts to turn BasicName objects into FormalName objects. The problem with this arises when a BasicName object really is a BasicName. In this case, it cannot be cast to a FormalName. A ClassCastException will be thrown.
What we need is a way to specify exactly which kind of name the MutableName interface is acting on. Generics provide this.
The generic type(s) resides between the < and >. By saying that NameType extends BasicName, we constrain the type used here to being either BasicName or some class which inherits BasicName (such as FormalName). We can then use NameType wherever we had previously used BasicName.
The changes to the MutableFormalName class are subtle, but important. The most important is that all casts have been removed. When casts are removed, the opportunity for casts to fail is also removed. This is the great benefit of generics.
Anonymous Inner Classes
One of the last few things we saw were interfaces. As a recap interfaces provide a means to specify what functionality related classes should implement. Two classes which are not directly related through inheritance can both implement an interface and be considered related.
Normally, to use an interface, we need a class which implements it. But there could be a huge number of classes which implement this interface, all in slightly different ways. How do we give them all names?
Fortunately we don't have to.
We can directly create an instance of an anonymous class which implements a particular interface.
Let's look at an example.
I have an array of ten names input by the user. For this purpose We'll use the simple BasicName class.
How do we sort these names? That's a good question, and I could provide a lengthy tutorial on sorting algorithms.
However, the essence of being a good Java programmer is learning how to write code that takes adantage of the libraries Java provides.
One of those classes Java provides is called Arrays. The Arrays class contains numerous static methods which perform useful operations on arrays. One such method is sort, which sorts an array. So, we pass the "names" array to Arrays.sort, and they get sorted.
But we want a very specific behavior that this method has no way to know about. We want to sort first by the last name, and then by the first name. A bit of research shows, however, that there is another sort method which takes a Comparator object as an argument. Further research shows that Comparator is an interface which specifies a "compare" method.
The compare method takes two arguments and does a comparison of them. If they're equal it returns zero. If the first argument is greater than the second, it returns one. Otherwise, it returns negative one. By creating an anonymous Comparator object which defines that method in such a way as to do the proper comparison of names, we can do our sort.
As you may have noticed, Comparator is also a generic interface.
Within the method we've defined, we use the compareTo method of the String class to compare the individual parts of a name. Only if the last names are equal do we compare the first names.
Creating the anonymous Comparator gave us a convenient way to have a custom behavior without having to actually write code to manage the sorting of the array.