Computer Science Canada

How much operator overloading should be done?

Author:  DtY [ Thu Dec 23, 2010 12:29 pm ]
Post subject:  How much operator overloading should be done?

There's a lot of debate on whether or not operator overloading should be available in languages, but I think it would be more meaningful to establish how much operator overloading should be done.

The case against operator overloading is that operators are often overloaded to do something that that operator is not meant to mean. For example, what does it mean to add two strings? This might seem obvious to anyone who has used a language that supports operator overloading (and for some reason Java Strings are allowed to), but in reality concatenating strings is very different than arithmetic addition.

But does that mean that mean we should not be allowed to overload the addition operator? I say no, because there are cases that you would want to use the addition operator on your own class, in a case that it would be a lot like arithmetic addition, in the case of arbitrary precision integer classes, rational number class, complex number class; in all of these addition does mean addition.

And then there's the equality operator. You will likely want to test if two objects of any class are equal (and how often will you want to check if they point to the same place?). It's not a stretch to say that "hello" is equal to "hello" but is not equal to "world", where it would be to say that "hello"+"world" is "helloworld". But then, we get into more problems with the inequality operators (not equal to is obvious, of course). Is "hello" more than "world"? Most languages with operator overloading will compare the characters' ordinal values one by one until it finds one that's not equal. And how useful is this really? it's not even guaranteed to be the same across platforms.

I'm actually okay with overloading the arithmetic operators, I think it's pretty natural that adding two strings is the same as concatenating them, but this may be because my first language is python. I can't say what I'd think if my first language was Perl or PHP, which have a different string concatenation operator. So maybe we can decide that + is actually the addition and concatenation operator, and then we can use it on all sorts of objects.

And then we have unordered types such as sets and hashmaps (dictionaries). What does it mean to add two sets together? Maybe this is just weird because in maths we made up a new operator for the union and intersection operators. What we want to do is find all the objects that are in a OR in b, this sounds like the or operator. Want everything that is in a AND in b? That's the and operator. But is this really natural? The boolean and and or operators act a lot differently than the set union and intersection operators. Are these really natural? It might just seem natural to me because I've never used a language that has its own set union and intersection operators.

But now, these operators are not the reason I made this topic, I think all the aforementioned operators are natural enough that they should be used like that. But, there are a few more I'm not so sure about.

The biggest culprit here is the bit shift operators (<< and >>). C++ famously overloaded these so that they were now the bit shift or stream extraction/insertion operators. Wait what? Those aren't even similar. So why is << the stream insertion operator? Because it looks like that's what it should do.

code:
stdout << "Hello world"


So you're putting "Hello world" into stdout. Looks like that's what it does. But, how much sense does it make to overload operators because it looks like it could do that? Say I do this with strings "hello" & "world", and it returns "helloworld". Does that make sense? We read that in English as "hello and world", so it makes a natural concatenation operator. Or this "hello"! (where ! is normally the factorial operator), this will give me "HELLO", because the exclamation mark makes it look like it should be an exclamation and so does making it all capitals.

And now, Ruby (and C++ might even do this) goes one step further and uses the left bit shift operator to append items to a list. A hated this at first, but lately I've found myself using it, because it's convenient. The reason that this use is so wrong to me is that the bitshift operator is a pure operator, it does not change the value of either operands, but if you shift a "hello" bits to the left, you've suddenly added a new item to the array a.

And then, Ruby takes it even further by overloading the array element operator (I'm sure that name's wrong) on the Proc (procedure) class to call it. If you do myproc["hello", "world"] it looks like you're accessing elements "hello" through "world" of the array myproc, but you're actually calling a procedure with the arguments "hello" and "world".

---

It seems that so far the only response from object oriented languages is to completely remove operator overloading. This is an awful solution because you end up doing stuff like if (a.equals(b.add(c))), and if I wanted to do that I'd use lisp.

So how much should we stretch the meaning of an operator?

Author:  wtd [ Thu Dec 23, 2010 2:46 pm ]
Post subject:  Re: How much operator overloading should be done?

DtY @ Fri Dec 24, 2010 1:29 am wrote:
It seems that so far the only response from object oriented languages is to completely remove operator overloading.


Hardly. Ruby is a thoroughly object-oriented language, and it supports operator overloading all over the place, as you noted. Java supports operator overloading as well, it just does so for special cases. Consider strings and the addition operator.

Author:  DemonWasp [ Thu Dec 23, 2010 4:14 pm ]
Post subject:  RE:How much operator overloading should be done?

Operator overloading is syntactic sugar that's beautiful when it works and awful when it fails. The most obvious applications of operator overloading are non-primitive algebraic objects: vectors, matrices, polynomials.

Polynomials are well-behaved: it makes sense to define the standard add/subtract/multiply/divide as well as the +=, -=, *= and /= operators. It makes a reasonable amount of sense to overload the array-access operator, [], to get coefficients for the given exponent. You could even make an argument for using the >> and << operators to shift coefficients left or right (5x^2 >> 1 becomes 5x, for example), though I wouldn't.

Vectors are pretty well-behaved: you can define +, - and +=, -=. You run into problems when you define * or *= (dot product? cross product?) and / or /= (how do you divide vectors? divide individual components?). There's the bigger problem, though, of handling cases where the vectors have different sizes or bases (one in R^2, one in C^3?), though this can be alleviated with templates.

Matrices introduce the complication that they have two dimensions, and that there are at least 3 different ways to multiply two matrices (though, admittedly, only one really common one). Again, you can handle this with templates to some extent but you'll rapidly find that the syntax sugar provided is overrun by the extra work templates require, not to mention your poor compile times.


One of the worst aspects of operator overloading, in my opinion, is the fact that now what appear to be primitive operations (addition, subtraction) are now implemented in methods, which has three major downsides:

First, your method for addition could throw an exception that proves difficult to hunt down because it doesn't look like the actual source line could even throw an exception.

Second, this can imply some substantial overhead to operations that seem like they should be trivial. Suppose you have an Iterator that defines operator++ to be a call to next(). This lets you have the nice idiom:
code:
for ( Entry e = NULL; e = iterator.getEntry() != NULL; iterator++ ) { ... }

in languages that lack a for-each construct. However, there's a problem: because of the semantics of the ++ operator, the above implies a copy of the iterator object. The correct way is then:
code:
for ( Entry e = NULL; e = iterator.getEntry() != NULL; ++iterator ) { ... }

...and although that's the form I prefer anyway, it's non-obvious and easy to overlook.

Third, you now have to be extremely careful that you know how and whether the semantics of your operators compare to the semantics of those operators on primitives. An example:

code:
A + B + C


If A, B, C are ints, then this is obviously the sum of A, B and C. You have associativity ( ( A + B ) + C == A + ( B + C ) ) and commutativity ( A + B = B + A ).

If A, B and C are lists, then you could probably suppose that it's a list containing [ (elements of A), (elements of B), (elements of C) ]. Note that this preserves associativity. However, note that we no longer have commutativity, as ( A + B != B + A ).

Consider an example from a real language. Python allows you to take "hello" * 5 and get "hellohellohello", an overload of I'll call the String*int operator. Now, distributivity cannot apply:
code:
( "hello" + "world" ) * 3

will result in "helloworldhelloworldhelloworld", which is different from the distributed version: "hellohellohelloworldworldworld".


Maybe these particular examples aren't a huge issue to you, but they're a violation of the contract of the + and * signs and I have a problem with code that does that. That said, sometimes I'd really like my Vector3f classes to implement +, -, *, +=, -= and so forth.


TL;DR: Operator overloading violates assumed contracts (associativity, distributivity, commutativity) and is basically a lie that can come back to haunt you. When used properly, it's quite nice, but the applications are rare. Use sparingly.

Author:  DtY [ Thu Dec 23, 2010 11:26 pm ]
Post subject:  RE:How much operator overloading should be done?

wtd: I meant the response from OOP languages that didn't like operator overloading. And speaking of those special cases in Java; how did those happen? Why is it that you're allowed to add strings, but not compare them?

DemonWasp: Those are also good examples against overloading, I'd not thought about any of those.

Author:  TheGuardian001 [ Fri Dec 24, 2010 12:26 am ]
Post subject:  Re: How much operator overloading should be done?

You can't use the == operator because Strings are objects. Well, technically you could use ==, however it will almost always be false.

== looks for equality in the Objects, while String.equals() looks for equality in the contents. For example:
code:

String s1 = "Hello";
String s2 = new String(s1);

s1 and s2 both hold the value "Hello". However since s2 was declared as a new String object, it will return false when compared to s1 with ==. Even though their contents are the same, they refer to distinct objects.

Unless you explicitly state that s2 = s1, the == operator always returns false.

To sum that up:
s1 == s2:
reference of s1 == reference of s2

s1.equals(s2):
value indicated by reference of s1 == value indicated by reference of s2

Author:  wtd [ Fri Dec 24, 2010 12:38 am ]
Post subject:  RE:How much operator overloading should be done?

Addition of strings is just a special case. No good explanation can exist. Apparently someone thought it was critically important that operator should be overloaded, and that such a case would never again ever exist.

Author:  DemonWasp [ Fri Dec 24, 2010 9:18 am ]
Post subject:  RE:How much operator overloading should be done?

The rationale probably went along the lines of "we have to make the language easy to get into somehow".

On the other hand, I'm inclined to think that operator overloading, made selectively available to a very few types, is a good thing. First, this allows you the flexibility of operator overloading when dealing with extremely common problems (everyone concatenates strings every day; it needs to be simple, preferably trivial). Second, this prevents idiots from introducing operator overloading to your codebase. While I'd prefer to just not work with idiots, I find that a language that "gently" "suggests" you not do things the wrong way can be helpful. Not that there isn't plenty of Java done the wrong way, but at least nobody's subclassed List and overloaded the division operator yet.

I'm inclined to say that this isn't uncommon, either. There are plenty of languages that allow easy string concatenation without operator overloading.

Notably, this isn't a weakness in the JVM, this is just a choice made by the Java language specification people. The JVM also hosts Groovy, which can apparently handle operator overloading.

Author:  OneOffDriveByPoster [ Sun Dec 26, 2010 11:56 am ]
Post subject:  Re: How much operator overloading should be done?

So my question here is, what is better? Letting people make mistakes but empowering them to express themselves more freely, or not giving them the tools to make the mistakes in the first place.

Author:  chrisbrown [ Sun Dec 26, 2010 12:50 pm ]
Post subject:  Re: How much operator overloading should be done?

OneOffDriveByPoster @ Sun Dec 26, 2010 11:56 am wrote:
So my question here is, what is better? Letting people make mistakes but empowering them to express themselves more freely, or not giving them the tools to make the mistakes in the first place.


Increased functionality and flexibility become bad things only through misuse. A shovel, an excavator, and some dynamite will all make holes in the ground, but only one of them will do it both cleanly and efficiently. However, I'd like to be able to use the others when the situation calls for them.

Author:  DemonWasp [ Sun Dec 26, 2010 2:38 pm ]
Post subject:  RE:How much operator overloading should be done?

I'm perfectly happy to live in languages without operator overloading. I don't have a problem with typing A.add ( B ) instead of A + B a few times to avoid the headache that would be induced when I have to use operator-overloaded code developed by other people.

Author:  DtY [ Sun Dec 26, 2010 7:08 pm ]
Post subject:  Re: How much operator overloading should be done?

TheGuardian001 @ Fri Dec 24, 2010 12:26 am wrote:
You can't use the == operator because Strings are objects. Well, technically you could use ==, however it will almost always be false.

== looks for equality in the Objects, while String.equals() looks for equality in the contents. For example:
code:

String s1 = "Hello";
String s2 = new String(s1);

s1 and s2 both hold the value "Hello". However since s2 was declared as a new String object, it will return false when compared to s1 with ==. Even though their contents are the same, they refer to distinct objects.

Unless you explicitly state that s2 = s1, the == operator always returns false.

To sum that up:
s1 == s2:
reference of s1 == reference of s2

s1.equals(s2):
value indicated by reference of s1 == value indicated by reference of s2
I know why it doesn't work, it's just completely counter intuitive to allow + to concatenate strings, but not == to compare them (the plus should do pointer arithmetic, if such a thing can be done in Java).

wtd @ Fri Dec 24, 2010 12:38 am wrote:
Addition of strings is just a special case. No good explanation can exist. Apparently someone thought it was critically important that operator should be overloaded, and that such a case would never again ever exist.
I was worried that would be the case. It's too bad they didn't extend it to classes for the various numeric types.

OneOffDriveByPoster @ Sun Dec 26, 2010 11:56 am wrote:
So my question here is, what is better? Letting people make mistakes but empowering them to express themselves more freely, or not giving them the tools to make the mistakes in the first place.
I'd prefer having them everywhere, they're definitely very convenient, and I don't think there should be any features in a language that are reserved only for the language designers.

Author:  wtd [ Sun Dec 26, 2010 10:41 pm ]
Post subject:  RE:How much operator overloading should be done?

General rule to work by: the most common cases should be the shortest.

How often do you compare strings for equality vs. comparing them for object equality?

Author:  mirhagk [ Mon Dec 27, 2010 12:15 am ]
Post subject:  RE:How much operator overloading should be done?

and it'd be easy to compare the actual pointers (at least in most languages). It'd be much less convienent to check if the values are the same, requiring calling a function each time.

Author:  DemonWasp [ Mon Dec 27, 2010 12:52 pm ]
Post subject:  RE:How much operator overloading should be done?

At least it's consistent: equality is always determined by public boolean equals ( Object other ), never by ==. This means that collections and other classes using those objects can uniformly employ the .equals() method. If you had both approaches, those classes couldn't do that.

Pointer comparisons are avoided in Java because Java avoids pointers in general. There are relatively few good reasons to do pointer arithmetic in the types of application Java is suited for, so this makes a lot of sense. In Java, however, == is an actual "pointer comparison" in the sense that it determines whether the objects are actually the same. This, combined with the above, makes it clear that for Objects, == means "is the same thing" and .equals() means "represents the same thing".

Author:  wtd [ Mon Dec 27, 2010 1:09 pm ]
Post subject:  RE:How much operator overloading should be done?

No one would argue that at least where objects are concerned, == and equals() are consistent. But... did they make the right choice, or did they make the more common case more awkward?

Author:  DemonWasp [ Tue Dec 28, 2010 6:57 pm ]
Post subject:  RE:How much operator overloading should be done?

The common case may have become more awkward, but that's never really been a concern for Java's engineers, or else they'd have had autoboxing/unboxing from version 1.0.

I'm actually in favour of the way it is now over vice-versa; my reasoning is that == is a primitive operation, while .equals() is obviously an object-oriented operation. If you reversed it, then you have == suddenly becomes an object operation and .isSameObject, or perhaps ===, is now a primitive operation. If anything, what they should do is extend the language so that another operator (perhaps ===) translates to the equals() method in the same way that certain cases translate to the toString() method. The problem with this is additional language complexity and additional confusion for newcomers, which I don't think is worth the improvement in the "common", but admittedly still rare, case of equals().


: