After the publication of source code for a widely popular highschool Computer Science project – Forces (which I wrote about before), the author raises the question of plagiarism of programming assignments.
“Do you think posting source code (especially in turing) is a bad idea because so many people may steal it and claim it as their own?”
The question implies theft for the purposes for academic dishonesty. A student could have a number of reasons to cheat:
And while plagiarism occurs in every kind of class, it is especially common in computer science for two reasons:
It is easy to copy because assigned programming projects tend to be very open ended (“make a game”), and availability of source code is much more common than amateur English class papers on the book your class has read. Code is found in Open Source projects, code is posted for peer review, and often times people just like to show off. I share, and read code to teach and learn.
The problem, as I see it, is that students don’t think that they will get caught. High schools probably need to implement a stricter policy – Universities are very strict, and suspend students for repeat academic offenses.
Though what it comes down to, is that those who cheat are often those who don’t understand the finer points of programming art. Having met such students, the general view of the code is a very impersonal, machine syntax. The common misconception is that there is just one “correct” way of programming a function, and so it leads some to believe that their own work would have came out identical to someone else’s, if they were to put in the effort. They are very mistaken.
Program’s source code carries as much unique style as does your hand writing. Visually, variable naming conventions and indentation structure are the most obvious ones. Code organization preference is similarly as apparent – how does one package functions, or order variable declarations? Finer details include preference for certain function calls over others, and order the lists are placed in.
For example, I always code as if ( 1 == variable ) then, which is counter-intuitive and will almost always raise a flag as to “why?”. On the flip side, if I was to suddenly submit an assignment that is functionally identical, but has the reverse order – questions will be raised just as well.
There are many such code signatures, and if they are not consistent over all the little assignments and written tests, leading up to the big final project, it is pretty easy to tell if the work is done by the same person or not. A common way to test English papers for plagiarism is to perform a Google search on a unique sounding sentence. Similarly, unique looking code snippets can be very easily identified, and with the ever expending tools offered by Google (such as Code Search) it is so much easier to look at code similarities beyond variable names.
Ultimately it’s up to the teachers to make the call.
An excellent check would be to have students explain the code, not just through comments, but with a very quick verbal presentation. It should be pretty obvious what a student is capable of.
I would love to hear what kind of experience people had with code plagiarism (be honest if you’ve ever cheated). Post a comment, and then take a look at an archive of such stories.