So, I was watching a first-year Harvard lecture on C programming the other day. One of the assignments given to the students involved writing a program that takes a bmp file and a number n as input, and outputs the image multiplied in area by n to a bmp file. It was then that I realized that I had never done I/O on anything but text files. I didn't have a clue how other file types worked.
My challenge for compsci.ca users (primarily students, but other people to, if you want) this summer is to write a program that does something with a common file format. A bitmap is a good place to start, as it's a fairly basic format. Add a watermark, change the colors, be creative! After that, move into more involved formats such as ZIP files or other compression formats, MS Word documents, etc. Maybe even design your own format for something.
This isn't a contest. It's a challenge. Everybody wins, unless you don't learn anything. Then only everyone who learned something wins.
Sounds fun, I might take a stab at it.
I haven't worked with anything other than text files as well (besides my own encrypted "text" files for program saves).
long i = b[3] << 24 | b[2] << 16 | b[1] << 8 | b[0]
You may have to fiddle with the indicies, depending on whether the source is big or little endian and whether or not I've screwed them up.
Alternately, DataInputStream, which you can extend to fill in the missing unsigned-int, unsigned-long logic (why they didn't include those is beyond me).
input-output section.
file-control.
select optional indexing
assign to "indexing.dat"
organization is indexed
access mode is dynamic
record key is keyfield of indexing-record
alternate record key is splitkey of indexing-record
with duplicates
.
*> ** OpenCOBOL does not yet support split keys **
*> alternate record key is newkey
*> source is first-part of indexing-record
*> last-part of indexing-record
*> with duplicates
learn some of this. If you ever want to work at a bank, insurance company or government agency, understanding the above code can get you a great big resume++
Cheers
long i = b[3] << 24 | b[2] << 16 | b[1] << 8 | b[0]
You may have to fiddle with the indicies, depending on whether the source is big or little endian and whether or not I've screwed them up.
Alternately, DataInputStream, which you can extend to fill in the missing unsigned-int, unsigned-long logic (why they didn't include those is beyond me).
Ah, I got it, thanks!
All the bytes were in reverse order
learn some of this. If you ever want to work at a bank, insurance company or government agency, understanding the above code can get you a great big resume++
I suppose it's a good thing I never want to work for a bank, insurance company, or government agency, because that code is terrifyingly ugly. Not that these organizations generally do any better in any other language.
Ah, I got it, thanks!
All the bytes were in reverse order
That's the endianess, this great computer science concept in which you can put the bytes in any order because it doesn't matter to the computer. You never put the individual bits in the byte backward though.
I might be missing something, but is there any useful reason to use little endian?
ultimatebuster
Posted: Sat Apr 23, 2011 11:40 am Post subject: RE:Insectoid\'s Summer Programming Challenge
isn't this just non-text binary formats?
Insectoid
Posted: Sat Apr 23, 2011 11:52 am Post subject: RE:Insectoid\'s Summer Programming Challenge
Pretty much. But you have to deal with things you usually don't with text files. Most languages have advanced functions for dealing with text (fscanf & getchar in C, for example). With non-text formats you're limited to raw byte reading and writing* (fread and fwrite in C), and all the formatting has to be done yourself. Also, doing this forces you to learn about how files are stored on a drive, which isn't really taught in school until 2nd year uni or higher (at least 2nd year- I still haven't been taught it).
*Yes, I know libraries exist to read popular formats, but that would defeat the purpose of this challenge.
ultimatebuster
Posted: Sat Apr 23, 2011 1:17 pm Post subject: RE:Insectoid\'s Summer Programming Challenge
I'm not sure if Python can write non-text format via file.write
nm it can. though i think i has to be direct HEX code or via struct.
Following writes a text file in binary directly.
code:
with open("test.dat", "wb") as f: f.write('\x75\x6C\x74\x69\x6D\x61\x74\x65\x62\x75\x73\x74\x65\x72')