Posted: Tue Dec 08, 2009 8:41 pm Post subject: looking for an introduction to compilers
just out of curiousity, could [OpenT] be made in c++, (is there a reason why its not) and when is the estimated release date??
Mod edit: Split out of OpenT discussion
Sponsor Sponsor
Tony
Posted: Tue Dec 08, 2009 8:47 pm Post subject: RE:The Project: OpenT
This could have been made in any language. Though Java offers tools for building compilers, and JVM is cross-platform, which is one of the project's goals.
Posted: Tue Dec 08, 2009 10:31 pm Post subject: RE:The Project: OpenT
alright, just wondering, im going to talk to my compsci teacher about putting together a little team of students to actually make a compiler, and maybe make it count as a course for us, cuz I know with the motivation of a credit my team would finish this by the end of the year... but can someone point me to a tutorial or something about the basics of making a compiler. And yes we'd of course share the code with all of you.
Tony
Posted: Tue Dec 08, 2009 10:57 pm Post subject: Re: RE:The Project: OpenT
mirhagk @ Tue Dec 08, 2009 10:31 pm wrote:
im going to talk to my compsci teacher about putting together a little team of students to actually make a compiler
For Turing?
Keep in mind that Compilers can be found as a 4th/grad level course at Universities. Yes, that's a group project course.
- You'd need language's grammar (I see that you saw Dan's post regarding that).
- You need to be able to parse that language; SLR(1)/LR(1)/LALR(1)
- Use the parser to build an Abstract Syntax Tree
- Then you start building a binary (or bytecode) from the AST.
Point is -- it's a very interesting project, but you need to be aware of the scope of work you are prepared to take on. It might be much more realistic to pick a smaller language (perhaps a subset of Turing), but even then, there's quite a bit of theory overhead to just get started.
Posted: Tue Dec 08, 2009 11:15 pm Post subject: RE:The Project: OpenT
alright but can you point me to a tutorial or something, anything to get me started... If i can make a semi-compiler that simply compiles a hello world program by next week then I can get my teacher to allow this course. And it probably wouldn't contain all of the language, i'd be happy if by the end of the course it did:
variables
loops
functions
procedures
input
output (even drawing a single pixel, cuz everything else could be created using procedures using that)
i've created a pseudo-compiler in turing that ran the previous code fragments so i know some of the theories of how to do it
Tony
Posted: Tue Dec 08, 2009 11:31 pm Post subject: RE:looking for an introduction to compilers
You'd need to consider both ends of the problem.
One one hand, you can start reading up on parser theory. Would LR(0) be sufficient for what you want to do, or would you have to build LR(1)?
On the other, you should decide to what the compiler will compile to. Machine code (what architecture)? Assembly? JVM/.Net? Some language that already has a compiler?
You might also want to pick up a copy of the Dragon Book.
Posted: Wed Dec 09, 2009 8:20 am Post subject: RE:looking for an introduction to compilers
I have an idea, it's probably pretty dumb but I think it could kinda work.
What if we built a Turing -> Machine Compiler in Turing. I could spend less time working with syntax, and more time actually working on the Optimization. Then once a semi-optimized version is complete, I'd pass the Compiler through the compiler giving us an .exe of the compiler to use on our turing programs.
Since it'd be done 100% in Turing, it could easily be upgraded and added to, and anyone could learn to understand it.
It's probably a dumb idea, but why not (other than lack of compiler tools, but I don't want to learn Java anyways)? I mean it wouldn't be decreased by Turing's speed since it'd be it's own compiler so we've basically got rid of the disadvantages of Turing.
I don't know, it's just an idea.... I either create that or a compiler in C++ and personally I'd rather see something that anyone can contribute to.
DemonWasp
Posted: Wed Dec 09, 2009 10:43 am Post subject: RE:looking for an introduction to compilers
Compiling your compiler with itself isn't an uncommon strategy, it just requires that you have an existing compiler for your code, which you do in Turing's case. However...
- You really don't seem to understand the complexity of parsing and lexing Turing (which has a fairly basic syntax compared to C++). Compiler tools are indispensable. My own experience with compilers comes from CS251, a 4-month course focused entirely on writing a compiler for a subset of C called WL. WL is a TINY language: we implemented loops, int variables and if statements, and that took 4 months of 2nd year university level CS.
- You seem to think that Turing or C++ is necessarily easier than Java. This isn't really the case. In my experience, writing code is easiest in Turing and hardest in C++, with Java very close to Turing. In debugging, however, there's no contest: Java trounces both competitors.
- Unless I'm very much mistaken, you seem to have no prior experience in x86, x64 or any other assembly language. These are surprisingly complex, and without excellent knowledge of these languages and their environment, you'll probably end up with a slow or incorrect (or both) result. This is why OpenT went with compiling to JVM, as the JVM uses a relatively small, simple set of instructions.
- Anyone can learn to read Java. It's really not that hard. The language isn't the prettiest, but it's similar enough to C's syntax that it should prove relatively readable. It also has some of the best IDEs in existence (see: Eclipse) to help handle the larger projects, like a compiler. Readability hinges more on design choices, comments and naming conventions than on language choice - though some languages, such as C++, are actively bad for readability.
- The disadvantages of Turing are far, far greater than you know. It's not the speed (which is a disaster already) so much as how restrictive the language is, and how poorly it implements basic concepts like OOP and function pointers. While it has some surprising library functionality, it also has a very small, largely useless set of libraries (make a list in Java? List<Whatever> = new ArrayList<Whatever>(); make a list in Turing...uhm...hm). Turing also entirely fails to implement method overloading, and makes a mockery of public / protected / private with its export / export var variation.
- Almost everyone here who would contribute to your project can write in Java or C++ anyway, so if that's your second choice, I'd go for that instead. Well, actually, if you're thinking C++ then I'd recommend you write in C instead, but that's probably just me loathing C++ for all of its massive failures.
Just be aware that you're biting off an awful lot of work building a compiler, particularly in Turing.
Edit: Clarification of last point.
Sponsor Sponsor
mirhagk
Posted: Wed Dec 09, 2009 11:06 am Post subject: RE:looking for an introduction to compilers
I don't have a lot of experience with assembly, but I have some experience and a huge array of books in front of me to help me do this. I have already made a Turing to Turing pseudo compiler (yeah pointless mostly but I did it to figure out how to do things like functions and loops)
I got that dragon book, and I'm working with it right now. I am certainly not thinking I'm going to completely create a full compiler. I'd be happy with loops procedures, variables (with simple manipulation) and put commands (don't know how to do that, but I'm going to pick up another book on x86 or some assembly language)
I'm going to be working all semester with a team of 3-4 people and I'd be satisfied if we got that far, but I'm learning about compilers before I even start that course (we're going to get it into a course for us)
apomb
Posted: Wed Dec 09, 2009 12:36 pm Post subject: RE:looking for an introduction to compilers
I'm curious as to what level of schooling you are in. Is this a university level research class?
Prabhakar Ragde
Posted: Wed Dec 09, 2009 2:02 pm Post subject: RE:looking for an introduction to compilers
May I suggest another reference: Appel's "Modern Compiler Implementation in X", where X is one of {Java, C, ML}. I'm personally partial to the ML one, but I know that's a minority view. --PR
(PS Of course the whole project is insane, but you may learn something despite that.)
mirhagk
Posted: Wed Dec 09, 2009 4:33 pm Post subject: RE:looking for an introduction to compilers
the group is now 6 people, and possibly going to be bigger. It's a mix of senior high school students, and we figure even if we dont finish it, we'll still learn alot.
oh and we're debating about one of several different things
compiler built in Turing vs. C++ (figure building in turing would allow for greater ease of modification, but c++ just seems more right for the job)
compile to machine vs assembly vs C++ (if we compile to C++ then we can use a C++ compiler to compile to an .exe file)
let me know your thoughts okay. The compiler is going to be a step by step compiler with the following steps:
turing to code tree
code tree debugger
debugged to pseudocode
pseudocode to optimized code
optimized code to assembly (or w/e)
assembly to optional optimizer
if there's anything I'm missing/ should change about the flow then let me know okay?
DemonWasp
Posted: Wed Dec 09, 2009 5:50 pm Post subject: RE:looking for an introduction to compilers
What do you mean by "code tree debugger"? You mean a step where you assess the program for simple syntax errors and variable scope?
Also, if you compile to C(++) code, how exactly are you going to report any compiler errors / linker errors generated by C? Turing's reporting of errors is really really nice compared to typical C / C++.
Compiling to "assembly" is a single translation step away from compiling to machine code, so it would probably be better to go to C(++) or machine.
Why the pseudocode step? What does that accomplish (given as you already have an AST)?
Tony
Posted: Wed Dec 09, 2009 5:59 pm Post subject: Re: RE:looking for an introduction to compilers
mirhagk @ Wed Dec 09, 2009 4:33 pm wrote:
if there's anything I ... should change
Don't bother with any optimization. Seriously, you can optimize it after you'll have everything working.
Also, think about how you are going to test this. When you make a change to how AST is generated, or *gasp* optimized... how would you make sure that all the valid code is still compiled correctly?
Posted: Wed Dec 09, 2009 6:07 pm Post subject: RE:looking for an introduction to compilers
If you guys are actually serious about doing this, then I hope you stick to Java as the language. It's close to C++ so it's not too much to learn and it would be good for the people with non-Windows operating systems. But if not, please keep it open-source so that it can be ported/compiled for other platforms.