
-----------------------------------
Zeroth
Mon Jul 14, 2008 3:52 pm

Lexer/Parser experience?
-----------------------------------
I have some basic experience with constructing a lexer/parser, however, I wondered if there was anyone here with way more knowledge than I. 

I'm planning to work on a project that requires a fully-working lexer/parser for Python. I've been looking at pypy, and its very capable... except not very clean, nor easy to understand.

-----------------------------------
Tony
Mon Jul 14, 2008 4:37 pm

RE:Lexer/Parser experience?
-----------------------------------
Likely not "way more". I'm just finishing up with the "baby compilers" course here at UW (CS241) -- scanner / parser / compiler, for a really simple made up language.

-----------------------------------
rizzix
Mon Jul 14, 2008 8:19 pm

RE:Lexer/Parser experience?
-----------------------------------
Don't know about PyPy but SableCC is what I use for [url=http://code.google.com/p/opent/]OpenT and its pretty darn good.

Lately I've written a parser for a lisp-dialect with Haskell's Parsec. Which can be quite a bit of fun.

Scala's parser combinators are also pretty cool as demonstrated in this post : [url=http://compsci.ca/v3/viewtopic.php?p=161376#161376]here.

-----------------------------------
Prabhakar Ragde
Mon Jul 14, 2008 9:07 pm

Re: RE:Lexer/Parser experience?
-----------------------------------
Likely not "way more". I'm just finishing up with the "baby compilers" course here at UW (CS241) -- scanner / parser / compiler, for a really simple made up language.

That language is actually the intersection of Java and C++! --PR

-----------------------------------
Tony
Mon Jul 14, 2008 10:38 pm

RE:Lexer/Parser experience?
-----------------------------------
@PR -- I'm not sure what part Java contributes to the design, as I'm pretty sure that (after substituting "wain" for "main") it's a valid (very small) subset of C.

-----------------------------------
DemonWasp
Mon Jul 14, 2008 11:09 pm

RE:Lexer/Parser experience?
-----------------------------------
Bah! Who needs more than int and int*? That's all you get in brainf*ck and that's all you need!

/jokes

I finished the 200-level course in compilers (CS241?) semester last Spring term, but to be honest the JavaCC stuff is throwing me for a loop (I haven't put in much time, but what I have put in is consumed entirely by confusion). Perhaps more will become clear after the 300/400-level compilers course...?

-----------------------------------
jernst
Tue Jul 15, 2008 9:30 pm

Re: Lexer/Parser experience?
-----------------------------------
I did a course at laurier that used yacc and lex but i found it very hard lol, good luck to you :P

-----------------------------------
Zeroth
Tue Jul 15, 2008 10:11 pm

Re: Lexer/Parser experience?
-----------------------------------
Well, a few more details: Umm, I found a project that intrigued my professor(I'm doing an Undergraduate Research Student thing, NSERC-USRA), but he doesn't have time to study the concept. Its about using AI, or rather Machine Learning, which is his specialty, to speed up compilation time. I uh, don't really have much background in Compiler theory, but thanks to the MIT OpenCourseWare, I've been learning. What I had planned to ask, was: Is there any place in the compilation stack, where the current accepted solutions appear to be a bit obtuse? IE: Places where a better "understanding" of the parse tree would make it easy to optimize/compile? Just random ideas out there. 

A few I've come up with, and are examining are:

Machine learning methods that actually fix the very basic bugs that even the best programmers still make, like missing semi-colons, etc. 
Neural networks trained to optimize seemingly complex/long loops much better than current compilers do.(They do best on short loops, that have most of the variables nearby, or in a local scope)

I was just curious what ideas/frustrations others may have had in the process, and so, maybe thats a place to apply ML methods. Also, I'm very bored.

-----------------------------------
Tony
Tue Jul 15, 2008 10:27 pm

Re: Lexer/Parser experience?
-----------------------------------
Machine learning methods that actually fix the very basic bugs that even the best programmers still make, like missing semi-colons, etc.
I'm not sure if such "fixes" would be welcomed. If a machine can definitively say that a semi-colon was omitted by accident, instead of being left out (perhaps in an overly cleaver way), then all of the semi-colons become obsolete, and might as well be taken out of the language. Semicolons are like spaces, in a way that they allow the programmer to direct the parser.


i+++j


did I leave out a space on accident? yes. The machine has a 50% chance of getting it right though. I would prefer to see an error over having my program compile in an unintended way.

(I realize that some implementations will always parse this as i++ + j, but the point was that the intent might have been i + ++j)

-----------------------------------
Zeroth
Tue Jul 15, 2008 10:31 pm

Re: Lexer/Parser experience?
-----------------------------------
Of course, the compiler would tell you. But the most annoying part is that the compiler is unable to guess, and continue compiling, while saying it made such and such decision."Reverse this?" Its still something I'm looking at. Of course, the errors I'm talking about those hideously basic ones that end up stopping compilation, and making you waste time. Its less time for the compiler to ask if thats what you intended.

-----------------------------------
Prabhakar Ragde
Thu Jul 17, 2008 8:09 am

Re: Lexer/Parser experience?
-----------------------------------
It's not my area of research, but by coincidence I'm sitting in on a grad course on compiler optimization. Most of the problems that need to be solved are NP-hard, so there are many heuristics used, and one promising approach (particularly for JIT compilation) involves adjusting things on the fly using run-time profiling. That's an area where machine learning techniques might work. Pypy has a JIT compiler, right?

On the other hand, if you want to do the lexing/parsing, then you're probably talking about compilation to byte code, and it's much less clear how learning could be useful. It would have to be across programs compiled, rather than across runs or during a single run of a program, and that corpus is less likely to have useful information on which a learning program could act. It's also going to be more difficult to put together a system of your own (even assembling components) in the duration of a URA, as opposed to tweaking pypy. --PR

Edit: removed spurious tag caused by my fingers automatically using Emacs keystrokes.

-----------------------------------
Zeroth
Thu Jul 17, 2008 10:04 am

Re: Lexer/Parser experience?
-----------------------------------
Thanks PR. Thats an excellent avenue to look at. :D I'll make sure to look at that.
