So I wrote a tool to parse a .thrift files and churn out specialized JSON serialization and deserialization code for those Thrift objects.
Antlr
Antlr-3 has tons and tons of users and lots of documentation... but most of the docs seem to refer to Antlr-2. It took me a long time to figure out how to build a parse tree and walk it (you have to pass Antlr-3 the debug flag so that emits events about nodes it's visiting and then use the ParseTreeBuilder class). I thought there might be some way to transform the productions into a semantic-looking tree via Antlr's "tree" grammar, but I couldn't figure it out.
SableCC
SableCC-4 has much less documentation. The best documentation was an example grammar file in the source checkout. But once I started generating the compiler... everything was easy. To walk the tree, I just override generated methods like "in_ThriftFieldDefinition" and "out_ThriftStruct". Super-easy.
I figured out everything I needed to know in fifteen minutes by looking at the --help output of the tool, the generated code and one example file.
Conclusion
If you're not a compiler compiler expert or don't know exactly what "AST transforms" are, I suggest SableCC. In two hours I finished my tool that I wasted four hours trying to get working in Antlr.
Footnote
Someone will probably modify the Thrift compiler to produce the code I want in a few months (which is the right way to approach a problem like this). The two reasons I didn't go that route was... I'm so much faster in Java than C++ and I found a thread from someone who tried to do just that (for purposes of integrating Thrift with GWT) and gave up.
Leave a comment