Python code generator Python code generator python python

Python code generator


You may want to take a look at the 2to3 tool, developed by the Python code devs to automatically convert Python 2 code to Python 3 code. The tool first parses the code to a tree, and then spits out "fixed" Python 3 code from that tree.

This may be a good place to start because this is an "official" Python tool endorsed by the core developers, and part of the recommended Python 2 to 3 migration path.

Alternatively, check out the codegen.py module, which generates Python code back from Python's ast.

See also this SO question, which may be relevant to yours (I'm not marking it a duplicate because I'm not sure the scopes of the questions overlap 100%)


Automatic code generation is commonly done in the following ways:

  • Print statements containing code fragments
  • Text templates with placeholders (think macros)

IMHO, better practice is:

  • Built an AST for the target fragment, and then prettyprint

Hardly anybody does the latter, because the tools are mostly not there.

Python's 2to3 tool provides (I think) the target AST and prettyprinting.

But a question you didn't ask, is "generate from what?" Somehow you have to specifyabstractly what you want generated (or it isn't a win). And your tool has to be able to read that specification somehow.

Many code generation schemes consist of writing procedural code that calls the above generation mechanisms; the procedural code acts as an implicit specification. It is "easy" to read the specification; it is just code in the language used by the code generator.

Some code generation schemes use some kind of graph structure to provide a frame on which fragments of specification are hung, that drive the code generation. UML class diagrams are a classic example. These schemes aren't so easy; you need a "specification reader" (e.g., UML diagram reader aka XMI or some such, or if you aren't using UML, some kind of specification parser), and then you need something to climb over the just-read specification in some useful order (UML is graph, there are many different ways it can be visited), that makes calls on code generation steps.

The Python 2to3 tool uses a Python2 parser to read the "spec". If you want to generate code from Python2, that will be fine. I suspect you don't want to do that.

A best practice approach is one that unifies the ability to read/analyze/traverse specifications, with the ability to produce ASTs for the target language.

Our DMS Software Reengineering Toolkit is a general purpose program analysis and transformation system. It parses "specifications" (instances of grammars to you can define to it) into ASTs; it will also let you build arbitrary ASTs for any of those grammars, using either procedural code [as sketched above] or using pattern-match/replacement (pretty much unique to DMS). Part of a DMS langauge front end is a prettyprinter, that can regenerate text from ASTs (these are tested by roundtripping code: parse to AST, prettyprint AST, better be the same text).

In case your grammar isn't known to DMS, it has extremely good parser and prettyprinter generators, as well as other support mechanisms for analyzing programs. All that additional machinery is usually not available with classic parser generators, or with just a plain "AST" package. (I don't know what is in 2to3).

The relevance of this to Python is that DMS has a Python front end as well as grammars for many other languages.

So, you can use DMS to parse your specification, and to generate Python code using ASTs, finally followed by prettyprinting.