Code Generation

We are writing the code and this is our job. That's the art that we make. We put our thoughts, our patient, our soul to make it ideal or near to ideal. Sometimes we just write the code asap as there is no time or code is simple or we wrote similar code hundreds of times. This small article is about writing similar code again and again.

When we are writing similar code second or fourth time it's not an art and it is tedious. But that's useful to polish previous code: we can find and fix bugs, add needed logging, rewrite with better performance, make the code clean and handy to read and change etc. Often we are changing previously written code to make it better as well.
But what if you need to write similar code tens or hundreds of times? Will you change previous 20 classes because now you find the best algorithm or fixed the annoying problem with performance? Will you be glad to fix 50 classes over the project because just now you find the bug? (so it must be fixed, right?)
That will take your time, that you could spend on coding something interesting or studying some new popular technology or have a beer with your friends. And what about customer for whom this software is a business.
That's why I'm writing about code generation, the thing we all know and use often. You remember, when you're creating new project or adding new class with your favorite IDE, you got initial code so you can write your important part rather that spend time on writing the same code again and again.
While reviewing code generation tools we can divide them into external and internal. External are provided by IDE or other providers (remember famous xdoclet toolkit?). Internal is written by your team and used withing some one or few projects.
Internal code generators could be simple enough to generate only basic template source code or UI that will be changed and extended. It also can be large and complex to generate layers (domain types, DAOs and repositories etc) of your software.

Lets review next types of code generators:
1. template code generators
2. partial code generators

Template code generators run once at the begin. They are responsible for generating source code that will be rewritten or extended by developer. It's important to have clean generated code with comments for generated code and for placements where to put your code. It's good to use such generators if you have already ideal code to be generated. Ideal means that you will not need ever to change template and regenerate the same code again and again. Regenerating code will remove code written by developers with hands.

Partial code generators could be run as much as you will need. They are separated from code written by developers. For C# that would be good to use partial keyword (as I remember partial class declaration was added to have distinct generated and written by developer code for WinForms and ASP.NET); for Java extending could be used as well. We use generator to generate only similar code and put in another file, e.g. partial class definition or base abstract class. The specific code is written in derived class. While we find new bugs and improvements we need just to change templates or generator configuration and regenerate the code again.

Although code is generated it also must be easy to read it. Document your code, as you need to do it only once - in templates, so don't be lazy and stingy! For partial code generator or template code generator that is twice as important, as you are working directly with generated code.
Don't forget to add comment with notion that code is generated automatically and can be regenerated again, so other developers will think twice before changing it by hands!

Also remember that it's okay to add code generation to project step-by-step. For example, I'm the only who is using the new generator for now. I need to do so to find and fix defects, find the parts of code that could and must be generated, improve configuration and way of use. In this case the generated base code (with partial code generator) is commited too.

Use the best tools to write templates. For example I'm using Freemarker template engine to describe templates and generate source code. There are few code generators that read configuration file and generate appropriate code. For such tools performance is not as important as flexibility of templates and configuration.

Document the generator. Share source code within team. Put your generator into source code repository near to the project, but not into it. Also make and tag latest stable version as runnable scripts or program, so other developers can update and use it without spending on it additional time.

1 comment:

Oleksandr Pavlyshak said...

Totally agree about keeping your generated code documented and human-readable.

You may find interesting to take a look at T4 template codegen in .NET