When applied effectively, using code generation can increase code quality and shorten development and test cycles. Here are some tricks and tools I've picked up along the way that you can use to lay waste to repetitive coding tasks, or generate things that aren't really possible to do efficiently by hand.
This article has no downloads, but several tools are linked to from here.
Introduction
I've noticed developers tend to have a love/hate relationship with code generation. Everyone has their opinions, but hopefully I can win over some detractors here, while providing some tools to ease the process.
Enter the Pre-Build Step
Why Not Visual Studio Integration?
I no longer use Visual Studio integration for most of my tools. The reason being is that it's just not as powerful and flexible as the pre-build step. It also takes longer to set up and debug, and makes your dev environment snowflakey and hard to replicate across workstations (installing half a dozen vsixs is no fun). Worse, you can't easily deploy a project to something like Github and expect other users can just clone the repo and go. That last bit is critical for team development, or even for you when you download the project onto a different machine two years from now and forget what tools you had installed. They are also harder to build than simple command line tools, won't work with VS Code (and JetBrains' offering?), etc. You can't put them in make
files, and batch files, and the list goes on.
Adding a Build Step to Your Project
With pre-build steps, you deploy the tools with your project. I usually put them in the root solution folder for easy pathing in the pre-build steps but you do you. That way, when someone clones your repro, they are ready to go.
While it varies by project type, you can usually get to the pre-build steps under Visual Studio by going to the project's properties page, and selecting Build Events. Under Visual Basic, it's under some "Advanced" setting. I don't use Visual Studio for C++ development, but in VS Code you can edit the relevant .json files, or better yet, add a make
file.
Know how to use command lines. You're a developer. The CLI should be comfortable enough by now. If not, you're about to get good at it. Fortunately, setting this up is a one-and-done situation so you don't have to mess with it every time you go to build. Your IDE/dev environment will do it for you.
When you set up a build step, remember to quote filenames, or any arguments with potential spaces in them. You can escape quotes with two quotes.
Visual Studio has helpful macros you can use to locate binaries, the solution folder, the project folder, etc. Use those when you make your build steps so that they are robust. In Visual Studio, just separate multiple command lines with enter.
Here's an example of pre-build events from a real world application build I have for an upcoming update to Reggie.
"$(SolutionDir)csppg.exe" "$(ProjectDir)Export\Common.cs.template" /output
"$(ProjectDir)Export.CommonGenerator.cs" /class CommonGenerator /namespace Reggie /internal
"$(SolutionDir)csppg.exe" "$(ProjectDir)Export\TableMatcher.cs.template" /output
"$(ProjectDir)Export.TableMatcherGenerator.cs" /class TableMatcherGenerator
/namespace Reggie /internal
"$(SolutionDir)csppg.exe" "$(ProjectDir)Export\TableTokenizer.cs.template" /output
"$(ProjectDir)Export.TableTokenizerGenerator.cs" /class TableTokenizerGenerator
/namespace Reggie /internal
"$(SolutionDir)csbrick.exe" "$(SolutionDir)LexContext\LexContext.csproj" /output
"$(ProjectDir)LexContext.brick.cs"
"$(SolutionDir)csbrick.exe" "$(SolutionDir)FastFA\FastFA.csproj"
/output "$(ProjectDir)FastFA.brick.cs"
All of my tools use the same basic syntax to make it easier to work them. I'm going to give you some boilerplate code later when we cover quickly making your own tools.
These commands will execute any time the project is built. If you don't want them to run every time, for example, if your process takes a long time to execute, you can add an /ifstale
switch to your executable that skips the process unless the input file is newer than the output file.
Skinning Cats
In the .NET realm, there are a couple of different ways to do code generation. You can use the CodeDOM or you can use some sort of text templating system, depending on how you want to go about it.
The CodeDOM Approach
Advantages of the CodeDOM
- Your generated code is language independent. You create an abstract syntax tree which is then rendered into a target language using a CodeDOM provider. Microsoft provides one for C# and one for Visual Basic but 3rd parties can provide their own. As a rule, if you can use it in an ASP.NET page for server side code, you can render to it using the CodeDOM. These are limited to .NET languages of course. You can't render to SQL/DDL using this method, for example.
- You can "macroize" and analyze the code since it is represented as an in-memory tree. This can allow you to do powerful code transformations.**
Disadvantages of the CodeDOM
- "Language independent" is relative in practice, and usually requires some "massaging" of your tree before it will work with a given target language. For example, you may need to explicitly derive classes from
System.Object
in order to get the Visual Basic code provider to render them properly when the object also inherits from interfaces. - You are limited to .NET languages for which you have a CodeDOM provider, and not all providers are created equal. The one for F# for example, might not work with most code generators because of all the things it doesn't support or supports badly.
- The code tree is very sparse. There's nothing fancy, like no
yield
keyword. Even some common operators are missing.** - The tree syntax is very verbose, with names like
CodePrimitiveExpression
and CodeIterationStatement
and involves a lot of nesting of objects so building them out in code requires a lot of code.** - There is no parser.** You must explicitly construct the objects in code.
** Deslang is a powerful tool that employs Slang - a CodeDOM compliant subset of C# which parses into a CodeDOM tree and creates graphs in code you can include in your code generation projects. Using Deslang, you can write code in a C# subset and render it to Visual Basic, for example. You can also do complex operations like visiting each element of a CodeDOM tree in order to do analysis and transformation. Furthermore, you can do things like reflection, including type and method resolution, plus constant expression evaluation (folding) over an in-memory code model. Using Deslang as a pre-build step, you can generate really complicated code very easily and in a language independent manner, and then take the resulting generated code, include it in your code generator project, and do transformations and rendering to the language of your choice with the resulting tree. See the articles linked to prior for more. Even though it is somewhat experimental, and you may have to tinker with it to get it to parse sometimes, once you've got it going it makes blasting out wicked code pretty easy. I highly recommend using Deslang if you choose to go the CodeDOM route. It is orders of magnitude more powerful and efficient than using the CodeDOM by hand.
The Text Templating Approach
Advantages of the Text Templating Approach
- You can generate any text based output you need, code or otherwise and you can take advantage of the latest features of your favorite language.
- You can generate to multiple target languages by using multiple templates, which gives you a lot of flexibility.
- It's simpler to understand than the CodeDOM.
- It typically generates much faster than using the CodeDOM approach, especially compared to using Slang and Deslang and doing analysis on complicated code or a lot of code.
Disadvantages of the Text Templating Approach
- You have to write one template for each target you want to render to.
- You cannot do analysis on the generated code. You must do your analysis before the code is generated, which is severely limiting.
- You'll typically lose Intellisense in your templates.
- It can be difficult to get your output to format properly.
Using csppg, you can create text based templates using ASP-like or T4-like syntax. The templates themselves are generated as C# files. You can then include the C# files in your code generator project to run them any time you need output from them.
Combining Both
You can use text templates to render to Slang, which can then be parsed into code to build the CodeDOM tree the code represents, using Deslang. The tree can then be analyzed and reflected upon before being rendered to any desired language. The only real disadvantage to this is that it's more complicated than the other approaches, and incorporates some of the disadvantages of using the CodeDOM..
My Preference, From Experience
I write a lot of code generation tools. You might say I'm a code generation aficionada/o. My father was a toolmaker so in a way I think I've got some of the same drive he did. With it comes some hard won experience.
I used to use the CodeDOM approach because of the advantages to it, but these days I'm about simplifying. I also am more interested in targeting things like C++ and even SQL/DDL.
I would still revert to the CodeDOM model, particularly if I want to use Slang at runtime like Parsley does, but now I tend to want to avoid it mostly due to the added complexity of it, even using tools like Deslang.
Making Tools
Many times, you won't find a tool that does precisely what you need. When that happens, you'll need to make one, which the aforementioned tools can help with. After you've chosen which direction to go with your generator (either CodeDOM or text templates or both) you need to throw together a file to process command line arguments:
using System;
using System.IO;
using System.Reflection;
namespace Tool
{
static class Program
{
static readonly string CodeBase = _GetCodeBase();
static readonly string Filename = Path.GetFileName(CodeBase);
static readonly string Name = _GetName();
static readonly Version Version = _GetVersion();
static readonly string Description = _GetDescription();
static int Main(string[] args) => Run(args, Console.In, Console.Out, Console.Error);
public static int Run(string[] args, TextReader stdin,
TextWriter stdout, TextWriter stderr)
{
string inputfile = null;
string outputfile = null;
bool ifstale = false;
var result = 0;
TextReader input = null;
TextWriter output = null;
try
{
if (0 == args.Length)
{
result = -1;
_PrintUsage(stderr);
}
else if (args[0].StartsWith("/"))
{
throw new ArgumentException("Missing input file.");
}
else
{
inputfile = args[0];
for (var i = 1; i < args.Length; ++i)
{
switch (args[i].ToLowerInvariant())
{
case "/output":
if (args.Length - 1 == i)
throw new ArgumentException(string.Format
("The parameter \"{0}\" is missing an argument",
args[i].Substring(1)));
++i;
outputfile = args[i];
break;
case "/ifstale":
ifstale = true;
break;
default:
throw new ArgumentException(string.Format
("Unknown switch {0}", args[i]));
}
}
if (string.IsNullOrWhiteSpace(inputfile))
throw new ArgumentException("inputfile");
var cwd = Environment.CurrentDirectory;
if (!ifstale || _IsStale(inputfile, outputfile))
{
if (null != outputfile)
{
stderr.WriteLine("{0} is building file: {1}", Name, outputfile);
cwd = Path.GetDirectoryName(outputfile);
output = new StreamWriter(outputfile);
}
else
{
stderr.WriteLine("{0} is building preprocessor.", Name);
output = stdout;
}
input = new StreamReader(inputfile);
}
else
{
stderr.WriteLine("{0} skipped building of {1}
because it was not stale.", Name, outputfile);
}
}
}
#if !DEBUG
catch (Exception ex)
{
result = _ReportError(ex, stderr);
}
#endif
finally
{
if (null != input)
input.Close();
if (null != outputfile && null != output)
output.Close();
}
return result;
}
static void _PrintUsage(TextWriter w)
{
w.Write("Usage: " + Filename + " ");
w.WriteLine("<inputfile> [/output <outputfile>] [/ifstale]");
w.WriteLine();
w.Write(Name);
w.Write(" ");
w.Write(Version.ToString());
if (!string.IsNullOrWhiteSpace(Description))
{
w.Write(" - ");
w.WriteLine(Description);
}
else
{
w.WriteLine(" - <No description>");
}
w.WriteLine();
w.WriteLine(" <inputfile> The input template");
w.WriteLine(" <outputfile> The generated file - defaults to STDOUT");
w.WriteLine(" <ifstale> Only generate if the input is newer than the output");
w.WriteLine();
}
static bool _IsStale(string inputfile, string outputfile)
{
if (string.IsNullOrEmpty(outputfile) || string.IsNullOrEmpty(inputfile))
return true;
var result = true;
try
{
if (File.GetLastWriteTimeUtc(outputfile) >= File.GetLastWriteTimeUtc(inputfile))
result = false;
}
catch { }
return result;
}
static string _GetCodeBase()
{
try { return Assembly.GetExecutingAssembly().GetModules()[0].FullyQualifiedName; }
catch { return Path.Combine(Environment.CurrentDirectory,
typeof(Program).Namespace + ".exe"); }
}
static string _GetName()
{
try
{
foreach (var attr in Assembly.GetExecutingAssembly().CustomAttributes)
{
if (typeof(AssemblyTitleAttribute) == attr.AttributeType)
{
return attr.ConstructorArguments[0].Value as string;
}
}
}
catch { }
return Path.GetFileNameWithoutExtension(Filename);
}
static Version _GetVersion()
{
return Assembly.GetExecutingAssembly().GetName().Version;
}
static string _GetDescription()
{
string result = null;
foreach (Attribute ca in Assembly.GetExecutingAssembly().GetCustomAttributes())
{
var ada = ca as AssemblyDescriptionAttribute;
if (null != ada && !string.IsNullOrWhiteSpace(ada.Description))
{
result = ada.Description;
break;
}
}
return result;
}
static int _ReportError(Exception ex, TextWriter stderr)
{
_PrintUsage(stderr);
stderr.WriteLine("Error: {0}", ex.Message);
return -1;
}
}
}
You can paste the above into Program.cs in your command line project. It is boilerplate for a robust CLI application that's easy to add new parsed arguments to, has a using screen, can detect if the output is stale, and is made such that it can be referenced as a library from another tool, like a Visual Studio Custom File Generator.
You simply put your main code where the TODO: Do work here comment is.
One thing you'll want to avoid when you build a tool like this is using external libraries unless absolutely necessary. The reason being is, as a pre-build step, requiring extra dependencies for the tool complicates its use. I know it's common practice with .NET to have a half a dozen .dlls for even the smallest project, but here you'll want to depart from that because a bunch of .dlls in your build tree is just a mess.
You can use csbrick to crunch all of the source files for a dependent project into a single file "code brick" that can be included into your project in lieu of using a .dll. Just remove the dependency from your references and add a pre-build step that runs csbrick with all your source files from the project. Then add the generated output to your project.
Hopefully, this article helps you improve and streamline your development.
History
- 28th October, 2021 - Initial submission