Patching Generated Files

It's often quite smart to use code generators, and it's not too bad to have a backup strategy for cases where that might fall short, either.

Code generators can be very useful tools. In the DirectX Shader Compiler project, for example, the hct database instruction helper script allows reference documentation to be generated from metadata tables, as well as updating tables in the specification, and language tables like overloads or various enumerations directly in the C++ source code.

For more complex code generators that start to encroach on transpiler territory, you might run into cases where the output isn't quite what you'd like, but updating the tool might take some additional work that you can't afford to do immediately.

In these cases, one approach that can save you is the use of patch files.

Patching step by step

First, make sure you have diff and patch on your path. Typically these are available on Linux systems, but if you have git installed on Windows, you'll also have them on hand. See my post on finding programs to learn how to automate finding and calling them.

Next, run the code generation tool and make sure that you have the "bad" state you want to fix. I'll assume it's in file.txt.

Now, create a copy of that file, updated.txt, and do the modifications you want there. Try to be concise - this works best for small tweaks rather than global changes.

Once you're happy with your changes, you can run diff -c file.txt updated-txt > file.patch. This will capture the changes that you've made.

Run patch file.txt file.patch. Now file.txt has the new contents, and file.txt.orig remains as it was before. If you're happy with your changes, commit file.patch and add the patching command to your code generation scripts, so they get re-applied in case your changes get overwritten.

Commiting Generated Files

I recommend commiting the generated file.txt file as well. If your build process uses this and this is part of your binaries, you'll want your source indexing to point to this file.

To make sure that you don't miss updates, you probably want to have your build process generate files on the side, then compare them with the committed files, and fail the build if they've diverged. At that point you can have a human decide whether they should be updated, or whether a problem was introduced with code generation.

I remember working on a product that will go un-named here that ran yacc and lex as part of their build and didn't commit the generated files. Looking at crash dumps and being unable to look at any of that code and the comments near them made debugging things so, so much harder.

Happy patching!

Tags:  codingpowershell

Home