Relentless Build Automation

How far is going to far when you are doing build automation? My answer to that question, is that it's basically not possible to go to far. Any task that you need to do more than three times deserves automation. Three times is my heuristic. The longer and more complex the task, the smaller the number you should allow yourself to do it by hand. Compiling for example is a pain if you don't automate it, so you automate it the first time. Computers are good at one thing, running algorithms. If you are doing those algorithms by hand, you're wasting your computer's potential.

Why Automate?

There are two major reasons to automate:

Reduce errors
Being lazy

Reducing errors is easy. We want to have a repeatable, dependable process that anyone can use to achieve the same results. If your deploy master has to know 15 steps and the ins and outs of some complex process, what happens when she goes on vacation? You don't do deployment. By automating you allow other people to perform the same task and you keep your computer working for you.

But being lazy? Yes! A truly great computer programmer is lazy and does everything in their power to avoid having to work that they've already done before. Likewise, if you've done it 10 times, doesn't it get boring to do the same thing over and over again? Wouldn't you rather be getting coffee or reading blogs while your computer works at what it does best? In addition to general laziness, automation will make your life easier when you want to do other things that you know you should be doing like unit testing, integration testing and functional testing.

What to Automate

Going back to my introduction, I say automate everything that you or someone new coming onto a project will need to do more than a few times. A little bit of effort up front can save every member of a team a lot of effort later on. So, as a punch list to get you started:

Compiling code
Packaging code into JARs, WARs, etc
Running unit tests

Most people doing Java development will automate code compilation and packaging all of the time. Some of the times, they'll automate unit tests, sometimes deployment, and rarely a few other things. We usually use some excellent script-based tools for that kind of thing like Ant or Maven. Even if you do all of these things, is that going far enough? I think that for most projects the answer is no.

If you are slightly more ambitious you will also automate:

Integration testing
Functional testing
Deployment

There, now you've got all of your day-to-day activities automated. So you must be done right?

Generating Databases

Databases are the bane of automation. People don't take the time to script their database creation or to script their default or testing data. When you don't script your database, you make integration and functional testing very difficult. It's really easy to write functional tests and integration tests if they rely on the data being in a known state. If they do not, then running them is very difficult because the data underneath is always changing. That usually means that integration and automated functional testing are deemed to hard and are not done. But don't blame the tests. You're avoiding a little bit of work, that if you did it, would allow you to avoid A LOT of work. (I think little jobs that prevent you from having to do big jobs are a good tradeoff.)

Generating your database will make new team members lives a breeze. They should be able to check out a project from source control and with one simple command, generate a database, compile the code, run the tests and know with confidence that everything is working for them. When they make any changes to the project they can repeat this process to know with confidence that they haven't broken anything. Likewise in day-to-day development, being able to get back to a known state is going to give you the ability to develop with more confidence and test with ease.

Ant has a core task for running SQL queries, so you don't even need any other tools to accomplish this. One problem that you might run into is that running DDL commands in JDBC can cause some problems. I found a great article on executing PL/SQL with Ant that you can probably apply to other databases as well. It allows you to run more database specific queries to build your schema from a blank database. This helped me overcome the last hurdle in a recent project to get it so I had the entire database scripted from scratch. So check it out.

I find that having a few standard Ant tasks makes this a breeze:

A task to create tablespaces (if you have separate tablespaces for each database)
A task to create a user
A task to generate the full schema
A task to delete the user and drop the schema

Together these tasks can be woven together to fully to create the full database. To support development and testing, it is also handy to have loadable datasets around. Either maintain a list of SQL INSERT statements or use a tool like DBUnit to load the data into your database. Often it can be nice to have a few sets of base data:

Unit testing data
Functional testing data
Demo data for showing of to clients and manager

The only thing lacking is an easy way to dump a schema from a working database so that you can use nice GUI tools to modify the database and then dump it out using an easy Ant target. Even with this, you'll likely need to maintain change scripts so that you can apply them to working production systems without wiping out all of the data. I guess since I'm complaining, that will have to be my next side project. Leave a comment if you know how to do this.

Conclusion

With a little bit of effort in going all the way with automation, you can make it easy for new people to come onto a project, you can make it easier to create and run all kinds of tests, and you can make the entire development process easier and more confident. The tools exist to help make automation easy, so why not?

What things do you like to automate that you see people skipping on projects?

Resources

Java Unit Testing: JUnit, TestNG Web Functional Testing: Selenium Java Build Automation: Ant

Update

Looks like someone is a couple of steps ahead of me. While not a final release, the Apache DdlUtils contains a set of Ant tasks that give you the ability to read and write database schemas as well as loading data sets defined in XML documents. Something to keep an eye on...