Better test data generation with Benerator

Part of the development process for enterprise applications involves performance and load testing, but it’s not always easy to generate realistic production data, especially in production-like amounts to properly stress the system. It’s even harder to do so early enough in the process to avoid extensive redesigns should bottlenecks be found.

Benerator addresses these issues. It makes it easy and fast to configure data generation in early project stages, so you can start performance testing as soon as you have running code. You can run Benerator from a nightly build system and trigger nightly automatic performance tests. As your project evolves, you can fine-tune data generation, or extract and anonymize real production data.

Benerator is the brainchild of German developer Volker Bergmann, who began working on it for his own use almost four years ago. “I decided to write my own tool because I found no tool on the market that was versatile and powerful enough for real performance-testing tasks, and that offered scalability, production-like data variations, invocation from a build tool, support for Windows (dev machines), Linux (test machines), and Solaris (production).

“Benerator’s biggest difference from other tools is its abstractness and extendibility to different data formats, software systems, uniqueness algorithms, distribution methods, and even logical domains. So, for instance, users can create financial transactions based on the population density of a geographic region, with a defined distribution concentrated between 9 and 5 o’clock on working days, with valid credit card numbers, and export the data to databases or batch files or feed them directly into an application – all with only a few lines of Benerator descriptor code and appropriate data files.”

Bergmann began developing Benerator in Maven 2 and is “quite happy with its reliability and structuredness, compared to Ant and Maven 1. For an IDE, I used Idea in the beginning, then switched to Eclipse because of weaknesses in my Idea version’s (4 or so) Subversion support, but I am now considering going back to a new version of Idea.”

Benerator can be licensed under both open and closed source licenses. Bergmann says, “I chose the open source approach in part to contribute a tool to the open source world that has provided so many good tools to me as a developer. This approach also allowed the product to mature over time and with user feedback, reducing effort and the risks of closed source software development. On the other hand, it was clear from the beginning that the definition of a tool like this would be hard work and take a lot of time. So far I have created 70,000 lines of Benerator code, and I want to avoid either having a larger company take over the source code, repackage it, and resell it, leaving me to go away empty-handed, or having to give up Benerator development because I need to do other work to pay my costs of living.”

Despite the already broad scope of the application, Bergmann has a big to-do list:
“- One of the more challenging tasks is to make Benerator scale on multicore systems.
- I am considering supporting Ant tasks in Benerator.
- While implementing test data generation, I have created almost all the pieces I need for using Benerator as a load driver – a system that simulates user activity on a system for inducing the test load. I am not yet sure whether to plug it into Grinder or JMeter or have some standalone functionality. Comments on this topic are welcome.”

He also welcome help on some specific tasks:
“- Benclipse, the Benerator Eclipse plugin, needs to be upgraded to the new Benerator version and could be improved, but I am a bit scared of the task. Sorry, Eclipse guys – I know you do hard work, but it is a pain to assemble the bits of knowledge needed to make anything work under Eclipse. I really need help there.
- I’d like to be involved in organizations’ projects for generating test data, either remotely or on-site. I’ve picked up some very valuable features from project experience.
- I also would appreciate hearing from people who manage to pay their rent from open source software development.”

And of course Bergmann would like feedback from Benerator users, and help from happy users to spread the word about what a great, powerful, and unique tool he has created!