From: Steve Howell <showell@zi...> - 2002-08-24 01:43:35
Curious phenomenon with YAML in Python.
I wanted to write a YAML loader, so:
1) I wrote acceptance tests in Python.
2) Soon a subset YAML emerged.
3) I got tired of Python's verbose way of specifying data.
4) So I started writing acceptance tests in YAML.
Now I am discovering another strange loop. I want to make a schema-driven YAML
loader that runs on top of an event-based parser. This type of two-layer
architecture is described in Chapter 3 of the YAML spec. I'm not interested in
implementing the parser quite yet--I am more interested in experimenting with
the loader interface and the schema language. So, I decide to create a simple
mock parser, which will behave like a true YAML parser from the loader's
perspective. Here is how that cycle goes:
1) I write a simple mock parser that you preload with a bunch of canned
2) I write a test for a loader that uses a particular set of canned parser
events, corresponding to a simple array. Basically, all I'm checking is that
the loader can take the canned parser events and build a data structure from it.
3) I write enough of the loader to make that test pass.
4) I write a test for a loader that uses another set of canned parser
events, corresponding to a simple hash.
5) I write enough of the loader to make that test pass as well.
But then I want to mock what a parser would generate for more complex Python
data structures, and hand-generating the parser events begins to get a little
tedious. But then this occurs to me--just like I want a pull-based parser, I
also want a push-based dumper/emitter. (Again I refer you to chapter 3 of the
YAML spec.) And since I'm gonna write a push-based dumper/emitter anyway, I
might as well write the dumper now, and have it generate the series of events
for the mock pull parser. So now I work on the dumper:
1) I write a mock emitter that just captures dump events, without actually
emitting them to YAML.
2) I write a real event-pushing dumper, but which only works on Python
3) I write a test that verifies this dumper, which will use the mock emitter
to capture the dump events and then assertEquals the mock emitter's list of
events with the events that I want.
4) I repeat this process for a simple hash.
5) Then I want to make sure the real event-pushing dumper can dump a complex
But writing all these tests in Python is cumbersome. I want to represent all
this data--whether it be the Python data structures themselves, or the data
structures representing the list of parser/dumper events--in YAML. And I can!
I am bootstrapped for a two-layer YAML architecture with my exisiting one-layer
implementation. Writing the monolithic YAMLs enables me to write the more
modular YAML later.
I think this whole bootstrapping process is cool. I probably made a mistake in
not writing the two-layer architecture up front, but I like how all of the
mocking and scaffolding elegantly works together, with a little bit of thought.