On Mon, Aug 28, 2006 at 12:24:22PM -0400, Chris Ross wrote:
> Hello there, folks. Okay, so I've sat down and am trying to write
> a bit of code that will parse a YAML configuration file, and read it
> into a structure in memory that I can then inquire of to get the
> desired configuration information. However, the issues I'm having
> are really all "intro to YAML and libyaml" type questions, I think. :-)
>
> So, I'm using the parser, as suggested. I have looked at the API
> synopsis on the PyYAML page for libyaml, and am getting results
> similar to those I'd expect. However, I think I made an erronious
> presumption. I guess I thought that the library would return a
> SCALAR event with both the key and value, when the key/value was for
> a scalar. But, it seems to return something more like tokens. If
> this is right, and I'm seeing this correctly, how do I know when I
> get a SCALAR event whether it's for a key or a value? Does this all
> only work inside of a mapping? When a SCALAR exists outside of a
> mapping, is it always just a "thing", and cannot have a key/value
> type identity? If inside of a mapping, is the first (the third, etc)
> event always the KEY, and the next (be it a scalar, or a start-thru-
> end of seq or mapping) is the VALUE? I can keep track of this in my
> code, but honestly I expected the library to do that for me. There's
> a parser, so it should parse and comprehend to a limited degree. I
> think I'm seeing something more like a tokenizer that doesn't do much
> of anything else.
First, I'm not sure if you use the correct API. Do you use
yaml_parser_parse()? yaml_parser_parse() is a parser while
yaml_parser_scan() is a tokenizer.
The events produced by the parser satisfy the following grammar:
stream ::= STREAM-START document* STREAM-END
document ::= DOCUMENT-START node DOCUMENT-END
node ::= ALIAS | SCALAR | sequence | mapping
sequence ::= SEQUENCE-START node* SEQUENCE-END
mapping ::= MAPPING-START (node node)* MAPPING-END
The first (and 3rd, 5th, etc) nodes in the mapping production are keys
while the second (and 4th, 6th, etc) nodes are the corresponding values.
Note that in YAML, sequence items, mapping values, and even mapping keys
could be complex objects like sequences or mappings. Therefore you shouldn't
expect that, say, that the second event after MAPPING-START is a mapping
value.
The code processing YAML events with libyaml should look like this (sans
error handling):
void process_stream()
{
yaml_parser_parse(&parser, &event); // Eat STREAM-START
yaml_event_delete(&event);
while (1) {
yaml_parser_parse(&parser, &event); // Eat STREAM-END or the
// first event of a document
if (event.type == YAML_DOCUMENT_END_EVENT) break;
process_node(event);
}
yaml_event_delete(&event);
}
void
process_node(event)
{
if (event.type == YAML_SEQUENCE_START_EVENT) {
process_sequence(event);
}
else if (event.type == YAML_MAPPING_START_EVENT) {
process_mapping(event);
}
else if (event.type == YAML_ALIAS_EVENT) {
// Do something with the alias or produce an error message
yaml_delete_event(&event);
}
else {
// Process a scalar event
yaml_delete_event(&event);
}
}
void process_sequence(event)
{
yaml_event_delete(&event);
while (1) {
yaml_parser_parse(&parser, &event); // Eat the first event of
// the next sequence item
if (event.type == YAML_SEQUENCE_END) break;
process_node(event); // Process a sequence item.
}
yaml_event_delete(&event);
}
void process_mapping(event)
{
yaml_event_delete(&event);
while (1) {
yaml_parser_parse(&parser, &event); // Eat the first event of
// the next mapping key
if (event.type == YAML_MAPPING_END) break;
process_node(event); // Process a mapping key
yaml_parser_parse(&parser, event);
process_node(event); // Process the corresponding value
}
yaml_event_delete(&event);
}
Well, the code may become really complicated when you include error
handling and processing of the configuration format. I think I'll add
node based API (like DOM), and it will become easier.
--
xi
|