Hi Trans,

Good timing. I'm not sure where to start. Maybe with this: https://github.com/ingydotnet/yaml-pgx/blob/master/yaml.pgx

That's a working Pegex grammar. Pegex is a parsing framework that can turn that grammar into a working parser in any language that has regular expression support. There's a lot of work yet to do, but I'm working on it. Please join me.

Here's a complete JSON grammar: https://github.com/ingydotnet/json-pgx/blob/master/json.pgx and you can find other working grammars ending in -pgx under my github account.

My vision for Pegex is to have a yaml.pgx based implementation in every language, where bugs can be fixed in exactly one place.


Regarding your grammar, a couple things:


Perl has had a YAML subset called YAML::Tiny https://github.com/Perl-Toolchain-Gang/YAML-Tiny#readme and it is used heavily in CPAN tooling.

It seems like a good idea at first, but it's not YAML, and yet people think it is so they get bad surprises eventually. I'm not a fan, and am working to make a full YAML parser in less code than YAML::Tiny. (Note: even though I'm not a fan I am a major contributor because it's a big part of YAML in Perl and I want it steered correctly).


TOML is not very impressive. A slightly modified INI. That said I use INI quite often for config stuff these days. It's made for configuration. YAML is a data serialization language that just happens to get (mis)used for config. I say this because Oren and Clark and I always considered that YAML was not the best thing for config.

TOML has no Dumper. It can't be used for anything but config. So I don't think we need to worry about TOML replacing YAML except in places where INI might have been better in the first place.


Back to the Pegex grammar. This is my most exciting work to date. It goes deeper than I could possibly express in this email. I'd love it if you'd join forces with me. #pegex on irc.freenode.net.

Cheers, Ingy

On Thu, Jun 26, 2014 at 4:02 AM, Trans <transfire@gmail.com> wrote:
I have been thinking more and more about the need for a light-weight version of YAML, especially after reading https://github.com/toml-lang/toml#but-why, and learning that Rust's new packages manager apparently will be using TOML. There are some things to like about TOML, after all it is essentially the old INI format with a couple of extra features. But it's those extra features that cause me rage --they are down right ugly and confusing.

Anyway, I wanted to see if I could put together a EBNF for a "diet YAML" --basically a YAML without types, complex map keys, etc. Just the basics people use to hand write configuration files and such. How does this look so far:

  Yaml ::= Data*
  Data ::= (Scalar | Sequence | Mapping )
  Scalar ::= (Number | String | Date | Boolean | Nil)
  Sequence ::= (InlineSequence | IndentedSequence)
  InlineSequence ::= "[" Data ("," Data)* "]"
  IndentedSequence ::= OptionalTab "-" Data ("\n" OptionalTab "-" Data)*
  Mapping ::= ( InlineMapping | IndentedMapping )
  InlineMapping ::= "{" Key ":" Data ("," Key ":" Data)* "}"
  IndentedMapping ::= Tab Key ":" Data ("\n" Tab Key ":" Data)*
  OptionalTab ::= Space*
  Tab ::= Space+
  Key ::= Scalar
  String ::= '"' [A-Za-z0-9]* '"' | [A-Za-z0-9]+
  Number ::= ("+" | "-")? [0-9]* ("." [0-9]+)?
  Date ::= [0-9][0-9][0-9][0-9] "-" [0-1][0-9] "-" [0-3][0-9] ( [0-2][0-9] ":" [0-5][0-9] ":" [0-5][0-9] )?
  Boolean ::= "true" | "false"
  Nil ::= "~"
  Space ::= " "

What am I missing? What needs fixing?

Note, you can put this code into http://bottlecaps.de/rr/ui and get a nice picture of it all. Any one else have a good tool for drawing these?


Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
Yaml-core mailing list