From: Brian I. <in...@tt...> - 2002-09-08 01:03:07
|
On 07/09/02 21:43 +0200, Armin Roehrl wrote: > Hi all, >=20 > this is a 1st draft for the Linux journal article from Stefan > and me. Please tell us what is wrong/missing/, etc. >=20 > Thanks > =20 > Armin & Stefan > P.S. I will be away till Wed. evening. > ----------------------- >=20 > YAML Ain't Markup Language > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >=20 > Armin R=F6hrl and Stefan Schmiedl >=20 > YAML (rhyming with "camel") is a new language suitable for > encoding data. Its data format is easily parseable by > machine and human and especially meant to be used with > scripting languages such as Perl, Python or Ruby. YAML is > optimized for data serialization, keeping configuration settings,=20 > log files, and realizing Internet messaging and filtering.=20 >=20 > If you think that XML is too verbose and made for computers > and not for humans, than YAML is for you. YAML emerged from > the union of two efforts: Brian Ingerson needed a > serialization format for Inline [Inline] while Clark Evans (I would downplay Inline as most readers will not be familiar with it) Brian Ingerson had created Data::Denter, a human readable and safely pars= able serialization format for Perl while Clark .... > and Oren Ben-Kiki worked on simplifying XML. >=20 > YAML is not going to replace the terabytes of XML out there > on the internet, but complement it as a new language for > data as used in config files, logging, reporting and > data-driven programming. If you are stuck with lots of XML, > there is a project going on to convert between XML and YAML > [XMLYAML].=20 >=20 > YAML was created according to the following design goals: >=20 > - YAML documents are very easily readable by humans.=20 > - Good interaction with scripting languages, as scripting > languages are not only good for small hacks, but the rise > of Perl, Python and Ruby shows the need for good tools > where YAML can help. > - YAML uses host languages' native data structures, which is > a big advantage over dumb string-based XML. > - YAML has a consistent information model.=20 > - YAML enables stream-based processing, which is the typical > way one works with files or data coming over the network.=20 > - YAML is expressive and extensible.=20 > - YAML is easy to implement. This is key to get acceptance > quickly. If you indent the above bullets one space then the previous paragraph becomes YAML. Of course, another printing format may knock it out of whack. >=20 > Learning YAML >=20 > Keep the highly detailed official specification [spec] for > later and trype the YAML-Cookbook [Cookbook] first. If you ^^^^^^^^^^^^try > haven't already, you're also going to pick up a bit of Ruby, > which won't hurt, either. >=20 > Below is an example of an invoice expressed in YAML. Structure=20 > is shown through indentation (one or more spaces). Sequence=20 > items are denoted by a dash, and key value pairs within a map=20 > are separated by a colon. >=20 > --- !clarkevans.com/^invoice > invoice: 34843 > date : 2001-01-23 > bill-to: &id001 > given : Chris > family : Dumars > address: > lines: | > 458 Walkman Dr. > Suite #292 > city : Royal Oak > state : MI > postal : 48046 > ship-to: *id001 > product: > - sku : BL394D > quantity : 4 > description : Basketball > price : 450.00 > - sku : BL4438H > quantity : 1 > description : Super Hoop > price : 2392.00 > tax : 251.42 > total: 4443.52 > comments: > > Late afternoon is best. > Backup contact is Nancy > Billsmer @ 338-4338. >=20 > There are only a few things to note in addition to what we > said above: The billing address is given a label &id001 so > that we can easily refer to it for shipping with *id001. The Please change all occurences of id001 to addr. This is an old example and= no implementation use id### anymore. > ">" in the comments-field denotes unformatted text, which > will be line wrapped upon output. >=20 > Base components >=20 > Maps in YAML are like hashes in Perl and Ruby, or > dictionaries in Python. >=20 > first name: Julia > last name: Smith > salary: 44 =20 44,000 If the data seems dumb, please feel free to change. We have some dumb examples. =20 >=20 > Sequences in YAML represent things like arrays, lists, > tuples, and vectors. >=20 > - SuSE > - Debian > - Black Cat >=20 > Of course we can combine these as deeply as we want, so we > build a mapping between strings and sequences: >=20 > Countries: > - Germany > - USA > - France > Languages: > - German > - English > - French >=20 > YAML can also represent multi-line scalars. The two > different styles determine whether line feeds get preserved. >=20 > Python code: | > def sum(a,b,c): > return a+b+c > HTML: > > This data is presented on=20 > multiple lines in YAML, but=20 > it will get folded into one > line when it's loaded. >=20 > On the other hand, you can use an efficient inline syntax > for maps and sequences, too. So >=20 > Verbose: > - name: Steve > languages:=20 > - Python > - Ruby > - name: Clark > languages:=20 > - Python > - C > - name: Neil > languages:=20 > - Perl > - C >=20 > becomes >=20 > Compact: > - {name: Steve, languages: [Python, Ruby]} > - {name: Clark, languages: [Python, C]} > - {name: Neil, languages: [Perl, C]} Again, I'm not sure these data items will be meaningful to a general audience. >=20 > Aliases (defined with & and used with *) allow to repeat > data elements: >=20 > Steve: &ScriptingLanguages > - Perl > - Python > - Ruby > Ryan: *ScriptingLanguages > Claven: [COBOL, Fortran] >=20 > To top things off, it is very easy to include multiple YAML > documents in one file or string using "---" as document > separator. So you can use a stream presenting different YAML > documents and let the parser sort them out. >=20 > --- > town: Baltimore > nickname: Orioles > --- > town: New York > teams: > - Mets > - Yankees > --- > city: Washington > goal: > > Washington wants to get a baseball > team, so that people in Northern > Virginia don't have to drive an=20 > hour north to get to the ballpark. > -- ^^ what is that? >=20 > A condensed reference card is shown in the following > textbox. Note that the card is a complete YAML-document > itself. >=20 > -------------textbox: reference------------------------- > # A YAML(tm) reference card [Reference] > --- #YAML:1.0=20 >=20 > Collection indicators: > '? ' : Key indicator. > ': ' : Key / value separator. > '- ' : Nested series entry indicator. > ', ' : Separate in-line branch entries. > '[]' : Surround in-line series branch. > '{}' : Surround in-line keyed branch. >=20 > Scalar indicators: > '''': Surround in-line unescaped scalar ('' escaped '). > '"' : Surround in-line escaped scalar (see escape codes below). > '|' : Block scalar indicator. > '>' : Folded scalar indicator. > '-' : Strip chomp modifier ('|-' or '>-'). > '+' : Keep chomp modifier ('|+' or '>+'). > int : Explicit indentation modifier ('|10' or '>2'). > # Modifiers can be combined ('|2-', '>+10'). >=20 > Alias indicators: > '&' : Anchor property. > '*' : Alias indicator. >=20 > Transfer indicators: > '!' : Transfer method indicator. > '!!' : Transfer method with private type family. > '^' : Establish/use public type family prefix. > '|' : Separate public type family from format. >=20 > Document indicators: > '#' : Directive indicator. > '---' : Document separator. >=20 > Misc indicators: > ' # ' : Throwaway comment indicator. > '//' : Note (preserved comment) map key. Leave this one out. It's not definite. > '=3D' : Default value map key. This one is less risky, but... >=20 > Collection types: ### Almost never given explicitly > '!map' : [ Hash table, dictionary, mapping ] > '!seq' : [ List, array, tuple, vector, sequence ] Confusing for a general reader. Please omit. >=20 > Scalar types: > `anything : Private implicit type. This is dubious. > foo? bar! : String > [ ~, (null), (Nil) ] : Null (no value). > 2002-12-31T18:59:59-05:00 : ISO8601 timestamp (EST) > 2002-12-31 18:59:59 Z : Space separated timestamp (UTC) > [ 1234, 0x4D2, 02333 ] : [ Decimal int, Hexadecimal int, Octal= int ] > [ 1,230.15, 12.3015e+02 ] : [ Fixed float, Exponential float ] > [ (inf), (-Inf), (NAN) ] : [ Infinity (float), Negative, Not a n= umber ] > [ +, (true), (Yes), (ON) ] : Boolean true > [ -, (false), (No), (OFF) ] : Boolean false > ? !binary > > R0lG...BADS=3D > : >- > Base 64 binary value. Actually, you'd best leave out this whole section. This is a hot topic. >=20 > Escape codes: > Numeric : { "\xXX": 8-bit, "\uXXXX": 16-bit, "\UXXXXXXXX": 32-bit } > Protective: { "\\": '\', "\"": '"', "\ ": ' ' } > C: { "\a": BEL, "\b": BS, "\f": FF, "\n": LF, "\r": CR, "\t": TAB, "\v= ": VTAB=20 > } > Additional: { "\e": ESC, "\z": NUL, "\_": NBSP, "\N": NEL, "\L": LS, "= \P": PS=20 > } A bit esoteric, but OK > -------------------------------------------------------- >=20 > YAML versus XML >=20 > We translated a little example from YAML to XML and did a > little bit of counting: Our "typical" YAML document is 536 > bytes long, which could be reduced to 462 bytes by using > compact notation without losing readability. On the XML > side we have 783 bytes for the same data, which can be > taken down to a 683 bytes long line, which is quite > unreadable. >=20 > # Plain YAML > title: Escape of the Unicorn > animations: > - title: background sky > author: Justyna > frames: > - file: bg_sky_1.png > ms: 500 > - file: bg_sky_2.png > ms: 500 > - title: background water > author: Jacek > frames: > - file: bg_water.png > ms: 300 > - file: bg_water1.png > ms: 200 > - file: bg_water2.png > ms: 200 > - file: bg_water3.png > ms: 300 > - file: bg_water2.png > ms: 200 > - file: bg_water1.png > ms: 200 >=20 # Compact YAML title: Escape of the Unicorn animations: - title: background sky author: Justyna frames: - [file: bg_sky_1.png, ms: 500] - [file: bg_sky_2.png, ms: 500] - title: background water author: Jacek frames: - [file: bg_water.png, ms: 300] - [file: bg_water1.png, ms: 200] - [file: bg_water2.png, ms: 200] - [file: bg_water3.png, ms: 300] - [file: bg_water2.png, ms: 200] - [file: bg_water1.png, ms: 200] The above was wrong, and not fully compacted. I fixed it for you. Note th= at if you say - foo: bar you must use two space indentation. - foo: bar bar: foo This compaction actually adds to your byte count. >=20 > # XML > <title>Escape of the Unicorn</title> > <animations> > <animation> > <title>0 background sky</title> > <author>Justyna</author> > <frames> > <frame><file>bg_sky_1.png</file><ms>500</ms></frame> > <frame><file>bg_sky_2.png</file><ms>500</ms></frame> > </frames> > </animation> > <animation> > <title>1 background water</title> > <author>Jacek</author> > <frames> > <frame><file>bg_water.png</file><ms>300</ms></frame> > <frame><file>bg_water1.png</file><ms>200</ms></frame> > <frame><file>bg_water2.png</file><ms>200</ms></frame> > <frame><file>bg_water3.png</file><ms>300</ms></frame> > <frame><file>bg_water2.png</file><ms>200</ms></frame> > <frame><file>bg_water1.png</file><ms>200</ms></frame> > </frames> > </animation> > </animations> >=20 >=20 > Interface to scripting languages >=20 > There's definitely cooperation among the YAML authors to > make a similar interface, but they also try to create > interfaces that feel native to the language at hand. So, you > are going to find "dump" and "load" everywhere, but the > Python version returns an iterator, while the Perl > implementation answers a list. >=20 > Due to a language-neutral testing suite the various > implementations are benchmarked to determine > interoperability between implementations.=20 >=20 >=20 > The following example originates with the Python bindings > written by Steve Howell. >=20 > #demo.py >=20 > import yaml, string >=20 > ####### RUNNING YAML AGAINST THE README AND CHANGELOG >=20 > readme =3D yaml.loadFile("README") > print "README" > for item in readme.next(): > print item > print "\n\nCONTRIBUTORS" > for person in readme.next()['contributors']: > print "=3D=3D=3D%s=3D=3D=3D" % person['who'] > print person['why?'] > print > print "\n\n" > print "CHANGELOG:\n" > print list(yaml.loadFile("CHANGELOG")) > print "\n\n" >=20 > ######## USING YAML INSIDE YOUR PROGRAM >=20 > testData =3D \ > """ > program: PyYaml > author: Steve Howell > --- > shopping list: > - apple > - banana > todo: > - eat more fruit: > - especially bananas! > - good for you > - write a better demo > """ >=20 > print "YAML INSIDE YOUR PROGRAM" > for x in yaml.load(testData): > print repr(x) > print "\n\n" >=20 >=20 > ######### YPATH STUFF >=20 > try: > print "YPATH EXPERIMENTATION" > data =3D yaml.load(testData) > print yaml.ypath("/author",data.next()).next() > print yaml.dump(yaml.ypath("/todo/0",data.next()).next()) > except NotImplementedError: > print "Experimental YPATH requires Python 2.2" >=20 > ######### YAML DUMPER=20 >=20 > class Person: > def __init__(self, fname, lname, salary, children): > self.fname =3D fname > self.lname =3D lname > self.salary =3D salary > self.children =3D children > # private variables > self._fullname =3D fname + ' ' + lname > if salary: > self._sal_per_month =3D salary / 12.0 > self._num_children =3D len(children) > def to_yaml(self): > return ({ > 'first name': self.fname, > 'last name': self.lname, > 'salary': self.salary > }, '!!Person') >=20 > mrBarson =3D Person('Foo', 'Barson', 20, ['ex', 'theomatic']) > mrDoe =3D Person('John', 'Doe', None, []) > print yaml.dump([mrBarson, mrDoe]) >=20 > print "\n\nANOTHER WAY TO STDOUT:\n" > import sys > yaml.dumpToFile(sys.stdout, [mrBarson, mrDoe]) >=20 > print "\n\nDUMP MULTIPLE DOCS TO A FILE:\n" > file =3D open('DEMO_OUTPUT.TXT', 'w') > yaml.dumpToFile(file,=20 > {'source': "Demo output from demo.py"}, > [ > 'apple', > 'banana', > ], > 'Third document' =20 > ) > file.close() >=20 >=20 > The Ruby implementation >=20 > For Ruby developers, YAML is a natural fit for object > serialization and general data storage, as their semantics > are similiar. YAML4R is a fully-featured YAML parser and > emitter for Ruby. Use it as a drop-in replacement for > PStore, or use one of its several APIs to store object data > in the friendly and readable YAML style. The implementation > is done by "why the lucky stiff". >=20 > YAML4R requires a current version of Racc, which in > turn requires Ruby (>=3D1.4) and a C compiler. To enable > Unicode support, you must have a current version of the > Iconv module for Ruby. >=20 > Ruby encourages objects to have their own exporting methods. > Hence, YAML.rb adds #to_yaml methods for built-in types: > The NilClass, FalseClass, TrueClass, Symbol, Range, > Numeric, Date, Time, Regexp, String, Array, and Hash all > get their implementation of the #to_yaml method. And using > it is just a breeze: >=20 > require 'yaml' > h =3D { 'test' =3D> 12, 'another' =3D> 13 } > puts h.to_yaml >=20 > Although you'll often want to store multiple YAML documents > in a single file, YAML.rb has a simplified mechanism for > loading and storing a single document in a single file.=20 >=20 > require 'yaml' > obj =3D YAML::load(File::open("/tmp/yaml.store.1")) >=20 > It does not matter, where the data originates, as the Parser > also accepts String objects through the same function. So > you can even do: >=20 > require 'yaml' > obj =3D YAML::load( <<EOY > --- #YAML:1.0 > - armless > - falling > - birds > EOY > ) > p obj > #=3D> [ 'armless', 'falling', 'birds' ] > -- >=20 > Parsing multiple documents from a YAML stream >=20 > When reading YAML from a socket or a pipe, you should > consider using the event-based parser, which will parse > documents one at a time. >=20 > require 'yaml' > log =3D File.open( "/var/log/apache.yaml" ) > yp =3D YAML::Parser.new.parse_documents( log ) { |doc| > puts "#{doc['at']} #{doc['type']} #{doc['url}" > } >=20 > Right now there are at least three active Ruby-YAML > projects: Yod (the equivalent-to-be of pod), the Endertromb > web server (to be released soonish) and ONI [ONI], which > uses YAML to serialize objects to pass them between Ruby > apps. >=20 > "why" also mentionned that a good YAML application would be > a messaging protocol similiar to Jabber. The handy thing > about implementing a protocol in Jabber is that you can pass > object instances around simply. So an invite message might > be: >=20 > --- !jabber.org,2003/^message > to: yam...@ja... > from: wh...@wh... > action: invite:yo...@yo... > body: > > Hey, this is just an example, > but at least it illustrates the > point. >=20 > Not only is it completely readable, but it will load as an > object in each of the implementations. In Ruby, to handle > this message you could just load it: >=20 > msg =3D YAML::load(message_str) > puts msg.to > #=3D> 'yam...@ja...' >=20 > In Python, there are similiar ways to use handlers to yield > a class: >=20 > class MessageLoader: > def resolveType(self, value, url): > if url =3D=3D "!jabber.org,2003/message" > return Message(data) > msg =3D yaml.load(message_str, MessageLoader()).next() > print msg.to >=20 > The point is: the YAML team is working toward YAML as a > central means of data-sharing. Not only would the protocol > be completely portable, but the interfaces to use the > protocol have no barrier to entry. Once you have learned > how to load and emit, then you're strictly dealing with > objects which are native to the language you are using. >=20 >=20 > The Java implementation >=20 > The Java parser is still in an early stage; it is not in any > usable state, but Rolf Veen is working on a new release.=20 > He will soon inform the world of a new version. >=20 >=20 > Conclusion >=20 > YAML is very young and moving ahead virtually every day. We > recommend you to join the mailinglist [MailingList] and ask > your questions to the yaml-core team directly. It is a > young team of enthusiasts that want to create something > different than XML.=20 >=20 > While XML is produced by big committees that pretend to be > think-tanks, YAML can be adapted to new ideas at break-neck > speed within hours. In open source projects it makes big fun > to watch the dynamics of new good ideas. >=20 > YAML might achieve a good degree of acceptance and make it > into the business world, as was the case with PHP, a tool > that at first just made sense to developers. And when we > show YAML to developers, they can see it sparkles. One thing you omitted which is very cool, is how easily you can embed YAML (or XML for that matter) content in other YAML documents, without any escaping whatsoever. Try that in XML. You may want to at least mentio= n that in the section that talks about the YAML TestSuite, which is itself packaged in a simple subset of YAML. > Acknowledgements >=20 > Many thanks go the YAML team. Material from Rolf Veen, > Steve Howell, why, Clark Evans and many others was used for > or included in this article. My apologies for not getting to you sooner. If there's still time, I'd be glad to talk about YAML.pm Cheers, Brian >=20 >=20 > Links > =3D=3D=3D=3D=3D > - [yaml] http://www.yaml.org/ > - [Inline] http://inline.perl.org/inline/home.html > - [wiki] http://wiki.yaml.org/yamlwiki/ > - [spec] http://www.yaml.org/spec/ > - [java] http://helide.com/g/yaml/ > - [Reference] http://www.yaml.org/refcard.html > - [Cookbook] http://yaml4r.sf.net/cookbook/ > - [MailingList] http://lists.sourceforge.net/lists/listinfo/yaml-core > - [SlideShow ] http://mountainwebtools.com/SlideShowell/rSlide-0.html > - [XMLYAML] http://wiki.yaml.org/yamlwiki/XmlYaml > - [ONI] http://www.ruby-lang.org/en/raa-list.rhtml?name=3DTomsLib >=20 >=20 >=20 > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r_______________________________________= ________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |