Thread: [Yaml-core] YAML.pm's basic interface for loading and dumping

yaml-core

[Yaml-core] YAML.pm's basic interface for loading and dumping

From: Brian I. <in...@tt...> - 2002-08-17 22:06:18

Where to start?

YAML.pm has a dead simple interface for dumping memory as a YAML
serialization and for doing the reverse. It's like this:

    use YAML;
    $yaml_string = Dump(@object_refs);
    @object_refs = Load($yaml_string);

Even though this is simple, I think there's a lot to say about it.

First is the names 'dump' and 'load'. The main reason I chose these is
because the spec talks about the concepts of a Dumper and a Loader. I
actually used to use Store() instead of Dump(). But Store() is now
deprecated. I also chose Dump() because it is in line with the other Perl
serializing modules:

    Module         Serialize     Deserialize
    ----------------------------------------
    Data::Dumper   Dumper        eval
                   Dump
    Data::Dump     dump          eval
    Data::Denter   Indent        Undent
    Data::DumpXML  dump_xml      $p->parsefile
    FreezeThaw     freeze        thaw
    Storable       freeze        thaw
    YAML           Dump          Load
                   freeze        thaw
		   Indent        Undent

I would be happy if we could at least all agree to support 'dump' and
'load' in our interfaces. I think it will give new users a good point of
reference. The actual specifics of the calls will probably be different
from language to language. I'll describe how mine work in a moment.

I export Load() and Dump() by default. This is typical in Perl because
it allows the smallest syntax for the simplest case. For instance, a one
liner to dump the symbol table can look like this:

    perl -MYAML -e 'print Dump *::'

In some languages it might be considered rude to export symbols
by default. In Perl, it's ok as long as you don't go overboard. A cautious
user can defeat the exporting with:

    use YAML();

which specifies an empty export list.

I also use title case (Dump instead of dump) for the exported things. This is
simply a matter of style. I save lowercase for oo calls like:

    YAML->new->dump(@objects);

Now I'll talk a bit about how the calls work. In Perl, you have list and
scalar context. Dump() returns a single YAML string in either case. Load()
returns all objects in the YAML stream in List context, and the last object
in the stream in scalar context. 

I'm actually not satisfied with Load() in scalar context. I think I might
change it to return the next object in the stream. Or maybe an iterator. But
that's talk for another day...

These calls only deal with YAML as a string, not as a file or a filehandle.
I'd like to start talks about a clean way of doing that. Currently I support:

    LoadFile(filename);
    DumpFile(filename, @objects);

Load and Dump also do operations in one shot. There is not yet an
iterative interface for adding to or parsing from a stream. In other
words, they are atomic at the stream level. I think that's actually best
for the basic interface. But I'd like to add loading and dumping at the
document level.

One point; it is important to remember that YAML documents in a stream
cannot know anything about each other. They have separate tab policies,
anchor/aliasing, etc. Be sure to reset your configurations between
documents.

That's all I have for now.

Cheers, Brian

[Yaml-core] YAML.rb's basic interface for loading and dumping

From: why t. l. s. <yam...@wh...> - 2002-08-18 07:04:22

Show and tell, then?  Or truth or dare?

Brian Ingerson (in...@tt...) wrote:
>     use YAML;
>     $yaml_string = Dump(@object_refs);
>     @object_refs = Load($yaml_string);

YAML.rb has load() as well.  I could add a dump() method quite easily,
as it will just run the to_yaml method of the object passed to it.
Here's your code in Perl's much prettier younger sister ;) ..

   require 'yaml'
   yaml_string = object.to_yaml
   object = YAML::load( yaml_string )

> I would be happy if we could at least all agree to support 'dump' and
> 'load' in our interfaces. I think it will give new users a good point of
> reference. The actual specifics of the calls will probably be different
> from language to language. I'll describe how mine work in a moment.

I can appreciate the background on 'dump' and 'load' and I think it's
a good convention.  I see good reason for some identical semantics
between implementations.

> I'm actually not satisfied with Load() in scalar context. I think I might
> change it to return the next object in the stream. Or maybe an iterator. But
> that's talk for another day...

YAML.rb has three methods for loading YAML, all of which I am quite fond of.
The first is the simple YAML::load illustrated above.  The second is a
YAML::load_document call, which loads an entire stream at once into an
object.  This is useful if you want to load the entire stream, alter it,
and spit it back out, keeping many of the conventions found in the original
YAML file:

    require 'yaml'
    ydoc = YAML::load_document( File.open( 'EMPLOYEES.yml' ) )
    ydoc.add( { 'name' => 'Why', 'salary' => 1.0/00 } )
    File.open( 'EMPLOYEES.yml' ).write( ydoc.emit )

Perhaps 'emit' could be exchanged for 'dump' if it's decided.  I'm not
totally involve with the load_document method, but it's convenient.

The last method, the iterator, is great.  The only problem is that it
handles document separators and pauses identically.  But it will read
from a stream (TCPSocket, File, etc.) and run the Proc for each document.

    require 'yaml'
    YAML::Parser.new.parse_documents( File.open( 'EMPLOYEES.yml' ) ) { |ydoc|
        puts "Employee found: " + ydoc['name']
    }

I'd like to do more with the above and I'm sure I will when I get more real
usage from it.

All of my loading methods can take a String or any other IO object.

> One point; it is important to remember that YAML documents in a stream
> cannot know anything about each other. They have separate tab policies,
> anchor/aliasing, etc. Be sure to reset your configurations between
> documents.

Good point.  I think YAML.rb is on about that, but I need to double check.

Another thing: much of what I've done in YAML.rb is based on the spec's
description of native model, generic model, serial model.  I've noticed that
information takes up much of the spec, too.  Would that be more useful in
an implementor's document?  It's very useful, but seems geared toward
implementors rather than users.

_why