yaml-core Mailing List for YAML Ain't Markup Language (Page 321)

yaml-core — YAML Core Development List

You can subscribe to this list here.

2001	Jan	Feb	Mar	Apr	May (101)	Jun (157)	Jul (89)	Aug (135)	Sep (17)	Oct (86)	Nov (410)	Dec (311)
2002	Jan (76)	Feb (100)	Mar (139)	Apr (138)	May (234)	Jun (178)	Jul (271)	Aug (286)	Sep (816)	Oct (50)	Nov (28)	Dec (137)
2003	Jan (62)	Feb (25)	Mar (97)	Apr (34)	May (35)	Jun (32)	Jul (32)	Aug (57)	Sep (67)	Oct (176)	Nov (36)	Dec (37)
2004	Jan (20)	Feb (93)	Mar (16)	Apr (36)	May (59)	Jun (48)	Jul (20)	Aug (154)	Sep (868)	Oct (41)	Nov (63)	Dec (60)
2005	Jan (59)	Feb (15)	Mar (16)	Apr (14)	May (19)	Jun (16)	Jul (25)	Aug (19)	Sep (7)	Oct (12)	Nov (18)	Dec (41)
2006	Jan (16)	Feb (65)	Mar (51)	Apr (75)	May (38)	Jun (25)	Jul (23)	Aug (16)	Sep (24)	Oct (3)	Nov (1)	Dec (10)
2007	Jan (4)	Feb (5)	Mar (7)	Apr (29)	May (38)	Jun (3)	Jul (1)	Aug (17)	Sep (1)	Oct	Nov (11)	Dec (16)
2008	Jan (11)	Feb (4)	Mar (7)	Apr (48)	May (17)	Jun (9)	Jul (6)	Aug (12)	Sep (5)	Oct (7)	Nov (4)	Dec (11)
2009	Jan (15)	Feb (28)	Mar (12)	Apr (44)	May (6)	Jun (16)	Jul (6)	Aug (37)	Sep (107)	Oct (24)	Nov (30)	Dec (22)
2010	Jan (8)	Feb (16)	Mar (11)	Apr (28)	May (9)	Jun (26)	Jul (7)	Aug (25)	Sep (2)	Oct	Nov	Dec
2011	Jan (5)	Feb (6)	Mar (3)	Apr (2)	May (10)	Jun (44)	Jul (11)	Aug (8)	Sep (6)	Oct (42)	Nov (19)	Dec (5)
2012	Jan (23)	Feb (8)	Mar (9)	Apr (11)	May (2)	Jun (11)	Jul	Aug (18)	Sep (1)	Oct (15)	Nov (14)	Dec (8)
2013	Jan (5)	Feb (13)	Mar (2)	Apr (10)	May	Jun (6)	Jul (17)	Aug (2)	Sep (3)	Oct	Nov (11)	Dec
2014	Jan	Feb (1)	Mar (10)	Apr (12)	May (1)	Jun (9)	Jul (27)	Aug (5)	Sep (13)	Oct (9)	Nov (9)	Dec
2015	Jan (8)	Feb (5)	Mar (1)	Apr (10)	May	Jun	Jul (1)	Aug	Sep (2)	Oct (14)	Nov (1)	Dec (6)
2016	Jan (12)	Feb (12)	Mar (133)	Apr (7)	May (1)	Jun	Jul	Aug	Sep (4)	Oct (3)	Nov (5)	Dec
2017	Jan (2)	Feb	Mar (3)	Apr	May (1)	Jun (8)	Jul (2)	Aug (2)	Sep (8)	Oct (2)	Nov (8)	Dec (1)
2018	Jan (1)	Feb (2)	Mar (6)	Apr	May (1)	Jun (4)	Jul (1)	Aug	Sep (2)	Oct	Nov	Dec
2019	Jan (2)	Feb (2)	Mar (2)	Apr	May	Jun	Jul	Aug	Sep	Oct (5)	Nov (1)	Dec (2)
2020	Jan (5)	Feb	Mar (2)	Apr (6)	May	Jun (1)	Jul	Aug	Sep	Oct (1)	Nov	Dec
2021	Jan (5)	Feb (2)	Mar (6)	Apr (1)	May (1)	Jun (3)	Jul	Aug (5)	Sep	Oct (5)	Nov (1)	Dec (4)
2022	Jan (1)	Feb (2)	Mar	Apr (1)	May	Jun	Jul	Aug	Sep	Oct (3)	Nov (1)	Dec (1)
2023	Jan	Feb (1)	Mar	Apr (1)	May	Jun (2)	Jul	Aug	Sep	Oct	Nov (1)	Dec (2)
2024	Jan	Feb	Mar	Apr	May	Jun (2)	Jul	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 319 320 321 322 > >> (Page 321 of 322)

Re: [Yaml-core] Changes to folding mechanism and elimination of "quoted strings"

From: Brian I. <briani@ActiveState.com> - 2001-05-19 15:30:35

"Clark C . Evans" wrote:
> There are three syntax forms for scalar values,
> "block", "stream", and "mime" which are used to
> accomidate differing needs for expressing content.

Sounds like a good idea...

> 
> Examples of the block form include:

These are my ideas from earlier, so I like them, of course.

> 
> Examples of stream form include:
> 
>   a: This is a simple one liner
ok
>   b: This has     intermediate spaces.
ok i think.
>   c: This is a multi line
>      value that wraps\taround and\n
>      uses an end of line marker.
uh huh
>   d:
>    \n      X
>    \n     X X
>    \n    X X X
>    \n
>    \n This one is just like "c" above.
>    \n Except that \\ escaping is required.

Why doesn't whitespace get folded between \n and X. I'm just not
perfectly clear on this.

And no emitter would ever *produce* this by default. Right? In other
words:

    \n      X\n     X X\n    X X X\n\n 
    This one is just like "c" above.\n 
    Except that \\ escaping is required.

Also, you haven't answered how to escape these simple values.

    e: "%"
    f: "@"
    g: "~"
    h: "% = 1.2"

I have a solution. See my next post...

> 
> An example of the mime form is...
> 

No changes there. Good.
-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

RE: [Yaml-core] To YAML or not to YAML

From: Oren Ben-K. <or...@ri...> - 2001-05-19 08:30:42

Brian Ingerson wrote:
> I was chatting on IRC to some Perl gurus who suggested that YAML is no
> more a Markup Language than MIDI. Or than a kayak is an automobile. (To
> quote them)

I agree. But everyone knows YAML stands for

YAML Ain't a Markup Language

:-)

We could also switch the name... though we had better do that fast. Clark's
hat again, I guess.

Oren.

RE: [Yaml-core] Classes

From: Oren Ben-K. <or...@ri...> - 2001-05-19 08:25:15

Brian Ingerson wrote:
> > ... this makes it difficult to provide an extra
> > "class" property to list/scalar.
> 
> But you don't need a class name to deserialize into these three. They're
> automatic. Just for other classes. Right?

I'd agree for lists, but for scalars you may want the type to be "float"
instead of "string", or some such.

> I need to go to sleep 3 hours ago.

I just woke up, full of vim and vigor :-)

> I'm trying to follow this, and I
> think I'm agreeing with the logic. But I'm puzzled as to why the above
> code represents a point object. I don't see a class name of "point". Is
> this a typo? Or am I just as spent as I feel?

Probably :-) A 'point' class is rather useful if you are dealing with
geometry. You can pick another example - would "invoice" be better (id,
price, description, date, ...)?

Happy dreams ;-)

    Oren Ben-Kiki

[Yaml-core] YASP (Yet ANother Syntax Proposal)

From: Oren Ben-K. <or...@ri...> - 2001-05-19 08:18:25

"There's always one more"...

I got a way around the need to "further indent" the second line in a list
block value, given Clark's new white space handling rules (which I like):

    list @
        value1
        value2
        |block
        |value3
        \
        |value4 - block with an empty
        |
        \line followed by a longer on
        |e which was broken
        \
        |value5  -  verbatim single line
        value6

That is, in a list, a block ends by an empty \ line, at least if followed by
another block. It may be simpler/more readable to require "a block ends with
an empty \" everywhere in lists:

        |verbatim  single line value5
        \
        value6

What do you think?
       
At any rate, no scalar marker required, ever. How about it, Brian? Can we
agree on the 'minimal' option, based on this?

Also, speaking of Clark's latest E-mails about whitespace handling, maybe we
should re-think the whole RFC822 issue.

One benefit is that a YAML parser won't have to implement the strange RFC822
text parsing rules, ever. I'm catching Brian's headache when trying to
figure these out, and I have no confidence at all in being able to define a
subset which is (1) simple and (2) will work on most existing such headers.
Clark is being optimistic here, I think; definitely his current white space
rules aren't such a subset.

If we give up on it we could also use '=' for separating keys and values.
Using Clark's notion that the default key is the empty one:

    point %
        x %
            = 1.2
            accuracy = 0.1

Maybe even allow a shorthand:

    point %
        x % = 1.2
            accuracy = 0.1

Everyone wins: Brian gets his '=', I get my '= default', Clark gets his
"default key is empty", and we all gain from a simpler implementation.

That's the neatest solution so far, I think...

I'm signing out for the next day or so - flying away to my XP conference.
I'll try to catch up on my E-mail Sunday evening (my time).

Until then,

    Oren Ben-Kiki

Re: [Yaml-core] Classes

From: Brian I. <briani@ActiveState.com> - 2001-05-19 08:13:28

Oren Ben-Kiki wrote:
> 
> Brian Ingerson wrote:
> > I can live without class names for Scalars and Arrays, but Hashes are a
> > must. Most perl objects map into Hashes.
> 
> Yes.
> 
> > Even Array and Scalar ones can
> > be serialized into Hashes.
> 
> Sounds interesting. How?

Hacked in at least:

    key : #class %
        __type__ : ARRAY
        0 : first elem
        1 : 2nd elem
        2 : 3rd elem

Or something like that.

> 
> > Hashes work for all Python objects and
> > Javascript too, AFAIK. I'll bet they could work for Java and C++ as
> > well.
> 
> In C++ (or Java) you'd want to de-serialize map/list/scalar as standard
> library hash/vector/string; this makes it difficult to provide an extra
> "class" property to list/scalar.

But you don't need a class name to deserialize into these three. They're
automatic. Just for other classes. Right?

> 
> Clark wrote:
> > Classes will be in.  I think we should allow these
> > permutations since the syntax seems to allow them.
> > We will need a disclaimer stating that not every
> > YAML system will be able to preserve the round-trip
> > information for some combinations, namely:
> >
> >   a) Having & and * on the same line.
> >   b) Using # with @ or with a scalar
> 
> I'd rather say - "NO strict YAML 1.0 system" instead of "Not every YAML
> system". These constructs simply aren't YAML 1.0, period. Otherwise, people
> will be very surprised when they write:
> 
>     point: %
>         x: #float 1.2
>         y: #float 3.4
> 
> And notice the YAML pretty printer strips away the '#float's.
> 
> Let's think a sec about *why* we want to add classes. It seems they are
> intended to allow a de-serialization into application-specific native data
> types (that is, something other then Hash/Vector/String).
> 
> I think that 99% of the time, the application specific data type is an
> object class (like "point", above). In which case it is represented in YAML
> as a map, with a key for each data member. It follows that the data type for
> each key is already determined by the map's class; e.g., class "point"
> expects float coordinate values. So there's no need to explicitly declare
> it.

I need to go to sleep 3 hours ago. I'm trying to follow this, and I
think I'm agreeing with the logic. But I'm puzzled as to why the above
code represents a point object. I don't see a class name of "point". Is
this a typo? Or am I just as spent as I feel?

Good night, Brian

[Yaml-core] Classes

From: Oren Ben-K. <or...@ri...> - 2001-05-19 07:36:42

Brian Ingerson wrote:
> I can live without class names for Scalars and Arrays, but Hashes are a
> must. Most perl objects map into Hashes.

Yes.

> Even Array and Scalar ones can
> be serialized into Hashes.

Sounds interesting. How?

> Hashes work for all Python objects and
> Javascript too, AFAIK. I'll bet they could work for Java and C++ as
> well. 

In C++ (or Java) you'd want to de-serialize map/list/scalar as standard
library hash/vector/string; this makes it difficult to provide an extra
"class" property to list/scalar.

Clark wrote:
> Classes will be in.  I think we should allow these 
> permutations since the syntax seems to allow them. 
> We will need a disclaimer stating that not every 
> YAML system will be able to preserve the round-trip 
> information for some combinations, namely: 
>   
>   a) Having & and * on the same line. 
>   b) Using # with @ or with a scalar 

I'd rather say - "NO strict YAML 1.0 system" instead of "Not every YAML
system". These constructs simply aren't YAML 1.0, period. Otherwise, people
will be very surprised when they write:

    point: %
        x: #float 1.2   
        y: #float 3.4

And notice the YAML pretty printer strips away the '#float's.

Let's think a sec about *why* we want to add classes. It seems they are
intended to allow a de-serialization into application-specific native data
types (that is, something other then Hash/Vector/String).

I think that 99% of the time, the application specific data type is an
object class (like "point", above). In which case it is represented in YAML
as a map, with a key for each data member. It follows that the data type for
each key is already determined by the map's class; e.g., class "point"
expects float coordinate values. So there's no need to explicitly declare
it.

Can anyone come up with a use case where it makes sense to assign a class to
something which isn't a data member of an object, and isn't an object by
itself? The only thing I can think of is top-level keys - when the whole
YAML file is de-serialized into a single object and there's no top-level
element to declare the type in. I think that declaring a top-level element
is good form in such a case, and that it isn't onerous to require that.

In the remaining 1% of the cases, there is a workaround. Use the good old
color idiom - that's exactly the sort of problem we invented it for, after
all. So, taking my "# :" syntax, using the "default value" concept Clarl put
in YAML for this explicit purpose (:-), and spicing it with using '=' as the
key for the "default value" (pretty intuitive), you get:

    point: %
        x: %
            =: 1.2
            #: float
        y: %
            =: 3.4
            #: float

OK, that's more verbose, but remember we are talking about 1% of the cases.
Besides, we should eat our own dog food - either we believe in the color
idiom, or we don't.

A word on the color idiom: This is a concept which was raised a while back
in SML-DEV, in relation to schema evolution, meta-data vs. data, etc. Take
the simple 'point' example:

    point: %
        x: 1.2
        y: 3.4

That's a simple 'object' with two data members, and you can write YAML code
handling it. Then, one day, you decide you want to add an accuracy
designation to each. Or maybe you round-trip data from XML people, and you
want to round-trip the information of whether 'x' and 'y' were attributes or
sub-elements. At any rate, you want to add meta-data to 'x' and 'y'.

The color idiom says: "color" the 'x' and 'y' elements with sub-elements
containing this meta data. Take the original value and place it in a
sub-element as well (this would be the "default" sub-element Clark talked
about. If you are stumped for a name for it, just use '=' by convention).

You get:

    point: %
        x: %
            =: 1.2
            accuracy: 0.1
            xml-syntax: attribute
        y: %
            =: 3.4
            accuracy: 0.2
            xml-syntax: attribute
        xml-syntax: empty-tag

The point is that using YAML's "default" rules, old code will continue to
work unbroken on this new data structure. The value of 'x' is still '1.2'.
If you aren't interested in accuracy, you just ignore it. 

This may effect the API in languages such as Perl/Python/JavaScript (i'll
have to think of this a bit). In Perl, for example, you'll have to define a
conversion to scalar context which retrieves the default value (this is
possible, right? It is a bit beyond my Perl mastery).

It turns out the "color idiom" solves a great many problems - schema
evolution, layering of processing modules, round-tripping to other syntax
forms, versioning, you name it. So Clark and I are rather fond of it :-)

Have fun,

    Oren Ben-Kiki

[Yaml-core] To YAML or not to YAML

From: Brian I. <briani@ActiveState.com> - 2001-05-19 07:23:13

I was chatting on IRC to some Perl gurus who suggested that YAML is no
more a Markup Language than MIDI. Or than a kayak is an automobile. (To
quote them)

Their point is that a Markup Language marks up text documents. XML is a
data language that can reasonably markup text as well. YAML is not.
(This was their assertion.)

For instance, how would YAML markup:

<QUOTE>I went to <A HREF="http://www.webvan.com">Webvan</A> to 
get my groceries</QUOTE>

I guess their point is that YAML is a misnomer.

I would probably YAML it like:

    QUOTE : 
        I went to <A HREF="http://www.webvan.com"
        </A> to get my groceries

But that still doesn't invalidate their point.

Just a thought.

Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

[Yaml-core] Classes and Namespaces

From: Clark C . E. <cc...@cl...> - 2001-05-19 07:11:26

Since we seem to be making such wonderful progress
on the language, perhaps it is time to address
the requirements of class names.  I see a few goals:

  0. To have a brief mechanism.

  1. To allow the unpacking/packing mechanism 
     to associate particular structures with
     a validation techinque.

  2. To allow the unpacking/packing mechanism
     to associate operational code with a 
     given structure in addition to loading
     the data directly.

  3. To enable both of these features to work in 
     the same language environment but perhaps with 
     different environment or "classpath" settings.

  4. Ideally, to allow these same identifiers 
     to operate across multiple languages.

Item 3 and 4 imply that a class name must be
globally unique, in other words, it must
be accompanied by a namespace mechanism.

Before we go into solutions, I want to quickly
say that XML's namespace mechanism is seriously
broke for a few reasons.  First, they let the
namespace be anything without providing any 
discovery or resolution mechanism... the W3C
was far too flexible here.  Second, the namespace
mechansim is coupled with a prefix abbreviation
system; originally it was specified that this 
prefix is not-informational but XSLT used the
prefixes as informatino and thus pushed the
prefixes into the canonical form.. Ouch!
Third, they had several classes of how names
relate to elements.  Three strikes.  Bad W3C.

Anyway, As I'm tried, I'm not going to enumerate
the options.  Oren and I beat this one up one
side and down the other before we came up with
pretty much what Sun had already implemented
with Java, only our version was more restrictive
and more like RFC 882's solution.  The core 
problem is that goal 0 (briefness) conflicts
with the need for global uniqueness.   Here
is the proposal as I remember...

There are three classes of names.

1. The first class is globally unique
   based on reverse DNS packages.  Thus,
   "com.clarkevans.timesheet" is a valid
   class name which the user of the domain
   clarkevans.com has authority and control.
   Note that all of these names have at
   least one period.

2. The second class is globally unique
   names (without periods) as reserved 
   through a central registry... perhaps
   a semi-automated mechanism at yaml.org?
   Thus "date" may be an example of a class
   in this arena.

3. The third class is not globally unique
   but can be used for temporary development
   purposes.  These names start with "x-"
   and can be safely used *within* an  
   organization.  YAML texts using "x-" are
   said to be "local YAML" and should be
   discouraged beyond development.

Notes:


A. This mechanism is very well supported 
   well by Java's package mechanism.

B. This technique is possible through Python's
   package mechanism, but won't be perfect
   since they don't follow the religous
   DNS based system.

C. I'm not sure how this works with Perl,
   however, "org.cpan." prepended to any
   CPAN module may be a very good start.

D. In any case, this isn't perfect, so
   each YAML system will have to have some
   sort of "catalogue" system which associates
   the global identifiers with a local
   identifier.  This is common in SGML land.
   (XML unsuccessfully tried to duck it)

   This may seem ugly, but the local catalog
   can be a simple YAML config file!

Anyway, it is definately something to 
chew on for a while.  However, I think
it's probably as good as we can get.
Perhaps Brian may have some new insights
into the process given that he's not 
from SGML/XML land.

Best,

Clark

[Yaml-core] The indenting method

From: Clark C . E. <cc...@cl...> - 2001-05-19 06:46:32

Brian Ingerson wrote:
| This is correct:
| 
|     text:
|         |this is my
|         |multiline?
| 
| this is not:
| 
|     text:
|        |this is my
|        |multiline?

And he wrote...
| >         key4 @
| >             * id
| >             : value
| >             %
| >               key : value
|   Why switch indent width?
| >             @
| >               : value


Ok. Here are the indenting rules as I 
understand them (and have demonstrated
through example, but as of yet have not
laid out as a spec).

Call the whitespace leading up to the first
printable character of a node it's "indent"
For one node c to be considered a child of
another node p it must:

  1. The child must be indented further
     than it's parent.  Specifically...

     Node c must have an indent i formed 
     from the concatination of the indent
     for p plus at least one additional
     whitespace character.

  2. There are not any intermediate nodes
     in the hierarchy.  Specifically...

     There does not exist another node b,
     such that b occurs after p and
     b occurs before c and the indent for
     p occurs in the indent for b and
     the indent for b occurs in the 
     indent for c and the indent of b and
     c are not equal and the indent for
     p and b are not equal.

  3. Indents compare literally.
      
     An indent with a tab in position n
     can only be equivalent to another
     indent with a tab in position n.

  4. The indentation is consistent
     among siblings.

     If a and b are both children of
     p, then the indent for a is
     equal to the indent for b.


So to your question "why switch indent with"
I answer: "beacuse you can as long as it 
is consistent".

And to your correct/incorrect item above...
I don't understand.  Both are legal.
In particular you want to allow more
than one indentation style so that
fragments can be concatinated as
needed and/or indented. 

I hope this makes sence.  Based on my
experiences, this is the Python method.

Best,

Clark

[Yaml-core] What do do with multi-line scalars in the context of a list?

From: Clark C . E. <cc...@cl...> - 2001-05-19 06:21:46

It seems that the *only* place we may now need
a scalar indicator is when the scalar has
multiple lines and occurs within a list.
I must say... this should be rather rare...

| key: @
|  Harry the flea
|   is not really hairy
|  |Harry the flea
|   |is not really hairy

As Oren and Brian agree, possible, but UGLY.

| 0) ':' is a key/value separator (We could change to '=' if it weren't
|    for self-imposed 822 restrictions)
| 1) ':' also is the scalar marker (Could be still be '$')
| 2) The scalar marker is optional, except to resolve ambiguity. (Lists of
|    single/multi-lines mixed together is the only example I can think of. If
|    you can think of a way out of that one <without shifting the '|'
|    paragraphs> then we can get rid of the scalar marker)
| 3) Any instance of ': :' can be collapsed to ':' without loss of
|    meaning.

I see four options:

  a) We can deal with the ugliness.
  b) We can re-introduce the scalar marker ($)
     that is optional pretty much everwhere 
     excepting this case.
  c) We introduce a special marker only
     for this case (no optional stuff).
  d) We introduce indexes (did Oren propose this?)

I think "a" is out beacuse it is ugly.  I think
"b" is out since adding a new item in the list
would require re-numbering... or nastly BASIC
10, 20, 30 style uglyness.

I'd also eliminate "b".  If we re-introduced
a scalar indicator, I strongly feel it should
not be the same as the key/value separator.
Also, I'm not fond of optional stuff... as
it causes exception handling in people's head.
Thus, if we did option "b", I'd like the
indicator to be mandatory for consistency.
But this is ugly.  Therefore, I think "b" is out.

This leaves us "c".  I say we use ":" to indicate
a multi-line scalar in this case and make it
a documented exception.  No use in trying to
rationalize it...

Best,

Clark

[Yaml-core] Changes to folding mechanism and elimination of "quoted strings"

From: Clark C . E. <cc...@cl...> - 2001-05-19 06:00:40

Earlier I had thought that RFC822 forced condensation of 
consecutive white spaces into a single space like HTML.  
Since I was only 1/2 right (true for structured headers)
and since the 1/2 that I was right about matters little 
to us, this requires some re-thinking.  Furthermore, I've
been unhappy about the "quoted  string" thingy.  Yucko.
Therefore, below is the new scalar value proposal.

...

There are three syntax forms for scalar values, 
"block", "stream", and "mime" which are used to
accomidate differing needs for expressing content.
The block form specializes in pre-formatted or
raw text, the stream form works well for 
unformatted content, the mime form is ideal 
for binary values or very large chunks of 
formatted text.

In the block form:
  a) trailing whitespace is significant
  b) end of line marker is also significant
  c) there is not a quoting or escaping mechanism
  d) each line must begin with the block symbol,
     which is currently "|"
  e) a slash character can be used in place 
     of the block symbol for the very last
     line to indicate that the block does not
     terminate with an end-of-line marker.

In the stream form:
  a) leading/trailing whitespace is not significant
  b) the end of line marker is also not significant
  c) consecutive whitespaces bounded by printable
     characters are significant
  d) standard slash style \n, \t, \\ style
     escaping is provided
  e) a trailing slash is used to indicate that
     the last character of a line and the first
     character of the next line are ajacent 
     without an intermediate space

In the mime form:
  a) the entire content is included as a seperate
     MIME section and may be transfer-encoded
     using base64 or quoted-printable
  b) the content is assigned an identifier
  c) this identifier can be used to provide
     a reference to the content within the 
     YAML proper

Examples of the block form include:

  a: | This line does not need \ escaping.

  b: 
  |This is a block form
  |with two end of line markers.

  c: |
     |     X
     |    X X
     |   X X X
     |
     | This one has a leading
     | carriage return and
     | does not have a trailing
     \ end of line marker.

Examples of stream form include:

  a: This is a simple one liner
  b: This has     intermediate spaces.
  c: This is a multi line
     value that wraps\taround and\n
     uses an end of line marker.
  d:
   \n      X
   \n     X X
   \n    X X X
   \n
   \n This one is just like "c" above.
   \n Except that \\ escaping is required.


An example of the mime form is...


Date: Sun, 13 May 2001 23:48:04 -0400
MIME-Version: 1.0
Content-Type: multipart/related; 
    boundary="================================"
X-YAML-Version: 1.0

--================================
Content-Type: text/plain; id="0001"

  XX XXX    XXXXX   XX   XX
   XXX XX       X   XX X XX
   XX      XXXXXX   XX X XX
   XX      X   XX   XXXXXXX
  XXXX     XXXXX X   XX XX

--================================
Content-Type: text/x-yaml

  raw: *(0001)

--================================--

Re: [Yaml-core] (backfill) YAML Meeting: Oren's Feedback]

From: Brian I. <briani@ActiveState.com> - 2001-05-19 05:39:07

"Clark C . Evans" wrote:
> 
> On Fri, May 18, 2001 at 04:45:06PM -0700, Brian Ingerson wrote:
> | >   text:
> | >    |this is my
> | >    |multi-line
> |
> | Hopefully, you saw my last comment on this. I now think the '|' should
> | be in the first column *after* indentation.
> 
> Thus allowing for...
> 
>    text:
>    |this is my
>    |multiline?

No.

This is correct:

    text:
        |this is my
        |multiline?

this is not:

    text:
       |this is my
       |multiline?

> | I can live without class names for Scalars and Arrays
> 
> Classes will be in.  I think we should allow these
> permutations since the syntax seems to allow them.
> We will need a disclaimer stating that not every
> YAML system will be able to preserve the round-trip
> information for some combinations, namely:
> 
>   a) Having & and * on the same line.
>   b) Using # with @ or with a scalar
> 
> I don't like the idea of forbidding them, however.
> These constructs just are not supported in every
> language... oh well.  If someone *has* to make use
> of these constructs, they will be available at the
> parser level.

Sounds great.

, Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

Re: [Yaml-core] (backfill) YAML Meeting: Oren's Feedback]

From: Clark C . E. <cc...@cl...> - 2001-05-19 05:24:38

On Fri, May 18, 2001 at 04:45:06PM -0700, Brian Ingerson wrote:
| >   text:
| >    |this is my
| >    |multi-line
| 
| Hopefully, you saw my last comment on this. I now think the '|' should
| be in the first column *after* indentation.

Thus allowing for...

   text:
   |this is my
   |multiline?

Ok.  I can rationalize this...

| Actually there's no restriction. It would look like this:
| 
|    text: |does X align vertically?
|          |          X

Right.


| > Also, to enable concatination, I think we should
| > allow IDs to be shadowed.  In other words,
| 
| I support this.

Good.

| I can live without class names for Scalars and Arrays

Classes will be in.  I think we should allow these
permutations since the syntax seems to allow them.
We will need a disclaimer stating that not every
YAML system will be able to preserve the round-trip
information for some combinations, namely:
  
  a) Having & and * on the same line.
  b) Using # with @ or with a scalar

I don't like the idea of forbidding them, however.
These constructs just are not supported in every
language... oh well.  If someone *has* to make use
of these constructs, they will be available at the
parser level.

| Now we could make a distinction between Objects and Hashes.

I'd like the distinctions to reflect differences
in the YAML syntax... not in how a host language
treats the distinctions.  Hopefully this makes sense.

| Don't leave me without an object model or I'll go Postal %

We won't leave you without objects.  It is an 
explicit goal not to have our own object model, 
like XML has DOM.  What a mess.  *smile*

Best,

Clark



----- End forwarded message -----

[Yaml-core] RFC 822 and YAML

From: Clark C . E. <cc...@cl...> - 2001-05-19 05:04:57

I've been reading quite a bit more on RFC822, some
of the information that I have stated is partially
correct.  This is an attempt to clarify.

1.  There are two types of header fields in RFC822, 
    structured and unstructured. 

2.  In both cases, the header name may only include 
    printable characters (33-126) excluding the colon. 
    Thus, a space is not permitted between the end of 
    the name and the colon.

3.  In both cases, a header value may contain any
    sequence of ASCII characters (1-127), although
    CR and LF are not significant.

4.  In both cases, before any whitespace character,
    a CRLF pair can be inserted or removed as 
    necessary to wrap the line to the required number
    of columns (76 in old systems, 250 in newer systems)

5.  In both cases, each header line is delimited by
    a CRLF followed by a printable ascii character
    in line one.  Note that this is consistent.

6.  Structured fields add significantly more 
    constraints:
   
    a. They introduce the notion of "comments" by
       using the parenthesis
    b. They have the concept of a domain literal,
       which is specific to e-mail requirements.
    c. They have "quoted strings" used to allow
       the usage of special characters without
       escaping.
    d. Read the spec... it's tedious.

    Note: In particular (I mis-informed earlier...)
    consecutive whitespace in quotes are not preserved
    within structured fields!  
   
7.  Structured fields also treat multiple linear space
    characters (tabs and spaces) as a single space.

8.  The header terminates with a CRLFCRLF sequence,
    other wise known as a blank line.

9.  For RFC 822 compliance, mandatory headers must
    include: Date, From and either To or BCC.

10. Interesting to note that vCard uses the
    semi-colon in names to indicate a type
    hierarchy, it also includes key=value
    parameters, for example...
      TEL;WORK;FORMAT=X:

11. Also interstingly, mbox is concatinated
    RFC (header+body) messags, where a "From"
    line (without the colon) containing e-mail
    address and a date is used to indicate the
    start of a new message

Impacts:

A. To remain "consistent", we should use the 
   colon as the magic seperator between the 
   key and value in a map.  Further, we should
   allow the key to be flush against the colon.

B. Due to the folding constraints (#4 above), 
   YAML will not be valid RFC 882.  No way 
   around this without significant changes,
   and significant changes are not possible.
  
C. Previous ideas that we needed "quoting" 
   to follow RFC 882 was incorrect.  We are
   free to design what ever meaning we require
   for "quoted strings".

D. The multiple space condensation rules only 
   apply to *Structured* RFC822 headers.  Thus,
   this process may have to be suspended when
   in the RFC822 headers.

E. The mandatory headers can be used as a 
   guide as to if the section is RFC822 or YAML.

Summary:

  A minimal RFC822 support (unstructured only) 
  is going to be a cake walk to implement and
  will be included in the spec.

  We may want to re-consider our "a  b" 
  technique... although I still like it.

I apologize for any mis-conceptions that I
may have propigated earlier.

Kind Regards,

Clark

Re: [Yaml-core] RE: Meeting Minutes

From: Brian I. <briani@ActiveState.com> - 2001-05-19 00:32:58

Oren Ben-Kiki wrote:
> 
> > > I assume the trailing ':' is a typo?
> >
> > No. See earlier post message for the reasoning.
> 
> > > That leaves "class" as the only problematic issue...
> >
> > I've read through this briefly, but don't have time to comment yet.
> > Let's stick with the original syntax for now.
> 
> I don't know about that; it is easier not putting something in than taking
> it out later.

It *is* optional. All ':' are optional except to distiniguish a
multi-line from several single lines.

> 
> > In general, keep in mind that YAML 1.0 will *not* be the final YAML
> > spec. It will evolve to YAML 2.0 and so on. For now, let's strive for
> > maximum sytactic simplicity.
> 
> That's why I'd rather leave #class out if it until proven necessary.

* ingy is trying to suppress postal tendencies * ;)

> > While your comment on aesthetics may be true, there is a major
> > distinction between what you think a ':' means and my intent.
> >
> > 1) A ':' is always a key value separator. We agree on that, but each
> > want it to have one other meaning.
> > 2) You want colon to be a "list bullet" in list context.
> > 3) I want ':' to mean '$' for scalar values. And I want it to almost
> > always be optional (unless there is ambiguity)
> > 4) That said. We can make it the canonical/default form for emitters if
> > we wish.
> 
> Actually, I thought of differently, since I started with thr RFC822 frame of
> mind. The idea was to combine two concepts:

Ahh. Whacking everything with the RFC822 hammer ;)

> - Unify Clark's Python-like indentation with RFC822 concept that (more)
> indented lines continue a value;
> - Make each YAML "element" have an RFC822 header line of its own.
> 
> > 3) Minimal
> 
> Nope. Minimal is:
> 
> >     key7 : #class3 @
> >         Tom the flea
> >         Dick the flea
>           Harry the flea
>            is not really hairy
> >         %
> >             foo : bar
> >         #class4 %
> >             FOO : BAR
> >         #class5 A very classy flea
> >         #class6
> >             |My favorite fleas:
> >             |     Jim
> >             |     Bob
> 
> You don't really need it. Since the next line is "more indented", it is a
> continuation line. It also works for aligned text:
> 
>     |Harry the flea
>      |is not really hairy

Big ouch.

> Pretty it isn't, but it is minimal :-)

Agree. On both counts.

> Here's another option (call it 5):
> 
> If we want to think of ':' as a "scalar marker", we could say that the
> syntax is:
> 
>     map %
>         key1 % (id1)
>             key : value
>         key2 : value
>         key3 * (id2) id1
  Boggle %l
>         key4 @
>             * id
>             : value
>             %
>               key : value
  Why switch indent width? 
>             @
>               : value
> 
> This is consistent; a value is always prefixed by its marker (@, %, : or *).
> No need to write ": %" or for that matter "map: %"; in a map, the syntax is
> <key> <value> where the ':' is just one of the options. RFC822 is simply a
> top level map with keys having only text values.

This might have a chance of general acceptance if you replaced ':' with
'$' in all cases. You won't accept that because of the 822 thing. And I
couldn't accept it because a key/value separator is visually important
if nothing else.

> 
> BTW, I tried switching back to (id) instead of &id - that's consistent with
> RFC822's "comment" concept, emphasising the id is not part of the data
> model. We could keep it as &id, it doesn't matter much.

I can go either way.

> Aesthetic is important, or S-expressions would rule over the world.

Agreed, but its not a common aesthetic. It's your personal one. My
feeling is that it would bother a lot of people, especially if it was
required.

> 
> Besides, aesthetics aside, being consistent is also important. This rules
> out option 4 for me, even though it looks nice (consistency over aesthetics
> :-). Options 1 and 2 are too noise for me... That leaves us with option 3
> (the corrected one, without the ':'), option 5, and my original option 6.

I see no inconsistencies in #4, if you accept the premises I laid out:

0) ':' is a key/value separator (We could change to '=' if it weren't
for self-imposed 822 restrictions)
1) ':' also is the scalar marker (Could be still be '$')
2) The scalar marker is optional, except to resolve ambiguity. (Lists of
single/multi-lines mixed together is the only example I can think of. If
you can think of a way out of that one <without shifting the '|'
paragraphs> then we can get rid of the scalar marker)
3) Any instance of ': :' can be collapsed to ':' without loss of
meaning.

> Have fun,

I cut myself shaving today. That's about all the fun I've had so far...

Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

Re: [Yaml-core] (backfill) YAML Meeting: Oren's Feedback

From: Brian I. <briani@ActiveState.com> - 2001-05-18 23:48:19

"Clark C . Evans" wrote:
> How about we define indentation as leading whitespace?  In this
> case, the minimal indentation for this style would be two.
> 
>   text:
>    |this is my
>    |multi-line
> 

Hopefully, you saw my last comment on this. I now think the '|' should
be in the first column *after* indentation.

> |
> | > F) I'd like to push for this always starting on the next line if it is a
> | > map value. It has no relation to RFC822.
> |
> | What's the harm in allowing:
> |
> |     text: | Spaces   and " and \n, oh my!
> |
> | ('"' and '\' meaning themselves in this text).
> |
> | I don't see why we need to make this distinction between multi- and single-
> | line text at all. It is bad enough we provide two different quoting
> | mechanisms...
> 
> Beacuse then we have to answer questions like...
> 
>   text: |does X align vertically?
>    |          X
> 
> I like Brian's restriction on this form.

Actually there's no restriction. It would look like this:

   text: |does X align vertically?
         |          X

But this would *not* be the default.

> Yes. Option C.  For the sequential API, we will have
> to expose this, but not in the in-memory representation.
> And in the sequential API, it can be wrapped as an
> opaque handle (without a string representation).
> Under no circumstances do we want this ID to become
> "data" as the XML prefix has... less we not have a
> reasonable canonical form.
> 
> Also, to enable concatination, I think we should
> allow IDs to be shadowed.  In other words,
> 
>   a : &0001  "This is a value"
>   b : *0001
>   c : &0001  "This is another value"
>   d : *0001
> 
> In this case, d->c and b->a.  After c, there is no
> way to access a by reference.  Simple solution,
> and this way concatination is still well defined.
> 

I support this.

> | That leaves "class" as the only problematic issue. We explicitly
> | decided not to talk about it in the conference call. It seems to
> | me like there's no way around requiring that this data will survive
> | round-trips, but I also don't see how it is possible to de-serialize
> | "scalar value" into a normal "Java String" if someone attached an
> | "unknown" class to it.
> 
> If classes arn't available in the target environment,
> or if the class requested can't be found, then we
> have a slight problem.  A resonable solution is
> to notify the user via a warning, and then create
> an auxilary yaml-class-map which maps lists/strings/maps
> that have been created by the load with their
> corresponding class name.  In this way we keep the
> native structures, but preserve the class names through
> a "coloring archive" on the side.
> 
> | So, the idea is to bite the bullet and remove "class"
> | as something specified in the "label line" (BTW, we need
> | to define some terminology here; I'm using "label lines"
> | and "text lines" - or maybe it should be "content lines"?).
> | It turns out that we can still achive most of the goals
> | of the "class" construct by making the key "#" magical in maps:
> |
> |     center: %
> |         #: point
> |         x: 35.3
> |         y: 42.1
> 
> Interesting.  However, this prevents class names for
> Strings or Lists.  Very interesting.  What do we do about
> Strings and Lists?  Move this into the category of "non-portable"
> constructs, like a & and * on the same line?  I'm not sure.
> The "coloring on the side" may be more painful (esp garbage
> collecting), but it at least does not get in the way.  Hmm.

I can live without class names for Scalars and Arrays, but Hashes are a
must. Most perl objects map into Hashes. Even Array and Scalar ones can
be serialized into Hashes. Hashes work for all Python objects and
Javascript too, AFAIK. I'll bet they could work for Java and C++ as
well. 

Now we could make a distinction between Objects and Hashes. Just have
Objects be separate regardless of their implementation. Just pick
another symbol. '&' is available if we go back to %(0001) notation.

Don't leave me without an object model or I'll go Postal %\


Still smiling, Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

[Yaml-core] RE: Meeting Minutes

From: Oren Ben-K. <or...@ri...> - 2001-05-18 22:52:52

> > I assume the trailing ':' is a typo?
> 
> No. See earlier post message for the reasoning.

> > That leaves "class" as the only problematic issue...
> 
> I've read through this briefly, but don't have time to comment yet.
> Let's stick with the original syntax for now. 

I don't know about that; it is easier not putting something in than taking
it out later.

> In general, keep in mind that YAML 1.0 will *not* be the final YAML
> spec. It will evolve to YAML 2.0 and so on. For now, let's strive for
> maximum sytactic simplicity.

That's why I'd rather leave #class out if it until proven necessary.

> Emailed Damian last night. He's preparing for an 11-week world speaking
> tour. I'll see him in June at the YAPC (Yet Another Perl Conference) in
> Montreal and I'll be sure to pin him down about YAML. BTW, I mentioned
> to Clark that I'll probably be speaking about YAML at YAPC :)

Now, if that was early July, I might have been able to attend. Oh well.

> While your comment on aesthetics may be true, there is a major
> distinction between what you think a ':' means and my intent. 
> 
> 1) A ':' is always a key value separator. We agree on that, but each
> want it to have one other meaning.
> 2) You want colon to be a "list bullet" in list context.
> 3) I want ':' to mean '$' for scalar values. And I want it to almost
> always be optional (unless there is ambiguity)
> 4) That said. We can make it the canonical/default form for emitters if
> we wish.

Actually, I thought of differently, since I started with thr RFC822 frame of
mind. The idea was to combine two concepts:

- Unify Clark's Python-like indentation with RFC822 concept that (more)
indented lines continue a value;
- Make each YAML "element" have an RFC822 header line of its own.

That's why I suggested the syntax:

map: %
  key: value
list: @
  : text
  : %
    key: multi line
         value
  : @
    : value

Etc.; Every element is logically a single RFC822 header line, with
indentation playing a dual role (both "continue a text value" and "continue
a structured value"). It just happens that in some lines the "field name" is
empty, that's all.

The ':' isn't a "scalar marker", it is a "YAML element marker". I still
don't see why you need a "scalar marker" as such, be it $, : or whatever. I
realize that Perl makes such a marker a natural thing to think of, but all
it does is clutter the text. 

(BTW, maybe one day we'd want to allow:

   list: @
      0 : 0th position
      10 : 10th position

Etc. Admittedly sparse lists aren't that much of a use case... but the above
sure beats having to specify 9 null values, doesn't it? :-)

> Consider the following four examples.
> 
> 1) Fully qualified with '$'.

Rather verbose...

> 2) Fully qualified with ':'. The only real gain here is no " for $40.00.

Right. Not much of an improvement.

> 3) Minimal

Nope. Minimal is:

>     key7 : #class3 @
>         Tom the flea
>         Dick the flea
          Harry the flea
           is not really hairy
>         %
>             foo : bar
>         #class4 %
>             FOO : BAR
>         #class5 A very classy flea
>         #class6
>             |My favorite fleas:
>             |     Jim
>             |     Bob
> 
> Note that the only required ':' (besides the key/value ones) is for ':
> Harry the flea'

You don't really need it. Since the next line is "more indented", it is a
continuation line. It also works for aligned text:

    |Harry the flea
     |is not really hairy

(To clark's question about this: All the indentation, up to and including
the '|' starting the line, is removed, and the result is the verbatim text.
So it is well defined even if the '|' in the first line isn't on the same
column as in the rest of the lines).

Pretty it isn't, but it is minimal :-)

> 4) Suggested canonical form:

The difference between it and my original proposal is that ": %" and ": @"
were collapsed into "%" and "@", as a shorthand. I'd still put an ID, say,
*after* the ':', because it isn't a scalar marker...

> The things I don't allow are:
> ...
>         : : a scalar
>         : #class : a scalar

Obviously the second ':' is unnecessary, so I agree with you here.

>         : %
>         : @
>         : #class %
>         : #class @

Here's another option (call it 5):

If we want to think of ':' as a "scalar marker", we could say that the
syntax is:

    map %
        key1 % (id1)
            key : value
        key2 : value
        key3 * (id2) id1
        key4 @
            * id
            : value
            %
              key : value
            @
              : value

This is consistent; a value is always prefixed by its marker (@, %, : or *).
No need to write ": %" or for that matter "map: %"; in a map, the syntax is
<key> <value> where the ':' is just one of the options. RFC822 is simply a
top level map with keys having only text values.

BTW, I tried switching back to (id) instead of &id - that's consistent with
RFC822's "comment" concept, emphasising the id is not part of the data
model. We could keep it as &id, it doesn't matter much.

Just so we'll have numbers for all of these, call my original proposal
option 6:

    map : %
        sub map : % &id1
            key : value
        text key : value
        ref key : * &id2 id1
        list key : @
            : *id
            : value
            : %
              key : value
            : @
              : value

> The problem with ':' as a "list bullet" is that it could not be
> optional. And that's too restrictive just to satisfy a personal
> aesthetic.

Aesthetic is important, or S-expressions would rule over the world.

Besides, aesthetics aside, being consistent is also important. This rules
out option 4 for me, even though it looks nice (consistency over aesthetics
:-). Options 1 and 2 are too noise for me... That leaves us with option 3
(the corrected one, without the ':'), option 5, and my original option 6.

I think I like 5 the most...

Have fun,

    Oren Ben-Kiki

Re: [Yaml-core] Re: Meeting Minutes

From: Clark C . E. <cc...@cl...> - 2001-05-18 21:55:55

On Fri, May 18, 2001 at 01:54:24PM -0700, Brian Ingerson wrote:
| While your comment on aesthetics may be true, there is a major
| distinction between what you think a ':' means and my intent. 

Not to make things worse, but I don't particularly
like ":" being used for two different meanings.
I'd rather have the $ back, people can always
quote currency amounts:

  total : "$540.00"

No problem here.

| 1) Fully qualified with '$'.
| 
|     key1 : $ my dog has fleas
|     key2 : $ "$40.00 for veternarian exam"
|     key3 : $
|         The vet said, "Yes Ingy,
|         Your dog has fleas."
|     key4 : $ Ingy said, "Wow,
|              my dog has fleas!"
|     key5 : #class1 $ I hate fleas
|     key6 : #class2 $
|         What is your viewpoint
|         about fleas?
|     key7 : #class3 @
|         $ Tom the flea
|         $ Dick the flea
|         $ Harry the flea
|           is not really hairy
|         %
|             foo : bar
|         #class4 %
|             FOO : BAR
|         #class5 $ A very classy flea
|         #class6 $
|             |My favorite fleas:
|             |     Jim
|             |     Bob

I like the above.  It's clean, despite 
being a bit "noisy".

5) Minimal

    key1 : my dog has fleas
    key2 : $40.00 for veternarian exam
    key3 :
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : Ingy said, "Wow,
           my dog has fleas!"
    key5 : #class1 I hate fleas
    key6 : #class2
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        Tom the flea
        Dick the flea
        $ Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 A very classy flea
        #class6
            |My favorite fleas:
            |     Jim
            |     Bob

6) Canonical

    key1 : my dog has fleas
    key2 : $40.00 for veternarian exam
    key3 :
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : Ingy said, "Wow,
           my dog has fleas!"
    key5 : #class1 I hate fleas
    key6 : #class2
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        $ Tom the flea
        $ Dick the flea
        $ Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 A very classy flea
        #class6
            |My favorite fleas:
            |     Jim
            |     Bob


Thoughts?  I don't mind escaping "$382.00", as 
this is a small use case anyway, and I'd rather
stick with Perl's type indicators.

Best,

Clark

[Yaml-core] Re: Meeting Minutes

From: Brian I. <briani@ActiveState.com> - 2001-05-18 20:57:39

Oren Ben-Kiki wrote:
> 
> > > 1. Brian stated that he would invstigate Oren's Syntax
> > >    and get back with us if it meets Perl's serilization
> > >    requirements for hard references.  If not, specify
> > >    what alternatives we can use.
> >
> > I don't think it's that important to investigate. It will probably
> > always be a moot point. I will let Data::Denter use it's current scheme
> > to deterministically round-trip all Perl data structures. YAML.pm
> > probably will have no need for this. It's all acadenic and I have no
> > spare time for academics for three more months. (My guess is, yes it
> > could be made to work, but would be suboptimal for Perl people) Let's
> > leave it at that for now.
> 
> Does that mean we are giving up on Denter using YAML syntax (extended to
> handle pointer-to-pointer)?

Just for the record, the Perl component for YAML is called YAML.pm.
Data::Denter is only of interest to Perl programmers from this point on.
It may fondly be remembered as the catalyst for YAML 1.0. And it may
keep a greedy eye on the YAML projects treasures, but that's of no
concern here.

> 
> I'm going to go over it with a fine-tooth comb, just to see what is involved
> in making YAML a superset of it. I guess I'll also have to look at MIME
> while I'm at it, with the same comb :-)

Beware of the nits! Nasty buggers. ;)

> 
> > On 4 & 5. I don't really like the blank line at the beginning thing
> > because people will mess it up or not understand it. And we have many
> > heuristic options.
> >
> > A) Parse lookahead for X-YAML-Version
> > B) Option-A rarely needed because as soon as we see a key that is *not*
> > RFC822 compliant, we assume YAML. 99% of the time this is the first
> > line!
> > C) If there is no whitespace allowed before the colon in RFC822, we
> > simply make it a requirement in YAML. Or does this break your RFC
> > compatability rules?
> >
> > Just for my own edification, would you please explain the rationale
> > behind making YAML RFC822 compliant. And do so with one of more specific
> > examples. Thanks :)
> 
> Well, for example, suppose that YAML was a "good enough" superset of RFC822.
> Then we could just adopt my idea that "blank lines separate top-level maps"
> and we wouldn't have to say anything further about RFC822 headers, period.
> If one wants to read/write a mail message as a YAML document, then it will
> simply work (as long as he sticks to the "safe" constructs there). If one
> wants to have a YAML document that has nothing to do with RFC822, that also
> works. No need for any special statement about them. I like this approach
> best.

I think that sounds right, if I understand it correctly. 

My only contention above was the very first blank line, not the ones
separating documents.

> >
> >     " this is the hash\n key for this example  :-) " : #class :
> 
> I assume the trailing ':' is a typo?
> 

No. See earlier post message for the reasoning.

> >        |# My Perl Subroutine
> >        |
> >        |  sub version {
> >        |      if ($_[0] =~ /\n/) {
> >        |          return \ "\to sender";
> >        |      }
> >        |  }
> >
> > Sorry for overloading this example with so many weird things. I'll just
> > comment on the multiline semantics:
> >
> > A) Trailing whitespace is preserved if the transporter preserves it.
> > B) The content can always be encoded before transport anyway.
> > C) Nothing is escaped. The content is truly verbatim. A '\' is a '\'.
> > D) An implicit newline is assumed to be at the end of every line.
> 
> We have to decide what our position is about them, BTW. Is a newline a "\n"
> or a "\n\r" - the answer may be different in-memory and in the text file
> (and thank you, O nameless DOS/CPM programmer, for inflicting this on us :-)

Bastard of Bastards. :(

But I think the heuristic is quite simple. Since the newline is
implicit, just replace whatever is there with the system's native
choice.

> 
> > E) Note that the '|' is one column back from the actual indentation
> > level. This is intententional. And it will work even if the indent width
> > is set to one character wide. (not mandatory, but I like it.)
> 
> Under Python indentation rules, there's no problem indenting the "label"
> line by 4 characters and the text lines by 7, or whatever. What you say
> about one character indentation, however, implies that the following would
> be legal:

Yes. It would be legal.

> 
>     text:
>     |multi-line
>     |text
> 
> I'm not certain I like it. I think Clark should make the call here -
> indentation is his baby.

I actually don't like it for another subtle reason. Tabs. You couldn't
use them properly with this scheme. So let's scrap the backing up one
space requirement. And yes, that's my final answer ;)

> 
> I started thinking about it and hit on an issue which Brian may already have
> thought about - or will have to very soon, if he's covering YAML.pm :-) The
> problem is we haven't defined the data model (or, viewing it differently,
> the round-tripping issue).
> 
> In "dynamic" languages such as Perl, JavaScript, Python (and to some extent,
> Java), it is natural to map a YAML map to the native hash, a list to a
> vector/array, and a scalar value to a simple string. That works admirably
> well, as long as the YAML entity hasn't been annotated with an ID or a class
> name.
> 
> If one wants to provide a stable-round tripping utility (e.g., suppose I
> want to write a YAML pretty printer), where am I to store the ID of a scalar
> value? The class of a map? For this use case, it seems my best course of
> action is to wrap the native construct (map/list/scalar) in an object which
> has an "id", a "class", and a "value".
> 
> There are several options:
> 
> A) Use the native constructs when possible, and only use "wrapper" objects
> when there's a need. That makes access pattern unpredictable: do I write
> map{key} or map{key}.value?

That's my idea.

> 
> B) Always use wrapper objects, and give up on de-serializing YAML into
> arbitrary native data structures. Big hit on usefulness - if we do this,
> Brian will just give up on us :-)

You're getting to know me pretty well ;)

> 
> C) Declare that IDs may be re-written arbitrarily, even by pretty printers.
> That is, banish them from the data model.

I think I agree...

> 
> That leaves "class" as the only problematic issue. We explicitly decided not
> to talk about it in the conference call. It seems to me like there's no way
> around requiring that this data will survive round-trips, but I also don't
> see how it is possible to de-serialize "scalar value" into a normal "Java
> String" if someone attached an "unknown" class to it.

I've read through this briefly, but don't have time to comment yet.
Let's stick with the original syntax for now. 

In general, keep in mind that YAML 1.0 will *not* be the final YAML
spec. It will evolve to YAML 2.0 and so on. For now, let's strive for
maximum sytactic simplicity. I think we can special case the semantics
of 1.0 without needing to change the current syntax.

> > >
> > > 12. Brian mentioned that he'd show YAML to one of
> > >     his Perl friends.  (sorry I didn't catch his name)
> >
> > Damian Conway http://www.csse.monash.edu.au/~damian/
> 
> His input will be greatly appreciated.

Emailed Damian last night. He's preparing for an 11-week world speaking
tour. I'll see him in June at the YAPC (Yet Another Perl Conference) in
Montreal and I'll be sure to pin him down about YAML. BTW, I mentioned
to Clark that I'll probably be speaking about YAML at YAPC :)

> > > 15. Clark agreed to write up the "single vs multi"
> > >     line controversy and post to the list so that
> > >     it is clearly understood.
> 
> I thought we settled this... Every scalar value is potentially multi-line.
> It doesn't seem to cost us anything, or does it?

I agree but see below.

> 
> > > 16. We made little progress on the scalar indicator
> > >     for lists, to colon or not to colon.  It wasn't
> > >     agreed, but Clark thinks this is someone else's
> > >     monkey.  If Oren and Brian can't agree within
> > >     7 days, Clark will put on the dictator cap.
> >
> > We traded in the '$' for the ':'. '$' as the last character in a line
> 
> I thought ':' was the first one; it is "as if" it is a normal header, with
> the key "just happening" to be empty. This seems more consistent.
> 
> > meant a multiline scalar was to follow. Converting this semantic to the
> > ':' leaves us with these represntations:
> >
> >     key1 : @
> >         single line
> >         :
> >             classless folded
> >             multi line
> >         another single line
> >         and another
> >         #class &0001 :
> 
>           : #class &0001

No, not a mistake.

> 
> >             classed multi
> >             line
> >         #class &0002 classed single line
> >         %
> >             key : value
> >         @
> 
> This is an empty list, right?

Yup. Just to keep you on your toes :)

> 
> >         ~
> 
> And this is a null?

Indeed.

> 
> >         #classy %
> >             key : value
> >         : even this multi line on the same line
> >           as a colon thingy works because there
> >           a little bit of indentation imposed by
> >           colon. (Although I don't love it)
> 
> This means the following:
> 
>           : single line
> 
> Will also work, even though you *really* dislike it. I like them :-)

Noted :-)

> 
> >         : "Another thingy like above that meets"
> >           "RFC822 wackiness"
> >         :
> >            |    1
> >            |   1 1
> >            |  1 1 1
> >            |Just for completeness :-)
> 
> I think we've said everything there's to be said about this, and whether or
> not you find either:
> 
>     list:
>       : One
>       : Two
>       : Three
>         and Four
> 
> Or:
> 
>     list:
>         One
>         Two
>         :
>             Three
>             and Four
> 
> To be beautiful or ugly is, when all is said and done, a matter of taste. To
> you, the extra ':'s are an eyesore; to me it seems strange that the
> multi-line value is "more indented"; it seems as though there's structure
> involved, when there isn't. I also like being able to do /^:/ in VI to get
> to the next entry.

While your comment on aesthetics may be true, there is a major
distinction between what you think a ':' means and my intent. 

1) A ':' is always a key value separator. We agree on that, but each
want it to have one other meaning.
2) You want colon to be a "list bullet" in list context.
3) I want ':' to mean '$' for scalar values. And I want it to almost
always be optional (unless there is ambiguity)
4) That said. We can make it the canonical/default form for emitters if
we wish.

Consider the following four examples.

1) Fully qualified with '$'.

    key1 : $ my dog has fleas
    key2 : $ "$40.00 for veternarian exam"
    key3 : $
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : $ Ingy said, "Wow,
             my dog has fleas!"
    key5 : #class1 $ I hate fleas
    key6 : #class2 $
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        $ Tom the flea
        $ Dick the flea
        $ Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 $ A very classy flea
        #class6 $
            |My favorite fleas:
            |     Jim
            |     Bob

2) Fully qualified with ':'. The only real gain here is no " for $40.00.

    key1 : : my dog has fleas
    key2 : : $40.00 for veternarian exam
    key3 : :
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : : Ingy said, "Wow,
             my dog has fleas!"
    key5 : #class1 : I hate fleas
    key6 : #class2 :
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        : Tom the flea
        : Dick the flea
        : Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 : A very classy flea
        #class6 :
            |My favorite fleas:
            |     Jim
            |     Bob


3) Minimal

    key1 : my dog has fleas
    key2 : $40.00 for veternarian exam
    key3 :
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : Ingy said, "Wow,
           my dog has fleas!"
    key5 : #class1 I hate fleas
    key6 : #class2
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        Tom the flea
        Dick the flea
        : Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 A very classy flea
        #class6
            |My favorite fleas:
            |     Jim
            |     Bob

Note that the only required ':' (besides the key/value ones) is for ':
Harry the flea'

4) Suggested canonical form:

    key1 : my dog has fleas
    key2 : $40.00 for veternarian exam
    key3 :
        The vet said, "Yes Ingy,
        Your dog has fleas."
    key4 : Ingy said, "Wow,
           my dog has fleas!"
    key5 : #class1 : I hate fleas
    key6 : #class2 :
        What is your viewpoint
        about fleas?
    key7 : #class3 @
        : Tom the flea
        : Dick the flea
        : Harry the flea
          is not really hairy
        %
            foo : bar
        #class4 %
            FOO : BAR
        #class5 : A very classy flea
        #class6 :
            |My favorite fleas:
            |     Jim
            |     Bob

So in this last example we always use the optional scalar indicator ':'
for all scalars in a list (by default). Note that a #class or &id
*always* comes before a %, @, or :. It's just that the ':' is usually
optional.

The things I don't allow are:

    key1 : @
        : %
        : @
        : : a scalar
        : #class %
        : #class @
        : #class : a scalar

The problem with ':' as a "list bullet" is that it could not be
optional. And that's too restrictive just to satisfy a personal
aesthetic.


, Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

Re: [Yaml-core] (backfill) YAML Meeting: Oren's Feedback

From: Clark C . E. <cc...@cl...> - 2001-05-18 20:32:33

| > Sorry for overloading this example with so many weird things. I'll just
| > comment on the multiline semantics:
| > 
| > A) Trailing whitespace is preserved if the transporter preserves it.
| > B) The content can always be encoded before transport anyway.
| > C) Nothing is escaped. The content is truly verbatim. A '\' is a '\'.
| > D) An implicit newline is assumed to be at the end of every line.
| 
| We have to decide what our position is about them, BTW. Is a newline a "\n"
| or a "\n\r" - the answer may be different in-memory and in the text file
| (and thank you, O nameless DOS/CPM programmer, for inflicting this on us :-)

FYI, XML chose the option to convert, at parse time, "\n\r" 
and "\r" to "\n".  In this way, anyone writing XML tools
could use \n and not worry about the platform's specific
line ending.  XML does not specify how these must be written
out, so that any of the three common line endings can be
used on serilization depending on the platform.


| > E) Note that the '|' is one column back from the actual indentation
| > level. This is intententional. And it will work even if the indent width
| > is set to one character wide. (not mandatory, but I like it.)
| 
| Under Python indentation rules, there's no problem indenting the "label"
| line by 4 characters and the text lines by 7, or whatever. What you say
| about one character indentation, however, implies that the following would
| be legal:
| 
|     text:
|     |multi-line
|     |text
| 
| I'm not certain I like it. I think Clark should make the call here -
| indentation is his baby.

How about we define indentation as leading whitespace?  In this
case, the minimal indentation for this style would be two.

  text:
   |this is my
   |multi-line

| 
| > F) I'd like to push for this always starting on the next line if it is a
| > map value. It has no relation to RFC822.
| 
| What's the harm in allowing:
| 
|     text: | Spaces   and " and \n, oh my!
| 
| ('"' and '\' meaning themselves in this text).
| 
| I don't see why we need to make this distinction between multi- and single-
| line text at all. It is bad enough we provide two different quoting
| mechanisms...

Beacuse then we have to answer questions like...

  text: |does X align vertically?
   |          X

I like Brian's restriction on this form.


| Perhaps we should require that a YAML processor must *accept* UTF-16 files.
| That goes well with the "superset" idea. If one wants to write a YAML
| document which is also a mail message, he'll just write it in the default
| UTF-8 encoding (or at least the first MIME part - I still have to read the
| RFC properly).


So we must accept both UTF-8 and UTF-16.  Where UTF-16 
support then implies that the RFC 822 headers are not present?
Or, do we upgrade the spec and allow the headers as
long as the BOM is there?

| > > 10. Oren mentioned that he was thinking of doing
| > >     a Java or Javascript implementation.
| 
| I started thinking about it and hit on an issue which Brian may already have
| thought about - or will have to very soon, if he's covering YAML.pm :-) The
| problem is we haven't defined the data model (or, viewing it differently,
| the round-tripping issue).
| 
| In "dynamic" languages such as Perl, JavaScript, Python (and to some extent,
| Java), it is natural to map a YAML map to the native hash, a list to a
| vector/array, and a scalar value to a simple string. That works admirably
| well, as long as the YAML entity hasn't been annotated with an ID or a class
| name.

We will officially decree that the ID is NON INFORMATIONAL.
We must be very careful about this ID.  It smells alot like
an XML prefix tar baby... which completely destroyed any
hopes at an XML canonical form.

| If one wants to provide a stable-round tripping utility (e.g., suppose I
| want to write a YAML pretty printer), where am I to store the ID of a scalar
| value? The class of a map? For this use case, it seems my best course of
| action is to wrap the native construct (map/list/scalar) in an object which
| has an "id", a "class", and a "value".

I must say, I don't like this option.

| A) Use the native constructs when possible, and only use "wrapper" objects
| when there's a need. That makes access pattern unpredictable: do I write
| map{key} or map{key}.value?
| 
| B) Always use wrapper objects, and give up on de-serializing YAML into
| arbitrary native data structures. Big hit on usefulness - if we do this,
| Brian will just give up on us :-)
| 
| C) Declare that IDs may be re-written arbitrarily, even by pretty printers.
| That is, banish them from the data model.

Yes. Option C.  For the sequential API, we will have
to expose this, but not in the in-memory representation.
And in the sequential API, it can be wrapped as an 
opaque handle (without a string representation).  
Under no circumstances do we want this ID to become
"data" as the XML prefix has... less we not have a 
reasonable canonical form.

Also, to enable concatination, I think we should
allow IDs to be shadowed.  In other words,

  a : &0001  "This is a value"
  b : *0001 
  c : &0001  "This is another value"
  d : *0001  

In this case, d->c and b->a.  After c, there is no
way to access a by reference.  Simple solution,
and this way concatination is still well defined.

| That leaves "class" as the only problematic issue. We explicitly 
| decided not to talk about it in the conference call. It seems to 
| me like there's no way around requiring that this data will survive 
| round-trips, but I also don't see how it is possible to de-serialize 
| "scalar value" into a normal "Java String" if someone attached an 
| "unknown" class to it. 

If classes arn't available in the target environment,
or if the class requested can't be found, then we
have a slight problem.  A resonable solution is
to notify the user via a warning, and then create
an auxilary yaml-class-map which maps lists/strings/maps
that have been created by the load with their 
corresponding class name.  In this way we keep the
native structures, but preserve the class names through
a "coloring archive" on the side.

| So, the idea is to bite the bullet and remove "class" 
| as something specified in the "label line" (BTW, we need
| to define some terminology here; I'm using "label lines"
| and "text lines" - or maybe it should be "content lines"?).
| It turns out that we can still achive most of the goals 
| of the "class" construct by making the key "#" magical in maps:
| 
|     center: %
|         #: point
|         x: 35.3
|         y: 42.1

Interesting.  However, this prevents class names for 
Strings or Lists.  Very interesting.  What do we do about 
Strings and Lists?  Move this into the category of "non-portable" 
constructs, like a & and * on the same line?  I'm not sure.  
The "coloring on the side" may be more painful (esp garbage 
collecting), but it at least does not get in the way.  Hmm.

| The idea is that the "point" class knows it has members "x" 
| and "y" and that their values must be floating-point numbers;
| so not being to declare their type is acceptable.
| 
| When serializing this to a language/system which doesn't 
| recognize "point", well, "center" will just be a run-of-the-mill 
| map. The only magic here is that we may require '#' to always
| be printed first, but this depends on the protocol we'll be using 
| for constructing "point" objects when the class is recognized:
| 
| C1) If the interface is such that there has to be a factory 
| method accepting a map of all the keys, and returning the 
| constructed object, than this restriction is unnecessary. This 
| is a good way to do things in Perl/JavaScript object creation 
| interfaces; e.g. in Perl the method will typically merely bless 
| the map and be done. I suspect in Python things would
| be a bit less elegant.

Python is very flexible here...

| C2) In Java this may be less efficient (all these map and 
| String objects will have to be created and destroyed per each
| point object creation - slooow!). Any more efficient method will 
| have to involve tighter interaction between parsing the values 
| and creating the object, which means we have to know its class
| when starting to parse each member (e.g., we may be able to
| parse "42.1" directly into a float). I can't see, off the top of 
| my head, a reasonable protocol for this right now, but we may 
| want to require '#' to be the first key anyway, in case something 
| does come up.
| 
| My favorite is C2. It's down side is that you can't directly assign 
| a type for a list or a scalar; you have to assign it to a
| surrounding map. I forsee a long discussion coming...

And we have not even begun to talk about namespaces, or how
to make a name globally unique so that it can be moved across
multiple languages.  This is especially important for common
objects, like Date, Currency, and possibly even Party or Address.

| > > 14. Clark introduced a very short discussion on
| > >     the need for a global mechanism to uniquely
| > >     identify names in a non-language specific
| > >     manner.
| 
| Reverse DNS being the easiest way we've ever come up
| with for this. It works directly for accessing class names 
| in Java. In C++ you'll have to "register" classes manually 
| anyway, so using reverse DNS doesn't gain or cost anything.
| In Perl/JavaScript/Python we may want to set up some automatic
| way to convert "org.yaml.class" into something the native 
| type system recognizes...

Python has a package mechanism similar to Java.

| > > 15. Clark agreed to write up the "single vs multi"
| > >     line controversy and post to the list so that
| > >     it is clearly understood.
| 
| I thought we settled this... Every scalar value is potentially multi-line.
| It doesn't seem to cost us anything, or does it?

Good.  I'll just document it in the updated spec... coming soon.

Clark

[Yaml-core] (backfill) YAML Meeting: Oren's Feedback

From: Clark C . E. <cc...@cl...> - 2001-05-18 19:41:11

----- Forwarded message from Oren Ben-Kiki <or...@ri...> -----
From: Oren Ben-Kiki <or...@ri...>
Subject: RE: Meeting Minutes
Date: Fri, 18 May 2001 19:18:14 +0200

> > 1. Brian stated that he would invstigate Oren's Syntax
> >    and get back with us if it meets Perl's serilization
> >    requirements for hard references.  If not, specify
> >    what alternatives we can use.
> 
> I don't think it's that important to investigate. It will probably
> always be a moot point. I will let Data::Denter use it's current scheme
> to deterministically round-trip all Perl data structures. YAML.pm
> probably will have no need for this. It's all acadenic and I have no
> spare time for academics for three more months. (My guess is, yes it
> could be made to work, but would be suboptimal for Perl people) Let's
> leave it at that for now.

Does that mean we are giving up on Denter using YAML syntax (extended to
handle pointer-to-pointer)?

> > 2. We agreed on Oren's reference (&*) syntax.
> > 
> > 3. We agreed on having an optiona RFC 822 Header,
> >    this requires that a YAML text without this
> >    header must begin on line #2.  Furthermore,
> >    if an RFC 822 Header is present, then it must
> >    include "X-YAML-Version: 1.0"
> > 
> > 4. Brian said that he'd investiage the RFC a bit
> >    more specifically relating to the productions.
> >    (I'm not sure if this is necessary... )

I'm going to go over it with a fine-tooth comb, just to see what is involved
in making YAML a superset of it. I guess I'll also have to look at MIME
while I'm at it, with the same comb :-)

> On 4 & 5. I don't really like the blank line at the beginning thing
> because people will mess it up or not understand it. And we have many
> heuristic options.
> 
> A) Parse lookahead for X-YAML-Version
> B) Option-A rarely needed because as soon as we see a key that is *not*
> RFC822 compliant, we assume YAML. 99% of the time this is the first
> line!
> C) If there is no whitespace allowed before the colon in RFC822, we
> simply make it a requirement in YAML. Or does this break your RFC
> compatability rules?
> 
> Just for my own edification, would you please explain the rationale
> behind making YAML RFC822 compliant. And do so with one of more specific
> examples. Thanks :)

Well, for example, suppose that YAML was a "good enough" superset of RFC822.
Then we could just adopt my idea that "blank lines separate top-level maps"
and we wouldn't have to say anything further about RFC822 headers, period.
If one wants to read/write a mail message as a YAML document, then it will
simply work (as long as he sticks to the "safe" constructs there). If one
wants to have a YAML document that has nothing to do with RFC822, that also
works. No need for any special statement about them. I like this approach
best.

> > 5. We talked at length about multi-line scalar text
> >    nodes.  Thus far, we agreed on option D, due to
> >    assumed compatibility with RFC 822.  Clark agreed
> >    to verify this compatibility.
> > 
> > 6. Brian stated that the quoting mechanism was not
> >    good enough for source code or ASCII art, and
> >    backed option F.  However, option F does not
> >    explicitly preserve trailing whitespace on a
> >    given line.  So Brian suggested using > <
> >    pairs.  Oren suggested using single quotes.
> >    Clark asked Brian to come up with something
> >    he likes and propose it.
> 
> Neil and I agree that the normal transport mechanism between Perl and
> Python serializer/parsers would definitely *not* be a mailer. And if a
> mailer was used, most people wouldn't give a darn about the trailing
> whitespace. And if they really did, we could just encode the whole
> document anyway. So I now definitely think the best-fit answer is:
> 
>     " this is the hash\n key for this example  :-) " : #class :

I assume the trailing ':' is a typo?

>        |# My Perl Subroutine
>        |
>        |  sub version {
>        |      if ($_[0] =~ /\n/) {
>        |          return \ "\to sender";
>        |      }        
>        |  }
> 
> Sorry for overloading this example with so many weird things. I'll just
> comment on the multiline semantics:
> 
> A) Trailing whitespace is preserved if the transporter preserves it.
> B) The content can always be encoded before transport anyway.
> C) Nothing is escaped. The content is truly verbatim. A '\' is a '\'.
> D) An implicit newline is assumed to be at the end of every line.

We have to decide what our position is about them, BTW. Is a newline a "\n"
or a "\n\r" - the answer may be different in-memory and in the text file
(and thank you, O nameless DOS/CPM programmer, for inflicting this on us :-)

> E) Note that the '|' is one column back from the actual indentation
> level. This is intententional. And it will work even if the indent width
> is set to one character wide. (not mandatory, but I like it.)

Under Python indentation rules, there's no problem indenting the "label"
line by 4 characters and the text lines by 7, or whatever. What you say
about one character indentation, however, implies that the following would
be legal:

    text:
    |multi-line
    |text

I'm not certain I like it. I think Clark should make the call here -
indentation is his baby.

> F) I'd like to push for this always starting on the next line if it is a
> map value. It has no relation to RFC822.

What's the harm in allowing:

    text: | Spaces   and " and \n, oh my!

('"' and '\' meaning themselves in this text).

I don't see why we need to make this distinction between multi- and single-
line text at all. It is bad enough we provide two different quoting
mechanisms...

> > This will work the way I intended it 98% of the time.
> 
> > 7. We agreed, after some discussion, that a YAML
> >    parser must support MIME.  We agreed implicity
> >    that it must support base64 encoding.
> 
> > 8. We didn't discuss this... but it should be
> >    mentioned that to (a) support unicode and
> >    (b) support RFC 822, our texts must be UTF-8.
> >    Thus a YAML parser/writer will default to
> >    UTF-8, although other encoding support is
> >    optional.

Perhaps we should require that a YAML processor must *accept* UTF-16 files.
That goes well with the "superset" idea. If one wants to write a YAML
document which is also a mail message, he'll just write it in the default
UTF-8 encoding (or at least the first MIME part - I still have to read the
RFC properly).

> > 9. Clark agreed to make a "boostrap" C program
> >    and upload to source forge.  Brian and Neil
> >    agreed to download and hack at will.
> 
> As I walked to the train station with Neil, he figured out the C
> implementation in his head and said he would try to get it done before
> bed.
> 
> > 10. Oren mentioned that he was thinking of doing
> >     a Java or Javascript implementation.

I started thinking about it and hit on an issue which Brian may already have
thought about - or will have to very soon, if he's covering YAML.pm :-) The
problem is we haven't defined the data model (or, viewing it differently,
the round-tripping issue).

In "dynamic" languages such as Perl, JavaScript, Python (and to some extent,
Java), it is natural to map a YAML map to the native hash, a list to a
vector/array, and a scalar value to a simple string. That works admirably
well, as long as the YAML entity hasn't been annotated with an ID or a class
name.

If one wants to provide a stable-round tripping utility (e.g., suppose I
want to write a YAML pretty printer), where am I to store the ID of a scalar
value? The class of a map? For this use case, it seems my best course of
action is to wrap the native construct (map/list/scalar) in an object which
has an "id", a "class", and a "value".

There are several options:

A) Use the native constructs when possible, and only use "wrapper" objects
when there's a need. That makes access pattern unpredictable: do I write
map{key} or map{key}.value?

B) Always use wrapper objects, and give up on de-serializing YAML into
arbitrary native data structures. Big hit on usefulness - if we do this,
Brian will just give up on us :-)

C) Declare that IDs may be re-written arbitrarily, even by pretty printers.
That is, banish them from the data model.

That leaves "class" as the only problematic issue. We explicitly decided not
to talk about it in the conference call. It seems to me like there's no way
around requiring that this data will survive round-trips, but I also don't
see how it is possible to de-serialize "scalar value" into a normal "Java
String" if someone attached an "unknown" class to it. 

So, the idea is to bite the bullet and remove "class" as something specified
in the "label line" (BTW, we need to define some terminology here; I'm using
"label lines" and "text lines" - or maybe it should be "content lines"?). It
turns out that we can still achive most of the goals of the "class"
construct by making the key "#" magical in maps:

    center: %
        #: point
        x: 35.3
        y: 42.1

The idea is that the "point" class knows it has members "x" and "y" and that
their values must be floating-point numbers; so not being to declare their
type is acceptable.

When serializing this to a language/system which doesn't recognize "point",
well, "center" will just be a run-of-the-mill map. The only magic here is
that we may require '#' to always be printed first, but this depends on the
protocol we'll be using for constructing "point" objects when the class is
recognized:

C1) If the interface is such that there has to be a factory method accepting
a map of all the keys, and returning the constructed object, than this
restriction is unnecessary. This is a good way to do things in
Perl/JavaScript object creation interfaces; e.g. in Perl the method will
typically merely bless the map and be done. I suspect in Python things would
be a bit less elegant.

C2) In Java this may be less efficient (all these map and String objects
will have to be created and destroyed per each point object creation -
slooow!). Any more efficient method will have to involve tighter interaction
between parsing the values and creating the object, which means we have to
know its class when starting to parse each member (e.g., we may be able to
parse "42.1" directly into a float). I can't see, off the top of my head, a
reasonable protocol for this right now, but we may want to require '#' to be
the first key anyway, in case something does come up.

My favorite is C2. It's down side is that you can't directly assign a type
for a list or a scalar; you have to assign it to a surrounding map. I forsee
a long discussion coming...

> > 11. Clark agreed to update the spec with the
> >     current agreements.
> 
> Send me a note when it's changed. I'll review for you.
> 
> > 
> > 12. Brian mentioned that he'd show YAML to one of
> >     his Perl friends.  (sorry I didn't catch his name)
> 
> Damian Conway http://www.csse.monash.edu.au/~damian/

His input will be greatly appreciated.

> > 13. Clark and Brian discussed the MIME usage.

And...?

> > 14. Clark introduced a very short discussion on
> >     the need for a global mechanism to uniquely
> >     identify names in a non-language specific
> >     manner.

Reverse DNS being the easiest way we've ever come up with for this. It works
directly for accessing class names in Java. In C++ you'll have to "register"
classes manually anyway, so using reverse DNS doesn't gain or cost anything.
In Perl/JavaScript/Python we may want to set up some automatic way to
convert "org.yaml.class" into something the native type system recognizes...

> > 15. Clark agreed to write up the "single vs multi"
> >     line controversy and post to the list so that
> >     it is clearly understood.

I thought we settled this... Every scalar value is potentially multi-line.
It doesn't seem to cost us anything, or does it?

> > 16. We made little progress on the scalar indicator
> >     for lists, to colon or not to colon.  It wasn't
> >     agreed, but Clark thinks this is someone else's
> >     monkey.  If Oren and Brian can't agree within
> >     7 days, Clark will put on the dictator cap.
> 
> We traded in the '$' for the ':'. '$' as the last character in a line

I thought ':' was the first one; it is "as if" it is a normal header, with
the key "just happening" to be empty. This seems more consistent.

> meant a multiline scalar was to follow. Converting this semantic to the
> ':' leaves us with these represntations:
> 
>     key1 : @
>         single line
>         :
>             classless folded
>             multi line
>         another single line
>         and another
>         #class &0001 :

          : #class &0001

>             classed multi
>             line
>         #class &0002 classed single line
>         %
>             key : value
>         @

This is an empty list, right?

>         ~

And this is a null?

>         #classy %
>             key : value
>         : even this multi line on the same line
>           as a colon thingy works because there 
>           a little bit of indentation imposed by
>           colon. (Although I don't love it)

This means the following:

          : single line

Will also work, even though you *really* dislike it. I like them :-)

>         : "Another thingy like above that meets"
>           "RFC822 wackiness"
>         :
>            |    1
>            |   1 1
>            |  1 1 1
>            |Just for completeness :-)

I think we've said everything there's to be said about this, and whether or
not you find either:

    list:
      : One
      : Two
      : Three
        and Four

Or:

    list:
        One
        Two
        :
            Three
            and Four

To be beautiful or ugly is, when all is said and done, a matter of taste. To
you, the extra ':'s are an eyesore; to me it seems strange that the
multi-line value is "more indented"; it seems as though there's structure
involved, when there isn't. I also like being able to do /^:/ in VI to get
to the next entry.

I think we should just let Clark make the call. OK?

> > 17. It is nice to have Neil in on the talk!

Welcome aboard, Neil. The more the merrier!

Have fun,

    Oren Ben-Kiki

----- End forwarded message -----

[Yaml-core] Re: Meeting Minutes

From: Brian I. <briani@ActiveState.com> - 2001-05-18 17:03:33

"Clark C . Evans" wrote:
> 
> | On 4 & 5. I don't really like the blank line at the beginning thing
> | because people will mess it up or not understand it. And we have many
> | heuristic options.
> |
> | A) Parse lookahead for X-YAML-Version
> | B) Option-A rarely needed because as soon as we see a key that is *not*
> | RFC822 compliant, we assume YAML. 99% of the time this is the first
> | line!
> | C) If there is no whitespace allowed before the colon in RFC822, we
> | simply make it a requirement in YAML. Or does this break your RFC
> | compatability rules?
> 
> A&B are good.  I don't really care about C, perhaps in the
> interest of consistency with both RFC822, but also Python
> code, we may not want to require the space before the colon.

I see.

> 
> | Just for my own edification, would you please explain the rationale
> | behind making YAML RFC822 compliant. And do so with one of more specific
> | examples. Thanks :)
> 
> I'm not all that concerned about RFC822 compliance.
> 
> I'm more concerned about consistency since we are going
> to allow RFC822 headers.  In particular, if someone
> sees a few RFC822 lines above and the YAML lines below
> the seperating blank line, they will most likely assume
> that YAML has the same (or very similar rules).  Thus,
> those items _common_ in RFC822 should be allowed in YAML.
> 
> There will be a laundry list of RFC822 constructs that
> when moved into the YAML section will be illegal.

I think I understand better now. Thanks again.

> 
> | Neil and I agree that the normal transport mechanism between Perl and
> | Python serializer/parsers would definitely *not* be a mailer. And if a
> | mailer was used, most people wouldn't give a darn about the trailing
> | whitespace. And if they really did, we could just encode the whole
> | document anyway. So I now definitely think the best-fit answer is:
> |
> |     " this is the hash\n key for this example  :-) " : #class :
> |        |# My Perl Subroutine
> |        |
> |        |  sub version {
> |        |      if ($_[0] =~ /\n/) {
> |        |          return \ "\to sender";
> |        |      }
> |        |  }
> 
> Nice.  Is this fairly "optimal" for your purposes?
> 
> | Sorry for overloading this example with so many weird things.
> 
> Not at all.  This is good.
> 
> | I'll just comment on the multiline semantics:
> |
> | A) Trailing whitespace is preserved if the transporter preserves it.
> | B) The content can always be encoded before transport anyway.
> | C) Nothing is escaped. The content is truly verbatim. A '\' is a '\'.
> | D) An implicit newline is assumed to be at the end of every line.
> | E) Note that the '|' is one column back from the actual indentation
> | level. This is intententional. And it will work even if the indent width
> | is set to one character wide. (not mandatory, but I like it.)
> | F) I'd like to push for this always starting on the next line if it is a
> | map value. It has no relation to RFC822.
> |
> | This will work the way I intended it 98% of the time.
> 
> One question.  How are trailing new lines handled?  You may
> want to modify "D" so that there is a new line on every |
> line, except the last one.  Thus to get a trailing new-line,
> you'd have to do:

I had pretty much given up on that. Since this method isn't foolproof
anyway, I'd just have the caveat that *all* lines are assumed to have a
trailing newline.

> 
>   after :
>    | this has a
>    | trailing new line
>    |
>   before :
>    |
>    | This has a leading new
>    | line, but not a trailing
>    | new line.
>   both:
>    |
>    | This has both a leading
>    | and a trailing new line
>    |
>   another :
>    | this does not have
>    | a trailing new line,
>    | nor a leading new line.
> 
> Clear?  I think it beats :- as far as readability.

Yes it does. And I understand it. But I think it's not obvious enough.
Too subtle. It's exactly what I was trying to avoid with the :- thingy.
People would think that there's just an extra blank line. I'd much
prefer something like this:

    after :
       |this has a
       |trailing new line
    before :
       |
       |This has a leading new
       |line, but not a trailing
       \new line.
    both:
       |
       |This has both a leading
       |and a trailing new line
    another :
       |this does not have
       |a trailing new line,
       \nor a leading new line.

Also, you can't put a space after the '|'. It doesn't scale well past
these examples.


> 
> | > 9. Clark agreed to make a "boostrap" C program
> | >    and upload to source forge.  Brian and Neil
> | >    agreed to download and hack at will.
> |
> | As I walked to the train station with Neil, he figured out the C
> | implementation in his head and said he would try to get it done
> | before bed.
> 
> Great.  I'll focus on the specification today then
> rather than laying-in-code.

The spec is the most important thing IMO.

> 
> | > 16. We made little progress on the scalar indicator
> | >     for lists, to colon or not to colon.  It wasn't
> | >     agreed, but Clark thinks this is someone else's
> | >     monkey.  If Oren and Brian can't agree within
> | >     7 days, Clark will put on the dictator cap.
> |
> | We traded in the '$' for the ':'. '$' as the last character in a line
> | meant a multiline scalar was to follow. Converting this semantic to the
> | ':' leaves us with these represntations:
> |
> |     key1 : @
> |         single line
> |         :
> |             classless folded
> |             multi line
> |         another single line
> |         and another
> |         #class &0001 :
> |             classed multi
> |             line
> |         #class &0002 classed single line
> |         %
> |             key : value
> |         @
> |         ~
> |         #classy %
> |             key : value
> |         : even this multi line on the same line
> |           as a colon thingy works because there
> |           a little bit of indentation imposed by
> |           colon. (Although I don't love it)
> |         : "Another thingy like above that meets"
> |           "RFC822 wackiness"
> |         :
> |            |    1
> |            |   1 1
> |            |  1 1 1
> |            |Just for completeness :-)
> 
> Good deal.  Your example above, you have two colons:
> 
>     " this is the hash\n key for this example  :-) " : #class :
> 
> Is the second colon a typo, or is it required per this
> proposal?

I'm glad you noticed!

It's definitely not a typo. I wouldn't go so far as to say it's
mandatory but I suggest it for the following reasons:

PREMISE: Assuming that the ':' was used to replace the '$':

PREMISE: A #class or &id always used to be followed by %, @, $ or value.
$ was optional, but strongly suggested if a multiline started on the
next line.

CONCLUSION: In the *absence* of a #class or &id we'd have:

    key : $
        multi
        line

Translated to ':' speak, that's:

    key : :
        multi
        line

Which I suggested should just be collapsed to a single ':'.

I think that cover's it.

Cheers, Brian

-- 
perl -le 'use Inline C=>q{SV*JAxH(char*x){return newSVpvf
("Just Another %s Hacker",x);}};print JAxH+Perl'

[Yaml-core] (backfill) YAML Meeting: Working Draft 0.19a

From: Clark C . E. <cc...@cl...> - 2001-05-18 16:29:48

+-------------------------------------------------------------------------+
|               Welcome to YAML (tm) -- WORKING DRAFT 0.19a               |
+-------------------------------------------------------------------------+
| YAML (tm) is a straight-forward data serilization language, offering an |
| alternative to XML where markup (named lists and mixed content) are not |
| needed. YAML borrows ideas from rfc822, SAX, C, HTML, Perl, and Python. |
|                                                                         |
|   * YAML texts are brief and readable.                                  |
|   * YAML uses your language's native data structures.                   |
|   * YAML has a simple stream based interface.                           |
|   * YAML has a solid information model.                                 |
|   * YAML is expressive and extensible.                                  |
|   * YAML is easy to implement.                                          |
|                                                                         |
| YAML is a collaboration between Brian Ingerson (author of Data::Denter  |
| ), Clark Evans, Oren Ben-Kiki, Sjoerd Visscher, and other members of    |
| the SML-DEV mailing list. YAML explicitly targets the object            |
| serilization needs of the Python and Perl communities. Implementations  |
| will be on their way within the next two weeks.                         |
+-------------------------------------------------------------------------+
|                                  News                                   |
+-------------------------------------------------------------------------+
|   * 17-MAY-2001: YAML now has a mailing list at SourceForge.            |
|   * 18-MAY-2001: YAML has had it's first meeting. The minutes have      |
|     been sent out to the mailing list, and should be appearing in the   |
|     archives soon.                                                      |
|                                                                         |
+-------------------------------------------------------------------------+
|                              Key Concepts                               |
+-------------------------------------------------------------------------+
| YAML is founded on several key concepts from very successful languages. |
|                                                                         |
|   * YAML uses similar type structure as Perl. In YAML, there there      |
|     are three fundamental structures: scalars ($), maps (%), and lists  |
|     (@). YAML also supports references to enable the serilization of    |
|     graphs. Furthermore, each data value can be associated with a class |
|     name to allow the use of specific data types.                       |
|   * YAML uses block scoping similar to Python. In YAML, the extent of   |
|     a node is indicated by its child's nesting level, i.e., what column |
|     it is in. Block indenting provides for easy inspection of the       |
|     document's structure which helps to identify scope errors.          |
|   * YAML uses similar whitespace handling as HTML. In YAML, sequences   |
|     of spaces, tabs, and carriage return characters are folded into a   |
|     single space during parse. This wonderful technique makes markup    |
|     code readable by enabling indentation and word-wrapping without     |
|     affecting the canonical form of the content.                        |
|   * YAML uses similar slash style escape sequences as C. In YAML, \n    |
|     is used to represent a new line, \t is used to represent a tab, and |
|     \\ is used to represent the slash. In addition, since whitespace is |
|     folded, YAML uses bash style "\ " to escape additional spaces that  |
|     are part of the content and should not be folded. Lastly, the       |
|     trailing \ is used as a continuation marker, allowing content to be |
|     broken into multiple lines without introducing unwanted whitespace. |
|   * YAML allows for a rfc822 compatible header area for comments,       |
|     specific processing instructions, and encoding declarations. This   |
|     provides a flexible and forward looking method to augment the YAML  |
|     parser with other features such as a validator similar to TREX or   |
|     RELAX. Furthermore, this will allow a mail processing system to     |
|     directly use YAML as its input parser.                              |
|   * YAML supports binary and formatted text entities with MIME          |
|     multi-part attachments. Each attachment is given an reference       |
|     identifier which can be associated with a location in hierarchical  |
|     YAML content. This allows leaf values which would distrupt the      |
|     in-line structural flow to be handled out of band in a seperate     |
|     block mechanism.                                                    |
|   * YAML has a SAX like sequential "C" API. This C library can be       |
|     used to easily construct native-language representations of a YAML  |
|     serilization. The API also show cases a clever substitutability     |
|     technique which allows schema changes to occur at the leaf nodes in |
|     a backwards compatible manner without breaking older code. This     |
|     brings resiliance to older code, while allowing the structure of    |
|     your data to grow over time.                                        |
|                                                                         |
+-------------------------------------------------------------------------+
|                             Example: Basic                              |
+-------------------------------------------------------------------------+
| Below is an example of an invoice expressed via YAML. Each value's type |
| indicated by either percent (map), or an at (list) sign, or an optional |
| dollar sign (scalar). The content for each value follows the indicator  |
| either on the same line for scalars or on subsequent indented lines.    |
| The content for a map, which is also the starting production, is a list |
| of key value paris. Each key and value are seperated by a colon.        |
| buyer    : %                                                            |
|     address     : %                                                     |
|        city       : Royal Oak                                           |
|        line one   : 458 Wittigen's Way                                  |
|        line two   : Suite #292                                          |
|        postal     : 48046                                               |
|        state      : MI                                                  |
|     family name : Dumars                                                |
|     given name  : Chris                                                 |
| date     : 12-JAN-2001                                                  |
| comments :                                                              |
|     Mr. Dumars is frequently gone in the morning                        |
|     so it is best advised to try things in late                         |
|     afternoon. \nIf Joe isn't around, try his house\                    |
|     keeper, Nancy Billsmer @ (734) 338-4338.\n                          |
| delivery : %                                                            |
|     method : UZS Express Overnight                                      |
|     price  : 45.50                                                      |
| invoice : 00034843                                                      |
| product : @                                                             |
|     %                                                                   |
|         desc      : Grade A, Leather Hide Basketball                    |
|         id        : BL394D                                              |
|         price     : 450.00                                              |
|         quantity  : 4                                                   |
|     %                                                                   |
|         desc      : Super Hoop (tm)                                     |
|         id        : BL4438H                                             |
|         price     : 2,392.00                                            |
|         quantity  : 1                                                   |
| tax      : 0.00                                                         |
| total    : 4237.50                                                      |
|                                                                         |
| Since "product" is a list, it only has values and thus is missing the   |
| key and colon. Also notice that the "comments" scalar is on multiple    |
| lines. Since whitespace is folded, the carriage return (\n) is escaped  |
| and the line ending \ is required to keep housekeeper as a single word. |
| By default, the serilizer will sort map keys, although this isn't a     |
| requirement of the serilization structure.                              |
+-------------------------------------------------------------------------+
|                   Example: References and Class Names                   |
+-------------------------------------------------------------------------+
| Below is an example of a YAML document which demonstrates the use of    |
| references and classes. Immediately after an indicator a class name can |
| occur and then within parenthesis an optional reference handle. If the  |
| indicator is a "*", then no further content is allowed, as this         |
| indicator signifies a reference to another value. The class name may be |
| used as a specific language specific binding to a particular object or  |
| type appropriate class, otherwise it can be considered a comment. The   |
| production for allowable names and a namespace mechanism have yet to be |
| worked out.                                                             |
| buyer : %person                                                         |
|     comments :                                                          |
|         This is a person object accessable                              |
|         through the "buyer" key from the                                |
|         top level map.                                                  |
|     family name : Dumars                                                |
|     given name  : Chris                                                 |
| inline : $(0001)                                                        |
|     This is a folded text entity                                        |
|     that is associated with a                                           |
|     reference so that it can be                                         |
|     re-used later on.                                                   |
| seller : %person(0002)                                                  |
|     comments:                                                           |
|         This is another person object, only                             |
|         that it is given a handle of 0001 as                            |
|         well as a class so that it can be                               |
|         refered to later.  Handles must be                              |
|         numeric, and classes cannot start                               |
|         with a number.                                                  |
|     family name : Sellers                                               |
|     given name  : Peter                                                 |
| zzz :                                                                   |
|    comments:                                                            |
|        The first two items in this map are references                   |
|        The first is to the person object "Peter Sellers".               |
|        The second is to the inline text object "This is..."             |
|        The price scalar below is given a class "price".                 |
|    peter : *(0002)                                                      |
|    price : $currency                                                    |
|        23.34                                                            |
|    text  : *(0001)                                                      |
+-------------------------------------------------------------------------+
|                Example: Block References and Attachments                |
+-------------------------------------------------------------------------+
| Below is an example of a YAML document which includes the optional      |
| rfc822 style header, specifically a rfc2046 multipart header. A YAML    |
| Parser must handle these headers to allow for application specific      |
| processing instructions, and MIME for raw/binary references.            |
| Date: Sun, 13 May 2001 23:48:04 -0400                                   |
| MIME-Version: 1.0                                                       |
| Content-Type: multipart/related;                                        |
|     boundary="================================"                         |
| X-YAML-Version: 1.0                                                     |
|                                                                         |
| --================================                                      |
| Content-Type: text/plain; id="0001"                                     |
|                                                                         |
|   XX XXX    XXXXX   XX   XX                                             |
|    XXX XX       X   XX X XX                                             |
|    XX      XXXXXX   XX X XX                                             |
|    XX      X   XX   XXXXXXX                                             |
|   XXXX     XXXXX X   XX XX                                              |
|                                                                         |
| --================================                                      |
| Content-Type: image/gif; id="0002"                                      |
| Content-Transfer-Encoding: base64                                       |
|                                                                         |
| DlhGQAAOMAAAICBDaanAJSVAISFP7+/GbOzAJmZAIeHGbMzGbMzGbMzGbMzGbMzGbMzGbM  |
| CH+Dk1ZGUgd2l0aCBHSU1QACH5BAEKAAYALAAAAAAZAA8AQAR70EgZArlBWHw7Nts1gB6R  |
| BMlkp4lHJppkNoyW1r5SmcTeV6wUwrFI4VEulSMyRLchhYrYLq4MDKYrm9XuFQuIzLhALA  |
| +g44FBHybokQGdnivNfhJ8enwFSR12eB4jcWZ3gHeCJQJycXSJEzaIc5SIWz0RADs=      |
|                                                                         |
| --================================                                      |
| Content-Type: text/x-yaml; id="0003"                                    |
|                                                                         |
| an inline : $(0004)                                                     |
|     This is a folded text entity                                        |
|     that is associated with a                                           |
|     reference.                                                          |
| content :                                                               |
|    comment:                                                             |
|        The cyclic item is a reference                                   |
|        to the top-level map.                                            |
|    cyclic : *(0003)                                                     |
|    image  : *(0002)                                                     |
|    inline : *(0004)                                                     |
|    raw    : *(0001)                                                     |
|    title  : This contains multiple references                           |
|                                                                         |
+-------------------------------------------------------------------------+
|                            Information Model                            |
+-------------------------------------------------------------------------+
| A map/list/scalar data structures found in modern programming languages |
| such as ML, Python, Perl, and C. This model should also be very         |
| compatible with relational database tables. Note: This model lacks      |
| classes and references which are still under consideration.             |
| Document  The the starting production for YAML is a Map.                |
| Map       An un-ordered sequence of zero or more (Key,Value) tuples.    |
|           Where they Key is unique within the sequence and matches the  |
|           Key production.                                               |
| Value     Exactly one of Scalar, Map, or List                           |
| List      An ordered sequence of zero or more Values.                   |
| Scalar    Any type directly serilizable through or able to be           |
|           constructed from a sequence of zero or more characters. These |
|           characters must match the Char production.                    |
| ----------------------------------------------------------------------- |
| Default   This is a synthesized attribute of every Value. If the Value  |
|           is a Scalar, then the Default property refers to the Value    |
|           itself. If this Value is a List, then the Default refers to   |
|           the Default property of the first Value in its sequence. If   |
|           the Value is a Map, then Default refers to the Default        |
|           property of the Value in its Pair entry lacking a Key. By     |
|           using Default, a Scalar Value can be substituted with a Map   |
|           or a List Value without braking older code.                   |
|                                                                         |
| Take careful note that the information model does not admit a "parent"  |
| property of each value. Quite the contrary, YAML may be a graph         |
| structure and is not necessarly a tree.                                 |
+-------------------------------------------------------------------------+
|                     Mapping To Popular Environments                     |
+-------------------------------------------------------------------------+
| For Python, the internal representation has a top-level object is a     |
| Dictionary, and from there, depending upon each value's indicator, can  |
| either be a List, Dictionary or a String. It is possible for a schema   |
| mechansim to be included which affords for more specific decoding into  |
| classes and types. The default attribute is implemented through a       |
| stand-alone function.                                                   |
|                                                                         |
| For Perl, the internal represenation starts with a top-level hash. And  |
| from there, depending upon the indicators can either be a list, hash,   |
| or string scalar. Of course, it is also possible for a schema mechanism |
| to be included which affords for more specific decoding. The default    |
| attribute is implemeted through a stand-alone function.                 |
|                                                                         |
| Haven't done Java or Javascript since '98, but I remember Strings, Maps |
| and Lists being Objects. So there shouldn't be any problem in Java.     |
| Javascript is probably in the same boat but I can't veryify since that  |
| book has mysteriously dissapeared as well.                              |
|                                                                         |
| For ML, C, and C++ all of which lack a built-in, variable type Map and  |
| List structure require a specific schema to build an internal           |
| representation. For these languages, a YamlValue type could be created  |
| with sub-types of Scalar, List, and Map. For C++, STL could make the    |
| implementation very quick, especially with iterator support. An         |
| alternative approach would be a class builder... but this, of course,   |
| requires a bit more smarts and a schema system.                         |
|                                                                         |
| Mapping to a relational database will also require some sort of schema  |
| to indicate how to pack/unpack. However, given that a tuple (record) is |
| easlily associated with a Map, and a relation (table) is easily         |
| associated with a List, there should not be that much difficulty.       |
| Mapping NULL values will be represented by a lack of a particular map   |
| entry.                                                                  |
+-------------------------------------------------------------------------+
|                        Serilization Format / BNF                        |
+-------------------------------------------------------------------------+
| This section contains the BNF productions for the YAML syntax. Much to  |
| do...                                                                   |
+-------------------------------------------------------------------------+
|                             Parser Behavior                             |
+-------------------------------------------------------------------------+
| This section describes how a parser should parse YAML. Much to do...    |
+-------------------------------------------------------------------------+
|                    Emitter Behavior / Canonical Form                    |
+-------------------------------------------------------------------------+
| This section describes how an emitter should write YAML into canonical  |
| form. Includes specific word-wrapping algorithem. Minimal content       |
| length of 20 chararacters, and does it's best to word-wrap by 76        |
| columns.                                                                |
+-------------------------------------------------------------------------+
|                             Implementations                             |
+-------------------------------------------------------------------------+
| To do... an implementation in C, C++/STL, Python, Java, and ...         |
+-------------------------------------------------------------------------+
|                                 Credits                                 |
+-------------------------------------------------------------------------+
| This work is the result of long, thoughtful discussions on the SML-DEV  |
| mailing list. Specific contributors include... (to do)                  |
+-------------------------------------------------------------------------+
|                              Some thoughts                              |
+-------------------------------------------------------------------------+
|  1. This is very preliminary thoughts on the subject, feedback is very  |
|     welcome.                                                            |
|  2. Implementations needed... Clark is happy to write the Python,C,&
|     perhaps even a C++ implementation. Any takers?                      |
|  3. Was thinking hard about using # for a comment indicator, or perhaps |
|     as a numeric indicator. Benfits? In any case, the BNF should leave  |
|     all of these special characters open to future versions.            |
|                                                                         |
+-------------------------------------------------------------------------+
|                                   FAQ                                   |
+-------------------------------------------------------------------------+
|  1. Don't the indicator characters need to be escaped in the content?   |
|     Answer: No.                                                         |
|                                                                         |
+-------------------------------------------------------------------------+
|                          Specific Productions                           |
+-------------------------------------------------------------------------+
| Char         ::  #x9 | #xA | #xD | [#     Any unicode character,        |
|             =    x20-#xD7FF] | [#xE000-#  excluding surrogate blocks,   |
|                  xFFFD] | [#x10000-#      FFFE, and FFFF. Where unicode |
|                  x10FFFF]                 is defined by ISO/IEC         |
|                                           10646-2000                    |
| Characters   ::  Char*                    Zero or more characters.      |
|             =                                                           |
| WhiteChar    ::  #x20 | #x9 | #xD | #xA   A space, tab, new line or     |
|             =                             carriage return, escaped by \ |
|                                           s, \t, \n, and \r             |
|                                           respectively.                 |
| Whitespace   ::  WhiteChar+               Any sequence of spaces, tabs, |
|             =                             new lines or carriage         |
|                                           returns.                      |
| Indicator    ::  '$' | '%' | '@' | '*'    The dollar sign indicates a   |
|             =                             scalar, a percent sign        |
|                                           indicates a map, an at sign   |
|                                           indicates a list, and a star  |
|                                           represents a reference.       |
| Reserved     ::  WhiteChar | Indicator |  Printable, non-alpha,         |
|             =    [#x21-#x2F] | '/' | [#   non-numeric ASCII characters  |
|                  x3B-#x40] | [#x5B-#x5E]  excluding the period, colon,  |
|                  | #x60 | [#x7B-#x7F]     underscore, and dash.         |
| Key          ::  (Char - Reserved)*       One or more non-reserved      |
|             =                             characters.                   |
+-------------------------------------------------------------------------+

[Yaml-core] Testing

From: Michael L. <yam...@ya...> - 2001-05-18 16:27:41

Just testing to see if the mailing list works....



=====
Michael Lauzon
Maximum Linux Project, Founder
http://maxlinux.sourceforge.net/

_______________________________________________________
Do You Yahoo!?
Get your free @yahoo.ca address at http://mail.yahoo.ca

[Yaml-core] (backfill) YAML Meeting Minutes: 17-MAY-2001

From: Clark C . E. <cc...@cl...> - 2001-05-18 16:01:22

----- Forwarded message from "Clark C . Evans" <cc...@cl...> -----
Date: Fri, 18 May 2001 10:01:25 -0500
From: "Clark C . Evans" <cc...@cl...>
Subject: Re: Meeting Minutes

On Fri, May 18, 2001 at 12:41:20AM -0700, Brian Ingerson wrote:
| > 9. Clark agreed to make a "boostrap" C program
| >    and upload to source forge.  Brian and Neil
| >    agreed to download and hack at will.
| 
| As I walked to the train station with Neil, he figured out the C
| implementation in his head and said he would try to get it done 
| before bed.

I was wondering if I had been clear about the Iterator
and Visitor interfaces, where the Emitter procedure
implements the Visitor interface and the Parser 
implements the Iterator.  This is pretty important
aspect of the API as it allows for both push and
pull based streams.  If you have any questions about
this, let's start a discussion about the implementation.

Further, if Neil wants to tackle the part of the spec
excluding RFC 882 and MIME, I can work on our implementation
of these specifications.  This might be a clean division
of work, although who ever implements first will probably
set the memory management policies.  It'd be cool if 
we could chat briefly about this so we are on the same page.
Also, I think I mentioned, but RFC822 limits us to UTF-8 
for our unicode version.  I suppose we could do UTF-16 for 
those files without the RFC822 header, but this should be optional.
If Neil would like, I can also handle the encoding mess.

Neil, thank you for offering to help!

| Damian Conway http://www.csse.monash.edu.au/~damian/

Danke.

Best,

Clark

P.S.  Let us all sign up on the source-forge mailing 
      list and start communicating this way.

----- End forwarded message -----

59 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 319 320 321 322 > >> (Page 321 of 322)