Ingy dot Net wrote:
On 26/04/07 18:53 -0400, Kenneth Downs wrote:
Ingy dot Net wrote:
On 17/04/07 17:40 -0400, Clark C. Evans wrote:
You should be able quote those Y/N items to force them
to be a string.
You might be able to configure your parser to not "implicitly
type" Y/N as a boolean value.
longest answer: >
You've hit the #1 usability prooblem with YAML, it's called "implicit
type resolution" and different implementations are doing it
The original goal was to make it easy to type in integers and have
them show up as integers w/o littering your text with "!!int".
Unfortuntely, where that line should be drawn is a bit hard.
In the next pass of YAML, I am going to recommend that all parsers
_only_ do implicit typing on:
(a) symbolic values, such as <<, which can be used to
augment the YAML syntax /w very nice hooks
(b) numbers, "true" "false" and "null", following
the JSON standard (for compatibility)
At least, this should, IMHO, be the default. I think Ingy begs
to differ and believes the default should be *all strings* with
no implicit typing. What ever we come up with, getting there
is sure to be unpleasent; but probably far less unpleasant than
the current state of affairs.
I disagree but in specifics, not in spirit. The "Parser" should not do
of any form. It reports for each scalar, a char-string value, and whether
scalar was plain or not.
A yaml "Load" operation consists of at least 3 steps, "parse", "compose",
"construct". According to the spec, introduction of node "tag" (aka "type")
happens in the composer.
Your point is that we got too cute with the default implicit types. String,
Integer and Number are fine as a default.
I would reccomend that all implementations support a everything is string
I would ask, what is simplest and most consistent. And also, how did
the conversation start?
The conversation started because these two elements (is that the right
word?) give different results:
prop_1st: value # yields the string "value"
prop_2nd: Y # yields a numeric 1! Newbie says huh??
What options are available?
1) Tweak default behavior. Pro: Might satisfy this case. Con: Might
break older files, or require them to be version-stamped. Con: Will
just be an invitation to more tweaking and nobody will ever be happy.
Con: will create a list of incompatible versions with incomprehensible
variations (this is the end-result of taking this road). Think HTML 3,
html 3 for ie, html 4, html 4 for ie, html 4 for mozilla pre 6, mozilla
6, etc etc etc.
2) Support for header directives for the possible options. Possible
none: follow behavior from before directives became available, so my Y
above becomes a 1
"booltrue: Y, 1, Yes, YES", some kind of explicit list of values that
will be treated as boolean true. If my Y is not listed it wont be
treated as a boolean.
boolfalse: same as booltrue, list of false values
date: xx-xx-xxxx, anything that fits the picture is treated as a date.
numdigits: Treat any string composed only of numerals as a number
...others as they come to mind. Those are off the top of my head, a
real effort would have to be made to seek the list of directives that
served all purposes without overlap or missing possibilities.
3) Declaring types for named properties. In the above example I would
declare in the header that "prop_2nd" is a string.
4) Type-casting at the definition, which I believe is supported now with
If Ken were calling the shots, option 1 would be thrown out, option 4 is
already supported, so supporting options 2 and 3 would produce the
general solution, and then it becomes a matter of programmer preference
and then you wait for best practices to emerge through community use of
the various approaches.
It's a little more involved than picking a typing solution.
Actually my problem is a typing problem, and solving my own problem is
exactly as involved as picking a typing solution. The only situations
it touches for other parties are typing situations.
YAML is intended
to be used in both closed systems and open. In situations with a single
producer and consumer, and in those with many. In single proramming languages
and multi. etc.
And this is relevant how? A general solution that allows a file to be
self-describing solves all cases. Different languages can make use of
the typing directives as needed/able. The typing method used by YAML
cannot change the fundamental typing abilities of any given language,
so the best way to handle lots of languages is to have the most
flexible way to describe the data.
Actually, to round out the general solution, the processor itself might
accept run-time parameters that override the directives inside of the
In small closed systems the YAML tool in question should be assumed to do the
Except when it doesn't, and you end up squeezing the balloon and always
watching it pop out somewhere else. The problem is that one person's
Right Thing is another person's Wrong Thing. You can never assume
except in trivial cases that code which is making assumptions will
always make the right assumptions.
If it doesn't this can easily be fixed by local code.
Yikes! Pushing the problem to code! A very strange approach in a
data-serialization project. I would expect more focus on the
possibilities of data-driven configurations.
There is also the times when a document should be considered appropriate for
the masses, and data typing must be perfect and enforced. So far we really
only have tags for this. But we have talked here for years about defining a
"Schema" language for yaml. And documents could be given a schema, perhaps in
a header directive...
So we need to do that. But that will take effort...
But the real question for to answer and make clear for now, is when to apply
Given two parties using it, you'll get two opinions. Three parties,
three opinions. Good luck.
Secure Data Software, Inc.
631-379-7200 Fax: 631-689-0527