From: <ir...@ms...> - 2002-09-10 18:57:07
|
I think we need to articulate in the spec the multi-level loader details we've been talking about, both to give a reference to language-module implementers and the libyaml maintainer, and also to show users more specifically what YAML supports. We need to give examples all the way down to the native types in one or two of our reference platforms (I'm using Python below). If language-specific information in the spec offends people, we should at least give hints about recommended default types. I also think we need to decouple the vocabulary for a user-level loading/dumping tool from "Parser/Loader/Emitter/Dumper" in the diagram. Those four actually convert from one model to another, but a user-level tool does something different: it exports/imports a native representation *within* a level. Since the user doesn't care what YAML calls its model-thunking routines, I'll use the terms "load" and "dump" for the user-level tools and let others worry about what to call the inter-model converters. In this scenario, "native model" truly disappears because "native" is a concept orthogonal to the models. With this in mind, here's a starting point, based on Clark's 4-model proposal with a level 0 added. Obviously the models and specifics are in flux, but we can get the outline down now and update the specifics later. (Oh dear, I'm adding pages...) ===== YAML has four models for representing data, arranged linearally from lowest (closest to the I/O stream) to highest (most abstract). Sometimes these models are called "levels". Any model may have a user-accessible loader tool that exports the data from that model into a language-specific native data structure the application can operate on. Likewise, the model may have a dumper tool that does the opposite. We say *may* because the language implementation (e.g., PyYAML) is not required to support a loader/dumper for every level, but if it does, the loader/dumper should support the following features at minimum. A loader may be a method or function you call that returns a native data structure, but exactly how you invoke it is up to the language module. Although each model has its own internal data structure (not normally accessible to the user), YAML has only one persistent structure, the "stream" contained in a .yml file. So when the application requests data at a particular level, YAML must go *through* the lower models to read it from the stream. MODEL: Sub-syntax LEVEL: 0 LOADER READS: input stream LOADER RETURNS: a string containing the entire document TYPICAL USES: not many MODEL: Syntax LEVEL: 1 LOADER READS: sub-syntax model LOADER RETURNS: - each scalar as a string. - each [] collection as a list of pairs (0-based index, value) - each {} collection as a list of pairs (key, value) - key order and duplicate keys are preserved for collections - no information about type family - anchors/aliases are not resolved, they are just strings with "&" and "*" prefixes. - comments? Should there be an intermediate level that returns comments? But how will the application distinguish between a scalar and a comment if both are strings? Counting on the "#" prefix would be unreliable. TYPICAL USES: - most of the serial-model uses can also be done here TYPE INFORMATION: - the alternate type-family loader mentioned in the serial model could also operate here MODEL: Serial LEVEL: 2 LOADER READS: syntax model LOADER RETURNS: - same as syntax model but... - anchors/aliases are resolved - no comments TYPICAL USES: - A configuration file where all values are considered strings. The application handles type conversion itself. The application may want to validate "12345" before converting it to a number, or may want to raise a custom validation error. - An application that needs key order and duplicate keys preserved. TYPE INFORMATION: - there may be an alternate loader method that returns a parallel structure containing the type family of each scalar (as a string) rather than the value itself. But how to encode the type family of collections? If you make the collection's value, there's no place to list the type families of the collection's children. MODEL: General LEVEL: 3 LOADER READS: serial model LOADER RETURNS: - I don't know how this is different from the serial model. MODEL: Functional (aka graph) LEVEL: 4 LOADER READS: general model LOADER RETURNS: - same as general model but... - each scalar is resolved to its implicit/explicit type - each [] collection is converted to a list. order is preserved. duplicate keys is not an issue since YAML generated the keys. - each {} collection is converted to a dictionary. caveats: - key order is not guaranteed. Python dictionaries destroy key order and it cannot be reconstructed. PHP arrays preserve key order, but the loader may intentionally randomize the order to prevent you from exploiting this. - duplicate keys are dropped. Should this be done silently, or with a warning, or with an error, or whichever the user chooses? The loader may give the user options to select from a variety of alternate native representations. The dumpers would generally operate in reverse. Probably they would need to choose one or two output styles for all scalars, sequences and maps. Choosing specific output styles beyond that really requires a schema. -- -Mike (Iron) Orr, ir...@ms... (if mail problems: ms...@oz...) http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol |
From: Clark C. E. <cc...@cl...> - 2002-09-10 19:18:00
|
On Tue, Sep 10, 2002 at 11:57:04AM -0700, Mike Orr wrote: | I think we need to articulate in the spec the multi-level loader details | we've been talking about, both to give a reference to language-module | implementers and the libyaml maintainer, and also to show users more | specifically what YAML supports I agree with your direction; although the word "Loader" we've reserved from the operation which goes from the serial model to the graph model; so I'd rather not muck with that. ;) I like everything you've written here. From the output of the parser, there are two separable issues: (a) creation of graph using anchor/alias, and (b) resolution of types. Both of these are the "loader" at this point, and perhaps their roles should be split. I'm not sure. Give me about 2 weeks to revise a draft of the information model to reflect the discussion on this list (and your thoughtful wording on a few tight items). At that point it'll be high time to have another round of commenting, question/answer, etc. beacuse I'm sure I won't get it exactly right. But at this time there are two many proposals floating around; give me some time to assimilate it as a coherent explanation. See my next post (about 20 min from now or so) Best, Clark |
From: Tom S. <tra...@tr...> - 2002-09-10 19:42:27
|
clark, i have already started the reworking of 3 Information Models. if you'd like, i can give you waht i have thus far. you can then modify it as you see sit. i've made it through the first part, icluding the diagram, although i have one extra level in there. by the way speaking of that level. i made this split in order to access directives, comments, and the like (the things lost in the current serial model) in order for natives to be able to load that information from the tree structured stream without having to parse the raw text instead. doesn't that seem like a reasonable distinction, and capability? -tom On Tue, 2002-09-10 at 13:19, Clark C. Evans wrote: > On Tue, Sep 10, 2002 at 11:57:04AM -0700, Mike Orr wrote: > | I think we need to articulate in the spec the multi-level loader details > | we've been talking about, both to give a reference to language-module > | implementers and the libyaml maintainer, and also to show users more > | specifically what YAML supports > > I agree with your direction; although the word "Loader" we've > reserved from the operation which goes from the serial model > to the graph model; so I'd rather not muck with that. ;) > I like everything you've written here. From the output of the > parser, there are two separable issues: (a) creation of graph > using anchor/alias, and (b) resolution of types. Both of these > are the "loader" at this point, and perhaps their roles should > be split. I'm not sure. > > Give me about 2 weeks to revise a draft of the information model > to reflect the discussion on this list (and your thoughtful wording > on a few tight items). At that point it'll be high time to have > another round of commenting, question/answer, etc. beacuse I'm sure > I won't get it exactly right. But at this time there are two > many proposals floating around; give me some time to assimilate > it as a coherent explanation. > > See my next post (about 20 min from now or so) > > Best, > > Clark > > > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core > -- tom sawyer, aka transami tra...@tr... |
From: Clark C. E. <cc...@cl...> - 2002-09-10 20:14:57
|
Tom, There is a reson why comments and the like _arn't_ in the serial model. It is beacuse they shouldn't be used by the loader when building the in-memory representation. They are THROW AWAY comments. Just beacuse it is in the syntax doesn't mean it needs to be in the model, this is the whole point of the model. Clark On Tue, Sep 10, 2002 at 01:52:31PM -0600, Tom Sawyer wrote: | clark, i have already started the reworking of 3 Information Models. if | you'd like, i can give you waht i have thus far. you can then modify it | as you see sit. i've made it through the first part, icluding the | diagram, although i have one extra level in there. | | by the way speaking of that level. i made this split in order to access | directives, comments, and the like (the things lost in the current | serial model) in order for natives to be able to load that information | from the tree structured stream without having to parse the raw text | instead. doesn't that seem like a reasonable distinction, and | capability? | | -tom | | | On Tue, 2002-09-10 at 13:19, Clark C. Evans wrote: | > On Tue, Sep 10, 2002 at 11:57:04AM -0700, Mike Orr wrote: | > | I think we need to articulate in the spec the multi-level loader details | > | we've been talking about, both to give a reference to language-module | > | implementers and the libyaml maintainer, and also to show users more | > | specifically what YAML supports | > | > I agree with your direction; although the word "Loader" we've | > reserved from the operation which goes from the serial model | > to the graph model; so I'd rather not muck with that. ;) | > I like everything you've written here. From the output of the | > parser, there are two separable issues: (a) creation of graph | > using anchor/alias, and (b) resolution of types. Both of these | > are the "loader" at this point, and perhaps their roles should | > be split. I'm not sure. | > | > Give me about 2 weeks to revise a draft of the information model | > to reflect the discussion on this list (and your thoughtful wording | > on a few tight items). At that point it'll be high time to have | > another round of commenting, question/answer, etc. beacuse I'm sure | > I won't get it exactly right. But at this time there are two | > many proposals floating around; give me some time to assimilate | > it as a coherent explanation. | > | > See my next post (about 20 min from now or so) | > | > Best, | > | > Clark | > | > | > ------------------------------------------------------- | > This sf.net email is sponsored by: OSDN - Tired of that same old | > cell phone? Get a new here for FREE! | > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 | > _______________________________________________ | > Yaml-core mailing list | > Yam...@li... | > https://lists.sourceforge.net/lists/listinfo/yaml-core | > | -- | tom sawyer, aka transami | tra...@tr... | | | | ------------------------------------------------------- | This sf.net email is sponsored by: OSDN - Tired of that same old | cell phone? Get a new here for FREE! | https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Brian I. <in...@tt...> - 2002-09-10 20:31:49
|
On 10/09/02 20:16 +0000, Clark C. Evans wrote: > Tom, > > There is a reson why comments and the like _arn't_ in the > serial model. It is beacuse they shouldn't be used by > the loader when building the in-memory representation. > They are THROW AWAY comments. Just beacuse it is in > the syntax doesn't mean it needs to be in the model, > this is the whole point of the model. +1 but... This is the general case for data serialization. We want to have a relatively simple parser/emitter for common YAML implementations like YAML.pm and PyYaml and YAML4Ruby. This is libyaml. It uses the SERIAL model. It throws away comments. Comments aren't in the SERIAL model. Fine. There will also be more specialized classes of applications like a YAML editor that must preserve all syntax info. These applications will not use libyaml, nor the SERIAL model. It's not that you MUST throw away comments when working with YAML. You just need to write a parser (and possibly a model) that deals with them. Cheers, Brian > > Clark > > On Tue, Sep 10, 2002 at 01:52:31PM -0600, Tom Sawyer wrote: > | clark, i have already started the reworking of 3 Information Models. if > | you'd like, i can give you waht i have thus far. you can then modify it > | as you see sit. i've made it through the first part, icluding the > | diagram, although i have one extra level in there. > | > | by the way speaking of that level. i made this split in order to access > | directives, comments, and the like (the things lost in the current > | serial model) in order for natives to be able to load that information > | from the tree structured stream without having to parse the raw text > | instead. doesn't that seem like a reasonable distinction, and > | capability? > | > | -tom > | > | > | On Tue, 2002-09-10 at 13:19, Clark C. Evans wrote: > | > On Tue, Sep 10, 2002 at 11:57:04AM -0700, Mike Orr wrote: > | > | I think we need to articulate in the spec the multi-level loader details > | > | we've been talking about, both to give a reference to language-module > | > | implementers and the libyaml maintainer, and also to show users more > | > | specifically what YAML supports > | > > | > I agree with your direction; although the word "Loader" we've > | > reserved from the operation which goes from the serial model > | > to the graph model; so I'd rather not muck with that. ;) > | > I like everything you've written here. From the output of the > | > parser, there are two separable issues: (a) creation of graph > | > using anchor/alias, and (b) resolution of types. Both of these > | > are the "loader" at this point, and perhaps their roles should > | > be split. I'm not sure. > | > > | > Give me about 2 weeks to revise a draft of the information model > | > to reflect the discussion on this list (and your thoughtful wording > | > on a few tight items). At that point it'll be high time to have > | > another round of commenting, question/answer, etc. beacuse I'm sure > | > I won't get it exactly right. But at this time there are two > | > many proposals floating around; give me some time to assimilate > | > it as a coherent explanation. > | > > | > See my next post (about 20 min from now or so) > | > > | > Best, > | > > | > Clark > | > > | > > | > ------------------------------------------------------- > | > This sf.net email is sponsored by: OSDN - Tired of that same old > | > cell phone? Get a new here for FREE! > | > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > | > _______________________________________________ > | > Yaml-core mailing list > | > Yam...@li... > | > https://lists.sourceforge.net/lists/listinfo/yaml-core > | > > | -- > | tom sawyer, aka transami > | tra...@tr... > | > | > | > | ------------------------------------------------------- > | This sf.net email is sponsored by: OSDN - Tired of that same old > | cell phone? Get a new here for FREE! > | https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > | _______________________________________________ > | Yaml-core mailing list > | Yam...@li... > | https://lists.sourceforge.net/lists/listinfo/yaml-core > > > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Tom S. <tra...@tr...> - 2002-09-10 20:47:38
|
On Tue, 2002-09-10 at 14:31, Brian Ingerson wrote: > It's not that you MUST throw away comments when working with YAML. You just > need to write a parser (and possibly a model) that deals with them. Brian, i am i'm in agreement with you and clark, understand that i merely postpone the removal of the comments, etc. one level, hence why i have an extra model level, Aliased, between Serial and Logcial. its after the initalial serilization that they are discarded. so there is at least a place where they exist hooked into the event tree before being jettisoned. -- tom sawyer, aka transami tra...@tr... |
From: Neil W. <neilw@ActiveState.com> - 2002-09-10 22:26:38
|
Brian Ingerson [10/09/02 13:31 -0700]: > This is the general case for data serialization. We want to have a relatively > simple parser/emitter for common YAML implementations like YAML.pm and PyYaml > and YAML4Ruby. This is libyaml. It uses the SERIAL model. It throws away > comments. Comments aren't in the SERIAL model. Fine. Hey! Not true! In the push API, there is a comment_notify() callback to enable applications to keep them. Specifically apps that need to, like pretty-printers. In the pull API, they're just returned with a special node kind YAML_COMMENT, and every element in the node structure is empty except 'content', which is the comment itself. > There will also be more specialized classes of applications like a YAML > editor that must preserve all syntax info. These applications will not use > libyaml, nor the SERIAL model. Yes they will. Or at least, it was my intention to make it possible. > It's not that you MUST throw away comments when working with YAML. You just > need to write a parser (and possibly a model) that deals with them. You *probably* ought to throw them out when going to a graph model or something like that -- otherwise where to comments fit? They're kinda weird. But you could I suppose; anyway, libyaml supports it. Later, Neil |
From: Tom S. <tra...@tr...> - 2002-09-10 22:31:01
|
On Tue, 2002-09-10 at 16:28, Neil Watkiss wrote: > Brian Ingerson [10/09/02 13:31 -0700]: > > This is the general case for data serialization. We want to have a relatively > > simple parser/emitter for common YAML implementations like YAML.pm and PyYaml > > and YAML4Ruby. This is libyaml. It uses the SERIAL model. It throws away > > comments. Comments aren't in the SERIAL model. Fine. > > Hey! Not true! > > In the push API, there is a comment_notify() callback to enable > applications to keep them. Specifically apps that need to, like > pretty-printers. In the pull API, they're just returned with a special node > kind YAML_COMMENT, and every element in the node structure is empty except > 'content', which is the comment itself. neil, so comments do have existence at the serial level in libyaml? is that right? -- tom sawyer, aka transami tra...@tr... |
From: <ir...@ms...> - 2002-09-10 22:43:43
|
What's the state of libyaml? Is there a list of functions it will provide? Is it written in C? -- -Mike (Iron) Orr, ir...@ms... (if mail problems: ms...@oz...) http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol |
From: Neil W. <neilw@ActiveState.com> - 2002-09-11 01:51:50
|
Mike Orr [10/09/02 15:43 -0700]: > What's the state of libyaml? Pre-release, although I'm working hard to change that. It was *very* stable about six months ago as a push parser. Then I decided to convert it to a pull parser with a push parser implemented on top of that; then spec went haywire and I hit PAUSE. Currently only YAML developers can access it. Anyone who wants an SSH account for the YAML Perforce server can email me or Brian Ingerson offline and get access. > Is there a list of functions it will provide? > Is it written in C? It's written in C, but the API is not yet public (or complete). I suspect it won't change much. Here's a quick glimpse of what you can expect: ---8<--- #include "yaml.h" yaml_bool_t exception_handler(yaml_exception_t *except) { if (IS_FATAL_EXCEPTION(except->errorcode)) { /* complain at the user, if you want */ return 0; /* 0 means a fatal error */ } return 1; /* non-fatal error (we call these warnings) */ } int main() { dYAML_OPTIONS(opts); yaml_parser_t *parser; /* Request that nodes have content recoded to UTF-16, regardless of the * original encoding. */ opts.encoding = YAML_ENCODING_UTF16; /* Register an exception handler. Without an exception handler, all * parser exceptions (including warnings) are considered fatal. The * parser accumulates a list of non-fatal exceptions which you can query * periodically to display to the user. */ opts.exception_handler = &exception_handler; /* Create a new parser to parse a UTF-8 stream from standard input. If * you specify YAML_ENCODING_DETECT, it uses the BOM (if any) to select * the encoding. If no BOM is found, it selects UTF-8. If you know there * isn't a BOM, you can specify the encoding explicitly. If you specify * YAML_ENCODING_UTF16 or YAML_ENCODING_UTF32, a BOM is required to tell * the difference between LE or BE. libyaml *only* supports UTF * encodings. It does not support any ISO-8859-1 caca. */ parser = yaml_new_parser_fp(stdin, YAML_ENCODING_UTF8, &opts); /* OR -- create a new parser to parse a memory buffer */ parser = yaml_new_parser_cstr(buffer, length, YAML_ENCODING_UTF32BE, &opts); /* OR -- create a parser to parse from an arbitrary input. The parser * calls the input function and tells it how much room it has, and the * function should read from whatever source it's using and fill up the * parser's buffer. parser = yaml_new_parser(&arbitrary_input, &opts); /* PUSH parse */ result = yaml_parse(parser, &events); /* PULL parse */ while (node = yaml_parse_next(parser)) { /* do something with the node here */ } } --->8--- That's all for now. If you want to know more, please wait for the alpha release. All I can say is "soon" -- probably before the end of the month for at least some kind of preview. That's barring any major upheavals in the spec. You can expect at least a minimal Perl and Python loader built on libyaml to be released at the same time. Later, Neil |
From: Brian I. <in...@tt...> - 2002-09-11 01:09:32
|
On 10/09/02 15:28 -0700, Neil Watkiss wrote: > Brian Ingerson [10/09/02 13:31 -0700]: > > This is the general case for data serialization. We want to have a relatively > > simple parser/emitter for common YAML implementations like YAML.pm and PyYaml > > and YAML4Ruby. This is libyaml. It uses the SERIAL model. It throws away > > comments. Comments aren't in the SERIAL model. Fine. > > Hey! Not true! > > In the push API, there is a comment_notify() callback to enable > applications to keep them. Specifically apps that need to, like > pretty-printers. In the pull API, they're just returned with a special node > kind YAML_COMMENT, and every element in the node structure is empty except > 'content', which is the comment itself. > > > There will also be more specialized classes of applications like a YAML > > editor that must preserve all syntax info. These applications will not use > > libyaml, nor the SERIAL model. > > Yes they will. Or at least, it was my intention to make it possible. > > > It's not that you MUST throw away comments when working with YAML. You just > > need to write a parser (and possibly a model) that deals with them. > > You *probably* ought to throw them out when going to a graph model or > something like that -- otherwise where to comments fit? They're kinda weird. > But you could I suppose; anyway, libyaml supports it. My apologies. I didn't know it was that sophisticated. One could argue that it need not be for complexity reasons. But one could also argue that we might as well have it be as function as Neil cares to make it. I assume the Loader could ignore comments without any significant loss in speed (opposed to having the parser just ignore them). Cheers, Brian |
From: Tom S. <tra...@tr...> - 2002-09-10 20:39:15
|
On Tue, 2002-09-10 at 14:16, Clark C. Evans wrote: > There is a reson why comments and the like _arn't_ in the > serial model. It is beacuse they shouldn't be used by > the loader when building the in-memory representation. > They are THROW AWAY comments. Just beacuse it is in > the syntax doesn't mean it needs to be in the model, > this is the whole point of the model. oh contra! though agreed that it is not to be brought down below its level of application, it does nonetheless have potential applications at it's particular level! for instance lets talk about ypath, although strictly general it will gives an an idea of what these dumper/loaders at differnt levels are all about. this is of course ficticious syntax just to give one an idea of the possibilities. ypath-syntactical: (returns strings) $[0..*] - return the entire document stream $[10] - return character 10 $[10..16] - return characters 10 through 16 @[1..2] - return the 3rd thru 5th lines etc... ypath-serial: (returns nodes or strings) \* - the entire tree \[3] - the 3rd branch off of root \[2]\[4] - the 4th branch of the 2nd branch off the root \[2]\[4]# - any comment located on this branch. \[2]\[4]-# - any directive located on this branch. \[2]\[4]- - the indentation of this branch (?) etc... ypath-aliased: \[3]& - any anchor on the thrid node \&[3] - the third anchor in the document stream \[3]* - any alias of the thrid node \*[3] - the third alias of the document stream \*[3]& - the anchor of the third alias and others... ypath-logical: yapth-functional: * these last two are the more traditional use cases of ypath so i'll let these be. we all i have an idea of what they are all about. -tom -- tom sawyer, aka transami tra...@tr... |
From: Clark C. E. <cc...@cl...> - 2002-09-10 20:49:53
|
On Tue, Sep 10, 2002 at 02:49:29PM -0600, Tom Sawyer wrote: | oh contra! though agreed that it is not to be brought down below its | level of application, it does nonetheless have potential applications at | it's particular level! We agree here. The reason for the SYNTAX model is for applications like a YAML editor, etc. Applications at this level also want to maintain syntax style and even where the actual carriage returns are in folded blocks. It's a bloody awful model, but it reflects the actual syntax. | ypath-syntactical: (returns strings) | | $[0..*] - return the entire document stream | $[10] - return character 10 | $[10..16] - return characters 10 through 16 | @[1..2] - return the 3rd thru 5th lines | etc... Interesting, but probably not that useful. ;) | ypath-serial: (returns nodes or strings) | | \* - the entire tree | \[3] - the 3rd branch off of root | \[2]\[4] - the 4th branch of the 2nd branch off the root Yep. BTW, good of you to use this sort of logic, appealing to a direct application of the model. Nice. | \[2]\[4]# - any comment located on this branch. | \[2]\[4]-# - any directive located on this branch. | \[2]\[4]- - the indentation of this branch (?) | etc... Comments arn't actually attached to nodes. This would cause all sort of problems and was hashed out about a year ago. See the archives. It's a brutal decision and we don't have a week to review it. So can we let it rest? | ypath-aliased: | \[3]& - any anchor on the thrid node | \&[3] - the third anchor in the document stream | \[3]* - any alias of the thrid node | \*[3] - the third alias of the document stream | \*[3]& - the anchor of the third alias | and others... | | ypath-logical: | | yapth-functional: I think you have too many models. Three will do just nicely. ;) Best, Clark |
From: Clark C. E. <cc...@cl...> - 2002-09-10 21:29:04
|
On Tue, Sep 10, 2002 at 03:10:42PM -0600, Tom Sawyer wrote: | On Tue, 2002-09-10 at 14:51, Clark C. Evans wrote: | > Comments arn't actually attached to nodes. This would cause | > all sort of problems and was hashed out about a year ago. | > See the archives. It's a brutal decision and we don't have | > a week to review it. So can we let it rest? Err, um, I didn't intend to go off list... sorry. | okay, if there'e no way to get comments or indentation except by parsing | the syntax that's fine, i'll drop. but what about directives? they have | a bit more importance don't you think? so at what level can i load those | below the syntatical one? Err. I think my post was mis-leading. Comments, Indentation, and other syntax stuff is in the syntax model (the one that reflects t but to answer your question; comments, indentation, directives, etc all belong in the syntax model and the parser is *welcome* to provide this info (I think libyaml provides it if asked for) | when you say three will do i get the feeling that no logical model has | made it into you relm of possibilities? Everything is in the realm of possibilities; I just don't think that a stack of models can probably be improved upon. I'm not sure... really. Model stuff is _hard_ and best done as a collaboration, cuz I know I'm too stupid to see the whole picture. Anyway, to answer the most immediate concern, I don't think that a tall hierarchy will do, I think it is probably more like a cube... +---------+ (functional,graph,typed) / | | / | | + | | | +---------+ (functional,tree,typed) | / / | / / |/ / +---------+ (functional,tree,untyped) ^ / (paired,tree,untyped) Only that functional,untyped is distinctly _not_ a possibility... so a cube is out. ;( Clark |
From: Tom S. <tra...@tr...> - 2002-09-10 22:07:35
|
On Tue, 2002-09-10 at 15:30, Clark C. Evans wrote: > On Tue, Sep 10, 2002 at 03:10:42PM -0600, Tom Sawyer wrote: > | On Tue, 2002-09-10 at 14:51, Clark C. Evans wrote: > | > Comments arn't actually attached to nodes. This would cause > | > all sort of problems and was hashed out about a year ago. > | > See the archives. It's a brutal decision and we don't have > | > a week to review it. So can we let it rest? > > Err, um, I didn't intend to go off list... sorry. > > | okay, if there'e no way to get comments or indentation except by parsing > | the syntax that's fine, i'll drop. but what about directives? they have > | a bit more importance don't you think? so at what level can i load those > | below the syntatical one? > > Err. I think my post was mis-leading. Comments, Indentation, and other > syntax stuff is in the syntax model (the one that reflects t > but to answer your question; comments, indentation, directives, etc > all belong in the syntax model and the parser is *welcome* to > provide this info (I think libyaml provides it if asked for) i see so the syntax model is a little more then what i've been taking it to be? just a hunk of text? that's actually the 0th pre-syntax model as mentioned by mike. is that right? okay if that's the case then that aprt is already like i want. my spilt was trying to fix something i didn't know was already there b/c the 0th level was never clearly contrasted against the 1st (syntax). > | when you say three will do i get the feeling that no logical model has > | made it into you relm of possibilities? > > Everything is in the realm of possibilities; I just don't > think that a stack of models can probably be improved upon. > I'm not sure... really. Model stuff is _hard_ and best done > as a collaboration, cuz I know I'm too stupid to see the > whole picture. well mine isn't all that differnt from what is there already, and its not completly a stack. first off the functonal model is a subclass of the logical one. and if you think about it one might be able to concieve it as a russian "onion" doll, one layer in inside another inside another. i'll think on it some more though. > Anyway, to answer the most immediate concern, I don't think > that a tall hierarchy will do, I think it is probably more > like a cube... > > +---------+ (functional,graph,typed) > / | | > / | | > + | | > | +---------+ (functional,tree,typed) > | / / > | / / > |/ / > +---------+ (functional,tree,untyped) > ^ > / > (paired,tree,untyped) > > Only that functional,untyped is distinctly _not_ > a possibility... so a cube is out. ;( Clark okay lets review each of these terms cause i want to know exactly what you mean by them. for instance graph is a real funny one, i've seen that used so freely i don't even know if has a meaning anymore ;) the functional, are we on the same page there, the mathmatical defintion --doamin/range, etc. then typed vs. untyped, your talking about a semantic level like 0xFF = 255, distinguished from a format case like 10,000 = 10000 limited to regex. i assume tree is the the syntax "flattened" into a hierarchical sequence (of events). and finally paired, well i'm thinking alias/anchor, that one escapes me. so i take it there's: pairing vs. functional tree vs. graph untyped vs. typed yes, indeed i really need to uinderstand your terminology better. mine is rather formally based on mathamatics and literal websters usage. not so much academic c.s. terminolgy b/c i've taught myself most everyhting i know. please clearify tree vs graph and paring vs. functional. -- tom sawyer, aka transami tra...@tr... |
From: Clark C. E. <cc...@cl...> - 2002-09-11 00:02:18
|
On Tue, Sep 10, 2002 at 04:17:46PM -0600, Tom Sawyer wrote: | yes, indeed i really need to uinderstand your terminology better. mine | is rather formally based on mathamatics and literal websters usage. not | so much academic c.s. terminolgy b/c i've taught myself most everyhting | i know. please clearify tree vs graph and paring vs. functional. functional: This is the model where the collection is defined to be a mathematical mapping, where the domain is a set. Thus, Equality property between nodes is required to keep duplicates out. paired: This is where the collection is an ordered set of pairs. I also call this one a named-list; although ordered set of pairs is better (it's actually more strict than a named-list). graph: This is where the structure is in random access memory and can fold back on itself. A node can appear twice. tree: This is where each node has exactly one parent, a node cannot be referenced more than once. We represent graphs using trees by marking nodes which occur more than once with an anchor, and subsequent occurances of the node with an alias which is logically linked to the anchor Clark |
From: Steve H. <sh...@zi...> - 2002-09-10 20:17:39
|
We seem to have at last two folks--Tom and Clark--interested in creating a document that describes the YAML data models. Both guys are intelligent. Both guys clearly see room for improvement. Both guys have slightly different philosophies about YAML. I propose that over the next two weeks they both maintain a Wiki page with their latest and greatest conception of how the YAML data models work: http://wiki.yaml.org/yamlwiki/TomsModels http://wiki.yaml.org/yamlwiki/ClarksInfoModels (and then Clark has a few others) When Tom and Clark agree on concepts, they are welcome to steal from each other's pages. Third party observers can also start their own pages. If they would rather just get behind one of the existing proposals, though, they may just decide to help Tom or Clark edit and strengthen their proposals. At the end of two weeks, hopefully we'll have at least one, probably two, maybe even threee, very strong presentations of how the YAML data model works. One of these presentations could be sanctioned in the spec. The other could be left on the Wiki for posterity. Cheers, Steve |
From: Clark C. E. <cc...@cl...> - 2002-09-10 20:26:29
|
Steve, I think a better approach is to put the spec up in the Wiki and then collaboratively edit it. The Wiki gives a history of who changed what. Clark |
From: Steve H. <sh...@zi...> - 2002-09-10 21:05:05
|
----- Original Message ----- From: "Clark C. Evans" <cc...@cl...> > > I think a better approach is to put the spec up in the Wiki and > then collaboratively edit it. The Wiki gives a history of who > changed what. > You and Tom seem to disagree on enough things that having two separate proposals for a little while makes more sense. I'd rather see two proposals that each person firmly believes in, than one proposal that neither believes in. Of course, insofar as you agree, please copy each other, and you can even refactor sections of your documents to other pages, when you know you have consensus on subgoals. (MoinMoin supports an #include directive.) When I read the two proposals from two weeks from now, I will disregard all information about personal histories, what committees people served on, what mailing lists they've read, what specs they've poured through, what prior schemas they drafted, what school they went to, and whether they are part of any holy triumvirates. I am just going to read both proposals, and see which one reads the most clearly and makes the most sense. That's my prerogative, right? Cheers, Steve |
From: Clark C. E. <cc...@cl...> - 2002-09-10 20:37:57
|
Steve, The different phlisophies about YAML's direction are really not spec-wording issues. They are actually much higher than that and they are set by agreement of Brian, Oren and myself. Spec wording issues are a reflection of these philsophies. YAML itself is run mostly by concensus -- but when it comes down to the wire, it is a concensus of Brian, Oren and myself. And in some cases we've even had to do a 2 out of 3 vote between us. This is YAML. As for Tom's ideas of the spec; these are great to have and I'm sure his perspectives will be examined and taken into account -- I very clearly know what he is going for. If you read the SML-DEV archives, you'll see a whole group of us went down this road about 2-3 years ago. I particularly like Mike Orr's wording on stuff and will probably lift a few sentances here and there. Further, I'll gladly start editing the spec on the website... and people are welcome to help. Best, Clark |
From: Brian I. <in...@tt...> - 2002-09-10 21:02:14
|
On 10/09/02 20:39 +0000, Clark C. Evans wrote: > Steve, > > The different phlisophies about YAML's direction are really not > spec-wording issues. They are actually much higher than > that and they are set by agreement of Brian, Oren and myself. > Spec wording issues are a reflection of these philsophies. > > YAML itself is run mostly by concensus -- but when > it comes down to the wire, it is a concensus of Brian, > Oren and myself. I'd be the last one to point this out. But iti's true. I won't ever harp on it though. I like to do everything in the open and be receptive to new ideas. Sometimes when there are too many ideas, we need to step back and think about why YAML was originally started. Oren, Clark and I have this advantage. We've been working on this almost daily for close to 500 days. It's funny, because this phenomenon of "too many cooks" was not a problem in the past. We barely had anyone listening in, let alone speaking up. I blame Steve Howell and Why and their excellent implementations. They dragged in whole user communities. It's super to have this think-tank, but sometimes it feels like a shark-tank! > And in some cases we've even had to > do a 2 out of 3 vote between us. This is YAML. Rarely though. I can't think of any issue that Clark and Oren made which I am completely unhappy with. We take time to get things right. It seems that we at least did a good job on the syntax. Nobody ever gripes about that. (thank goodness) Well that's just my editorial. Back to business everybody. Move along now... Cheers, Brian |
From: Tom S. <tra...@tr...> - 2002-09-10 19:29:48
|
Mike, this is crazy! except for one level distinction this is exactly my model proposal, what i've now spent the last 24 hours working on non-stop. http://wiki.yaml.org/yamlwiki/TomsModels?action=show is it me or was it really that hard to comprehensd before? well please give it another go. this time i think it will clarify some of your questions. On Tue, 2002-09-10 at 12:57, Mike Orr wrote: > I think we need to articulate in the spec the multi-level loader details > we've been talking about, both to give a reference to language-module > implementers and the libyaml maintainer, and also to show users more > specifically what YAML supports. We need to give examples all the > way down to the native types in one or two of our reference platforms > (I'm using Python below). If language-specific information in the spec > offends people, we should at least give hints about recommended default > types. > > I also think we need to decouple the vocabulary for a user-level > loading/dumping tool from "Parser/Loader/Emitter/Dumper" in the > diagram. Those four actually convert from one model to another, but > a user-level tool does something different: it exports/imports a > native representation *within* a level. Since the user doesn't care > what YAML calls its model-thunking routines, I'll use the terms > "load" and "dump" for the user-level tools and let others worry about > what to call the inter-model converters. > > In this scenario, "native model" truly disappears because "native" is a > concept orthogonal to the models. > > With this in mind, here's a starting point, based on Clark's 4-model > proposal with a level 0 added. Obviously the models and specifics are > in flux, but we can get the outline down now and update the specifics > later. (Oh dear, I'm adding pages...) > > ===== > YAML has four models for representing data, arranged linearally > from lowest (closest to the I/O stream) to highest (most abstract). > Sometimes these models are called "levels". Any model may have a > user-accessible loader tool that exports the data from that model into a > language-specific native data structure the application can operate on. > Likewise, the model may have a dumper tool that does the opposite. We > say *may* because the language implementation (e.g., PyYAML) is not > required to support a loader/dumper for every level, but if it does, > the loader/dumper should support the following features at minimum. > A loader may be a method or function you call that returns a native > data structure, but exactly how you invoke it is up to the language > module. > > Although each model has its own internal data structure (not normally > accessible to the user), YAML has only one persistent structure, the > "stream" contained in a .yml file. So when the application requests > data at a particular level, YAML must go *through* the lower models > to read it from the stream. > > MODEL: Sub-syntax > LEVEL: 0 > LOADER READS: input stream > LOADER RETURNS: a string containing the entire document > TYPICAL USES: not many > > MODEL: Syntax > LEVEL: 1 > LOADER READS: sub-syntax model > LOADER RETURNS: > - each scalar as a string. > - each [] collection as a list of pairs (0-based index, value) > - each {} collection as a list of pairs (key, value) > - key order and duplicate keys are preserved for collections > - no information about type family > - anchors/aliases are not resolved, they are just strings with "&" and > "*" prefixes. > - comments? Should there be an intermediate level that returns > comments? But how will the application distinguish between a > scalar and a comment if both are strings? Counting on the "#" prefix > would be unreliable. > TYPICAL USES: > - most of the serial-model uses can also be done here > TYPE INFORMATION: > - the alternate type-family loader mentioned in the serial model > could also operate here > > > MODEL: Serial > LEVEL: 2 > LOADER READS: syntax model > LOADER RETURNS: > - same as syntax model but... > - anchors/aliases are resolved > - no comments > TYPICAL USES: > - A configuration file where all values are considered strings. > The application handles type conversion itself. The application > may want to validate "12345" before converting it to a number, or > may want to raise a custom validation error. > - An application that needs key order and duplicate keys preserved. > TYPE INFORMATION: > - there may be an alternate loader method that returns a parallel > structure containing the type family of each scalar (as a string) > rather than the value itself. But how to encode the type family > of collections? If you make the collection's value, there's no > place to list the type families of the collection's children. > > > MODEL: General > LEVEL: 3 > LOADER READS: serial model > LOADER RETURNS: > - I don't know how this is different from the serial model. > > > MODEL: Functional (aka graph) > LEVEL: 4 > LOADER READS: general model > LOADER RETURNS: > - same as general model but... > - each scalar is resolved to its implicit/explicit type > - each [] collection is converted to a list. order is preserved. > duplicate keys is not an issue since YAML generated the keys. > - each {} collection is converted to a dictionary. caveats: > - key order is not guaranteed. Python dictionaries destroy > key order and it cannot be reconstructed. PHP arrays preserve > key order, but the loader may intentionally randomize the order > to prevent you from exploiting this. > - duplicate keys are dropped. Should this be done silently, or > with a warning, or with an error, or whichever the user chooses? > > > The loader may give the user options to select from a variety of > alternate native representations. > > The dumpers would generally operate in reverse. Probably they would > need to choose one or two output styles for all scalars, sequences and > maps. Choosing specific output styles beyond that really requires a schema. > > -- > -Mike (Iron) Orr, ir...@ms... (if mail problems: ms...@oz...) > http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol > > > ------------------------------------------------------- > This sf.net email is sponsored by: OSDN - Tired of that same old > cell phone? Get a new here for FREE! > https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core > -- tom sawyer, aka transami tra...@tr... |
From: Steve H. <sh...@zi...> - 2002-09-10 19:57:28
|
We now have five model-related pages on the Wiki. http://wiki.yaml.org/yamlwiki/CategoryInfoModel Thanks for the work, guys. The next step might be to refactor and consolidate some of the info that's out there, to the extent that Tom, Clark, and others can come to a consensus. Cheers, Steve |
From: <ir...@ms...> - 2002-09-10 20:11:45
|
On Tue, Sep 10, 2002 at 01:40:24PM -0600, Tom Sawyer wrote: > this is crazy! except for one level distinction this is exactly my model > proposal, what i've now spent the last 24 hours working on non-stop. > > http://wiki.yaml.org/yamlwiki/TomsModels?action=show > > is it me or was it really that hard to comprehensd before? well please > give it another go. this time i think it will clarify some of your > questions. I was looking at that page as well as Clark's model e-mail while I was writing this. Your page was the one that convinced me to add a level 0 for the "entire document as a string". I didn't know if it would be useful, but I figure it's better to add it as a separate model now in case it is, and then we can drop it later if we don't need it. I followed Clark's model scheme rather than yours because I understood how it related to the existing model hierarchy better, and how the different levels relate to the user. We are going in similar directions but with different emphases. You're proposing a model structure and the concept that each level should be importable/exportable to the user. I'm taking an existing model (modifying it as little as necessary), and articulating how each level relates to the user, and what specific data the user receives at each level. Also, I can't really follow your wording in the paragraphs. We can discuss this further if you like. -- -Mike (Iron) Orr, ir...@ms... (if mail problems: ms...@oz...) http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol |
From: Tom S. <tra...@tr...> - 2002-09-10 20:54:55
|
On Tue, 2002-09-10 at 14:11, Mike Orr wrote: > Also, I can't really follow your wording in the paragraphs. We can > discuss this further if you like. yes, certainly. have your read it recently? took me awhile to get it cleared up. but i'd be happy to clearify. i imagine the oddest part is the use of semantic consumption. just think of that as turning something that's a symbol into a meaningful entity. the reason i use consumption b/c the symbolic representation disappears and is replaced twith the semantic. (or nothing in the case of a nil-semantic). -tom -- tom sawyer, aka transami tra...@tr... |