From: Oren Ben-K. <or...@ri...> - 2002-07-02 15:50:51
|
Steve Howell [mailto:sh...@zi...] wrote: > > - > > > This is paragraph > > Number 1. > > > > There is 1 newline in between these. > > *This* accomplishes DWIM. > > > > > > There are 2 newlines in between these, > > and so on. > > - | > > This is paragraph Number 1. > > There is 1 newline in between these. *This* accomplishes DWIM. > > > > There are 2 newlines in between these, and so on. > > > > I don't see how this is DWIM at all. YAML is robbing me of > newlines when I don't expect them to be robbed. The only place > I want newlines removed is inside the paragraph. That is, there would be no way whatsoever to write the above example in a folded style? That seems wrong to me. I'd be forced to use escaped format to write the above? Ugh. > Most folks separate paragraphs with empty lines. For > example, you separate > paragraphs in your email with empty lines. But now, for > YAML, we are going > to require an extra newline? This seems wrong to me. Separating paragraphs with an empty line is exactly what I expect to happen... That's what the folded style *does*. If you are using a folded format, your *internal* format is "newline to end a paragraph". Not two. If your *internal* format says "two newlines to separate a paragraph", it means you are doing another folding *on top* of the YAML folding. Fine, just write: this: | This is paragraph Number 1. There is 1 newline in between these. *This* accomplishes DWIM. There are 2 newlines in between these, and so on. And feed that into your own folding code. If your problem is "how do I display folded text in a human-friendly way", the answer is "line-wrap it as if it will be written into a folded scalar". This gives you back your empty line. You *have* to do *something* to such text before displaying it to a user *anyway*, otherwise you will get extremely long lines which isn't what you mean. If you have some OS or window system code that line-wraps the value but doesn't add an empty line, and you insist on displaying them, just double the newlines before you pass it to this code. In HTML, just change each line feed to </p><p>. And so on. The correct solution depends on the system used to display the text. > > ... the above is the correct > > interpretation, as a reading of the spec will quickly show: > > > > "... Thus a single line feed can be serialized as two, two > > line feeds can be serialized as three, etc.". > > Yes, but then the spec also says this: > > In nested scalars, folding only applies to line feeds that > separate text > lines starting with a non-space character. All the line feeds in the above examples satisfy this (except for the very final one and that one is controlled by the chomping modifier). Note the folding applies to "line feed*S*". A better phrasing would be "applies to *sequences of line feeds* that...". > Hence, folding does not apply to > leading line feeds, line feeds surrounding an empty line ending with a > specific line break, or line feeds surrounding a text line > that starts with a space character. The above examples don't contain any of these exceptions. Finally: > There's no compelling use case for smushed paragraphs. I think that: - An internal representation where a line feeds represents a paragraph break and there's no trace of line wrapping is a common use case, and is supported perfectly by the folded scalar style. - An internal representation where single line feeds are folded and should line feeds indicate paragraph is a common use case and is supported perfectly by using the block scalar style. - An internal representations where single line feeds are absolutely forbidden and a double line feed represents a paragraph is something I never heard of. As you point out, this internal representation format is not supported well by YAML. Your proposal would make the first use case impossible. For example: > They'd still be > representable in YAML under the more DWIMmish approach, > although it might be > a bit ugly. If we want to support smushed paragraphs better, > perhaps we > could say that a line that only has a "\" is not rendered, > but forces the > preceding line to have a newline? And have to escape \ in folded scalars? No thanks. Have fun, Oren Ben-Kiki |
From: Steve H. <sh...@zi...> - 2002-07-02 16:26:06
|
What's the convention for separting paragraphs in underlying data? Do folks typically separate paragraphs with one line feed or two? I know that when I typed this particular email document, I hit two line feeds between each paragraph, so at least the underlying human representation (for me) is two newlines. I am not sure about the machine's. Would Oren argue that having to hit two newlines is just a weakness of my mailer? I would argue that a "single newline" is used to separate lines, not paragraphs. You need a "double newline" to separate paragraphs. Most text renderers other than YAML then represent the double newlines verbatim. Then, to address variable width constraints, they insert additional line feeds near column 80, or near column 132, etc. |
From: Clark C . E. <cc...@cl...> - 2002-07-02 16:43:41
|
On Tue, Jul 02, 2002 at 12:25:45PM -0400, Steve Howell wrote: | What's the convention for separting paragraphs in underlying data? | Do folks typically separate paragraphs with one line feed or two? In previous systems I've worked on, we used a single carriage return to separate each paragraph. We could have used two carriage returns... but I don't see the need. | I know that when I typed this particular email document, I hit two line feeds | between each paragraph, so at least the underlying human representation (for | me) is two newlines. I am not sure about the machine's. Yes, but you also used a carriage return on each line... | Would Oren argue that having to hit two newlines is just a weakness of my | mailer? IMHO, its a weakness of terminals that don't word-wrap automatically. | I would argue that a "single newline" is used to separate lines, not | paragraphs. You need a "double newline" to separate paragraphs. Most text | renderers other than YAML then represent the double newlines verbatim. Then, | to address variable width constraints, they insert additional line feeds near | column 80, or near column 132, etc. When I have a system that properly word-wraps and formatts (most good word processors), I use a *single* ENTER to separate paragraphs. This is the norm. When dealing with clients that don't wrap for us, we are forced to use ENTER to break lines, and thus use ENTER ENTER for paragraph separators; but once again, this is a limitation of the client, not an ideal state of being where a single ENTER is used to split paragraphs. I hope my previous e-mail is illuminating. I think the problem isn't the old folded behavior (N -> N-1) but rather the transition between paragraphs and indented sections. Best, Clark |
From: Steve H. <sh...@zi...> - 2002-07-02 16:54:58
|
From: "Clark C . Evans" <cc...@cl...> > On Tue, Jul 02, 2002 at 12:25:45PM -0400, Steve Howell wrote: > | What's the convention for separting paragraphs in underlying data? > | Do folks typically separate paragraphs with one line feed or two? > > In previous systems I've worked on, we used a single carriage > return to separate each paragraph. We could have used two > carriage returns... but I don't see the need. On a typewriter, you would hit two carriage returns between paragraphs to create the text you see here. On paper, you would see two line feeds. On your terminal, you would see two line feeds. In your underlying representation, though, there would only be one paragraph separating the paragraphs. But, then, in YAML, I'd have to supply three carriage returns. THREE carriage returns! (Until I got my YAML-optimized editor, I guess.) Why don't we just be consistent? Let's assume most apps use two carriage returns as the paragraph separator. Then, let's have YAML render the two carriage returns as two carriage returns. A human typing YAML would then also use two carriage returns to see the YAML. Then, when the YAML was read by another human, they would know that the two carriage returns actually mean two carriage returns. I don't think this will be two difficult for most people two understand. Thanks, Steve |
From: Steve H. <sh...@zi...> - 2002-07-02 17:20:28
|
Retracting some of this, sorry... ----- Original Message ----- From: "Steve Howell" <sh...@zi...> > > On Tue, Jul 02, 2002 at 12:25:45PM -0400, Steve Howell wrote: > > | What's the convention for separting paragraphs in underlying data? > > | Do folks typically separate paragraphs with one line feed or two? > > > > In previous systems I've worked on, we used a single carriage > > return to separate each paragraph. We could have used two > > carriage returns... but I don't see the need. > > On a typewriter, you would hit two carriage returns between paragraphs to > create the text you see here. On paper, you would see two line feeds. On > your terminal, you would see two line feeds. Keep above. > > In your underlying representation, though, there would only be one paragraph > separating the paragraphs. But, then, in YAML, I'd have to supply three > carriage returns. THREE carriage returns! (Until I got my YAML-optimized > editor, I guess.) > Retract above. Got off-by-oned. > Why don't we just be consistent? Let's assume most apps use two carriage > returns as the paragraph separator. Then, let's have YAML render the two > carriage returns as two carriage returns. A human typing YAML would then > also use two carriage returns to see the YAML. Then, when the YAML was read > by another human, they would know that the two carriage returns actually mean > two carriage returns. I don't think this will be two difficult for most > people two understand. > Keep above. > Thanks, > > Steve > |
From: Clark C . E. <cc...@cl...> - 2002-07-02 17:27:19
|
On Tue, Jul 02, 2002 at 12:54:35PM -0400, Steve Howell wrote: | From: "Clark C . Evans" <cc...@cl...> | > On Tue, Jul 02, 2002 at 12:25:45PM -0400, Steve Howell wrote: | > | What's the convention for separting paragraphs in underlying data? | > | Do folks typically separate paragraphs with one line feed or two? | > | > In previous systems I've worked on, we used a single carriage | > return to separate each paragraph. We could have used two | > carriage returns... but I don't see the need. | | On a typewriter, you would hit two carriage returns between paragraphs to | create the text you see here. On paper, you would see two line feeds. On | your terminal, you would see two line feeds. You are mixing formatting with content. When you press ENTER in microsoft word, it spaces things out very nicely. Only one ENTER there. The ENTER signifies the end of the paragraph. | In your underlying representation, though, there would only be one paragraph | separating the paragraphs. But, then, in YAML, I'd have to supply three | carriage returns. THREE carriage returns! (Until I got my YAML-optimized | editor, I guess.) At least you have a *way* to express a double ENTER after each paragraph. If we went with your suggestion, then I'd not have a way to express a single ENTER. This, IMHO, is a show stopper for your suggestion. | Why don't we just be consistent? Let's assume most apps use two carriage | returns as the paragraph separator. Then, let's have YAML render the two | carriage returns as two carriage returns. A human typing YAML would then | also use two carriage returns to see the YAML. Then, when the YAML was read | by another human, they would know that the two carriage returns actually mean | two carriage returns. I don't think this will be two difficult for most | people two understand. If we wanted to be consistent, then we wouldn't even have "folded" style at all; as the literal block does just fine. The whole point of folded style is to allow for information to be encoded in a way that is readable in your standard 76 column limited terminal. My content that I want to display nice in folded land, has very long paragaphs separated by a single line. This is my use case and the whole reason why I pushed for folded to begin with. I've already conceded that carriage returns surrounding indented blocks should be signficant; which is IMHO, a great idea. However, trying to change the established inter-paragraph rules is a dead-on-arrival proposition. ;) Clark |
From: Steve H. <sh...@zi...> - 2002-07-02 17:59:03
|
> | Why don't we just be consistent? Let's assume most apps use two carriage > | returns as the paragraph separator. Then, let's have YAML render the two > | carriage returns as two carriage returns. A human typing YAML would then > | also use two carriage returns to see the YAML. Then, when the YAML was read > | by another human, they would know that the two carriage returns actually mean > | two carriage returns. I don't think this will be two difficult for most > | people two understand. > > If we wanted to be consistent, then we wouldn't even have "folded" > style at all; as the literal block does just fine. The whole point > of folded style is to allow for information to be encoded in a way > that is readable in your standard 76 column limited terminal. My > content that I want to display nice in folded land, has very long > paragaphs separated by a single line. This is my use case and the > whole reason why I pushed for folded to begin with. Okay, since it's your use case, I won't argue any more. Let's document it better. I like the technique of showing the same data in both representations. Showing things in YAML goes a lot further than describing them in English. The spec might also talk about the use case a little more. Clearly, YAML's folding format will be optimized for the one-newline-between-paragraphs internal format. That's reasonable. > > I've already conceded that carriage returns surrounding indented > blocks should be signficant; which is IMHO, a great idea. However, > trying to change the established inter-paragraph rules is a > dead-on-arrival proposition. ;) Clark > Examples of that would be good too. Thanks, Steve |
From: Steve H. <sh...@zi...> - 2002-07-02 17:11:53
|
Clarifying a few points below... > | I know that when I typed this particular email document, I hit two line feeds > | between each paragraph, so at least the underlying human representation (for > | me) is two newlines. I am not sure about the machine's. > > Yes, but you also used a carriage return on each line... > Nope. The mailer DWIMmed that for me. > | Would Oren argue that having to hit two newlines is just a weakness of my > | mailer? > > IMHO, its a weakness of terminals that don't word-wrap automatically. But my mailer did word-wrap automatically. It just didn't smush my paragraphs automatically for me, because I don't want them smushed. > > | I would argue that a "single newline" is used to separate lines, not > | paragraphs. You need a "double newline" to separate paragraphs. Most text > | renderers other than YAML then represent the double newlines verbatim. Then, > | to address variable width constraints, they insert additional line feeds near > | column 80, or near column 132, etc. > > When I have a system that properly word-wraps and formatts (most > good word processors), I use a *single* ENTER to separate paragraphs. > This is the norm. When dealing with clients that don't wrap for us, > we are forced to use ENTER to break lines, and thus use ENTER ENTER > for paragraph separators; but once again, this is a limitation of > the client, not an ideal state of being where a single ENTER is used > to split paragraphs. Sure, some word processors word wrap automatically convert a single ENTER into paragraph separators. Many text editors (including EditPlus and the text editor for Outlook Plus) word wrap optionally for you, but they still make you hit two ENTERs at the end of a paragraph. This is fine, since you want a sense of closure at the end of the paragraph anyway. Folks who use YAML will be entering YAML in text editors, not in Word. Therefore, we want to optimize for them. The "n MINUS 1" rule for interpreting YAML becomes an "n PLUS 1" rule for the person typing the YAML. > I hope my previous e-mail is illuminating. I think the problem isn't > the old folded behavior (N -> N-1) but rather the transition between > paragraphs and indented sections. > It's a tough problem. I see how the "N-1" rule solves the smushed paragraph problem, but I wonder if the folded format should even address the smushed paragraphs. Suppose YAML keeps the "N-1" rule. If I were to type text into YAML, I would certainly separate paragraphs with two newlines. Then I guess I'd have to adapt my application to use single newlines as separators internally, and then I'd put the newline back when I rendered it again, following the "N+1" rule. |
From: Clark C . E. <cc...@cl...> - 2002-07-02 16:32:44
|
For folded scalar, lets define two types of chunks, paragraph chunks and indented chunks. The problem, as I see it, is the transition between chunks. Between paragraph chunks we want the N-1 rule (so that one new line is possible as Oren) The problem is the transition between indented and paragraph chunks... It seems that this discussion is emergent since the spec isn't exactly precice here; and what ever rule we choose it should be easy to grok but yet be intutive. How about... New lines inside paragraphs are of folded. New lines between paragraphs follow the N-1 rule All other new lines are signficant That should cover it. Here is an example of the permutations: folded: > This is one paragraph. This is another paragraph. This is a third paragraph. 1. This is indented 2. List 3. This is a continuation 4. Of the indented list This is the fourth paragraph. equals: | This is one paragraph. This is another paragraph. This is a third paragreaph. 1. This is indented 2. List 3. This is a continuation 4. Of the indented list This is the fourth paragraph ... Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |
From: Brian I. <in...@tt...> - 2002-07-02 16:43:58
|
On 02/07/02 12:38 -0400, Clark C . Evans wrote: > For folded scalar, lets define two types of chunks, paragraph > chunks and indented chunks. The problem, as I see it, is > the transition between chunks. Between paragraph chunks we > want the N-1 rule (so that one new line is possible as Oren) > The problem is the transition between indented and paragraph > chunks... > > It seems that this discussion is emergent since the spec isn't > exactly precice here; and what ever rule we choose it should be > easy to grok but yet be intutive. How about... > > New lines inside paragraphs are of folded. > New lines between paragraphs follow the N-1 rule > All other new lines are signficant I think we all agree on these semantics, except what comes between paragraphs. But thanks for the nice document example :) We just need to decide if the N-1 rule is correct. It's simply a matter of which is *more* correct. Tough problem. Let's keep thinking on it. > > That should cover it. Here is an example of the permutations: > > folded: > > This is one > paragraph. > > This is another > paragraph. > > > This is a third > paragraph. > 1. This is indented > 2. List > > 3. This is a continuation > 4. Of the indented list > > This is the fourth > paragraph. > > equals: | > This is one paragraph. > This is another paragraph. > > This is a third paragreaph. > 1. This is indented > 2. List > > 3. This is a continuation > 4. Of the indented list > > This is the fourth paragraph > ... > > Clark > > -- > Clark C. Evans Axista, Inc. > http://www.axista.com 800.926.5525 > XCOLLA Collaborative Project Management Software > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |
From: Clark C . E. <cc...@cl...> - 2002-07-02 16:51:18
|
On Tue, Jul 02, 2002 at 09:43:48AM -0700, Brian Ingerson wrote: | > New lines inside paragraphs are of folded. | > New lines between paragraphs follow the N-1 rule | > All other new lines are signficant | | I think we all agree on these semantics, except what comes between | paragraphs. But thanks for the nice document example :) | We just need to decide if the N-1 rule is correct. It's simply a matter of | which is *more* correct. Tough problem. Let's keep thinking on it. IMHO, it's clear cut. One ENTER vs two ENTERs is a formatting matter (not a content matter). Area inside and surrounding the indented segments are where formatting matters and carriage returns should be preserved. However, inside paragraphs and between paragraphs is a content issue and not where formatting is preserved. Between paragraphs you want exactly one enter, not two. Best, Clark |
From: Brian I. <in...@tt...> - 2002-07-02 16:53:21
|
On 02/07/02 12:38 -0400, Clark C . Evans wrote: > For folded scalar, lets define two types of chunks, paragraph > chunks and indented chunks. The problem, as I see it, is > the transition between chunks. Between paragraph chunks we > want the N-1 rule (so that one new line is possible as Oren) > The problem is the transition between indented and paragraph > chunks... > > It seems that this discussion is emergent since the spec isn't > exactly precice here; and what ever rule we choose it should be > easy to grok but yet be intutive. How about... > > New lines inside paragraphs are of folded. > New lines between paragraphs follow the N-1 rule > All other new lines are signficant I think we all agree on these semantics, except what comes between paragraphs. But thanks for the nice document example :) We just need to decide if the N-1 rule is correct. It's simply a matter of which is *more* correct. Tough problem. Let's keep thinking on it. > > That should cover it. Here is an example of the permutations: > > folded: > > This is one > paragraph. > > This is another > paragraph. > > > This is a third > paragraph. > 1. This is indented > 2. List > > 3. This is a continuation > 4. Of the indented list > > This is the fourth > paragraph. > > equals: | > This is one paragraph. > This is another paragraph. > > This is a third paragreaph. > 1. This is indented > 2. List > > 3. This is a continuation > 4. Of the indented list > > This is the fourth paragraph > ... > > Clark > > -- > Clark C. Evans Axista, Inc. > http://www.axista.com 800.926.5525 > XCOLLA Collaborative Project Management Software > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Yaml-core mailing list > Yam...@li... > https://lists.sourceforge.net/lists/listinfo/yaml-core |