Re: [Xsltforms-support] fn:tokenize in XSLTForms
Brought to you by:
alain-couthures
From: Tim T. <tim...@gm...> - 2015-11-29 20:55:05
|
Alain, This sounds great so far. Maybe it could even form the basis for more robust mixed content editing extensions in XForms. There are use cases (like TEI annotations[1]) that are not easily satisfied by RTE plugins, which were designed for HTML editing. An XSLTForms Markdown parser could also be interesting :-) Regarding your questions, internal whitespace should be preserved when the data is serialized, and any final punctuation (like periods) should be maintained. Libraries in the U.S. follow the ISBD standard[2], which defines how punctuation should be used in catalog records. Semicolons are usually preceded and followed by a single whitespace character. Also, I'm not sure about the first item in sub-values having a special meaning. It is usually capitalized, but that is just a formatting issue, not because it differs from the other values. It's really too bad that, after creating structured data, it should have to be serialized back to a string, but that is the limitation of current cataloging formats (which is why we are trying to move to a new RDF-based model). Thank you for your work on this! Tim [1] Winona Salesky has done some very interesting work in this regard: https://github.com/srophe/tei-editor [2] https://en.wikipedia.org/wiki/International_Standard_Bibliographic_Description -- Tim A. Thompson Metadata Librarian (Spanish/Portuguese Specialty) Princeton University Library On Sat, Nov 28, 2015 at 3:03 AM, Alain Couthures < ala...@ag...> wrote: > Tim, > > This is now partially implemented and I would be happy to have your > remarks. > > Enabling this feature for just some nodes is not easy with mediatype > parameters. So, I have created a new action named "split" to create an > array node according to a separator and to perform left and right trim > according to a regular expression. > > For your test form, it sounds like this: > > <xf:split ev:event="xforms-model-construct-done" > ref="instance('result')/marc:collection/marc:record/marc:datafield[@tag = > '245']/marc:subfield[@code = 'c']" separator=";" left-trim="^\s\s*" > right-trim="(\s|\.)\s*$"/> > <xf:split ev:event="xforms-model-construct-done" > ref="instance('result')/marc:collection/marc:record/marc:datafield[@tag = > '508']/marc:subfield[@code = 'a']" separator=";" left-trim="^\s\s*" > right-trim="(\s|\.)\s*$"/> > <xf:split ev:event="xforms-model-construct-done" > ref="instance('result')/marc:collection/marc:record/marc:datafield[@tag = > '508']/marc:subfield[@code = 'a']/array()/text()" separator="," > left-trim="^\s\s*" right-trim="(\s|\.)\s*$"/> > <xf:split ev:event="xforms-model-construct-done" > ref="instance('result')/marc:collection/marc:record/marc:datafield[@tag = > '511']/marc:subfield[@code = 'a']" separator="," left-trim="^\s\s*" > right-trim="(\s|\.)\s*$"/> > > As you can see, this action can be applied more than once on the same > subfield in case of embedded separators such as ";" then ",". The right > trim regular expression is able to remove the trailing ".". Maybe the "^" > and the "$" should be automatically added... > > I have not yet implemented the new "+" feature for input so I have just > tested with static input controls. It should be possible to add XForms > triggers and actions to add/remove text items but I have not tested this > yet: > > <xf:group > ref="marc:datafield[@tag = '245']/marc:subfield[@code = > 'c']/array()"> > <xf:repeat nodeset="text()"> > <xf:input ref="."> > <xf:label>Production credit: </xf:label> > </xf:input> > </xf:repeat> > </xf:group> > > and > <xf:group ref="marc:datafield[@tag = > '508']/marc:subfield[@code = 'a']/array()"> > <xf:repeat nodeset="array()"> > <xf:input ref="text()"> > <xf:label>Role: </xf:label> > </xf:input> > <xf:repeat nodeset="node()[preceding-sibling::node()]"> > <xf:input ref="."> > <xf:label>Role by: </xf:label> > </xf:input> > </xf:repeat> > </xf:repeat> > </xf:group> > > I have seen that the first item in sub-values has a special meaning so I > have isolated it from the corresponding repeat. The corresponding XPath > expression should naturally be "text()[preceding-sibling::text()]" but the > XPath parser has currently an issue with it so I just tested with the more > general expression "node()[preceding-sibling::node()]". > > This is not serialized back yet because I have questions about this. While > the parsing separator can be just a single character, would not you prefer > to have a serialized return with a separator followed by a single space > character? Should the ending "." be added back? If so, my idea is to add > attributes to the split action to allow the serializer to perform this. > These parameters could be stored as attributes of the array node. > > What do you think? > > Alain > > > > Le 20/11/2015 18:50, Tim Thompson a écrit : > > Alain, > > This approach sounds very promising! Yes, I think it would be preferable > to have the ability to specifically enable/disable the "+" feature as you > suggest. > > Best regards, > Tim > > > -- > Tim A. Thompson > Metadata Librarian (Spanish/Portuguese Specialty) > Princeton University Library > > > On Fri, Nov 20, 2015 at 12:13 PM, Alain Couthures < > <ala...@ag...>ala...@ag...> wrote: > >> Tim, >> >> The best approach could be to mix the two approaches: >> >> - let's define a mediatype such as "application/xml+csv" with an >> optional parameter named "separator" >> - add support for this new mediatype in Fleur, my new DOM >> implementation I presented at XML Amsterdam, so each text value containing >> a separator is splitted into an array node with as many text nodes as items >> - add support of arrays in XSLTForms input controls: as many HTML >> inputs as items with "+" and "-" buttons for each to add and delete items >> (every XForms input control for single strings will actually have a "+" >> button to allow to generate an array from them) >> >> Would it be better for you to specifically enable/disable the "+" feature >> for, let's say, attributes, some elements, depending on the namespace, ...? >> >> What do you think? >> >> Alain >> >> Le 20/11/2015 04:26, Tim Thompson a écrit : >> >> Yes, library catalog data is notorious for using punctuation to indicate >> structure (a legacy of the card catalog days). >> >> Your proposal seems like a great, declarative solution, and I think it >> would be a valuable extension for XSLTForms. It could also be interesting >> if the XForms 2.0 CSV2XML mapping could be applied to the text content of >> individual elements (rather than only instances), so that it could be >> accessed using XPath. >> >> Best regards, >> Tim >> >> -- >> Tim A. Thompson >> Metadata Librarian (Spanish/Portuguese Specialty) >> Princeton University Library >> >> >> On Thu, Nov 19, 2015 at 3:36 PM, Alain Couthures < >> <ala...@ag...>ala...@ag...> wrote: >> >>> Tim, >>> >>> This is a very interesting test case because it shows that some elements >>> might contain multiple values separated by ";". It looks like CSV contents >>> within elements! >>> >>> I don't think fn:tokenize is the right tool for this situation: it is >>> designed for returning strings not nodes to be bound to controls. It is >>> true that XSLTForms is using faked nodes for returning what can be seen as >>> a sequence but it is just a trick for an XPath 1.0 engine. >>> >>> I have been thinking of a way for splitting the text node which is the >>> child of the element into many and, then, it would be possible to bind them >>> individually for editing. I am not sure that all DOM implementations in >>> browsers will support adjacent text nodes without concatenate them. An >>> XPath function should not modify nodes so a specific action would be >>> required... >>> >>> It should be better, first, to consider that the element value has a >>> specific data type, which is a restriction of xsd:string. As for RTE, the >>> separator to be recognized can be associated to this data type. Then, the >>> input control bound to each element with this data type will actually be >>> rendered by XSLTForms with multiple HTML inputs, as many as sub-values, and >>> with buttons to add and remove inputs. This would be developed as an XForms >>> extension in XSLTForms. >>> >>> What do you think? >>> >>> Alain >>> >>> >>> Le 18/11/2015 23:34, Tim Thompson a écrit : >>> >>> Alain, >>> >>> Thank you for your reply. Attached is a test case using fn:tokenize to >>> create input controls. The purpose is to edit library catalog records for >>> audiovisual material and transcribe information about production credits. >>> >>> Best regards, >>> Tim >>> >>> -- >>> Tim A. Thompson >>> Metadata Librarian (Spanish/Portuguese Specialty) >>> Princeton University Library >>> >>> >>> On Wed, Nov 18, 2015 at 3:57 PM, Alain Couthures < >>> <ala...@ag...>ala...@ag...> wrote: >>> >>>> Tim, >>>> >>>> Currently, fn:tokenize() is returning faked text nodes. This is good >>>> enough for xf:output and should also work with xf:setvalue. >>>> >>>> It would still be easy to create standalone text nodes owned by default >>>> instance to do the trick more appropriately for complex cases. >>>> >>>> Could you please send me a test case? >>>> >>>> Thanks! >>>> >>>> --Alain >>>> >>>> >>>> Le 18/11/2015 20:02, Tim Thompson a écrit : >>>> >>>> Alain, >>>> >>>> Can fn:tokenize be used to update instance data with xf:input, or only >>>> to display data with xf:output? I have XML data with elements that contain >>>> semicolon-delimited strings that I would like to split across separate >>>> input controls, but I guess the data can't be updated once it is tokenized? >>>> >>>> Best wishes, >>>> Tim >>>> >>>> -- >>>> Tim A. Thompson >>>> Metadata Librarian (Spanish/Portuguese Specialty) >>>> Princeton University Library >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xsltforms-support mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/xsltforms-support >>>> >>>> >>>> >>> >>> >> >> > > |