From: Kostis A. <ank...@gm...> - 2011-05-22 13:41:59
|
Hi, i'm using the very useful extension of Yaron Koren, ExternalData, to parse and semantically annotate data found in xml files within maven repositories. Yet, the pom.xml files have repetitive deep structures that the plain 'xml' format cannot handle well. This is the perfect job for XPath. So i added the (SimpleXML) XPath capability to this extension by adding a new format='XPath' (to remain backward-compatible). I think the same capability can be utilized by other formats, hence a better name for the new xpath-family of formats would be: 'XML with XPath', 'JSON with XPath', ... Or it may be more appropriate to add a new parameter controlling whether the contents of external-variables are xpath expressions. I leave it up to you, Yaron, to decide for them. See the attached patch against ExternalData-1.3.1. It is moderately tested with the test-cases below (no NSs), it is less optimal (re-scaning file on each xpath), but gets the job done. A sample usage and TCs given below: test.xml ------------ <root> <child attr="attr-val1" >ch1-val1</child> <child attr="attr-val2" > <grandchild attr="gattr-val1">gchild1</grandchild> </child> <child attr="attr-val3" > <grandchild attr="gattr-val2">gchild2</grandchild> </child> <child attr="attr-val4" >ch1-val4</child> </root> TEST PAGE: ------------------ {{#get_web_data: url=http://some.net/test.xml |format=xpath |data=v0=/bad v1=//child, v2=child, v3=/root/child, v4=/root/child[4], v5=//grandchild, v6=//grandchild[@attr='gattr-val2'], v7=child[@attr='attr-val2']/grandchild, }} * v0={{#external_value:v0}} * v1={{#external_value:v1}} * v2={{#external_value:v2}} * v3={{#external_value:v3}} * v4={{#external_value:v4}} * v5={{#external_value:v5}} * v6={{#external_value:v6}} * v7={{#external_value:v7}} * v0-table: {{#for_external_table: {{{v0}}}, }} * v1-table: {{#for_external_table: {{{v1}}}, }} * v3-table: {{#for_external_table: {{{v3}}}, }} * v4-table: {{#for_external_table: {{{v4}}}, }} * v5-table: {{#for_external_table: {{{v5}}}, }} * v6-table: {{#for_external_table: {{{v6}}}, }} RESULTS: --------------- v0= v1=ch1-val1 v2=ch1-val1 v3=ch1-val1 v4=ch1-val4 v5=gchild1 v6=gchild2 v7=gchild1 v0-table: v1-table: ch1-val1,ch1-val4, v3-table: ch1-val1,ch1-val4, v4-table: ch1-val4, v5-table: gchild1,gchild2, v6-table: gchild2, Regards, Kostis Anagnostopoulos |
From: Yaron K. <ya...@wi...> - 2012-06-22 15:39:22
|
Hi Kostis, Thanks for this patch. I finally checked it in to the External Data code - I don't know why it took over a year to take care of it, and I wish I had done it sooner, since I think this will be a useful feature in a lot of contexts. Anyway, better late than never, I hope. I plan to release a new version of External Data, 1.4, with this feature included, soon. -Yaron On Sun, May 22, 2011 at 9:41 AM, Kostis Anagnostopoulos <ank...@gm...>wrote: > Hi, > > i'm using the very useful extension of Yaron Koren, ExternalData, to > parse and semantically annotate data found in xml files within maven > repositories. > Yet, the pom.xml files have repetitive deep structures that the plain > 'xml' format cannot handle well. > This is the perfect job for XPath. > > So i added the (SimpleXML) XPath capability to this extension by > adding a new format='XPath' (to remain backward-compatible). > > I think the same capability can be utilized by other formats, hence a > better name for the new xpath-family of formats would be: > 'XML with XPath', > 'JSON with XPath', > ... > Or it may be more appropriate to add a new parameter controlling > whether the contents of external-variables are xpath expressions. > > I leave it up to you, Yaron, to decide for them. > > See the attached patch against ExternalData-1.3.1. > It is moderately tested with the test-cases below (no NSs), it is less > optimal (re-scaning file on each xpath), but gets the job done. > > A sample usage and TCs given below: > > test.xml > ------------ > <root> > <child attr="attr-val1" >ch1-val1</child> > <child attr="attr-val2" > > <grandchild attr="gattr-val1">gchild1</grandchild> > </child> > <child attr="attr-val3" > > <grandchild attr="gattr-val2">gchild2</grandchild> > </child> > <child attr="attr-val4" >ch1-val4</child> > </root> > > TEST PAGE: > ------------------ > {{#get_web_data: > url=http://some.net/test.xml > |format=xpath > |data=v0=/bad > v1=//child, > v2=child, > v3=/root/child, > v4=/root/child[4], > v5=//grandchild, > v6=//grandchild[@attr='gattr-val2'], > v7=child[@attr='attr-val2']/grandchild, > }} > * v0={{#external_value:v0}} > * v1={{#external_value:v1}} > * v2={{#external_value:v2}} > * v3={{#external_value:v3}} > * v4={{#external_value:v4}} > * v5={{#external_value:v5}} > * v6={{#external_value:v6}} > * v7={{#external_value:v7}} > > * v0-table: {{#for_external_table: {{{v0}}}, }} > * v1-table: {{#for_external_table: {{{v1}}}, }} > * v3-table: {{#for_external_table: {{{v3}}}, }} > * v4-table: {{#for_external_table: {{{v4}}}, }} > * v5-table: {{#for_external_table: {{{v5}}}, }} > * v6-table: {{#for_external_table: {{{v6}}}, }} > > RESULTS: > --------------- > v0= > v1=ch1-val1 > v2=ch1-val1 > v3=ch1-val1 > v4=ch1-val4 > v5=gchild1 > v6=gchild2 > v7=gchild1 > v0-table: > v1-table: ch1-val1,ch1-val4, > v3-table: ch1-val1,ch1-val4, > v4-table: ch1-val4, > v5-table: gchild1,gchild2, > v6-table: gchild2, > > > Regards, > Kostis Anagnostopoulos > > > ------------------------------------------------------------------------------ > What Every C/C++ and Fortran developer Should Know! > Read this article and learn how Intel has extended the reach of its > next-generation tools to help Windows* and Linux* C/C++ and Fortran > developers boost performance applications - including clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Semediawiki-user mailing list > Sem...@li... > https://lists.sourceforge.net/lists/listinfo/semediawiki-user > > -- WikiWorks · MediaWiki Consulting · http://wikiworks.com |