Thread: XPath Search
Brought to you by:
bs_php,
nigelswinson
From: Tod M. <tod...@gm...> - 2006-01-14 19:55:45
|
I just found this class and it seems to offer exactly what I want. I'm new to it & xpath though so I'm slowly learning how to do things. I want to perform a search on my XML database for a string entered by a user. Say fo= r instance "cabinet". From this I want to search several XML files of the following structure: <?xml version=3D"1.0"?> <!DOCTYPE product PUBLIC "-//RYE//DTD IMAGEDB 1.0 Strict//EN" "component.dt= d "> <product> <component id=3D"wall_cab"> <title>Wall Cabinet</title> <image src=3D"/images/cabinetry/wall_cab.jpg" /> <link href=3D"/products/cabinetry/wall.html" /> <feature>Sturdy double wall door </feature> <feature>Door has perforated inside wall for hanging hooks for tools etc</feature> <feature>Solid steel construction 20-Gauge (.036") thick </feature> <feature>One shelf included 24" x 10" x 0.6" </feature> <keywords>tools sports sporting </keywords> </component> <component id=3D"component2"> ... </component> </product> I have added keywords to enhance the search functionality. What I want to do is extract the title, image, and link of any product that contains, in any tag, the string entered by the user. I'm not exactly sure how to do this. Can someone point me in the right direction? I know it's asking a lot, but even a quick pointer would help me out. I'm not a very experienced php programmer . Thanks very much, Brad |
From: Nigel S. <nig...@us...> - 2006-01-29 03:11:57
|
Try stuff like this: // Find all nodes that have a feature element that contains the word = cabinet, case insensitively $aMatches =3D = $Object->match("//*[contains(x-tolower(feature),'cabinet')]"); // Cycle through the nodes we find foreach ($aMatches as $Match) { // $Match will be something like /Product[1]/Component[3] echo $Object->getAttributes($Match.'/image', 'src'); echo $Object->wholeText($Match.'/title'); } Sorry it took so long to respond, I was hoping someone else on the list = would beat me to it. Nigel ----- Original Message -----=20 From: Tod McIntyre=20 To: php...@li...=20 Sent: Saturday, January 14, 2006 7:55 PM Subject: XPath Search I just found this class and it seems to offer exactly what I want. = I'm new to it & xpath though so I'm slowly learning how to do things. I = want to perform a search on my XML database for a string entered by a = user. Say for instance "cabinet". From this I want to search several = XML files of the following structure:=20 <?xml version=3D"1.0"?> <!DOCTYPE product PUBLIC "-//RYE//DTD IMAGEDB 1.0 Strict//EN" = "component.dtd"> <product> <component id=3D"wall_cab"> <title>Wall Cabinet</title> <image src=3D"/images/cabinetry/wall_cab.jpg" /> <link href=3D"/products/cabinetry/wall.html" /> <feature>Sturdy double wall door </feature>=20 <feature>Door has perforated inside wall for hanging hooks for = tools etc</feature>=20 <feature>Solid steel construction 20-Gauge (.036") thick </feature> = <feature>One shelf included 24" x 10" x 0.6" </feature> <keywords>tools sports sporting </keywords> </component>=20 <component id=3D"component2"> ... </component> =20 </product> I have added keywords to enhance the search functionality. What I = want to do is extract the title, image, and link of any product that = contains, in any tag, the string entered by the user. I'm not exactly sure how to do this.=20 Can someone point me in the right direction? I know it's asking a = lot, but even a quick pointer would help me out. I'm not a very = experienced php programmer . Thanks very much, Brad |
From: Tod M. <tod...@gm...> - 2006-02-06 18:34:21
|
Can someone help me out? I can't seem to get the match working... I'm using: $components=3D$xPath->match("//component[contains(feature,'string')]"); which returns nothing in the array, even though "string" is in the file when I use : $components=3D$xPath->match("//*[contains(.,'string')]"); however, it works. This isn't exactly ideal though as I just want to return the component node itself that contains the "string" within the feature, not the actual feature. Nigel, your solution of $components=3D$xPath->match("//*[contains(feature,'string')]"); doesn't wor= k either. It seems whenever I put the actual node name in the contains function it doesn't work. Brad On 1/28/06, Nigel Swinson <nig...@us...> wrote: > > Try stuff like this: > > // Find all nodes that have a feature element that contains the word > cabinet, case insensitively > $aMatches =3D $Object->match("//*[contains(x-tolower(feature),'cabinet')]= "); > // Cycle through the nodes we find > foreach ($aMatches as $Match) { > // $Match will be something like /Product[1]/Component[3] > echo $Object->getAttributes($Match.'/image', 'src'); > echo $Object->wholeText($Match.'/title'); > } > > Sorry it took so long to respond, I was hoping someone else on the list > would beat me to it. > > Nigel > > ----- Original Message ----- > *From:* Tod McIntyre <tod...@gm...> > *To:* php...@li... > *Sent:* Saturday, January 14, 2006 7:55 PM > *Subject:* XPath Search > > I just found this class and it seems to offer exactly what I want. I'm > new to it & xpath though so I'm slowly learning how to do things. I want= to > perform a search on my XML database for a string entered by a user. Say = for > instance "cabinet". From this I want to search several XML files of the > following structure: > > <?xml version=3D"1.0"?> > <!DOCTYPE product PUBLIC "-//RYE//DTD IMAGEDB 1.0 Strict//EN" " > component.dtd"> > <product> > <component id=3D"wall_cab"> > <title>Wall Cabinet</title> > <image src=3D"/images/cabinetry/wall_cab.jpg" /> > <link href=3D"/products/cabinetry/wall.html" /> > <feature>Sturdy double wall door </feature> > <feature>Door has perforated inside wall for hanging hooks for tools > etc</feature> > <feature>Solid steel construction 20-Gauge (.036") thick </feature> > <feature>One shelf included 24" x 10" x 0.6" </feature> > <keywords>tools sports sporting </keywords> > </component> > <component id=3D"component2"> > ... > </component> > </product> > > I have added keywords to enhance the search functionality. What I want t= o > do is extract the title, image, and link of any product that contains, in > any tag, the string entered by the user. > > I'm not exactly sure how to do this. > > Can someone point me in the right direction? I know it's asking a lot, > but even a quick pointer would help me out. I'm not a very experienced p= hp > programmer > . > Thanks very much, > > Brad > > |
From: Tod M. <tod...@gm...> - 2006-02-06 18:45:42
|
I've narrowed the problem down. When I search through keywords in the xml file (each component has one keyword tag with multiple words) it works fine using: $components=3D$xPath->match("//component[contains(keyword,'string')]"); but when I search through features (each component has multiple features) using: $components=3D$xPath->match("//component[contains(feature,'string')]"); I get an empty array. Do I have to do something different if there are multiple tags such as feature? On 2/6/06, Tod McIntyre <tod...@gm...> wrote: > > Can someone help me out? I can't seem to get the match working... > > I'm using: > $components=3D$xPath->match("//component[contains(feature,'string')]"); > which returns nothing in the array, even though "string" is in the file > > when I use : $components=3D$xPath->match("//*[contains(.,'string')]"); > however, it works. > > This isn't exactly ideal though as I just want to return the component > node itself that contains the "string" within the feature, not the actual > feature. > > Nigel, your solution of > $components=3D$xPath->match("//*[contains(feature,'string')]"); doesn't w= ork > either. It seems whenever I put the actual node name in the contains > function it doesn't work. > > Brad > > On 1/28/06, Nigel Swinson <nig...@us...> wrote: > > > > Try stuff like this: > > > > // Find all nodes that have a feature element that contains the word > > cabinet, case insensitively > > $aMatches =3D > > $Object->match("//*[contains(x-tolower(feature),'cabinet')]"); > > // Cycle through the nodes we find > > foreach ($aMatches as $Match) { > > // $Match will be something like /Product[1]/Component[3] > > echo $Object->getAttributes($Match.'/image', 'src'); > > echo $Object->wholeText($Match.'/title'); > > } > > > > Sorry it took so long to respond, I was hoping someone else on the list > > would beat me to it. > > > > Nigel > > > > ----- Original Message ----- > > *From:* Tod McIntyre <tod...@gm...> > > *To:* php...@li... > > *Sent:* Saturday, January 14, 2006 7:55 PM > > *Subject:* XPath Search > > > > I just found this class and it seems to offer exactly what I want. I'm > > new to it & xpath though so I'm slowly learning how to do things. I wa= nt to > > perform a search on my XML database for a string entered by a user. Sa= y for > > instance "cabinet". From this I want to search several XML files of th= e > > following structure: > > > > <?xml version=3D"1.0"?> > > <!DOCTYPE product PUBLIC "-//RYE//DTD IMAGEDB 1.0 Strict//EN" " > > component.dtd"> > > <product> > > <component id=3D"wall_cab"> > > <title>Wall Cabinet</title> > > <image src=3D"/images/cabinetry/wall_cab.jpg" /> > > <link href=3D"/products/cabinetry/wall.html" /> > > <feature>Sturdy double wall door </feature> > > <feature>Door has perforated inside wall for hanging hooks for tools > > etc</feature> > > <feature>Solid steel construction 20-Gauge (.036") thick </feature> > > <feature>One shelf included 24" x 10" x 0.6" </feature> > > <keywords>tools sports sporting </keywords> > > </component> > > <component id=3D"component2"> > > ... > > </component> > > </product> > > > > I have added keywords to enhance the search functionality. What I want > > to do is extract the title, image, and link of any product that contain= s, in > > any tag, the string entered by the user. > > > > I'm not exactly sure how to do this. > > > > Can someone point me in the right direction? I know it's asking a lot, > > but even a quick pointer would help me out. I'm not a very experienced= php > > programmer > > . > > Thanks very much, > > > > Brad > > > > > |
From: Nigel S. <nig...@us...> - 2006-02-07 22:43:11
|
I've narrowed the problem down. When I search through keywords in the = xml file (each component has one keyword tag with multiple words) it = works fine using: $components=3D$xPath->match("//component[contains(keyword,'string')]"); but when I search through features (each component has multiple = features) using: $components=3D$xPath->match("//component[contains(feature,'string')]"); I get an empty array. Do I have to do something different if there are multiple tags such as = feature? Worked it out. When you search for = //component[contains(feature,'string')], then it first of all evaluates = //component to get a list of context nodes, which will be something = like: XPathSet:Array ( [0] =3D> /product[1]/component[1] [1] =3D> /product[1]/component[6] ) Then it filters the list based on the predicate, which is the = contains(feature,'string') bit. But to do this, it needs to evaluate = the "feature" for each of the context nodes. But for the first context = node this returns something like: Array ( [0] =3D> /product[1]/component[1]/feature[1] [1] =3D> /product[1]/component[1]/feature[2] [2] =3D> /product[1]/component[1]/feature[3] [3] =3D> /product[1]/component[1]/feature[4] ) So the LHS is an array, and the RHS is a string. According to = http://www.w3.org/TR/xpath#function-string, the contains function will = convert the LHS to a string by "A node-set is converted to a string by = returning the string-value of the node in the node-set that is first in = document order. If the node-set is empty, an empty string is returned.". = And thus it will take the text part of = /product[1]/component[1]/feature[1], and ignore the rest. This explains why it works for the keyword, as you only have one = keyword. Instead of finding all components, and then filtering the list, instead = we should find all features, and then go back up a level to get their = owning components. So we do: //feature[contains(.,'search string')]/.. Nigel |
From: Tod M. <tod...@gm...> - 2006-02-10 20:07:07
|
excellent... very thorough! Now, my only question is... once I have = a list of all the features that match a particular phrase, along with keyword= s etc. I want to then get the component that that feature belongs to. I'm not exactly sure how to get the parent. I tried some of XPath the axis fcn's such as "parent::feature", to no avail. I'm not even really sure wha= t the exact syntax is in order to get this to work anyway. Can someone tell me how to get the component element of the feature that matches. Then all = I need to do is eliminate duplicates and I'm set! Thanks again for all the help, Brad On 2/7/06, Nigel Swinson <nig...@us...> wrote: > > I've narrowed the problem down. When I search through keywords in the xm= l > file (each component has one keyword tag with multiple words) it works fi= ne > using: > $components=3D$xPath->match("//component[contains(keyword,'string')]"); > but when I search through features (each component has multiple features) > using: > $components=3D$xPath->match("//component[contains(feature,'string')]"); > I get an empty array. > > Do I have to do something different if there are multiple tags such as > feature? > > Worked it out. When you search for //component[contains(feature,'string'= )], > then it first of all evaluates //component to get a list of context nodes= , > which will be something like: > > XPathSet:Array > ( > [0] =3D> /product[1]/component[1] > [1] =3D> /product[1]/component[6] > ) > > Then it filters the list based on the predicate, which is the > contains(feature,'string') bit. But to do this, it needs to evaluate the > "feature" for each of the context nodes. But for the first context node > this returns something like: > > Array > ( > [0] =3D> /product[1]/component[1]/feature[1] > [1] =3D> /product[1]/component[1]/feature[2] > [2] =3D> /product[1]/component[1]/feature[3] > [3] =3D> /product[1]/component[1]/feature[4] > ) > > So the LHS is an array, and the RHS is a string. According to > http://www.w3.org/TR/xpath#function-string, the contains function will > convert the LHS to a string by "A node-set is converted to a string by > returning the string-value <http://www.w3.org/TR/xpath#dt-string-value> o= f > the node in the node-set that is first in document order<http://www.w3.or= g/TR/xpath#dt-document-order>. > If the node-set is empty, an empty string is returned.". And thus it wil= l > take the text part of /product[1]/component[1]/feature[1], and ignore the > rest. > > This explains why it works for the keyword, as you only have one keyword. > > Instead of finding all components, and then filtering the list, instead w= e > should find all features, and then go back up a level to get their owning > components. So we do: > > //feature[contains(.,'search string')]/.. > > Nigel > |
From: Nigel S. <nig...@us...> - 2006-02-11 02:03:16
|
Returns a list of features: //feature[contains(.,'search string')] =20 Returns a list of components //feature[contains(.,'search string')]/.. The trailing /.. is no accident. It filters duplicates too Nigel ----- Original Message -----=20 From: Tod McIntyre=20 To: Nigel Swinson=20 Cc: php...@li...=20 Sent: Friday, February 10, 2006 8:07 PM Subject: Re: XPath Search excellent... very thorough! Now, my only question is... once I = have a list of all the features that match a particular phrase, along = with keywords etc. I want to then get the component that that feature = belongs to. I'm not exactly sure how to get the parent. I tried some = of XPath the axis fcn's such as "parent::feature", to no avail. I'm not = even really sure what the exact syntax is in order to get this to work = anyway. Can someone tell me how to get the component element of the = feature that matches. Then all I need to do is eliminate duplicates and = I'm set! Thanks again for all the help, Brad On 2/7/06, Nigel Swinson <nig...@us...> wrote: I've narrowed the problem down. When I search through keywords in = the xml file (each component has one keyword tag with multiple words) it = works fine using: = $components=3D$xPath->match("//component[contains(keyword,'string')]"); but when I search through features (each component has multiple = features) using: = $components=3D$xPath->match("//component[contains(feature,'string')]"); I get an empty array. Do I have to do something different if there are multiple tags such = as feature? Worked it out. When you search for = //component[contains(feature,'string')], then it first of all evaluates = //component to get a list of context nodes, which will be something = like: XPathSet:Array ( [0] =3D> /product[1]/component[1] [1] =3D> /product[1]/component[6] ) Then it filters the list based on the predicate, which is the = contains(feature,'string') bit. But to do this, it needs to evaluate = the "feature" for each of the context nodes. But for the first context = node this returns something like: Array ( [0] =3D> /product[1]/component[1]/feature[1] [1] =3D> /product[1]/component[1]/feature[2] [2] =3D> /product[1]/component[1]/feature[3] [3] =3D> /product[1]/component[1]/feature[4] ) So the LHS is an array, and the RHS is a string. According to = http://www.w3.org/TR/xpath#function-string, the contains function will = convert the LHS to a string by "A node-set is converted to a string by = returning the string-value of the node in the node-set that is first in = document order . If the node-set is empty, an empty string is = returned.". And thus it will take the text part of = /product[1]/component[1]/feature[1], and ignore the rest. This explains why it works for the keyword, as you only have one = keyword. Instead of finding all components, and then filtering the list, = instead we should find all features, and then go back up a level to get = their owning components. So we do: //feature[contains(.,'search string')]/.. Nigel |
From: Tod M. <tod...@gm...> - 2006-02-13 17:45:44
|
I'll try it out. Thanks again! Brad On 2/10/06, Nigel Swinson <nig...@us...> wrote: > > Returns a list of features: > //feature[contains(.,'search string')] > > Returns a list of components > //feature[contains(.,'search string')]/.. > > The trailing /.. is no accident. It filters duplicates too > > Nigel > > ----- Original Message ----- > *From:* Tod McIntyre <tod...@gm...> > *To:* Nigel Swinson <nig...@us...> > *Cc:* php...@li... > *Sent:* Friday, February 10, 2006 8:07 PM > *Subject:* Re: XPath Search > > excellent... very thorough! Now, my only question is... once I hav= e > a list of all the features that match a particular phrase, along with > keywords etc. I want to then get the component that that feature belongs > to. I'm not exactly sure how to get the parent. I tried some of XPath t= he > axis fcn's such as "parent::feature", to no avail. I'm not even really s= ure > what the exact syntax is in order to get this to work anyway. Can someon= e > tell me how to get the component element of the feature that matches. Th= en > all I need to do is eliminate duplicates and I'm set! > > Thanks again for all the help, > > Brad > > On 2/7/06, Nigel Swinson <nig...@us...> wrote: > > > > I've narrowed the problem down. When I search through keywords in the > > xml file (each component has one keyword tag with multiple words) it wo= rks > > fine using: > > $components=3D$xPath->match("//component[contains(keyword,'string')]"); > > but when I search through features (each component has multiple > > features) using: > > $components=3D$xPath->match("//component[contains(feature,'string')]"); > > I get an empty array. > > > > Do I have to do something different if there are multiple tags such as > > feature? > > > > Worked it out. When you search for //component[contains(feature,'strin= g')], > > then it first of all evaluates //component to get a list of context > > nodes, which will be something like: > > > > XPathSet:Array > > ( > > [0] =3D> /product[1]/component[1] > > [1] =3D> /product[1]/component[6] > > ) > > > > Then it filters the list based on the predicate, which is the > > contains(feature,'string') bit. But to do this, it needs to evaluate > > the "feature" for each of the context nodes. But for the first context= node > > this returns something like: > > > > Array > > ( > > [0] =3D> /product[1]/component[1]/feature[1] > > [1] =3D> /product[1]/component[1]/feature[2] > > [2] =3D> /product[1]/component[1]/feature[3] > > [3] =3D> /product[1]/component[1]/feature[4] > > ) > > > > So the LHS is an array, and the RHS is a string. According to > > http://www.w3.org/TR/xpath#function-string, the contains function will > > convert the LHS to a string by "A node-set is converted to a string by > > returning the string-value <http://www.w3.org/TR/xpath#dt-string-value>= of the node in the node-set that is first in document > > order <http://www.w3.org/TR/xpath#dt-document-order> . If the node-set > > is empty, an empty string is returned.". And thus it will take the tex= t > > part of /product[1]/component[1]/feature[1], and ignore the rest. > > > > This explains why it works for the keyword, as you only have one > > keyword. > > > > Instead of finding all components, and then filtering the list, instead > > we should find all features, and then go back up a level to get their o= wning > > components. So we do: > > > > //feature[contains(.,'search string')]/.. > > > > Nigel > > > > |