Thread: Some suggestions for documentation
Brought to you by:
bs_php,
nigelswinson
From: Peter R. <php...@pe...> - 2002-05-15 17:20:09
|
1. It's useful that the intro defines things like absolutexpath and xpathquery, but the function list doesn't use these consistently, using other terms like xpath expression or xpath string. I feel this is likely to confuse newbies. 2. an example of evaluate/match in the sample code, and when it would be used. Am I right in thinking this is the only function that uses an xpathquery, i.e. that works with a node set and not an absolute address? This was something I misunderstood when I first used the class, as I thought you always had to do an evaluate before using any of the other functions. If you only want one node, don't use evaluate, as it just slows things down. 3. it would be helpful if the functions were grouped together more, e.g. if getNode and getNodePath were together. In fact, a list of functions by category would be useful, so that, for example, wholeText appears with the other content retrieval functions, and it's explained how they differ. 4. equalNodes should read "compare two nodes" ! 5. perhaps worth stressing with decodeEntities that it uses PHP functions, limited to 8859-1. 6. I think you mean 'relational database' - it may also be rational :-) |
From: Nigel S. <nig...@us...> - 2002-05-16 01:36:14
|
> 1. It's useful that the intro defines things like absolutexpath and > xpathquery, but the function list doesn't use these consistently, using other > terms like xpath expression or xpath string. I feel this is likely to confuse > newbies. Well spotted. Hopefully I've caught all of these now. If not please send the section reference. > 2. an example of evaluate/match in the sample code, and when it would be > used. Am I right in thinking this is the only function that uses an > xpathquery, i.e. that works with a node set and not an absolute address? This > was something I misunderstood when I first used the class, as I thought you > always had to do an evaluate before using any of the other functions. If you > only want one node, don't use evaluate, as it just slows things down. Actually as of V3, if the parameter is an $xPathQuery, then you can pass in an xpath expression, so you don't have to call evaluate() all the time! So all these functions can now be called using things like /A/B/C rather than /A[1]/B[2]/C[2]. This was such a common problem that it really had to be addressed. function evaluate($xPathQuery, $baseXPath='') { function nodeName($xPathQuery) { function removeChild($xPathQuery, $autoReindex=TRUE) { function replaceChildByData($xPathQuery, $data, $autoReindex=TRUE) { function &replaceChild($xPathQuery, $node, $autoReindex=TRUE) { function insertChild($xPathQuery, $node, $shiftRight=TRUE, $afterText=TRUE, $autoReindex=TRUE) { function appendChild($xPathQuery, $node, $afterText=FALSE, $autoReindex=TRUE) { function insertBefore($xPathQuery, $node, $afterText=TRUE, $autoReindex=TRUE) { function setAttribute($xPathQuery, $name, $value, $overwrite=TRUE) { function setAttributes($xPathQuery, $attributes, $overwrite=TRUE) { function removeAttribute($xPathQuery, $attrList=NULL) { function getDataParts($xPathQuery) { function replaceData($xPathQuery, $replacement, $offset = 0, $count = 0, $textPartNr=1) { function insertData($xPathQuery, $data, $offset=0) { function appendData($xPathQuery, $data, $textPartNr=1) { function deleteData($xPathQuery, $offset=0, $count=0, $textPartNr=1) { See the setModMatch() function to change the behaviour for when the query matches more than one node. These are the options. * - XPATH_QUERYHIT_ALL (default) * - XPATH_QUERYHIT_FIRST * - XPATH_QUERYHIT_UNIQUE But clearly, passing in an absolute XPath will mean it doesn't have to call evaluate. > 3. it would be helpful if the functions were grouped together more, e.g. if > getNode and getNodePath were together. In fact, a list of functions by > category would be useful, so that, for example, wholeText appears with the > other content retrieval functions, and it's explained how they differ. The functions are grouped as they appear in the source file, so by class and then in the order that they are defined in the class. I had some debate with Sam over the ordering in the file and we decided that the ordering of the functions in the source files should assist coding, and the doc will just have to live with it. I know it's not ideal, but I don't see an easy solution to re-doing the order for the doc, so I don't plan to do anything about this. > 4. equalNodes should read "compare two nodes" ! Fixed thanks. > 5. perhaps worth stressing with decodeEntities that it uses PHP functions, > limited to 8859-1. Ok have added: * It makes use of the get_html_translation_table(HTML_ENTITIES) php library * call, so is limited in the same ways. At the time of writing this seemed * be restricted to iso-8859-1 How does that sound? > 6. I think you mean 'relational database' - it may also be rational :-) Heh, changed. Thanks for the input Peter :o) Nigel |
From: Peter R. <php...@pe...> - 2002-05-16 10:20:56
|
On Thursday 16 May 2002 2:20, Nigel Swinson wrote: > > > 2. an example of evaluate/match in the sample code, and when it would be > > used. Am I right in thinking this is the only function that uses an > > xpathquery, i.e. that works with a node set and not an absolute address? > This > > was something I misunderstood when I first used the class, as I thought > you > > always had to do an evaluate before using any of the other functions. If > you > > only want one node, don't use evaluate, as it just slows things down. > > Actually as of V3, if the parameter is an $xPathQuery, then you can pass in > an xpath expression, so you don't have to call evaluate() all the time! So > all these functions can now be called using things like /A/B/C rather than > /A[1]/B[2]/C[2]. This was such a common problem that it really had to be > addressed. > > function evaluate($xPathQuery, $baseXPath='') { > function nodeName($xPathQuery) { > function removeChild($xPathQuery, $autoReindex=TRUE) { > function replaceChildByData($xPathQuery, $data, $autoReindex=TRUE) { > function &replaceChild($xPathQuery, $node, $autoReindex=TRUE) { > function insertChild($xPathQuery, $node, $shiftRight=TRUE, $afterText=TRUE, > $autoReindex=TRUE) { > function appendChild($xPathQuery, $node, $afterText=FALSE, > $autoReindex=TRUE) { > function insertBefore($xPathQuery, $node, $afterText=TRUE, > $autoReindex=TRUE) { > function setAttribute($xPathQuery, $name, $value, $overwrite=TRUE) { > function setAttributes($xPathQuery, $attributes, $overwrite=TRUE) { > function removeAttribute($xPathQuery, $attrList=NULL) { > function getDataParts($xPathQuery) { > function replaceData($xPathQuery, $replacement, $offset = 0, $count = 0, > $textPartNr=1) { > function insertData($xPathQuery, $data, $offset=0) { > function appendData($xPathQuery, $data, $textPartNr=1) { > function deleteData($xPathQuery, $offset=0, $count=0, $textPartNr=1) { hm, I can see that being able to update all the nodes in a set with one instruction is useful, but why getDataParts and not any of the other get functions? Documentation for getDataParts doesn't mention this! Surely the main reason it was 'such a common problem' was because the documentation wasn't clear. To get a particular node, use a get function; to get a list of nodes meeting an xpathquery criterion, use evaluate/match, and then loop through the resulting array. Seems clear to me! > The functions are grouped as they appear in the source file, so by class > and then in the order that they are defined in the class. I had some > debate with Sam over the ordering in the file and we decided that the > ordering of the functions in the source files should assist coding, and the > doc will just have to live with it. I know it's not ideal, but I don't see > an easy solution to re-doing the order for the doc, so I don't plan to do > anything about this. perhaps what's needed is a separate FAQ: 'how do I ...' > > 5. perhaps worth stressing with decodeEntities that it uses PHP > > functions, limited to 8859-1. > > Ok have added: > > * It makes use of the get_html_translation_table(HTML_ENTITIES) php > library > * call, so is limited in the same ways. At the time of writing this > seemed > * be restricted to iso-8859-1 > > How does that sound? it doesn't 'seem to be restricted', it is! On a related issue, could you add the ability to set the source encoding. Though you have the options, which enable setting of target encoding, it looks like source definition isn't in phpxpath at present, meaning it uses the default, which in PHP for some reason is 8859-1. I would have thought PHP's default should be whatever is given in the xml file or, failing that, the xml default, which is utf8 - but ... I am currently using non-Latin1 charsets only in a database, but intend to expand this into xml. I can foresee problems if it assumes everything is in 8859-1 ! |
From: Nigel S. <nig...@us...> - 2002-05-17 10:32:42
|
> hm, I can see that being able to update all the nodes in a set with one > instruction is useful, but why getDataParts and not any of the other get > functions? getData() calls getDataParts(), so getData() takes a xPathQuery. Have updated the source... Which other "get" functions are missing this? > Documentation for getDataParts doesn't mention this! The documentation for getDataParts doesn't mention this, because getDataParts() was altered since the V3.0 release to support an xPathQuery, hence it is different in the released documentation. Sorry for the confusion... I always refer the most recent version from CVS unless I explicitly state otherwise. > Surely the main reason it was 'such a common problem' was because the > documentation wasn't clear. To get a particular node, use a get function; to > get a list of nodes meeting an xpathquery criterion, use evaluate/match, and > then loop through the resulting array. Seems clear to me! I'm glad it was clear to you :o) The way the class is now it is more useable, at the expense of making novice users run less efficient calls by passing a xPathQuery rather than an absolute XPath. > > The functions are grouped as they appear in the source file, so by class > > and then in the order that they are defined in the class. I had some > > debate with Sam over the ordering in the file and we decided that the > > ordering of the functions in the source files should assist coding, and the > > doc will just have to live with it. I know it's not ideal, but I don't see > > an easy solution to re-doing the order for the doc, so I don't plan to do > > anything about this. > > perhaps what's needed is a separate FAQ: 'how do I ...' This is probably a good point to mention that ANYONE can author documentation for the class, and submit it at http://sourceforge.net/docman/new.php?group_id=36731 Sam or I have to "authorise the document" but unless you swear, curse and blashpheme it's likely to be published. Given that there were nearly 2500 downloads of 2.2, and less than 1% support requests, I wouldn't know what to put in a faq, so at this stage have no intention of authoring one myself > On a related issue, could you add the ability to set the source encoding. > Though you have the options, which enable setting of target encoding, it > looks like source definition isn't in phpxpath at present, meaning it uses > the default, which in PHP for some reason is 8859-1. I would have thought > PHP's default should be whatever is given in the xml file or, failing that, > the xml default, which is utf8 - but ... > I am currently using non-Latin1 charsets only in a database, but intend to > expand this into xml. I can foresee problems if it assumes everything is in > 8859-1 ! Ummm I don't really know where to start in coding this feature, as I don't really understand what's required. It does sound like a useful addition though, as I know we have many european users. It's not an issue that is directly affecting me, so I'm afraid it's unlikely that I will look into this with any sort of priority. Don't suppose anyone else wants to? Sam? Peter? :o) Nigel |
From: Peter R. <php...@pe...> - 2002-05-17 17:07:54
|
On Friday 17 May 2002 11:34, Nigel Swinson wrote: > > hm, I can see that being able to update all the nodes in a set with one > > instruction is useful, but why getDataParts and not any of the other get > > functions? > > getData() calls getDataParts(), so getData() takes a xPathQuery. Have > updated the source... Which other "get" functions are missing this? I was using your list, Nigel. getDataParts was the only one of the retrieval functions on it. > This is probably a good point to mention that ANYONE can author > documentation for the class, had a feeling you might say that :-) > > On a related issue, could you add the ability to set the source encoding. > > Ummm I don't really know where to start in coding this feature, as I don't > really understand what's required. It does sound like a useful addition > though, as I know we have many european users. It's not an issue that is > directly affecting me, so I'm afraid it's unlikely that I will look into > this with any sort of priority. Don't suppose anyone else wants to? Sam? > Peter? see xml_parser_create add parameter to this statement (in, erm, importFromString), and then there would have to be some way of setting it to utf-8 or us-ascii (in case anyone actually uses that). Because you can pass the constructor the filename, you would have to pass this in as a param. Or of course not allow it here, but only with importFromString, and have a separate function to set before calling that. |
From: Nigel S. <nig...@us...> - 2002-05-21 23:18:22
|
> > > hm, I can see that being able to update all the nodes in a set with one > > > instruction is useful, but why getDataParts and not any of the other get > > > functions? > > > > getData() calls getDataParts(), so getData() takes a xPathQuery. Have > > updated the source... Which other "get" functions are missing this? > > I was using your list, Nigel. getDataParts was the only one of the retrieval > functions on it. Ok, and which other get functions do you want xPathQuery access to? Perhaps there are reasons, but at the moment there doesn't seem to be a concrete example for me to comment on, and I'm too busy/lazy to research it. All I have to go on is "any of the other get functions". > > > On a related issue, could you add the ability to set the source encoding. > > > > Ummm I don't really know where to start in coding this feature, as I don't > > really understand what's required. It does sound like a useful addition > > though, as I know we have many european users. It's not an issue that is > > directly affecting me, so I'm afraid it's unlikely that I will look into > > this with any sort of priority. Don't suppose anyone else wants to? Sam? > > Peter? > > see xml_parser_create > add parameter to this statement (in, erm, importFromString), and then there > would have to be some way of setting it to utf-8 or us-ascii (in case anyone > actually uses that). Because you can pass the constructor the filename, you > would have to pass this in as a param. Or of course not allow it here, but > only with importFromString, and have a separate function to set before > calling that. Sounds like you know what you are talking about. Looking forward to seeing your patch ;o) Cheers, Nigel |
From: Peter R. <php...@pe...> - 2002-05-23 08:39:12
|
On Wednesday 22 May 2002 12:20 am, Nigel Swinson wrote: > > > > see xml_parser_create > > add parameter to this statement (in, erm, importFromString), and then > there > > would have to be some way of setting it to utf-8 or us-ascii (in case > anyone > > actually uses that). Because you can pass the constructor the filename, > you > > would have to pass this in as a param. Or of course not allow it here, > > but only with importFromString, and have a separate function to set > > before calling that. > > Sounds like you know what you are talking about. Looking forward to seeing > your patch ;o) ok, I will try and get this done (and tested!) next week. What's easiest for you - an actual patch (i.e. snippets of code) for you/whoever to edit in, or a complete new class file? |
From: Nigel S. <nig...@us...> - 2002-05-24 02:04:20
|
> > Sounds like you know what you are talking about. Looking forward to seeing > > your patch ;o) > > ok, I will try and get this done (and tested!) next week. What's easiest for > you - an actual patch (i.e. snippets of code) for you/whoever to edit in, or > a complete new class file? A complete new class file would probably be easiest... can't be bothered learning how to use the patch tool. Please alter from the most recent version from CVS though (and tell us which version it was you took) so that it's clear what changes you made. Also if it makes sense, writing some tests would be good and we could add them to the testHarness. Not sure how convenient /complex/worthwhile it would be to have much in the way of tests for this feature though... Thanks :o) Nigel =========================== For the most recent version of Php.XPath, and an archive of this list visit: http://www.sourceforge.net/projects/phpxpath |
From: Peter R. <php...@pe...> - 2002-05-31 19:35:55
|
On Thursday 23 May 2002 09:25, Peter Robins wrote: > > ok, I will try and get this done (and tested!) next week. have looked further into this and find there is no problem. In fact, it looks like the PHP manual is wrong. What the source encoding function in expat does is _override_ the encoding given in the file. So, if your xml file is defined as utf-8, it will be read in and processed as utf-8. As I can't think of any reason why you would want to override the file encoding, I see no reason for writing a function to do it! However, whilst investigating this, I came across another phpxpath feature that I wasn't aware of before. Why does the handling of the xml PI and other header info such as dtd differ between exportAsXml and exportToFile? As default, exportAsXml keeps it, whereas exportToFile drops it. Is not the main difference between these 2 functions that exportToFile, er, well, writes the xml to a file? This is neatly illustrated by the test suite. Take for example the simple-test files provided. These are defined as latin1, and if you change them using >127 characters such as accented ones, according to the display the encoding in the output file is latin1. However if you look at the actual file created, you find the encoding has disappeared, in other words it is now supposedly in utf-8. This should mean the display is screwy if you load this file in the test suite, but doesn't because the script doesn't define the html charset, so if your browser, like mine, uses latin1 as default, the display looks correct - in other words, the 2 errors cancel each other out. nice one :-) |
From: Nigel S. <nig...@us...> - 2002-06-04 22:58:56
|
> However, whilst investigating this, I came across another phpxpath > feature that I wasn't aware of before. Why does the handling of the > xml PI and other header info such as dtd differ between exportAsXml > and exportToFile? As default, exportAsXml keeps it, whereas > exportToFile drops it. Is not the main difference between these 2 > functions that exportToFile, er, well, writes the xml to a file? Well you could pass in the third argument as NULL and that would prevent it from overwriting the PI. I'm not sure why it's the way it is. Perhaps Sam has a comment? If not then perhaps we should only add the "default" PI if none is present. Nigel |
From: Sam B. <bs...@us...> - 2002-06-05 07:54:55
|
> ?: > As default, exportAsXml keeps it, whereas > exportToFile drops it. Is not the main difference between these 2 > functions that exportToFile, er, well, writes the xml to a file? > Nigel: > Well you could pass in the third argument as NULL and that would prevent > it from overwriting the PI. I'm not sure why it's the way it is. Perhaps Sam > has a comment? If not then perhaps we should only add the "default" PI if > none is present. Hi, This is a leftover 'feature' from V1 of Php.XPath. V1 used to ignor the header, so the idea was to replace it by <?xml version="1.0"?> by default when exporting to a file. Then in V2 the user had the option to overwrite it and in V3 we're able to keep the header 'as is' but exportToFile() would still overwite it by default. We could change the behaver but it could break ppls code. -- Sam Blum <bs...@us...> =========================== For the most recent version of PHP.XPath and an archive of this list visit: http://sourceforge.net/projects/phpxpath |
From: Peter R. <php...@pe...> - 2002-06-07 17:58:50
|
On Wednesday 05 Jun 2002 00:00, Nigel Swinson wrote: > > However, whilst investigating this, I came across another > > phpxpath feature that I wasn't aware of before. Why does the > > handling of the xml PI and other header info such as dtd differ > > between exportAsXml and exportToFile? As default, exportAsXml > > keeps it, whereas exportToFile drops it. Is not the main > > difference between these 2 functions that exportToFile, er, well, > > writes the xml to a file? > > Well you could pass in the third argument as NULL and that would > prevent it from overwriting the PI. I'm not sure why it's the way > it is. Perhaps Sam has a comment? If not then perhaps we should > only add the "default" PI if none is present. if I pass in NULL, it uses the default, which in this case I don't want. IMO 1. exportAsXml and exportToFile should do the same 2. if there is an input file (i.e. this is a change, not a create new), then use header from input file unless override param 3. if not (i.e. this is a new file), use a default <?xml ...> header unless override param Whilst testing this, I came across another, um, feature. If you have 2 exports one after the other, for example ... print $xml->exportAsXml(); print $xml->exportAsXml(); ... the first one will have the header, the second and subsequent will not. I take it this is a bug and not intentional? Seems to have something to do with the node index, though putting reindexNodeTree() between the 2 exports had no effect. |
From: Nigel S. <nig...@us...> - 2002-06-14 21:38:08
|
> if I pass in NULL, it uses the default, which in this case I don't > want. IMO > 1. exportAsXml and exportToFile should do the same > 2. if there is an input file (i.e. this is a change, not a create > new), then use header from input file unless override param > 3. if not (i.e. this is a new file), use a default <?xml ...> header > unless override param Ok have updated all this. Now the class will use the following preference order in the _export() function, that is called by both exportToFile() and exportAsXml(). 1) The xmlHeader you supply as an argument 2) The xmlHeader that it parsed at import time 3) A default of <?xml version="1.0"?> I presume this is what we want? If not please get in touch. > Whilst testing this, I came across another, um, feature. If you have 2 > exports one after the other, for example > ... > print $xml->exportAsXml(); > print $xml->exportAsXml(); > ... > the first one will have the header, the second and subsequent will > not. I take it this is a bug and not intentional? Seems to have > something to do with the node index, though putting reindexNodeTree() > between the 2 exports had no effect. I forgot to try to reproduce this before I made the fix, and now can't reproduce it, so I presume that I fixed it on the way. If not then I'm sure we'll find out some time... Cheers, Nigel |