From: Hannes G. <han...@de...> - 2005-03-26 18:16:33
|
Tobias, Chris, everybody; Attached you find two more patches to N3Parser.php, this time on the topic of whitespace handling. I managed to easily get URIs wrongly containing linebreaks (yikes!) without these patches.=20 The first one is about the isWS() method, which uses simple string comparison to determine if a string is whitespace only. I replaced th= is with a more powerful regular expression. It's faster, too. I hope tha= t's ok for you guys. The second one is about trimming all the lines extracted from the N3 file. It seems like this had been in the code once before, involving = the (strange, and now unused) method trimLine(). I added a call to trim() and removed the unused trimLine() method. I hope that's ok, too. Well, then I have a question, regarding a line of code that is obviou= sly buggy, but seems to have a purpose unknown to me. In the applyStuff() method we find this: =AB if (isset($prefixes[$ns])) $list[$i] =3D '<'.$prefixes[$ns].$name.'>'= ; else if (isset($prefixes[substr($ns,2)])) $list[$i] =3D '^^'.$prefixes[substr($ns,2)].$name.''; else { //die('Prefix not declared:'.$ns);=20 =BB Unfortunately I have now idea what the "else if" clause is supposed t= o do, but it fails in the case where the document contains undeclared two-letter prefixes (like dc:), and the "empty" prefix. It produces rather odd looking URIs then. If somebody explained me the purpose of this line I guess I could com= e up with a fix. Oh, and another remark: In constants.php you have =AB define('FOAF_NS', 'http://xmlns.com/foaf/0.1/#'); =BB The spec at that URL states that the NS URI ends with the slash, not = the hash. Alas, hash vs. slash, same old story. Some more remarks re ResList+bNodes and findAsIterator() follow soon. Thanks very much indeed for all your patience. Hannes (<- feeling Bengee must think I'd rather use his tool.. ;-)) |