Tobias, Chris, everybody;
Attached you find two more patches to N3Parser.php, this time on the
topic of whitespace handling.
I managed to easily get URIs wrongly containing linebreaks (yikes!)
without these patches.=20
The first one is about the isWS() method, which uses simple string
comparison to determine if a string is whitespace only. I replaced th=
is
with a more powerful regular expression. It's faster, too. I hope tha=
t's
ok for you guys.
The second one is about trimming all the lines extracted from the N3
file. It seems like this had been in the code once before, involving =
the
(strange, and now unused) method trimLine(). I added a call to trim()
and removed the unused trimLine() method. I hope that's ok, too.
Well, then I have a question, regarding a line of code that is obviou=
sly
buggy, but seems to have a purpose unknown to me.
In the applyStuff() method we find this:
=AB
if (isset($prefixes[$ns])) $list[$i] =3D '<'.$prefixes[$ns].$name.'>'=
;
else if (isset($prefixes[substr($ns,2)])) $list[$i] =3D
'^^'.$prefixes[substr($ns,2)].$name.'';
else {
//die('Prefix not declared:'.$ns);=20
=BB
Unfortunately I have now idea what the "else if" clause is supposed t=
o
do, but it fails in the case where the document contains undeclared
two-letter prefixes (like dc:), and the "empty" prefix. It produces
rather odd looking URIs then.
If somebody explained me the purpose of this line I guess I could com=
e
up with a fix.
Oh, and another remark: In constants.php you have
=AB
define('FOAF_NS', 'http://xmlns.com/foaf/0.1/#');
=BB
The spec at that URL states that the NS URI ends with the slash, not =
the
hash. Alas, hash vs. slash, same old story.
Some more remarks re ResList+bNodes and findAsIterator() follow soon.
Thanks very much indeed for all your patience.
Hannes (<- feeling Bengee must think I'd rather use his tool.. ;-))
|