|
From: Ian B. J. <ij...@w3...> - 2006-07-07 01:47:15
|
Hello, One of my colleagues reported [1] an issue where UTF-8 characters are escaped like this "\u00D0" (correct codepoints but escaped instead of the actual characters).=20 Richard Cyganiak kindly made a suggestion [2]: =20 "Set UNIC_RDF to FALSE again to avoid this." However, when I set UNIC_RDF to FALSE, the parser seems to fail. Here is the query: PREFIX : <http://www.w3.org/2000/10/swap/pim/contact#> PREFIX doc: <http://www.w3.org/2000/10/swap/pim/doc#> PREFIX mat: <http://www.w3.org/2002/05/matrix/vocab#>=20 PREFIX org: <http://www.w3.org/2001/04/roadmap/org#> PREFIX rec: <http://www.w3.org/2001/02pd/rec54#>=20 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?doc, ?editor, ?title, ?date, ?versionOf, ?type, ?supersedes WHERE = {?doc rdf:type ?type; dc:title ?title; dc:date ?date; doc:versionOf ?versi= onOf. OPTIONAL {?doc rec:supersedes ?supersedes} OPTIONAL {?doc rec:editor= [:fullName ?editor ] .}} ORDER BY DESC(?date) Here is the RDF source: http://www.w3.org/2002/01/tr-automation/tr.rdf I find that if I set UNIC_RDF to TRUE, parsing succeeds but I have the escaping issue. If I set it fo FALSE, parsing fails (with a "255" error tha= t I did not examine closely). I would appreciate any suggestions and hope the above information is sufficient to run the test; please let me know if more information is=20 required. Thank you, _ Ian Jacobs [1] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149273 [2] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149274 --=20 Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs Tel: +1 718 260-9447 |
|
From: Richard C. <ri...@cy...> - 2006-07-10 09:56:52
|
Ian,
The file parses fine here, with UNIC_RDF set to false. I didn't try
to run a SPARQL query though, so maybe the problem is not parsing but
somewhere in the SPARQL engine. Can you please provide the exact
error message, and a code sample that produces the error? Which
version of RAP and PHP ("php -v") is this?
Cheers,
Richard
On 7 Jul 2006, at 03:46, Ian B. Jacobs wrote:
> Hello,
>
> One of my colleagues reported [1] an issue where UTF-8 characters
> are escaped like this "\u00D0" (correct codepoints but escaped
> instead of the actual characters).
>
> Richard Cyganiak kindly made a suggestion [2]:
>
> "Set UNIC_RDF to FALSE again to avoid this."
>
> However, when I set UNIC_RDF to FALSE, the parser seems to fail.
> Here is the query:
>
> PREFIX : <http://www.w3.org/2000/10/swap/pim/contact#>
> PREFIX doc: <http://www.w3.org/2000/10/swap/pim/doc#>
> PREFIX mat: <http://www.w3.org/2002/05/matrix/vocab#>
> PREFIX org: <http://www.w3.org/2001/04/roadmap/org#>
> PREFIX rec: <http://www.w3.org/2001/02pd/rec54#>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX dc: <http://purl.org/dc/elements/1.1/>
> SELECT ?doc, ?editor, ?title, ?date, ?versionOf, ?type, ?
> supersedes WHERE {?doc rdf:type ?type; dc:title ?title; dc:date ?
> date; doc:versionOf ?versionOf. OPTIONAL {?doc rec:supersedes ?
> supersedes} OPTIONAL {?doc rec:editor [:fullName ?editor ] .}}
> ORDER BY DESC(?date)
>
> Here is the RDF source:
> http://www.w3.org/2002/01/tr-automation/tr.rdf
>
> I find that if I set UNIC_RDF to TRUE, parsing succeeds but I have the
> escaping issue. If I set it fo FALSE, parsing fails (with a "255"
> error that
> I did not examine closely).
>
> I would appreciate any suggestions and hope the above information is
> sufficient to run the test; please let me know if more information is
> required.
>
> Thank you,
>
> _ Ian Jacobs
>
>
>
> [1] http://sourceforge.net/mailarchive/message.php?msg_id=15149273
> [2] http://sourceforge.net/mailarchive/message.php?msg_id=15149274
> --
> Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
> Tel: +1 718 260-9447
>
> ----------------------------------------------------------------------
> ---
> Using Tomcat but need to do more? Need to support web services,
> security?
> Get stuff done quickly with pre-integrated technology to make your
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo
> http://sel.as-us.falkag.net/sel?
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Rdfapi-php-interest mailing list
> Rdf...@li...
> https://lists.sourceforge.net/lists/listinfo/rdfapi-php-interest
|
|
From: Richard C. <ri...@cy...> - 2006-07-10 20:50:07
|
Hi Ian,
Thanks for the additional information! I *think* (but haven't =20
actually tried) that a small change in line 1282 of api/sparql/=20
SparqlEngine.php should fix the problem. Could you please try to replace
$label =3D htmlentities($varvalue->getLabel());
with
$label =3D htmlspecialchars($varvalue->getLabel());
and report back if it works?
Cheers,
Richard
On 10 Jul 2006, at 16:37, Ian B. Jacobs wrote:
> On Mon, 2006-07-10 at 11:56 +0200, Richard Cyganiak wrote:
>> Ian,
>>
>> The file parses fine here, with UNIC_RDF set to false. I didn't try
>> to run a SPARQL query though, so maybe the problem is not parsing but
>> somewhere in the SPARQL engine. Can you please provide the exact
>> error message, and a code sample that produces the error? Which
>> version of RAP and PHP ("php -v") is this?
>
> Hello Richard,
>
> Here's more information; I hope it helps. Thanks for your work on =20
> this.
>
> _ Ian
>
> [Info provided by my colleague Dom]
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
> Version info:
> dom@cumulustier:~$ php5 -v
> PHP 5.1.4-0.1 (cli) (built: Jun 13 2006 21:46:20)
> dom@cumulustier:~$
> less /usr/local/lib/php/rdfapi-php/api/RdfAPI.php |grep "@version"
> // @version : $Id: RdfAPI.php,v 1.20 =20
> 2006/05/15
> 05:24:35 tgauss Exp $
>
> When I run the SPARQL query on the tr.rdf with UNIC_RDF set to =20
> false, I
> get back a bunch of PHP errors =E0 la:
>
> Warning: simplexml_load_string():
> Entity: line 1: parser error : Entity 'acirc' not defined
> in /usr/local/lib/php/rdfapi-php/api/sparql/=20
> SparqlEngine.php on
> line 1260
>
> Warning: simplexml_load_string(): iteral>Mark
> Baker</literal></binding><binding
> name=3D"title"><literal>XHTMLâ
> in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> 1260
>
> Warning: simplexml_load_string():
> ^
> in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> 1260
>
> Warning: simplexml_load_string(): Entity: line 1: parser =20
> error :
> Input is not proper UTF-8, indicate encoding !
> Bytes: 0x84 0x26 0x63 0x65
> in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> 1260
>
> When I experimented to see what went wrong, it looks like the function
> that outputs the results as XML was failing on the intermediary =20
> content.
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
>> On 7 Jul 2006, at 03:46, Ian B. Jacobs wrote:
>>
>>> Hello,
>>>
>>> One of my colleagues reported [1] an issue where UTF-8 characters
>>> are escaped like this "\u00D0" (correct codepoints but escaped
>>> instead of the actual characters).
>>>
>>> Richard Cyganiak kindly made a suggestion [2]:
>>>
>>> "Set UNIC_RDF to FALSE again to avoid this."
>>>
>>> However, when I set UNIC_RDF to FALSE, the parser seems to fail.
>>> Here is the query:
>>>
>>> PREFIX : <http://www.w3.org/2000/10/swap/pim/contact#>
>>> PREFIX doc: <http://www.w3.org/2000/10/swap/pim/doc#>
>>> PREFIX mat: <http://www.w3.org/2002/05/matrix/vocab#>
>>> PREFIX org: <http://www.w3.org/2001/04/roadmap/org#>
>>> PREFIX rec: <http://www.w3.org/2001/02pd/rec54#>
>>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>> PREFIX dc: <http://purl.org/dc/elements/1.1/>
>>> SELECT ?doc, ?editor, ?title, ?date, ?versionOf, ?type, ?
>>> supersedes WHERE {?doc rdf:type ?type; dc:title ?title; dc:date ?
>>> date; doc:versionOf ?versionOf. OPTIONAL {?doc rec:supersedes ?
>>> supersedes} OPTIONAL {?doc rec:editor [:fullName ?editor ] .}}
>>> ORDER BY DESC(?date)
>>>
>>> Here is the RDF source:
>>> http://www.w3.org/2002/01/tr-automation/tr.rdf
>>>
>>> I find that if I set UNIC_RDF to TRUE, parsing succeeds but I =20
>>> have the
>>> escaping issue. If I set it fo FALSE, parsing fails (with a "255"
>>> error that
>>> I did not examine closely).
>>>
>>> I would appreciate any suggestions and hope the above information is
>>> sufficient to run the test; please let me know if more =20
>>> information is
>>> required.
>>>
>>> Thank you,
>>>
>>> _ Ian Jacobs
>>>
>>>
>>>
>>> [1] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149273
>>> [2] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149274
>>> --=20
>>> Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
>>> Tel: +1 718 260-9447
>>>
>>> --------------------------------------------------------------------=20=
>>> --
>>> ---
>>> Using Tomcat but need to do more? Need to support web services,
>>> security?
>>> Get stuff done quickly with pre-integrated technology to make your
>>> job easier
>>> Download IBM WebSphere Application Server v.1.0.1 based on Apache
>>> Geronimo
>>> http://sel.as-us.falkag.net/sel?
>>> cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D121642
>>> _______________________________________________
>>> Rdfapi-php-interest mailing list
>>> Rdf...@li...
>>> https://lists.sourceforge.net/lists/listinfo/rdfapi-php-interest
> --=20
> Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
> Tel: +1 718 260-9447
|
|
From: Ian B. J. <ij...@w3...> - 2006-07-11 13:40:44
|
Hello Richard,
The change seems to do the trick. Thank you!
_ Ian
On Mon, 2006-07-10 at 22:50 +0200, Richard Cyganiak wrote:
> Hi Ian,
>=20
> Thanks for the additional information! I *think* (but haven't =20
> actually tried) that a small change in line 1282 of api/sparql/=20
> SparqlEngine.php should fix the problem. Could you please try to replace
>=20
> $label =3D htmlentities($varvalue->getLabel());
>=20
> with
>=20
> $label =3D htmlspecialchars($varvalue->getLabel());
>=20
> and report back if it works?
>=20
> Cheers,
> Richard
>=20
>=20
> On 10 Jul 2006, at 16:37, Ian B. Jacobs wrote:
>=20
> > On Mon, 2006-07-10 at 11:56 +0200, Richard Cyganiak wrote:
> >> Ian,
> >>
> >> The file parses fine here, with UNIC_RDF set to false. I didn't try
> >> to run a SPARQL query though, so maybe the problem is not parsing but
> >> somewhere in the SPARQL engine. Can you please provide the exact
> >> error message, and a code sample that produces the error? Which
> >> version of RAP and PHP ("php -v") is this?
> >
> > Hello Richard,
> >
> > Here's more information; I hope it helps. Thanks for your work on =20
> > this.
> >
> > _ Ian
> >
> > [Info provided by my colleague Dom]
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D
> > Version info:
> > dom@cumulustier:~$ php5 -v
> > PHP 5.1.4-0.1 (cli) (built: Jun 13 2006 21:46:20)
> > dom@cumulustier:~$
> > less /usr/local/lib/php/rdfapi-php/api/RdfAPI.php |grep "@version"
> > // @version : $Id: RdfAPI.php,v 1.20 =20
> > 2006/05/15
> > 05:24:35 tgauss Exp $
> >
> > When I run the SPARQL query on the tr.rdf with UNIC_RDF set to =20
> > false, I
> > get back a bunch of PHP errors =E0 la:
> >
> > Warning: simplexml_load_string():
> > Entity: line 1: parser error : Entity 'acirc' not defined
> > in /usr/local/lib/php/rdfapi-php/api/sparql/=20
> > SparqlEngine.php on
> > line 1260
> >
> > Warning: simplexml_load_string(): iteral>Mark
> > Baker</literal></binding><binding
> > name=3D"title"><literal>XHTMLâ
> > in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> > 1260
> >
> > Warning: simplexml_load_string():
> > ^
> > in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> > 1260
> >
> > Warning: simplexml_load_string(): Entity: line 1: parser =20
> > error :
> > Input is not proper UTF-8, indicate encoding !
> > Bytes: 0x84 0x26 0x63 0x65
> > in /usr/local/lib/php/rdfapi-php/api/sparql/SparqlEngine.php on line
> > 1260
> >
> > When I experimented to see what went wrong, it looks like the function
> > that outputs the results as XML was failing on the intermediary =20
> > content.
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >
> >> On 7 Jul 2006, at 03:46, Ian B. Jacobs wrote:
> >>
> >>> Hello,
> >>>
> >>> One of my colleagues reported [1] an issue where UTF-8 characters
> >>> are escaped like this "\u00D0" (correct codepoints but escaped
> >>> instead of the actual characters).
> >>>
> >>> Richard Cyganiak kindly made a suggestion [2]:
> >>>
> >>> "Set UNIC_RDF to FALSE again to avoid this."
> >>>
> >>> However, when I set UNIC_RDF to FALSE, the parser seems to fail.
> >>> Here is the query:
> >>>
> >>> PREFIX : <http://www.w3.org/2000/10/swap/pim/contact#>
> >>> PREFIX doc: <http://www.w3.org/2000/10/swap/pim/doc#>
> >>> PREFIX mat: <http://www.w3.org/2002/05/matrix/vocab#>
> >>> PREFIX org: <http://www.w3.org/2001/04/roadmap/org#>
> >>> PREFIX rec: <http://www.w3.org/2001/02pd/rec54#>
> >>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> >>> PREFIX dc: <http://purl.org/dc/elements/1.1/>
> >>> SELECT ?doc, ?editor, ?title, ?date, ?versionOf, ?type, ?
> >>> supersedes WHERE {?doc rdf:type ?type; dc:title ?title; dc:date ?
> >>> date; doc:versionOf ?versionOf. OPTIONAL {?doc rec:supersedes ?
> >>> supersedes} OPTIONAL {?doc rec:editor [:fullName ?editor ] .}}
> >>> ORDER BY DESC(?date)
> >>>
> >>> Here is the RDF source:
> >>> http://www.w3.org/2002/01/tr-automation/tr.rdf
> >>>
> >>> I find that if I set UNIC_RDF to TRUE, parsing succeeds but I =20
> >>> have the
> >>> escaping issue. If I set it fo FALSE, parsing fails (with a "255"
> >>> error that
> >>> I did not examine closely).
> >>>
> >>> I would appreciate any suggestions and hope the above information is
> >>> sufficient to run the test; please let me know if more =20
> >>> information is
> >>> required.
> >>>
> >>> Thank you,
> >>>
> >>> _ Ian Jacobs
> >>>
> >>>
> >>>
> >>> [1] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149273
> >>> [2] http://sourceforge.net/mailarchive/message.php?msg_id=3D15149274
> >>> --=20
> >>> Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
> >>> Tel: +1 718 260-9447
> >>>
> >>> --------------------------------------------------------------------=20
> >>> --
> >>> ---
> >>> Using Tomcat but need to do more? Need to support web services,
> >>> security?
> >>> Get stuff done quickly with pre-integrated technology to make your
> >>> job easier
> >>> Download IBM WebSphere Application Server v.1.0.1 based on Apache
> >>> Geronimo
> >>> http://sel.as-us.falkag.net/sel?
> >>> cmd=3Dlnk&kid=3D120709&bid=3D263057&dat=3D121642
> >>> _______________________________________________
> >>> Rdfapi-php-interest mailing list
> >>> Rdf...@li...
> >>> https://lists.sourceforge.net/lists/listinfo/rdfapi-php-interest
> > --=20
> > Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
> > Tel: +1 718 260-9447
--=20
Ian Jacobs (ij...@w3...) http://www.w3.org/People/Jacobs
Tel: +1 718 260-9447
|