Hi, thanks for replying.

We replaced the existing file with the new one you posted in bug 37575, but the problem persists.

Virtuoso contains the triple, and we confirm that the problem is on reading. Inserting prints statements as you say, we realize the problem is generated when the parser is invoked to create the SMWSparqlResultWrapper data.
In SMW_SparqlDatabase.php file, in line 463 (return $xmlParser->makeResultFromXml( $xmlResult );) when the makeResultFromXml function is invoked, the result data from the query is still fine, with the special chars in correct format, but then, inside the function in line 79 (xml_parse( $parser, $xmlQueryResult, true );) after data is parsed, the final result is in bad format as mentioned in the first message.

We also tried putting prints statements in the xmlHandleCData function wich is loaded into the parser in line 70 with the following statement:
- xml_set_character_data_handler($parser, 'xmlHandleCData' );
Doing this, we realized that for one String like "Marcos Nuñez", the xmlHandleCData function is called twice. First with the String "Marcos Nu" and second with the String "ñez".

Thanks again,

Marcelo, Pablo y Danilo

2012/10/23 Markus Krötzsch <markus@semantic-mediawiki.org>

this problem is not know (to me). A recent bugfix (bug 37575) corrected an encoding issue (char encoding not sent when calling RDF store; could lead to misunderstandings on the side of the store). You can try if this fixes your problem. Testing this can be done by replacing a single file in SMW 1.7; see the link to gerrit in the bug report for details.

This may or may not help you. To further analyse the problem, make sure that Virtuoso really contains the triple you expect. It could also be that the error is already introduced when writing the data. If the reading is the problem, it needs to be debugged further. You can do this by looking at the SPARQL query code (esp. the one that reads SPARQL result sets) and inserting print statements to view the raw data coming from the store, and the raw data extracted before turned into SMW objects. This should clarify where information is lost.

Good luck,


On 23/10/12 02:43, Danilo da Rosa wrote:
Hello, we are having some trouble developing a MW extension, using SMW
with Virtuoso, when having string objects with special characters in the
triple store.

When making a SPARQL query, through the “SMWSparqlDatabase::doquery
($sparql)” method, if the triple object has special characters, such as
accents (i.e. á) or “ñ”, the result does not shows all the string, it
shows only de part after the special char.

For example, having the following triple in Virtuoso:
"Marcos Nuñez"

And making this query, using the doQuery method:

DEFINE input:inference <>
DEFINE get:soft "replacing"
DEFINE input:same-as "yes"
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?o where {
{ < > <> ?o }

  When we print the result, we get:


We would like to know if this is a known error, and if there is any

Thanks in advance,

Marcelo, Pablo y Danilo

Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:

Semediawiki-devel mailing list