Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#86 Turtle serializer produces prefixed names with illegal characters

v6.1.3
open
None
5
2014-01-29
2013-02-12
Richard Cyganiak
No

The Turtle serializer of Virtuoso abbreviates IRIs to prefixed names where possible. So http://example.com/foo becomes ex:foo if an appropriate prefix is declared.

However, only certain characters are allowed in the local part of prefixed names (the foo part). See the Turtle spec:
http://www.w3.org/TR/turtle/#grammar-production-PrefixedName

Virtuoso produces prefixed names when the local part contains certain characters outside of the legal range. Thus, the Turtle output becomes invalid and cannot be read by parsers that follow the spec more closely (like Jena).

The serializer should simply output the full unabbreviated IRI in this case.

A minimal test case is attached. This causes a parse error because of the "star" character used in the triple's subject. If the subject was written as http://ja.dbpedia.org/resource/ダンス☆マン then things would be fine.

The test case is extracted from this live example:
http://ja.dbpedia.org/data/東京都.nt

See also:
http://ja.dbpedia.org/page/東京都

I'm not sure which versions of Virtuoso are affected -- the one used for the site above exhibits the problem.

1 Attachments

Discussion

  • Tim Haynes
    Tim Haynes
    2014-01-29

    • assigned_to: Ivan Mikhailov