[gobo-eiffel-develop] File URI resolution

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

An interesting problem has emerged from one of the W3C XSLT tests.

The test looks like this:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <?spec xslt#document?>

<xsl:template match="/">
  <out>
      <!-- test use of an escaped URI with document() function -->
      <xsl:copy-of select="document('xgespr%C3%A4ch.xml')/*"/>
  </out>
</xsl:template>

</xsl:stylesheet>

Note the (relative) file URI:

xgespr%C3%A4ch.xml

The percent encoding equates to lower case a-umlaut.

Now on my linux system, file name encoding is UTF-8 (I think), but the
file supplied with the test suite exists on my system with the
a-umlaut as a Latin-1 character. I am not sure if this is a bug with
the unzip command, or what (in fact, I am very unsure about everything
to do with this - the one thing I am certain about is that the percent
encoding MUST be interpreted as UTF-8, and the file name decoded
accordingly).

Not surprsingly, the test fails on my system.

But I begin to wonder if the test would not also have failed if the
file name were correct as a UTF-8 file name.

The resolver uses UT_FILE_URI_ROUTINES, whichj in turn make use of

file_system.pathname_to_string (uri_to_pathname (a_uri))

As far as I can tell, the net result of this is to pass a UTF-8 byte
string to the Eiffel runtime as a STRING/STRING_8.

I suspect the runtime expects a Latin-1 name, but I don't know.
What about on a windows system with the file system using UTF-16.

Manu, can you tell me what the ISE runtime expects? And does it make a
difference if you pass a STRING_32?
-- 
Colin Adams
Preston Lancashire