From: Grant M. <gr...@us...> - 2004-11-11 09:25:24
|
Update of /cvsroot/perl-xml/perl-xml-faq In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18818 Modified Files: perl-xml-faq.xml Log Message: - added Q&A re 'use utf8;' in 5.8 Index: perl-xml-faq.xml =================================================================== RCS file: /cvsroot/perl-xml/perl-xml-faq/perl-xml-faq.xml,v retrieving revision 1.18 retrieving revision 1.19 diff -u -d -r1.18 -r1.19 --- perl-xml-faq.xml 11 Nov 2004 08:57:33 -0000 1.18 +++ perl-xml-faq.xml 11 Nov 2004 09:25:15 -0000 1.19 @@ -1401,7 +1401,7 @@ </formalpara> <programlisting><![CDATA[ -use utf8; +use utf8; # Only needed for 5.6, not 5.8 or later s/([\x{80}-\x{FFFF}])/'&#' . ord($1) . ';'/gse; ]]></programlisting> @@ -1413,8 +1413,8 @@ s/([^\x20-\x7F])/'&#' . ord($1) . ';'/gse; ]]></programlisting> - <para>This version does not require 'use utf8'; does not require a - version of Perl which recognises \x{NN} and handles characters + <para>This version does not require 'use utf8' with Perl 5.6; does not + require a version of Perl which recognises \x{NN} and handles characters outside the 0x80-0xFFFF range.</para> <para>Even if you are outputting Latin1, you will need to use a technique @@ -1495,7 +1495,7 @@ </formalpara> <programlisting><![CDATA[ -use utf8; +use utf8; # Not required with 5.8 or later my $u_city = "S\x{E3}o Paulo"; my $l_city = pack("C*", unpack('U*', $u_city)); @@ -1581,6 +1581,33 @@ </qandaentry> + <qandaentry id="use_utf8"> + <question> + <para>What does 'use utf8;' do?</para> + </question> + + <answer> + + <para>In Perl 5.8 and later, the sole use of the 'use utf8;' pragma is to + tell Perl that your script is written in UTF-8 (ie: any non-ASCII or + multibyte characters should be interpreted as UTF-8). So if your code is + plain ASCII, you don't need the pragma.</para> + + <para>The original UTF8 support in Perl 5.6 required the pragma to + enable wide character support for builtin functions (such as length) + and the regular expression engine. This is no longer necessary in 5.8 + since Perl automatically uses character rather than byte semantics + with strings that have the utf8 flag set.</para> + + <para>You can find out more about how Unicode handling changed in + Perl 5.8 from the <ulink + url="http://search.cpan.org/dist/perl/pod/perl58delta.pod">perl58delta.pod</ulink> + file that ships with Perl.</para> + + </answer> + + </qandaentry> + <qandaentry id="encoding_common"> <question> <para>What are some commonly encountered problems with encodings?</para> |