Hi.
I have the same Problem. Two Dash's (Alt+0150) and (Alt+0151) are also detected as illegal characters. So I tried to clean the "input" with a small function:
function clean($data){
$source_chars=array('','','');
$clean_chars=array(' Euro','-','-');
return str_replace($source_chars,$clean_chars,$data);
}
But to use it is not so easy, as it looks like.
The comes along as "" but this is also not easyly replaceable.
Even I searched for the right place for two hours, i could not find out, how to clean the input.
What I found out ist, that the illegal character is dedected in the line:
$retstr = ($asciiEncoding) ? $retstr : $this->_encodeUTF16($retstr);
and leads $retstr to function the _encodeUTF16 on the end of reader php.
It would be great, if a cleaning function could be built in...
@Vadim Tkachenko and Travis Harris:
Thanks a lot for the great work you did.
sunfish
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi. This has been a very useful script for me. However, it does not seem to understand the euro symbol. Can anyone please help? Thanks.
Hi.
I have the same Problem. Two Dash's (Alt+0150) and (Alt+0151) are also detected as illegal characters. So I tried to clean the "input" with a small function:
function clean($data){
$source_chars=array('','','');
$clean_chars=array(' Euro','-','-');
return str_replace($source_chars,$clean_chars,$data);
}
But to use it is not so easy, as it looks like.
The comes along as "" but this is also not easyly replaceable.
Even I searched for the right place for two hours, i could not find out, how to clean the input.
What I found out ist, that the illegal character is dedected in the line:
$retstr = ($asciiEncoding) ? $retstr : $this->_encodeUTF16($retstr);
and leads $retstr to function the _encodeUTF16 on the end of reader php.
It would be great, if a cleaning function could be built in...
@Vadim Tkachenko and Travis Harris:
Thanks a lot for the great work you did.
sunfish
Now I found a solution, maybe quick and dirty.
Put a new fuction at the begining of reader.php
//cp1251_utf8 function inserted - called in function _encodeUTF16($string)
//example from http://de3.php.net/manual/en/function.iconv.php
function cp1251_utf8( $sInput )
{
$sOutput = "";
echo 'input: '.$sInput.' // ';
for ( $i = 0; $i < strlen( $sInput ); $i++ )
{
$iAscii = ord( $sInput[$i] );
//echo $iAscii.' -- ';
if ( $iAscii >= 192 && $iAscii <= 255 )
$sOutput .= "&#".( 1040 + ( $iAscii - 192 ) ).";";
else if ( $iAscii == 168 )
$sOutput .= "&#".( 1025 ).";";
else if ( $iAscii == 184 )
$sOutput .= "&#".( 1105 ).";";
else if ( $iAscii == 19 )
$sOutput .= "–";
else if ( $iAscii == 20 )
$sOutput .= "—";
else if ( $iAscii == 172 )
$sOutput .= "€"; // EURO
else
$sOutput .= $sInput[$i];
}
return $sOutput;
}
and change function _encodeUTF16 at the end of reader.php
function _encodeUTF16($string){
$result = $string;
$result = cp1251_utf8($string); //inserted
/* No more needed ?
if ($this->_defaultEncoding){
switch ($this->_encoderFunction){
case 'iconv' :
$result = iconv('UTF-16LE', $this->_defaultEncoding, $string);
break;
case 'mb_convert_encoding' :
$result = mb_convert_encoding($string, $this->_defaultEncoding, 'UTF-16LE' );
break;
}
}*/
return $result;
}