Problem when sending and receiving special letters
Status: Beta
Brought to you by:
dsj_shock
Hi,
When the hamsam-lib receives special letters like the
danish: ц, ј and х or like т and ф, extra \0 are put in
the end of the received string.
e.g. if 'ттт' is sent to the lib, 'ттт\0\0\0' is received
The other way when special letters are sent through the
hamsam-lib, letters are cut off from the string.
e.g. if 'тттhello' is sent from the lib, 'тттhel' is received
Regards Daniel
Logged In: YES
user_id=680966
I wish I could fix this right away. Unfortunately, this could be done as part of a larger subproject - An
attempt to bring in national language support in Hamsam.
I do not know Danish or any european/asian languages. I would greatly appreciate, if anyone could help me
in this.
Logged In: YES
user_id=884338
I don't think the problem is the language. My point by showing
the example with the character '' was, that the problem
occurs whenever a character takes 4 instead of the normal 2
bytes.
For ICQ I use another lib, that works fine. I need a solution
today or as soon as possible :) and will therefore work on it.
If I find one, i'll post it...
Logged In: YES
user_id=680966
I know that the problem is not languge. The problem is certainly with Hamsam's support for languages other
than English. But I am unable to do any dev/testing on this just because neither my development platform
support all these languages nor I know how to use the keyboard etc.
So I am waiting for your solution :-)
Logged In: YES
user_id=884338
Hi,
After some hours work I found the thing, that made some
characters disappear when special characters were used,
e.g. '' and ''. The problem is found in
MsnMessage.MsnMessage(Font, Color, String) in the last line
addParam(Integer.toString(getPayloadString().length()));
The length tells the receiver-client how many characters to
read. The above length returns the amount of characters, not
the length in bytes to read. Therefore '' count as one
character (2 bytes) but should be 4 bytes. The solution is to
replace the above code with:
try {
addParam(Integer.toString(getPayloadString().getBytes
("UTF-8").length));
} catch (UnsupportedEncodingException uee) { }
The problem that \0 occurs in the end of the string must be
somewhat the opposite, that too many characters are read. I
haven't found the actual code to change, and so far I just
cut all \0 from the end :)
// Daniel J.
Logged In: YES
user_id=68628
Hi, this is not hard and has nothing to do in particular
with l10n, but you have to use the same encoding Yahoo!
Messenger expects. From Daniel's and someone else's
comments, I understand it's UTF-8 in this case. Luckily Java
has the framework to do that easily, I'll try to fix this as
this bug's been hitting me, too.
BTW, it's so easy to test this.. just cut and paste any
non-ASCII character from your favourite website/charmap/mail
and try sending it fro and to as part of a message.
Logged In: YES
user_id=68628
Sorry, I see now that this report was for MSN, but I've been
told Yahoo! is having some problems, too.
Logged In: YES
user_id=68628
Here's the fix for the Yahoo! problem.
I also removed the creation of new String instances from
previous String objects. I don't see why that would be
useful (remember that Java strings are immutable).
(Sorry for inlining the patch, but for some reason I cannot
attach files to this report. Fortunately it's small enough.)
diff -urN hamsam/protocol/yahoo/Packet.java
/home/jkohen/eclipse-bs-1_0/hamsam/hamsam/protocol/yahoo/Packet.java
--- hamsam/protocol/yahoo/Packet.java 2003-10-08
05:27:23.000000000 -0300
+++
/home/jkohen/eclipse-bs-1_0/hamsam/hamsam/protocol/yahoo/Packet.java
2004-04-07 21:37:00.000000000 -0300
@@ -20,6 +20,7 @@
package hamsam.protocol.yahoo;
import java.io.IOException;
+import java.io.UnsupportedEncodingException;
import java.util.Vector;
import java.util.Enumeration;
import hamsam.net.Connection;
@@ -325,8 +326,8 @@
// get value
if(data == 0xc0 && dataNext == 0x80)
{
- value = new String(dataArray, valueStart, i - valueStart);
this.data.add(new Integer(key));
+ value = new String(dataArray, valueStart, i -
valueStart, "UTF-8");
this.data.add(value);
gettingKey = true;
key = 0;
@@ -353,7 +354,12 @@
while(all.hasMoreElements())
{
Object elem = all.nextElement();
- dataLength += elem.toString().length() + 2;
+ try {
+ dataLength += elem.toString().getBytes("UTF-8").length + 2;
+ } catch (UnsupportedEncodingException impossible) {
+ // assert false;
+ continue;
+ }
}
@@ -393,7 +399,14 @@
while(all.hasMoreElements())
{
Object elem = all.nextElement();
- byte[] elemData = elem.toString().getBytes();
+ byte[] elemData = null;
+ try {
+ elemData = elem.toString().getBytes("UTF-8");
+ } catch (UnsupportedEncodingException impossible) {
+ // assert false;
+ continue;
+ }
+
for(int i = 0; i < elemData.length; i++)
ret[index++] = elemData[i];
@@ -414,7 +427,7 @@
public synchronized void putData(int key, String val)
{
data.add(new Integer(key));
- data.add(new String(val));
+ data.add(val);
}
@@ -446,7 +459,7 @@
*/
public String getValue(int index)
{
- return new String((String) data.elementAt(index * 2 + 1));
+ return (String) data.elementAt(index * 2 + 1);
}