Menu

#74 StringEncoder cuts off last character

v0.3.9
open
nobody
5
2014-08-14
2010-10-15
Axel
No

The current implemenation fo StringEncoder is always cutting off the last character of any data it reads because it assumes that it would be a zero byte.

Unfortunately our provider (mblox) is sending String optional values without them and therefore we just get a corrupt value.

I was able to work around this by using my own encoder readFrom() implemenation:

public Object readFrom(final Tag tag, final byte[] b, final int offset, final int length)
{
try
{
final byte[] bytes = new byte[length];
System.arraycopy(b, offset, bytes, 0, length);
if (bytes[length - 1] == (byte) 0) // 0 terminated -> get rid of last byte
{
return new String(b, offset, length - 1, ASCII);
}
else
// otherwise read fully !!!
{
return new String(bytes, ASCII);
}
}
catch (final java.io.UnsupportedEncodingException x)
{
// Java spec requires US-ASCII support
throw new RuntimeException(ASCII_UNSUPPORTED_MSG);
}
}

Discussion

  • Oran Kelly

    Oran Kelly - 2010-10-15

    The current behaviour of the smppapi is correct as per spec. TLVs that use the StringEncoder are defined in the spec as C-Octet Strings, which the spec explicitly defines as being ASCII bytes with a nul terminator. As such, I will be leaving the current behaviour as the default for the API.

    To support the incorrect behaviour of mblox, I guess I could put a hack into StringEncoder that uses an APIConfig property to decide whether or not to read or write the nul-terminator. Something like this:
    [code]
    public void writeTo(Tag tag, Object value, byte[] b, int offset) {
    try {
    String s = value.toString();
    int len = s.length();

            byte[] b1 = s.getBytes(ASCII);
            // Don't encode the nul-terminator of the mblox hack is
            // enabled.
            if (!mbloxHack) {
                System.arraycopy(b1, 0, b, offset, len);
                b[offset + len] = (byte) 0;
            }
        } catch (java.io.UnsupportedEncodingException x) {
            // Java spec _requires_ US-ASCII support
            throw new RuntimeException(ASCII_UNSUPPORTED_MSG);
        }
    }
    

    [/code]

    Enabling or disabling the behaviour could then be controlled via the API config properties as loaded by the APIConfig class. Sound like a reasonable solution?

     
  • Axel

    Axel - 2010-10-18

    Before adding hacks i'd prefer to leave it to a custom encode on our side.

    I was looking in the 3.4 spec (quickly) and did not find a place where it says that TLVs should use C-Octet Strings, but i might not have searched hard enough.

    If it's actually legal to use plain ASCII Strings what about providing both CStringEncode and StringEncoder to let framework users decide which one they need ?

     
MongoDB Logo MongoDB