jTDS - SQL Server and Sybase JDBC driver / Discussion / Help: Unable to insert Arabic data

tnabil - 2006-09-06

Hi,

I'm using jTDS to access a SQL server 2000 database, as part of a migration project from plain old ASP and VB COM objects to Java EE.

I'm also using iBatis for database access.

I have a problem inserting Arabic values in the database, when the data type of the field is varchar.

I don't have a problem with nvarchar fields.

Previously, I had a problem with both read and update, but adding the "charset=Cp1256" property to the URL fixed the read problem, unfortunately, it didn't work for the insert/update.

For that particular case, I switched to plain JDBC in order to debug the problem, and I found a solution, albeit a very awkward one. The solution will only work with plain JDBC (not iBatis), so I'm not inclined to use it.

The solution I found was to set the value using setBinaryStream instead of setString. I'm encoding the String using Cp1256 to a byte array and then using a ByteArrayInputStream to set the value on the CallableStatement.

I would very much appreciate if someone can guide me on how to fix this problem without those kinds of tricks, so I can revert back to iBatis.

Please note that the ASP code works with the same database.

Here's the code I used to fix the problem (I'm using a prepared statement here to simplify things a bit as the original procedure has 46 parameters):

String update = "update tmp_company_request set company_name_ar = ? where registration_id = ?"; ps = con.prepareStatement(update); String value = "اختبار آخر للغة العربية"; // Those are Arabic characters InputStream is = Utilities.getStringAsByteStream(value); ps.setBinaryStream(1, is, is.available()); ps.setInt(2, 703); ps.execute();

The utility method:

public static ByteArrayInputStream getStringAsByteStream(String value) { try { byte[] valueBytes; if (value == null) valueBytes = new byte[0]; else valueBytes = value.getBytes(Constants.DB_CHARSET); // Cp1256 return new ByteArrayInputStream(valueBytes); } catch (UnsupportedEncodingException e) { throw new SystemException(new StringBuffer( "An error occurred while attempting to use the ") .append(Constants.DB_CHARSET) .append(" character encoding").toString(), e); } }

The database server configuration as provided by the DBA is as follows:

Ø Server default collation Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data

Ø select @@language

us_english

Ø SQL Server 2000 supports these code pages.

Code page Description

1258 Vietnamese

1257 Baltic

1256 Arabic <-- Relevant

1255 Hebrew

And so on...

Every help is appreciated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Alin Sinpalean - 2006-09-06
  
  Why do you expect storing and retrieving arabic characters (Cp1256) in VARCHAR columns in a Latin-1 (aka Cp1252) database to work?
  
  Either use NVARCHAR or create the database/relevant column(s) with a Cp1256 charset.
  
  BTW the "read works, update/insert doesn't" part, is not very helpful. It should work, but there are tens of reasons why it might not, so unless you provide some code, database schema and description of what happens I can't really help. And -- this is quite important -- even if this gets fixed and starts working for you, it's still a hack and switching to another JDBC driver or DB access library will most likely break all your existing data. So do it right and take my first suggestion.
  
  Alin.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - tnabil - 2006-09-06
    
    Alin,
    
    Thanks a lot for your help.
    
    Had this not been a migration project, then I would have taken your first suggestion right away. But, it's impossible to mess with those fields, there are multiple existing applications that are using them, and changing the fields (even if the data could be migrated) will mandate change to all of them.
    
    I can supply the code of a test case, and here it is
    
    [code]
    Connection con = null;
    PreparedStatement ps = null;
    
    try { con = ds.getConnection(); con.setAutoCommit(false); String update = "update tmp_company_request set company_name_ar = ? where registration_id = ?"; ps = con.prepareStatement(update); ps.setString(1, "طارق نبيل"); ps.setInt(2, 703); ps.execute(); con.commit(); } finally { if (ps != null) { try { ps.close(); } catch (SQLException e) { e.printStackTrace(); } } if (con != null) { try { con.close(); } catch (SQLException e) { e.printStackTrace(); } } }
    
    [/code]
    
    This code causes question marks to be inserted in the field.
    
    As for the table definition, here it is (the column of interest is company_name_ar)
    
    CREATE TABLE [tmp_company_request] (
    [registration_id] numeric IDENTITY (1, 1) NOT NULL ,
    [registration_date] [smalldatetime] NOT NULL ,
    [registration_type] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    [company_type] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    [company_code] [char] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [company_name] [varchar] (35) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    [company_name_ar] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [business_name] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [po_boxno] [varchar] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    .
    .
    Some more column definitions
    .
    .
    [receipt_amount] numeric NULL ,
    [receipt_location] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    CONSTRAINT [PK_tmp_company_request] PRIMARY KEY CLUSTERED
    (
    [registration_id]
    ) ON [PRIMARY] ,
    CONSTRAINT [company_type_CK] CHECK ([company_type] = 'I' or [company_type] = 'F' or [company_type] = 'C' or [company_type] = 'S'),
    CONSTRAINT [registration_status_CK] CHECK ([registration_status] = 'P' or [registration_status] = 'A' or [registration_status] = 'R'),
    CONSTRAINT [registration_type_CK] CHECK ([registration_type] = 'N' or [registration_type] = 'M' or [registration_type] = 'A')
    ) ON [PRIMARY]
    GO
    
    If there's more information that can help solve the problem, please tell me.
    
    Related
    
    Code: code
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Unable to insert Arabic data

Forums

Help

Unable to insert Arabic data document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Related

Unable to insert Arabic data