I'm using jTDS to access a SQL server 2000 database, as part of a migration project from plain old ASP and VB COM objects to Java EE.
I'm also using iBatis for database access.
I have a problem inserting Arabic values in the database, when the data type of the field is varchar.
I don't have a problem with nvarchar fields.
Previously, I had a problem with both read and update, but adding the "charset=Cp1256" property to the URL fixed the read problem, unfortunately, it didn't work for the insert/update.
For that particular case, I switched to plain JDBC in order to debug the problem, and I found a solution, albeit a very awkward one. The solution will only work with plain JDBC (not iBatis), so I'm not inclined to use it.
The solution I found was to set the value using setBinaryStream instead of setString. I'm encoding the String using Cp1256 to a byte array and then using a ByteArrayInputStream to set the value on the CallableStatement.
I would very much appreciate if someone can guide me on how to fix this problem without those kinds of tricks, so I can revert back to iBatis.
Please note that the ASP code works with the same database.
Here's the code I used to fix the problem (I'm using a prepared statement here to simplify things a bit as the original procedure has 46 parameters):
String update =
"update tmp_company_request set company_name_ar = ? where registration_id = ?";
ps = con.prepareStatement(update);
String value = "اختبار آخر للغة العربية"; // Those are Arabic characters
InputStream is = Utilities.getStringAsByteStream(value);
ps.setBinaryStream(1, is, is.available());
ps.setInt(2, 703);
ps.execute();
The utility method:
public static ByteArrayInputStream getStringAsByteStream(String value) {
try {
byte[] valueBytes;
if (value == null)
valueBytes = new byte[0];
else
valueBytes = value.getBytes(Constants.DB_CHARSET); // Cp1256
return new ByteArrayInputStream(valueBytes);
} catch (UnsupportedEncodingException e) {
throw new SystemException(new StringBuffer(
"An error occurred while attempting to use the ")
.append(Constants.DB_CHARSET)
.append(" character encoding").toString(), e);
}
}
The database server configuration as provided by the DBA is as follows:
Ø Server default collation Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data
Ø select @@language
us_english
Ø SQL Server 2000 supports these code pages.
Code page Description
1258 Vietnamese
1257 Baltic
1256 Arabic <-- Relevant
1255 Hebrew
And so on...
Every help is appreciated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Why do you expect storing and retrieving arabic characters (Cp1256) in VARCHAR columns in a Latin-1 (aka Cp1252) database to work?
Either use NVARCHAR or create the database/relevant column(s) with a Cp1256 charset.
BTW the "read works, update/insert doesn't" part, is not very helpful. It should work, but there are tens of reasons why it might not, so unless you provide some code, database schema and description of what happens I can't really help. And -- this is quite important -- even if this gets fixed and starts working for you, it's still a hack and switching to another JDBC driver or DB access library will most likely break all your existing data. So do it right and take my first suggestion.
Alin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Had this not been a migration project, then I would have taken your first suggestion right away. But, it's impossible to mess with those fields, there are multiple existing applications that are using them, and changing the fields (even if the data could be migrated) will mandate change to all of them.
I can supply the code of a test case, and here it is
[code]
Connection con = null;
PreparedStatement ps = null;
Hi,
I'm using jTDS to access a SQL server 2000 database, as part of a migration project from plain old ASP and VB COM objects to Java EE.
I'm also using iBatis for database access.
I have a problem inserting Arabic values in the database, when the data type of the field is varchar.
I don't have a problem with nvarchar fields.
Previously, I had a problem with both read and update, but adding the "charset=Cp1256" property to the URL fixed the read problem, unfortunately, it didn't work for the insert/update.
For that particular case, I switched to plain JDBC in order to debug the problem, and I found a solution, albeit a very awkward one. The solution will only work with plain JDBC (not iBatis), so I'm not inclined to use it.
The solution I found was to set the value using setBinaryStream instead of setString. I'm encoding the String using Cp1256 to a byte array and then using a ByteArrayInputStream to set the value on the CallableStatement.
I would very much appreciate if someone can guide me on how to fix this problem without those kinds of tricks, so I can revert back to iBatis.
Please note that the ASP code works with the same database.
Here's the code I used to fix the problem (I'm using a prepared statement here to simplify things a bit as the original procedure has 46 parameters):
The utility method:
The database server configuration as provided by the DBA is as follows:
Ø Server default collation Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data
Ø select @@language
Ø SQL Server 2000 supports these code pages.
Code page Description
1258 Vietnamese
1257 Baltic
1256 Arabic <-- Relevant
1255 Hebrew
And so on...
Every help is appreciated.
Why do you expect storing and retrieving arabic characters (Cp1256) in VARCHAR columns in a Latin-1 (aka Cp1252) database to work?
Either use NVARCHAR or create the database/relevant column(s) with a Cp1256 charset.
BTW the "read works, update/insert doesn't" part, is not very helpful. It should work, but there are tens of reasons why it might not, so unless you provide some code, database schema and description of what happens I can't really help. And -- this is quite important -- even if this gets fixed and starts working for you, it's still a hack and switching to another JDBC driver or DB access library will most likely break all your existing data. So do it right and take my first suggestion.
Alin.
Alin,
Thanks a lot for your help.
Had this not been a migration project, then I would have taken your first suggestion right away. But, it's impossible to mess with those fields, there are multiple existing applications that are using them, and changing the fields (even if the data could be migrated) will mandate change to all of them.
I can supply the code of a test case, and here it is
[code]
Connection con = null;
PreparedStatement ps = null;
[/code]
This code causes question marks to be inserted in the field.
As for the table definition, here it is (the column of interest is company_name_ar)
CREATE TABLE [tmp_company_request] (
[registration_id] numeric IDENTITY (1, 1) NOT NULL ,
[registration_date] [smalldatetime] NOT NULL ,
[registration_type] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[company_type] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[company_code] [char] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[company_name] [varchar] (35) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[company_name_ar] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[business_name] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[po_boxno] [varchar] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
.
.
Some more column definitions
.
.
[receipt_amount] numeric NULL ,
[receipt_location] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
CONSTRAINT [PK_tmp_company_request] PRIMARY KEY CLUSTERED
(
[registration_id]
) ON [PRIMARY] ,
CONSTRAINT [company_type_CK] CHECK ([company_type] = 'I' or [company_type] = 'F' or [company_type] = 'C' or [company_type] = 'S'),
CONSTRAINT [registration_status_CK] CHECK ([registration_status] = 'P' or [registration_status] = 'A' or [registration_status] = 'R'),
CONSTRAINT [registration_type_CK] CHECK ([registration_type] = 'N' or [registration_type] = 'M' or [registration_type] = 'A')
) ON [PRIMARY]
GO
If there's more information that can help solve the problem, please tell me.
Related
Code: code