I am new to regular expressions.
Can you guide me to the javadoc for jregex...
I want to identify 2 byte Japanese characters in a String and replace them with 1 byte characters. Is this possible using jregex?
Could you please tell me the Pattern and Classes to use for doing so...
Regards
Devi
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> I want to identify 2 byte Japanese characters in a String
The only problem you may encounter is getting the correct String. The jregex should handle it with no problems.
You work with Japanese string just like with any string:
As i said above, the most problematic are the things denoted by "...". Getting them working is too big topic to be covered here, so please post the specific problems you have.
> and replace them with 1 byte characters. Is this possible using jregex?
There are no such things as 1-byte characters in Java (they are always 2-byte). What do you mean?
Regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Sergey,
Thank you for the response.
I created a bit of confusion when I said 1-byte chars. What I meant was characters that lie from 0-255 on the ASCII chart. Sorry about that.
I tried a little program and I am not getting the desired output... Can you tell me where I am making a mistake....
**************************************************
Pattern p = new Pattern("\"\\w+\"");
Replacer replacer=p.replacer("\"illegal token\"");
String str1 = "'102', '102', '102', '102', \"わかりました\"";
String str2 = "'102', '102', '102', '102', \"abcdegf\"";
out.println("<br><br>output of replacer str is : "+replacer.replace(str1));
out.println("<br><br>output of replacer str is : "+replacer.replace(str2));
*****************************************************
The output for str2 is correct and "abcdegf" gets replaced. But str1 contains Japanese characters. They do not get replaced.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am new to regular expressions.
Can you guide me to the javadoc for jregex...
I want to identify 2 byte Japanese characters in a String and replace them with 1 byte characters. Is this possible using jregex?
Could you please tell me the Pattern and Classes to use for doing so...
Regards
Devi
> I want to identify 2 byte Japanese characters in a String
The only problem you may encounter is getting the correct String. The jregex should handle it with no problems.
You work with Japanese string just like with any string:
String jpString=...;
String jpPattern=...;
Pattern myPat=new Pattern(jpPattern);
Matcher m=myPat.matcher(jpString);
while(m.find()){ ... }
As i said above, the most problematic are the things denoted by "...". Getting them working is too big topic to be covered here, so please post the specific problems you have.
> and replace them with 1 byte characters. Is this possible using jregex?
There are no such things as 1-byte characters in Java (they are always 2-byte). What do you mean?
Regards
Hi Sergey,
Thank you for the response.
I created a bit of confusion when I said 1-byte chars. What I meant was characters that lie from 0-255 on the ASCII chart. Sorry about that.
I tried a little program and I am not getting the desired output... Can you tell me where I am making a mistake....
**************************************************
Pattern p = new Pattern("\"\\w+\"");
Replacer replacer=p.replacer("\"illegal token\"");
String str1 = "'102', '102', '102', '102', \"わかりました\"";
String str2 = "'102', '102', '102', '102', \"abcdegf\"";
out.println("<br><br>output of replacer str is : "+replacer.replace(str1));
out.println("<br><br>output of replacer str is : "+replacer.replace(str2));
*****************************************************
The output for str2 is correct and "abcdegf" gets replaced. But str1 contains Japanese characters. They do not get replaced.