In the example contained in the .zip file the fields I and J (the COMP fields) are seen as Float data when you run it by cb2java copybook. But if you parse the example data the resulting output for those columns are BigDecimal, I expected Double.
Am I wrong, or this is a bug?
Regards,
Sven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Actually that is the intended 'natural' object for COBOL floating point types. The reason I made the decision to go with this was that float types are not consistent across platforms in COBOL. In AS400 (the platform I work with) floating point numbers are IEEE 754 just like Java but this is not true on all platforms. I believe, intuitively, that all floating point numbers can be represented as infinite length decimals, but I haven't proven this or anything. I'm actually a little surprised someone is actually using this feature. I merely added it for (semi-)completeness.
You can get the number as a float or a double (depending on the size) by casting to the appropriate Data class. Don't take this as a brush off. I'd really like to have a better solution than this. If you have an idea, please give me some advice.
And as always, if you would like to contribute or have any other problems or ideas please let me know. I'm not a COBOL expert by any stretch of the imagination and could use any and all help you care to give.
thanks,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for clarifying... I was comparing the "meta-data" and actual data output and it stood out a bit.
I'm giving it a shot to build in CB2java into an ETL tool, it's working pretty good so far :D . I did make a few small changes in the interfaces (getLength(), decimalPlaces() from private to public) but nothing else so far. I will let you know how it works out.
Regards,
Sven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm interested in seeing any changes you have made. If you are going to be using this as a core part of another tool, it might make sense to join this project. Just remember that this library is GPL. This pretty much has to be (as I understand it) because it uses source from another GPL project (cb2xml).
I'm pretty much ready to move this a production ready status (I've used it for 'real' project work) but I feel I need more validation. I'd also like to build a test suite. I should have done that first, I know.
I have some ideas for adding Visitor/Builder classes to make this library easier to use. If that's something you could use let me know.
How is the performance. I've seen some slowness on AS400 but I think that has to do with reading the copybooks from multi-member files.
thanks for your interest,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Oh, one more thing. If you are not using IEEE 754 floats, your floats probably won't parse properly. I only have support for IEEE 754 built in right now. OK, yeah, that kind of seems stupid given my last post. It's ready to support more types in theory but I haven't done so basically because IEEE 754 parsing is built into Java and I didn't really figure anyone else would use it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The only part I changed so far is make getLength() and decimalPlaces() public starting from Element downwards.
The idea is to incorporate it in a step in PDI/Kettle (http://kettle.pentaho.org) eventually. For the moment it's just a proof of concept thing, but the functionality seems to work ok. I'm currently still searching for example copybooks and data files to test further with.
Depending on what will happen it will either end up in PDI (also GPL'ed), or as a small separate open source project.
Regards,
Sven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK so getLength on the Element class gives you the length of the serialized data and I think you are referring to the decimalLength method on Numeric.
If you don't mind can you explain why you need the length method? This is just the byte length of the serialized data and (I thought) only useful for parsing and writing. What am I missing?
thanks,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for creating this project. My task is to import COBOL data to the current application and I am using cb2java for this task. I must say that cb2java saves me months of development. Thanks again.
As with Sven Boden, I made Element.getLength() to be public. Also, the in Copybook.parse(InputStream), I returns a list of Records instead of a single record. I also pass in a number of bytes to skip after each record since some data files contain new line as a data record separator.
Thanks,
Trung Nguyen.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Trung. Thanks for providing feedback and I'm glad this is helping you. I am still considering the project alive even though I have not made any updates to it lately, in case anyone is wondering.
Returning a single record instead of multiples was definitely an oversight. I will try to incorporate the changes you and Sven have added in a new build soon. If you have any interest in joining the project, that is a possibility too.
Once the library is at a 1.0 release, I will probably add a 2.0 version that includes generics. I would also like to create a custom parse because it seems to me that the loading for the sable classes takes a long time relative to the rest of the processing. That's my best guess at this point anyway. It would also allow for different licensing schemes. Right now cb2java uses the CB2XML parsing libraries and that project is GPL.
And of course I have a wish list of things like bean support and a visitor implementation.
Please continue to provide feedback, especially if you run into any issues.
thanks again,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I use them for display only purposes, I'll send you some screenshots after it's more finished (give me a week or so). Essentially a user can enter a copybook in a text window and then in another window he will be able to see (but not adapt) the fieldnames and fieldtypes that will be used in PDI after the extraction, for the moment I put length in input file in there as well as information.
Regards,
Sven
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Now that you mention it, I think I've needed this for my little testing gui. I think that I let Jython access private members though. This should be public. There's really no reason for it not to be. I like to try and understand where my preconceptions get off track.
thanks,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
IEEE754.fromBytes(byte[] input, Precision p)
{
long bits = 0;
//original: not sure why it does not work
//for (int i = 0; i < p.bytes; i++) {
// bits = bits | (0xFF & input[i]);
// if (i < p.bytes - 1) bits = bits << 8;
// tdn:
int shift = 0;
for (int i = 0; i < input.length; i++) {
bits |= ((long)(0xFF & input[i])) << shift;
shift += 8;
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If the system is IEEE 754, I was expecting these types to work. It's been a while since I looked at this stuff. If you are not on IEEE754 and you are feeling adventurous, feel free to take a shot at implementing the code to handle this on your platform. There is a way to set which float parser to use configuration file, there's just only one implementation.
I guess I missed the trailing sign. But is it not possible to have the leading char be the sign character?
If you have verified that your version works, I'll just incorporate it. I'm not inclined to figure out why the other doesn't work if we have a working version.
If you would like me to make you a developer on the project, it might be easier.
let me know.
thanks,
-James
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For the IEEE754 issue: with the original code it works for the small numbers but not large ones. The updated works for all but I am not sure if this is the system dependent issue.
For the sign issue: In PC platform using Micro Focus COBOL, the numeric display field can be defined as one of the following syntax (the first one is the default).
Thanks again for your continued interest. So from what I understand, we need to support both a leading and a trailing sign even when the sign is not separate. I somehow neglected to deal with this. And I'm also reading that we need to support a configurable default.
I hate to be pushy but I'm asking you directly if you will join this project as a developer. It will allow you to make these changes you need directly into the project. It won't entail any large commitments. You can just make the changes you need for your work, that will be great.
If you are not sure how to work with svn, it's no problem. It's just like cvs but better. A few minutes with the manual and you'll be all set.
-Matt
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have added you as a developer. Could you put the changes you have made into SVN? I'd like to build another release soon. Otherwise, you could just send me your sources.
thanks,
-Matt (James)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for I didn't response promptly. I am reluctant to check in the code since the parts I changed are system dependent. I should wait until we have some kind of configuration.
-Trung
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In the example contained in the .zip file the fields I and J (the COMP fields) are seen as Float data when you run it by cb2java copybook. But if you parse the example data the resulting output for those columns are BigDecimal, I expected Double.
Am I wrong, or this is a bug?
Regards,
Sven
Sven-
Actually that is the intended 'natural' object for COBOL floating point types. The reason I made the decision to go with this was that float types are not consistent across platforms in COBOL. In AS400 (the platform I work with) floating point numbers are IEEE 754 just like Java but this is not true on all platforms. I believe, intuitively, that all floating point numbers can be represented as infinite length decimals, but I haven't proven this or anything. I'm actually a little surprised someone is actually using this feature. I merely added it for (semi-)completeness.
You can get the number as a float or a double (depending on the size) by casting to the appropriate Data class. Don't take this as a brush off. I'd really like to have a better solution than this. If you have an idea, please give me some advice.
And as always, if you would like to contribute or have any other problems or ideas please let me know. I'm not a COBOL expert by any stretch of the imagination and could use any and all help you care to give.
thanks,
-James
Thanks for clarifying... I was comparing the "meta-data" and actual data output and it stood out a bit.
I'm giving it a shot to build in CB2java into an ETL tool, it's working pretty good so far :D . I did make a few small changes in the interfaces (getLength(), decimalPlaces() from private to public) but nothing else so far. I will let you know how it works out.
Regards,
Sven
I'm interested in seeing any changes you have made. If you are going to be using this as a core part of another tool, it might make sense to join this project. Just remember that this library is GPL. This pretty much has to be (as I understand it) because it uses source from another GPL project (cb2xml).
I'm pretty much ready to move this a production ready status (I've used it for 'real' project work) but I feel I need more validation. I'd also like to build a test suite. I should have done that first, I know.
I have some ideas for adding Visitor/Builder classes to make this library easier to use. If that's something you could use let me know.
How is the performance. I've seen some slowness on AS400 but I think that has to do with reading the copybooks from multi-member files.
thanks for your interest,
-James
Oh, one more thing. If you are not using IEEE 754 floats, your floats probably won't parse properly. I only have support for IEEE 754 built in right now. OK, yeah, that kind of seems stupid given my last post. It's ready to support more types in theory but I haven't done so basically because IEEE 754 parsing is built into Java and I didn't really figure anyone else would use it.
The only part I changed so far is make getLength() and decimalPlaces() public starting from Element downwards.
The idea is to incorporate it in a step in PDI/Kettle (http://kettle.pentaho.org) eventually. For the moment it's just a proof of concept thing, but the functionality seems to work ok. I'm currently still searching for example copybooks and data files to test further with.
Depending on what will happen it will either end up in PDI (also GPL'ed), or as a small separate open source project.
Regards,
Sven
OK so getLength on the Element class gives you the length of the serialized data and I think you are referring to the decimalLength method on Numeric.
If you don't mind can you explain why you need the length method? This is just the byte length of the serialized data and (I thought) only useful for parsing and writing. What am I missing?
thanks,
-James
Hello dubwai,
Thanks for creating this project. My task is to import COBOL data to the current application and I am using cb2java for this task. I must say that cb2java saves me months of development. Thanks again.
As with Sven Boden, I made Element.getLength() to be public. Also, the in Copybook.parse(InputStream), I returns a list of Records instead of a single record. I also pass in a number of bytes to skip after each record since some data files contain new line as a data record separator.
Thanks,
Trung Nguyen.
Hi Trung. Thanks for providing feedback and I'm glad this is helping you. I am still considering the project alive even though I have not made any updates to it lately, in case anyone is wondering.
Returning a single record instead of multiples was definitely an oversight. I will try to incorporate the changes you and Sven have added in a new build soon. If you have any interest in joining the project, that is a possibility too.
Once the library is at a 1.0 release, I will probably add a 2.0 version that includes generics. I would also like to create a custom parse because it seems to me that the loading for the sable classes takes a long time relative to the rest of the processing. That's my best guess at this point anyway. It would also allow for different licensing schemes. Right now cb2java uses the CB2XML parsing libraries and that project is GPL.
And of course I have a wish list of things like bean support and a visitor implementation.
Please continue to provide feedback, especially if you run into any issues.
thanks again,
-James
decimalLength() as well of course...
I use them for display only purposes, I'll send you some screenshots after it's more finished (give me a week or so). Essentially a user can enter a copybook in a text window and then in another window he will be able to see (but not adapt) the fieldnames and fieldtypes that will be used in PDI after the extraction, for the moment I put length in input file in there as well as information.
Regards,
Sven
Now that you mention it, I think I've needed this for my little testing gui. I think that I let Jython access private members though. This should be public. There's really no reason for it not to be. I like to try and understand where my preconceptions get off track.
thanks,
-James
Hi James,
After putting cb2java through more tests, I made some additional changes. So far, I am able to read data for the following data types:
05 f-alphabetic pic A(9).
05 f-alpha-numeric pic X(9).
05 f-numeric pic s9(9)V99.
05 f-comp-1 comp-1.
05 f-comp-2 comp-2.
05 f-comp-3 pic s9(9)V99 comp-3.
I am not able to use these types because they are binary and are system dependence.
* 05 f-comp pic s9(9)V99 comp.
* 05 f-comp-4 pic s9(9)V99 comp-4.
* 05 f-comp-5 pic s9(9)V99 comp-5.
* 05 f-comp-x pic 9(9) comp-x.
* 05 f-binary pic s9(9)V99 binary.
-------------------------------------------------
And below are the changes I made:
Decimal.parse(byte[] bytes)
{
String input = getString(bytes).trim();
//original: sign leading
//char c = input.charAt(0);
//String s = (isPositive(c) ? "" : "-") + getNumber(c) + input.toString().substring(1);
//tdn: sign trailing (last char contains the sign)
int last = input.length() - 1;
char c = input.charAt(last);
String s = (isPositive(c) ? "" : "-") + input.toString().substring(0, last-1) + getNumber(c);
Packed.parse(byte[] input)
{
//original: sign leading
//boolean negative = signed() && (input[0] & 0x0F) == 0x0D;
//tdn: sign trailing (last nybble contains the sign)
byte lastByte = input[input.length -1];
boolean negative = signed() && (lastByte & 0x0F) == 0x0D;
IEEE754.fromBytes(byte[] input, Precision p)
{
long bits = 0;
//original: not sure why it does not work
//for (int i = 0; i < p.bytes; i++) {
// bits = bits | (0xFF & input[i]);
// if (i < p.bytes - 1) bits = bits << 8;
// tdn:
int shift = 0;
for (int i = 0; i < input.length; i++) {
bits |= ((long)(0xFF & input[i])) << shift;
shift += 8;
Trung-
If the system is IEEE 754, I was expecting these types to work. It's been a while since I looked at this stuff. If you are not on IEEE754 and you are feeling adventurous, feel free to take a shot at implementing the code to handle this on your platform. There is a way to set which float parser to use configuration file, there's just only one implementation.
I guess I missed the trailing sign. But is it not possible to have the leading char be the sign character?
If you have verified that your version works, I'll just incorporate it. I'm not inclined to figure out why the other doesn't work if we have a working version.
If you would like me to make you a developer on the project, it might be easier.
let me know.
thanks,
-James
Hi James,
For the IEEE754 issue: with the original code it works for the small numbers but not large ones. The updated works for all but I am not sure if this is the system dependent issue.
For the sign issue: In PC platform using Micro Focus COBOL, the numeric display field can be defined as one of the following syntax (the first one is the default).
* 05 f1 pic s9 sign trailing.
* 05 f2 pic s9 sign trailing separate.
* 05 f3 pic s9 sign leading.
* 05 f4 pic s9 sign leading separate.
I guess that AS400 platform chose the 3rd syntax as the default.
Thanks,
-Trung
Trung-
Thanks again for your continued interest. So from what I understand, we need to support both a leading and a trailing sign even when the sign is not separate. I somehow neglected to deal with this. And I'm also reading that we need to support a configurable default.
I hate to be pushy but I'm asking you directly if you will join this project as a developer. It will allow you to make these changes you need directly into the project. It won't entail any large commitments. You can just make the changes you need for your work, that will be great.
If you are not sure how to work with svn, it's no problem. It's just like cvs but better. A few minutes with the manual and you'll be all set.
-Matt
Hi Matt,
I agree with you that there is a need for configurable default.
I will be glad to joint the project. Probably as a QA on PC platform initially. I have been working with both csv and svn.
Thanks,
-Trung
Trung-
I have added you as a developer. Could you put the changes you have made into SVN? I'd like to build another release soon. Otherwise, you could just send me your sources.
thanks,
-Matt (James)
Hi Math,
Sorry for I didn't response promptly. I am reluctant to check in the code since the parts I changed are system dependent. I should wait until we have some kind of configuration.
-Trung
I have added the issues mentioned in this thread as bugs in the tracker. I would like to start maintaining them there to ensure they are not lost.
If I have missed any issues or new ones are discovered, please add them to the bug tracker. Feel free to continue to post them here too, if desired.
thanks,
-James