I tried below: Cobol2JsonSchema.newCobol2Json(copyBookFile).cobol2jsonSchema(printStream); but this is printing the actual data in json format and not the schema definition
Does it support Avro format ?
Does it support Avro format ?
THanks @bruce_a_martin. is there possibility to output the avro schema along with the file. that is helpful for parsing large json files to avoid schema inference in other tools for processing. I noticed for large json processing its better to supply the schema in avro format instead of inference. schema inference is very expensive operation for large datasets.
THanks @bruce_a_martin. is there possbility to output the json schema along with the file. that is helpful for parsing large json files to avoid schema inference in other tools for processing.
THanks @bruce_a_martin. is there possbility to redirect errors generated to a file output or supress it so it does not show up in stdout same as final json output ?
@bruce_a_martin. thanks for the update. flatten option will be great addition. setWriteCheck addresses the redefines issue suggested earlier ?
@bruce_a_martin. thanks for the update. flatten option will be great addition. are you planning to address the redefine issue as well ?
@bruce_a_martin thatnks for the update. I will test it out. what is the option or flag to supress groups ? i assume this helps us flatten out the record. this will be great option saving space and compressing large data further
Hi @bruce_a_martin - i noticed that non-pretty flag iss resulting in errors . i tried validating using jq. pretty formatting is not yeilding any errors. parse error: Unfinished string at EOF at line 1, column -1313092230
@bruce_a_martin, below is snippet of the code: Cobol2Json.newCobol2Json(copyBookFile) .setFileOrganization(IFileStructureConstants.IO_BIN_TEXT) .setPrettyPrint(false) .setSplitCopybook(CopybookLoader.SPLIT_01_LEVEL) .setTagFormat(IReformatFieldNames.RO_CAMEL_CASE) .setDropCopybookNameFromFields(true) .setRecordSelection("DAROOT", Cobol2Json.newFieldSelection("RRC-TAPE-RECORD-ID","01")) .setRecordSelection("DAPERMIT" , Cobol2Json.newFieldSelection("RRC-TAPE-RECORD-ID","02")) .setRecordSelection("DAFIELD",...
Thanks @bruce_a_martin. I noticed that in your latest updated jars this function is not working: setDropCopybookNameFromFields(). Whether true or false its still adding the copybookname. So right now when I set it to true or false, I am getting following output: { "allpermits": [{ "daremark": { "rrcTapeRecordId": "12", "daRemarksSegment": { "daRemarkSequenceNumber": 1, "daRemarkFileDate": { "daRemarkFileCentury": 19, "daRemarkFileYear": 87, "daRemarkFileMonth": 3, "daRemarkFileDay": 3 }, "daRemarkLine":...
Thanks @bruce_a_martin. I noticed that in your latest updated jars this function is not working: setDropCopybookNameFromFields(). Whether true or false its still adding the copybookname. So right now when I set it to true or false, I am getting following output: { **"allpermits":** [{ "daremark": { "rrcTapeRecordId": "12", "daRemarksSegment": { "daRemarkSequenceNumber": 1, "daRemarkFileDate": { "daRemarkFileCentury": 19, "daRemarkFileYear": 87, "daRemarkFileMonth": 3, "daRemarkFileDay": 3 }, "daRemarkLine":...
@bruce_a_martin how can i setPrettyPrint via commandline ?
Thanks @bruce_a_martin. Is there way in CBL2JSON I can avoid trimming of leading 0's for certain fields. I know you wrote a helper function to read the raw values in another thread. how can we apply that here in CBL2JSON Thanks
Thanks @bruce_a_martin. That was quick turn around. I tested it. It finised till the end. Any ideas why I am getting below errors. Is there way to parse these special unicode characters ? Line Number: 9030479 Error: Invalid Record Type �NO ALLOWABLE WILL BE ASSIGNED UNTIL THF0000000000 854e4f20414c4c4f5741424c452057494c4c2042452041535349474e454420554e54494c2054484630303030303030303030 Line Number: 9117824 Error: Invalid Record Type � P0000000000 8520202020202020202020202020205030303030303030303030...
Thanks @bruce_a_martin. Can you also look in to following options for long term: * Output pretty format or not * Stream output to stdout instead to a file How soon can I get the updated jar for the error handling and skip records that can't be read ? Thanks
Hi @bruce_a_martin if its using jackson for streaming then thats great. I tried running below code. It works almost 80% of the way. I am not sure why there is an error on line 3124060 java -jar Cobol2Json.jar -cobol /diskf/RRCDataFiles/allPermits.cbl -fileOrganisation Text -split 01 -recordSelection DAROOT RRC-TAPE-RECORD-ID=01 -recordSelection DAPERMIT RRC-TAPE-RECORD-ID=02 -recordSelection DAFIELD RRC-TAPE-RECORD-ID=03 -recordSelection DAFLDSPC RRC-TAPE-RECORD-ID=04 -recordSelection DAFLDBHL RRC-TAPE-RECORD-ID=05...
Thanks @bruce_a_martin. Does cbl2json support streaming of data in chunks ?
Thanks for the update @bruce_a_martin. I am not sure how REDEFINES need to be handled. I am new to COBOL data structures. Should only one of the fields be extracted ? How should it be handled on JRecord side ? If you can share example code to tackle this that will be great. Thanks
I am noticing this is occuring in DAPERMIT section wherever a field has REDEFINES.
Yes. I have attached it.
I am wondering if its to do with records where there are REDEFINES and those need to be removed since they share existing space with another.
Hi @bruce_a_martin. I am trying to parse DAPERMIT section of the cobol data file based on the structure defined here: https://www.rrc.texas.gov/media/ezxjqdmn/oga049.pdf I convert each record to a json object. within the object I noticed some fields are not being read correctly These ones I noticed are not being parsed correctly: daSurfaceSurveyDirection1 daSurfaceSurveyFeet2 daSurfaceSurvey daSurfaceSurveyFeet1 daSurfaceLeaseDirection2 daNearestLeaseLine daNearestWellFeet daNearestWell daSurfaceAbstract...
I noticed the SEQUENCE-NUMBER defines if there is a records continuation from previous number
Thanks @bruce_a_martin are you able to share what the code might look like ? How will I determine the end of the string here ? Can P0000000000 be treated as newline character ?
Hi @bruce_a_martin in the copybook file I have lots filler entries like below 01 DAREMARK. 02 RRC-TAPE-RECORD-ID PIC X(02). 02 DA-REMARKS-SEGMENT. 03 DA-REMARK-SEQUENCE-NUMBER PIC 9(03) VALUE ZEROS. 03 DA-REMARK-FILE-DATE. 05 DA-REMARK-FILE-CENTURY PIC 9(02) VALUE ZEROS. 05 DA-REMARK-FILE-YEAR PIC 9(02) VALUE ZEROS. 05 DA-REMARK-FILE-MONTH PIC 9(02) VALUE ZEROS. 05 DA-REMARK-FILE-DAY PIC 9(02) VALUE ZEROS. 03 DA-REMARK-LINE PIC X(70) VALUE SPACES. 03 FILLER PIC X(10) VALUE ZEROS. 02 RRC-TAPE-FILLER...
I updated the CopyBook file section to below and it worked. Thanks: 01 DAW999A1. 05 RRC-TAPE-RECORD-ID PIC X(02). 05 DA-SURF-LOC-LONGITUDE PIC 9(5)V9(9) VALUE SPACES. 05 DA-SURF-LOC-LATITUDE PIC 9(5)V9(9) VALUE SPACES. 01 DAW999B1. 05 RRC-TAPE-RECORD-ID PIC X(02). 05 DA-BOTTOM-HOLE-LONGITUDE PIC 9(5)V9(9) VALUE SPACES. 05 DA-BOTTOM-HOLE-LATITUDE PIC 9(5)V9(9) VALUE SPACE
Hi @bruce_a_martin I have managed to generate Java code to read and extract the records. I have an issue with following section: 01 DAW999A1. 05 DA-SURF-LOC-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-SURF-LOC-LATITUDE PIC 9(5)V9(7) VALUE SPACES. 01 DAW999B1. 05 DA-BOTTOM-HOLE-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-BOTTOM-HOLE-LATITUDE PIC 9(5)V9(7) VALUE SPACES. Record number 14 and 15 have latitude and longitude values that represent in the data file like this: 14 -94.3251710 31.4884060 15...
Hi @bruce_a_martin I have managed to generate Java code to read and extract the records. I have an issue with following section: 01 DAW999A1. 05 DA-SURF-LOC-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-SURF-LOC-LATITUDE PIC 9(5)V9(7) VALUE SPACES. 01 DAW999B1. 05 DA-BOTTOM-HOLE-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-BOTTOM-HOLE-LATITUDE PIC 9(5)V9(7) VALUE SPACES. Record number 14 and 15 have latitude and longitude values that represent in the data file like this: 14 -94.3251710 31.4884060 15...
Hi @bruce_a_martin I have managed to generate Java code to read and extract the records. I have an issue with following section: 01 DAW999A1. 05 DA-SURF-LOC-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-SURF-LOC-LATITUDE PIC 9(5)V9(7) VALUE SPACES. 01 DAW999B1. 05 DA-BOTTOM-HOLE-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-BOTTOM-HOLE-LATITUDE PIC 9(5)V9(7) VALUE SPACES. Record number 14 and 15 have latitude and longitude values that represent in the data file like this: 14 -94.3251710 31.4884060 15...
Hi @bruce_a_martin I have managed to generate Java code to read and extract the records. I have an issue with following section: 01 DAW999A1. 05 DA-SURF-LOC-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-SURF-LOC-LATITUDE PIC 9(5)V9(7) VALUE SPACES. 01 DAW999B1. 05 DA-BOTTOM-HOLE-LONGITUDE PIC 9(5)V9(7) VALUE SPACES. 05 DA-BOTTOM-HOLE-LATITUDE PIC 9(5)V9(7) VALUE SPACES. Record number 14 and 15 have latitude and longitude values that represent in the data file like this: 14 -94.3251710 31.4884060 15...
Bruce. Thanks for your prompt reply. Commenting that line out worked. However when you open the data file the encoding seems to be ASCII. So will there still be corruption with the comp field ? What is your recommendation to handle the comp field in the ASCII data file to avoid such corruption or incorrect parsing of the file.
Hi, I am trying to read a single copybook that represents multiple records in a ASCII file output. I am new to cobol and have been trying to parse the copybook in RecordEditor to generate the Java code. I can successfully load the copybook under Utilities -> Cobol Copybook Anaylsis. When I try import the copybook using "Load Cobol Record Layout" I get String index out of range: 0 error. I need some guidance if the format and structure of copybook is valid and also if the file structure is correct...