#22 Delaying Cell Processor exceptions

Outstanding
open
James Bassett
None
6
2013-04-26
2012-12-03
James Bassett
No
1 up votes | 0 down votes | 100%
8 comments

Currently, CSV reading/writing stops immediately when encountering a cell processor exception. This is fine in most cases, but it would be nice to have the option to process all columns, then throw the exception. This way, the whole row of data can be fixed at once, instead of re-running after fixing each column.

This is currently (v2.0.0) achievable by using a custom cell processor (see this question on StackOverflow for an example), but it would be better for the functionality to be built in to the readers/writers.

The main use case for this is probably a batch scenario - you want to process all of the data (when writing - or CSV if reading) in 1 pass, and keep track of all of the errors so you can fix them all.

This has been on the roadmap for a while - so probably time to do it!

Discussion

  • James Bassett
    James Bassett
    2012-12-09

    • milestone: Outstanding --> 2.1.0
     
  • Alan Mehio
    Alan Mehio
    2012-12-19

    James,
    It is nice to have this feature, however, it is not enough. Usually in the GUI application, the user needs to get a feedback of which rows and which columns is being not correct and how many correct and incorrect rows. Having the exception delayed is fine but having a general call back mechanism like in case of SAX parser but with the ability to flag the exception in the call back mechanism without interrupting the process such as making the call back with different attributes/flags. In this case the end caller can store all the bad rows and the info related which are incorrect or all the right rows which are correct.
    Currently,I am overriding NotNull class and StrRegEx method execute to prevent the stop of parsing in case there is a defected row.

    Hope this help.
    Regards,
    Alan Mehio

     
  • Adam Brown
    Adam Brown
    2013-01-03

    Alan,
    To do this, couldn't you implement a "new LogAndDontStop()" cell processor, that just wraps the execute return statement in a try/catch, and then add the error message from the exception stack to log/report/output in your desired fashion?

     
  • James Bassett
    James Bassett
    2013-01-03

    Yeah, that's essentially what the linked StackOverflow answer does. The only issue is that the row is still written (which is not always desirable) - hence the feature request for better support.

     
  • Adam Brown
    Adam Brown
    2013-01-11

    The other issue, which I just ran into doing the above "SuppressExceptions" Cell processor is that since the rowContext is only passed by reference, on printing out the errors, all indicate that they came from the last column of data, when used on a reader. To do it right, looks like the cellProcessors need to copy the context object versus just including the reference in the thrown exception, since it continues to get used and updated.

     
  • James Bassett
    James Bassett
    2013-04-24

    • Group: 2.1.0 --> Outstanding
     
  • James Bassett
    James Bassett
    2013-04-24

    Deferring to next release

     
  • Feng Dihai
    Feng Dihai
    2013-04-26

    It is very helpful to create a BAD file including only failed lines rather than being interrupted.