Menu

#124 Support: Can beanVerifer validate all errors in a row

v1.0 (example)
open
nobody
None
5
2024-04-20
2024-04-13
No

Hello OpenCSV team,

First thank you for the project.
I need some help with an use case. My project has validation for every column and need to collect all the errors from every column at once.
I want to get all these exceptions all in a list with csvToBean.getCapturedExceptions().
The problem is beanVerifier stop as soon as the first CsvConstraintViolationException is thrown.

I thought I can work around by not throwing the error, but instead store the errors in ExceptionHandler, but csvToBean.getCapturedExceptions() don't seem to see the error and return an empty list.

I have a minimize example project here. https://github.com/yatw/openCsvValidationDemo/blob/master/src/test/java/com/yatw/opencsvvalidationdemo/OpencsvvalidationdemoApplicationTests.java

I want to check with the team if this is limitation of the project, or I am just not using beanVerifier correctly? Thank you.

Sincerely,
Yat Man, Wong

Discussion

  • Scott Conway

    Scott Conway - 2024-04-14

    So if you want to look at a working example download the opencsv code and look at AnnotationTest. Specifically the testMultipleExceptionsPerLine method.

    Looking at your example, which thank you for loading that to github for me as that made it way easier to see what is going on., the first thing that jumped out at me was that you don't need the .withExceptionHandler(new ExceptionHandlerQueue()) as that is the default that is used when you call the .withThrowExceptions(false). You only need to use the withExceptionHandler if you have your own custom handler you want to use.

    So in your code it is also not going to work because you have one verifier doing all three things so it will automatically skip the others as soon as it throws an exception. So using the existing test as an example I broke it into separate verifiers.

    class OpencsvvalidationdemoApplicationTests {
    
        @Test
        void mytest() throws IOException {
            /**
             * Sample file
             * ID,Building Name,Country
             * ,testBuilding1,
             * 12345,testBuilding2,
             *
             * Sample validation check non null for every column
             * Expectation: csvToBean.getCapturedExceptions() return all errors from the same row, for every row
             */
            InputStream fileStream = new ClassPathResource("apr11.csv").getInputStream();
            Reader reader = new BufferedReader(new InputStreamReader(fileStream));
    
    
            HeaderColumnNameMappingStrategy<BuildingDTO> strategy = new HeaderColumnNameMappingStrategy<>();
            strategy.setType(BuildingDTO.class);
    
            BeanVerifier<BuildingDTO> nameVerifier = buildingDTO -> {
                if (buildingDTO.getName() == null){
                    throw new CsvConstraintViolationException("Name is required");
                }
                return true;
            };
    
            BeanVerifier<BuildingDTO> countryVerifier = buildingDTO -> {
                if (buildingDTO.getCountry() == null){
                    throw new CsvConstraintViolationException("Country is required");
                }
                return true;
            };
    
            BeanVerifier<BuildingDTO> idVerifier = buildingDTO -> {
                if (buildingDTO.getId() == null){
                    throw new CsvConstraintViolationException("ID is required");
                }
                return true;
            };
            CsvToBean<BuildingDTO> csvToBean = new CsvToBeanBuilder<BuildingDTO>(reader)
                    .withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
                    .withMappingStrategy(strategy)
                    .withVerifier(idVerifier)
                    .withVerifier(nameVerifier)
                    .withVerifier(countryVerifier)
                    .withThrowExceptions(false)
                    .build();
    
            csvToBean.setThrowExceptions(false);
            List<BuildingDTO> buildings = csvToBean.parse();
            System.out.println("This is buildings");
            System.out.println(buildings);
            List<CsvException> exceptions = csvToBean.getCapturedExceptions();
            System.out.println("This is exceptions");
            System.out.println(exceptions);
            //This is buildings
            //[BuildingDTO(id=null, name=testBuilding1, country=null), BuildingDTO(id=null, name=testBuilding2, country=null)]
            //This is exceptions
            //[]
            reader.close();
        }
    

    The issue though is that it did not work. I think that may be because of the lazy evaluation of the lambdas. You really need to move those into their own private inner class or a separate class. Sorry I would have tested that myself but things are getting busy this weekend at my house so I will leave it to you to test it out.

    Scott :)

     
  • Yat Man, Wong

    Yat Man, Wong - 2024-04-19

    Thank you Scott! I tested moving the verifiers into their own class. The behavior I found is only the exception from the first verifier is returned from getCapturedExceptions(), the latter verifiers are not executed after the first verifier thrown an exception for that row.

    For example in this change, CountryVerifier was set first so getCapturedExceptions() return the exception from that verifier.
    https://github.com/yatw/openCsvValidationDemo/commit/657a9121e97b0a48ecad938e978e8ab58d34ef99#diff-b238fe60176ed02237cddabacb5b4ab05b6f7c26ede636fe8550faa77797979e
    If I instead set NameVerifier before CountryVerifier, the exception from NameVerifier would be thrown instead.

    I also tried testMultipleExceptionsPerLine() from source code. It works well but it is using the validation from annotation so I am thinking the issue is in the verifier.

    Does BeanVerifier meant to support multiple errors in the same row?

     
    • Andrew Rucker Jones

      No, BeanVerifier is not meant to support multiple errors in the same row. As far as the implementation goes, BeanVerifier throws an exception if anything is wrong, which interrupts the normal flow of the program. It would be difficult to make that work for multiple errors in one row. Conceptually, I decided not to provide for the case of multiple errors per row because I figured if there was one error, it might create other errors by its mere existence. I'm thinking of programming in compiled languages like C thirty years ago when IDEs were little more than text editors. Back then, all you had to do was miss one semicolon, and the compiler spat out hundreds of errors. All of those errors did the programmer no good—there was really only one error, right at the beginning.

       
  • Scott Conway

    Scott Conway - 2024-04-20

    To expand on the cascading errors problem we have had quite a bit of support requests where data was mis quoted causing shifts in the columns so after the first error you really don't know if anything after that is really an error or if its because you are looking at the wrong data for a given column.

     
  • Yat Man, Wong

    Yat Man, Wong - 2024-04-20

    thank you both. The reason I have this use case is so that the user can see all the errors during an import, avoiding the need to fix the errors one by one. I could workaround by reading the CSV rows into a Bean where all properties are string and run my validation logic after.

    For the case of data miss one semicolon, I see OpenCSV is already throwing "CsvRequiredFieldEmptyException: Number of data fields does not match number." I agree just this error is sufficient to indicate the file is corrupted and no need to run further validation.

    The ideal experience I am thinking is if the file is corrupted, throws CsvRequiredFieldEmptyException. Otherwise if all the columns are present, run all the BeanVerifier logic. Let me know if I am oversimplifying it.

    But of course I didn't open the ticket to request a new feature. Thanks for helping me understand the library better.

     

    Last edit: Yat Man, Wong 2024-04-20

Log in to post a comment.