Menu

Quotes for all values

nt4cats
2008-04-17
2012-09-14
  • nt4cats

    nt4cats - 2008-04-17

    I have to send data to a CSV Parser (not SuperCSV) that wants all non-numeric fields to be surrounded by double-quotes. The specific record I'm writing has all text fields -- therefore all of the fields in this record need to be quoted. Suppose that this record had three fields, this is what the output should look like:

    "Bob","Smith","Boston"
    "Susan","Jones","New York"
    "Terry","Walsh","Brussels"
    "Chris","Fury","Canberra"

    I wrote this:

    class QuotedStringProcessor extends CellProcessorAdaptor implements StringCellProcessor
    {
    public QuotedStringProcessor()
    {
    super();
    }

                public QuotedStringProcessor(CellProcessor next)
                {
                    super(next);
                }
    
                public Object execute(final Object o, final CSVContext csvContext)
                {
                    Object returnVal = o;
    
                    StringBuffer quoted = new StringBuffer("\"").append(o.toString()).append("\"");
                    returnVal = quoted.toString();
    
                    return next.execute(returnVal,csvContext);
                }
            }
    

    ... thinking that would wrap all values in quotes.

    After implementing the above SuperCSV "noticed" that the resulting strings contained quotes and produced something close to the following ouptut:

    """Bob""","""Smith""","""Boston"""
    """Susan""","""Jones","""New York"""
    """Terry""","""Walsh","""Brussels"""
    """Chris""","""Fury","""Canberra"""

    I understand why SuperCSV did that -- but I don't know how to get the result I want.

    Can you point me in the right direction? You don't need to write the code for me, but I'd appreciate a hint that says something like "If you implement a descendant of the Blah class that overrides the whatever() method ....". Thanks!

     
    • Kasper B. Graversen

      Hi nt4cats

      LOL what an adventure :-) I'm afraid there is no such solution at the moment, but as you say, maybe we should define an interface RawOutput that would ensure data won't be mutated... I'll have a look at it for the next release..

      cheers,
      kasper

       
    • nt4cats

      nt4cats - 2008-04-30

      I wonder if it would make sense to provide a slightly higher-level concept/abstraction explicitly supporting always-quoted fields. This implementation may require a RawOutput interface ... but maybe it is as simple as changing your current logic that decides whether or not to surround output with quotes to recognize the field "type" (or definition) as requiring quotes instead of basing that decision solely on the content of the output.

      I'll crack open the source and see if I can provide something more constructive (if I end up implementing something I'll send it back to you to consider including in a future release).

       
    • nt4cats

      nt4cats - 2008-04-30

      ... to re-state my idea ...

      Today buried somewhere in SuperCSV there is logic that says:

      if(field_data.contains(special_characters))
      {
      // The output data has quotes, commas, or some other special characters. Surround with quotes!
      output_stream.print("\"");
      output_stream.print(field_data);
      output_stream.print)"\"");
      }
      else
      {
      // No quotes are needed. Just write the data
      output_stream.print(field_data);
      }

      .......

      Instead this could say:
      if(field_data.contains(special_characters) || field_definition.quotesRequired() )
      {
      // The output data has quotes, commas, or some other special characters -- or this field is
      // defined as always needing quotes. Surround with quotes!
      output_stream.print("\"");
      output_stream.print(field_data);
      output_stream.print)"\"");
      }
      else
      {
      // No quotes are needed. Just write the data
      output_stream.print(field_data);
      }

       
    • Kasper B. Graversen

      Hi nt4cats

      Yes that sure is an idea. How do you see this being configured, ie. how would you set up the writer and start writing? we have a list of cell processors, do you also want a list of cell-requiring-" ? i think that would be a bit verbose. maybe we can introduce a special cellprocessor to ensure the "'s.. but then there is the problem that the writer operates on the raw data, not the processors.. so.. any ideas? ;)

      cheers, :)
      k

       
    • nt4cats

      nt4cats - 2008-06-03

      Well, I had to hack something together, so I just checked out the source from subversion and had to make relatively minor mods to two files. This was much more of a "quick and dirty" change than an elegant one, but I think it will work. I'll be actually trying it out later this afternoon or tomorrow.

      I went along the lines of the "list of cells requiring quotes" angle. I added an (optional) array of booleans to the AbstractCsvWriter (and the ICsvWiter interface) for the "requires quotes" flag -- including simple getter and setter methods for the array. If a user doesn't provide an array, the helper methods assume you don't want the "quotes required" behavior -- thus preserving backwards compatibility. I then made a few helper methods for wrapping a field in quotes (after checking to make sure the something else like the escape() method didn't already wrap it), checking the array to see if a field requires quotes, etc. I then just changed the two occurrence of:

      outStream.write(escapeString(content[i]));
      --with--
      outStream.write(wrapWithQuotesIfNeeded(escapeString(content[i]),i));

      I'll e-mail you the two files I changed after I do testing.

       
    • Kasper B. Graversen

      Hi Nt4cats

      sounds good! We are about to do a beta release any day now.. maybe we can have it in by the time of the real release if your implementation strategy proves not too be clumsy to use in practice...

      thanks!
      kasper

       
    • Paul taylor

      Paul taylor - 2009-02-24

      I was also expecting this behaviour (although I dont have a particular reason for needing it), I would suggets the most sensible way to enable it would be to add another preference to CvsPreferences.quoteAllStringvalues()

       
      • Kasper B. Graversen

        Actually, I think as suggested in this thread, that doing it on a per-processor is what's most useful... who wants to give back to the project ? :)

         
    • Brad Ostrand

      Brad Ostrand - 2009-04-20

      We are upgrading a legacy system that quotes all values and would like to see this feature added as well.

      I guess it would be nice to have the feature on a cell by cell basis, but for now globally quoting all the values would be great.

       
      • Kasper B. Graversen

        Hi Brad

        Yes, please send me a patch file containing the said constructors for all processors, the added boolean variable and the check in the writer, then i'll integrate it.

        cheers,

         
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.