#280 Permit loading of data delimited by spaces and handle arbitrary fields

David Benn

In a May 2011 team post, Aaron suggested: Add "spaces" to the delimiter options for the "load from file" routine. Right now it supports commas and tabs. The DOS/Fortran "ts" supports spaces so many people have data in that format. The old AAVSO data download tool exported that way too. So I think supporting "one or more consecutive spaces" as a delimiter would be very useful.

To which I replied: That turns out to be easier said than done, except if the file consists of JD and mag values. For example, it's not possible to parse a file of space separated rows in AAVSO download format without ambiguity, because some fields, like comments, have an unknown number of spaces. Comma-separated is also problematic in that regard for AAVSO download format, tab-separated less so. Adding quotes around fields when they are generated would help with that.
I actually started out with the intention of permitting space-separated files in VStar early on. The spec initially called for CSV and TSV simple file format and AAVSO download file format. Of course we've also recently added upload format as a plug-in to handle VPHOT output etc.

Correct me if I'm wrong, but I think TS manages to handle space-separated files because it only reads JD and mag values (e.g. see loadraw subroutine in ts12.f). Presumably only JD and mag matter since the period analysis functionality only uses these (e.g. not also error/uncertainty values).

However, I am keen to make importing of files as easy as possible, so if we are willing to live with some constraints, I'd be very happy to take this further. We can also do this via Observation Source plug-ins, but that is probably not be the right approach here.
See also "Loading Observation Data From a File" in VStar Help for more./


  • David Benn

    David Benn - 2011-05-19

    In a follow-up to that post, Doug said: I have enough data in space delimited format that I have to support this! While I don't want to encourage a large number of newly created formats, I would suggest that the remainder of any line be displayed in the read in data table as a "Notes" string so that it could still be viewed.

    To which I replied: Okay. This seems like an important one to support.

    I also quite like WinWWZ's data training feature: https://sourceforge.net/tracker/?func=detail&aid=2944587&group_id=263306&atid=1152052

    Last edit: David Benn 2013-05-30
  • David Benn

    David Benn - 2011-07-28

    This probably still needs more testing.

    Also, should Doug's suggestion of making additional info at the end of the line into Notes should be looked at.

    • David Benn

      David Benn - 2013-05-30

      Note that with the advent of arbitrary fields in ValidObservation, this is now possible. Adding another internal format to treat everything but Julian Date,Magnitude,[Uncertainty] as a string to be added as a detail would accomplish this.

      So, the rules would be:

      1. If it's AAVSO download format, use that.
      2. If it's simple format, use that.
      3. Otherwise, if JD, mag, and some number of fields are present, treat it as the new anonymous format. If a header comment is present, this could be used to turn field names from "Detail 1..N" to useful names.
  • David Benn

    David Benn - 2013-05-30
    • summary: Permit loading of data delimited by spaces --> Permit loading of data delimited by spaces and handle arbitrary fields

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks