Shouldn't be neceessary to ever reload file on 3rd tab
System dynamics program with additional features for economics
Brought to you by:
hpcoder,
profstevekeen
CSV file
Really weird. It looks like an utterly straightforward file, but Ravel treats most axes as ignore, and the Name field isn't populated except for the first column.
Upping this to fatal since it's impossible to get the file into Ravel.
A related file had the same problem: https://archive.researchdata.leeds.ac.uk/1234/#
This is the metafile for the data https://archive.researchdata.leeds.ac.uk/1234/1/README.txt
Using the context menu I was able to specify the start of the data row/column, then identify the columns. But the import failed on row 301, where from the error message the first data column is being treated as an axis.
I'm guessing that this has something to do with UTF-8 encoding. Whatever, we need to be able to handle a file that appears as straightforward as this--maybe by converting to ASCII from UTF-8 on loading, if Ravel can only handle ASCII.
Whoops! On closer inspection, it appears that rows repeat--non-unique values. I'll check with the data collator, who's at the same seminar I'm at right now.
Ravel has no problems with this file. See attached video for the process...
PS - the presence of N/A in the data columns is probably what caused the auto format guesser to fail. If they had been NaN, instead, it would have been fine.
Confirmed--editing NA to NaN meant no problems apart from the duplicate
rows (about which it seems the author of the file is unaware: he's at this
workshop I'm attending). But this indicates the need for the Missing Value
field to be on the 3rd tab in the import process rather than the 4th.
On Fri, Jan 10, 2025 at 08:08:08AM -0000, Steve Keen wrote:
That is not what that field is for. Its for substituting another value
for missing values (default NaN) - lets say to 0. I'm not sure anyone has
actually found a use for it.
Being able to specify what pattern corresponds to not a value in the
dataset is not particularly helpful. It is only relevant for
autoguessing the format, so by the time you get a dialog box, your
past that stage. Once the format is defined, Ravel treats all
non-numerical strings as a missing value.
--
Dr Russell Standish Phone 0425 253119 (mobile)
Principal, High Performance Coders hpcoder@hpcoders.com.au
http://www.hpcoders.com.au
Ok. But is it possible to add a multi valued field to the import routine’s
guess for what represents a NaN? The import routine should be as wrinkle
free as possible, and the fact that its guess here resulted in no fields
being recognised initially was not good.
No - because it uses the presence of numerical data to infer the the distinction between data and metadata. NaN and Inf are numerical data, as well the normal scientific notation +/-X.XXXXE+/-XXX where X are digits (0-9). If the data is all numerical, or contains non-standard strings to represent non-a-number, there's not much we can do.
I do agree it is annoying that it doesn't load all columns with selector boxes, ie it shouldn't be necessary to 'reload', unless it fails to recognise the separator or quote characters. Ie the reload button shouldn't be necessary on the 3rd tab, but even though I've looked at Javascript code, I don't really understand why it does that.
Ticket moved from /p/minsky/ravel/665/
Can't be converted: