Menu

#606 Add a virtual unique id to import form

Backlog
open
nobody
feature (71)
6normal
2024-07-08
2024-06-26
Steve Keen
No

This is one of those files I was going to do a Ravel to Python comparison with. The original post is at:

https://medium.com/@jltlee/exploratory-analysis-of-the-dry-bean-dataset-unveiling-insights-and-patterns-8a6add89acdb

Firstly I couldn't import the data (of course) because there's no unique identifier for each row. So I thought I could use the counter checkbox, but that appeared to aggregate all the data. So I used Excel to add an extra "Datapoint" column and used that as a dimension. Then the data imported, but only for the very first entry in Class: the rest were blank.

We do need a way to add a unique identified--which I thought counter would do that.

Anyway, when I did import, only the very first class had data.

3 Attachments

Discussion

  • High Performance Coder

    No - that is not the point of counter. And unique IDs per row is not useful to Ravel - those columns shouldm be ignored.

     
  • High Performance Coder

    This sort of dataset is a typical machine learning dataset.
    There is only one axis here - class - the remaining columns consist of a bunch of variables (ML 'features').

    The analysis you can do? Average, standard deviation, max, min of the features are the obvious ones. Maybe one could do correlation analysis between different features

    This example doesn't really exhibit "Ravel superiority", as you can do the conceivable analysis in a spreadsheet for about as much effort.

     
  • High Performance Coder

    Actually, there is a use case for adding unique record ids. See below, where I'm attempting to compute the correlation between features and classes. The statistics are summed over the records within each class and feature.

    But for now, it is easy enough to add an extra column like you did, so I'll mark this as a worthwhile feature request.

     
  • High Performance Coder

    • labels: --> feature
    • Priority: 2Critical --> 6normal
    • Milestone: Pascal --> Backlog
     
  • High Performance Coder

    • summary: Missing Data on Import --> Add a virtual unique id to import form
     

Log in to post a comment.