Minsky / Ravel / #627 Crash on Loading Horizontal Data

#627 Crash on Loading Horizontal Data

Milestone: Babbage

Status: closed

Owner: High Performance Coder

Labels: None

Priority: 1Fatal

Updated: 2024-08-19

Created: 2024-08-18

Creator: Steve Keen

Private: No

The easy fix for this was to transpose the data using Excel, but it's still mysterious why this file crashed Ravel.

The MP4 files are too large so I'll send them as Google Doc links when I can reply from within gmail.

1 Attachments

CrashOnFileImport20240818BB.rvl

Discussion

Steve Keen - 2024-08-18

Data files.

EnergySince1900.csv

PrimEn_Area.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Steve Keen - 2024-08-18
  
  Movies of crashes
  CrashOnFileImport20240818A.mp4
  https://drive.google.com/file/d/1WcJMo_R4ntjVn7CKRwpJrP117NPaY1Hx/view?usp=drive_web
  CrashOnFileImport20240818B.mp4
  https://drive.google.com/file/d/1HBVbIryvIZzwt0Sc7NOy6XpS5Newlr6c/view?usp=drive_web
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

High Performance Coder - 2024-08-18

Obviously crashing is not good. This file is not a CSV file, it is of the type "merged delimiters", where the delimiter is a space. There is an option within Ravel to handle that file. However, even worse , it is UTF-16 encoded, so Ravel will never be able to load such files. You have to convert it using XL or other spreadsheet programs that can handle UTF-16 and convert it to UTF-8. In time, we might add UTF-16 handling (big job, though), but for now, we do need to detect the format, and display an error message.

In time, people should stop using UTF-16 too. It is an abomination, particularly for latin-based alphabets!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

High Performance Coder - 2024-08-19

status: open --> closed

assigned_to: High Performance Coder
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

High Performance Coder - 2024-08-19

Found a solution to prevent crashes or infinite loops on bad input data (trying to parse a UTF-16 file as UTF-8 is effectively bad data). And also added UTF-16/32 BOM detection and throw for now. Will add UTF-16/32 support as a separate feature ticket for consideration later.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.