Schema Detection
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer
Brought to you by:
slothflowlabs
Originally created by: KNP-BI
Originally owned by: SouravRoy-ETL
Firstly, super excited for this project. Love what I see so far. 👏
Just a quick test on a CSV file that has a Date column in dd/mm/yy format.
This is commonly misinterpreted with auto-detect and ends up assuming yyyy-mm-dd.
To get around this, I typically set to string and transform later.
When I set this column to string in the schema section, save, and run.
dateyyyy-mm-ddAm I doing something wrong in the UI?
What is the correct way to force string?
Originally posted by: SouravRoy-ETL
Fixed in 385b618. Good catch.
You weren't doing anything wrong - the Schema panel actually wasn't being honored at execution time. The user-declared schema was stored on the node but
build_csv_sourcewas emitting plainread_csv_auto(...), which ignores the saved schema and re-detects every column. With a dd/mm/yy column that means DuckDB picks the wrong format and you get the yyyy-mm-dd output you saw.The fix threads
node.data.schemathrough to the SQL builder. When you set a column to string (or any other type) in the Schema panel, the generated SQL is now:DuckDB's
columns = {...}argument explicitly skips auto-inference for the listed columns, so VARCHAR stays VARCHAR and the rest of the row keeps its declared type. Works for src.csv and src.tsv today.Released in https://github.com/SouravRoy-ETL/duckle/releases/tag/v0.1.0-hotfix
Thanks for the clear repro.
Originally posted by: SouravRoy-ETL
@all-contributors please add @KNP-BI for infrastructure, tests and code
Originally posted by: allcontributors[bot]
@SouravRoy-ETL
I've put up a pull request to add @KNP-BI! 🎉