Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2024-12-09 | 9.5 kB | |
v0.15_ Experimental new CSV-, and Geographic integrations and many other fixes source code.tar.gz | 2024-12-09 | 76.8 MB | |
v0.15_ Experimental new CSV-, and Geographic integrations and many other fixes source code.zip | 2024-12-09 | 78.1 MB | |
Totals: 3 Items | 154.9 MB | 0 |
This release contains several new features, tons of fixes and two new exciting experimental new integrations:
- Experimental new CSV parser based on Deephaven-CSV. See below for more information.
- Experimental new
GeoDataFrame
class for working with geographical data (from GeoJson/Shapefile) and plotting it with Kandy. See below for more information. - Full
BigInteger
support: Just like we support theBigDecimal
numbers, DataFrame now also supportsBigInteger
in parsing, converting, statistics, column arithmetics, etc. - Custom SQL DataBase registration
- Improved parsing:
Parsing and converting
String
columns to other types is now faster. We also introduce the new experimentalParserOptions.useFastDoubleParser
setting which uses FastDoubleParser for faster and more flexibleDouble
parsing. - We continue improving our Compiler Plugin with every release. See below for more information.
- See this notebook for some more information about the changes.
New Experimental CSV integration
DataFrame's CSV parsing has been based on Apache Commons CSV from the beginning. While this has been sufficient for most applications, it had some issues like running out of memory, performance, and our API lacking in clarity, documentation, and completeness.
For DataFrame 0.15, we introduce a new separate package org.jetbrains.kotlinx:dataframe-csv
which tries to solve all these issues at once. It's based on Deephaven-CSV which makes it faster and more memory efficient. And since we built it from the ground up, we made sure the API was complete, predictable, and documented carefully.
To try it yourself, explicitly add the dependency org.jetbrains.kotlinx:dataframe-csv
to your project. In notebooks you can add enableExperimentalCsv=true
to the %use-magic, like %use dataframe(enableExperimentalCsv=true)
.
Use the new DataFrame.readCsv()
/DataFrame.readTsv()
/DataFrame.readDelim()
functions over the old DataFrame.readCSV()
ones.
We happily await your feedback!
New Experimental Geo integration
Kandy v0.8 introduces geo-plotting which allows you to visualize geospatial/geographical data using the awesome Kandy DSL. To make working with this geographical data (from GeoJson/Shapefile) easier, we happily accepted the GeoDataFrame PR from the Kandy team.
To try it yourself, explicitly add the dependency org.jetbrains.kotlinx:dataframe-geo
to your project or notebook (with the repository maven("https://repo.osgeo.org/repository/release")
) and use GeoDataFrame.readGeoJson()
or GeoDataFrame.readShapeFile()
to get started!
Features
- New CSV implementation by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/903
- GeoDataFrame init by @AndreiKingsley in https://github.com/Kotlin/dataframe/pull/909
- Change default flatten parent-child separator to "_" by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/920
- Split OpenAPI in module needed for user projects and module needed for code-generation by @koperagen in https://github.com/Kotlin/dataframe/pull/916
- Support read unstructured excel file by @khm0651 in https://github.com/Kotlin/dataframe/pull/901
- Fast double parser by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/935
- Implemented custom SQL DB registration by @zaleslaw in https://github.com/Kotlin/dataframe/pull/917
- Render FormattedFrame stored inside columns as HTML by @koperagen in https://github.com/Kotlin/dataframe/pull/944
- Adding some missing converters by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/958
- Full
BigInteger
support by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/972
Compiler Plugin
- [Compiler plugin] Lower frontend generated implicit receivers by @koperagen in https://github.com/Kotlin/dataframe/pull/869
- Generate valid code in transform(call) when interpret(call) fails by @koperagen in https://github.com/Kotlin/dataframe/pull/907
- [Compiler plugin] Support dataFrameOf(Pair<String, List<T>) by @koperagen in https://github.com/Kotlin/dataframe/pull/908
- [Compiler plugin] Add a mechanism to handle function calls to stdlib that can appear as df api arguments by @koperagen in https://github.com/Kotlin/dataframe/pull/914
- [Compiler plugin] Generate ColumnName annotations on frontend for all names that contain illegal characters by @koperagen in https://github.com/Kotlin/dataframe/pull/913
- Revert insertGenericTreeImpl by @koperagen in https://github.com/Kotlin/dataframe/pull/923
- [Compiler plugin] Propagate nullability in toDataFrame tree conversion by @koperagen in https://github.com/Kotlin/dataframe/pull/942
- Add castTo(Function) overload for workflows that use compiler plugin by @koperagen in https://github.com/Kotlin/dataframe/pull/948
- [Compiler plugin] Setup call transformer pipeline to handle (...) -> DataRow functions by @koperagen in https://github.com/Kotlin/dataframe/pull/918
- Compiler plugin read improvements by @koperagen in https://github.com/Kotlin/dataframe/pull/949
- [Compiler plugin] Support valueCounts by @koperagen in https://github.com/Kotlin/dataframe/pull/951
Fixes
- Adding contracts for
Anycol.isValueColumn
etc. for smart-casting by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/882 - Fix publish indexes docs by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/885
- Update algolia index builder by @koperagen in https://github.com/Kotlin/dataframe/pull/895
- Find KSP Configurations that are Added Later by @mgroth0 in https://github.com/Kotlin/dataframe/pull/881
- Partially inline AnyFrame typealias in return type position by @koperagen in https://github.com/Kotlin/dataframe/pull/888
- Deprecating
DataFrame.read("", delimiter =)
by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/902 - Parsing improvements by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/874
- Fixed local classes being inferred as
Any
by changing visibility check by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/929 - Open File in readExcel in read-only mode by @koperagen in https://github.com/Kotlin/dataframe/pull/931
- Adds the binary compatibility validator plugin by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/938
- Fixes nulls in framecols and improves column creation situation by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/925
- Specify slf4j-api instead of slf4j-simple. by @erikogenvik in https://github.com/Kotlin/dataframe/pull/934
- [Important Fix!] Parse started to removed unselected columns by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/947
- Fixed error message for ColumnAccessor by @zaleslaw in https://github.com/Kotlin/dataframe/pull/953
- fix crs by @AndreiKingsley in https://github.com/Kotlin/dataframe/pull/955
- Added inferNullability test for other databases by @zaleslaw in https://github.com/Kotlin/dataframe/pull/954
- Fix: disabled FastDoubleParser debug logs overload in the tests by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/956
describe()
fixes by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/937- Fix IO closing & add new useful extensions by @AndreiKingsley in https://github.com/Kotlin/dataframe/pull/960
- Remove dependency on fuel by @koperagen in https://github.com/Kotlin/dataframe/pull/969
- [fix] Parser should be skipped after it fails to parse value by @koperagen in https://github.com/Kotlin/dataframe/pull/975
- small fix for file clash at impl/io/readDelim.kt by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/976
- Fixed csv dependencies by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/977
- Bumped deprecations of
startsWith
andendsWith
in CS DSL to Error by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/978 - Version bumps for 0.15 by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/980
Docs and Examples
- READMEs by @Jolanrensen in https://github.com/Kotlin/dataframe/pull/900
- Make it clear that dataframes are immutable by @dmcg in https://github.com/Kotlin/dataframe/pull/924
- Add examples for rename by @koperagen in https://github.com/Kotlin/dataframe/pull/952
New Contributors
- @khm0651 made their first contribution in https://github.com/Kotlin/dataframe/pull/901
- @erikogenvik made their first contribution in https://github.com/Kotlin/dataframe/pull/934
- @dmcg made their first contribution in https://github.com/Kotlin/dataframe/pull/924
- @AndreiKingsley made their first contribution in https://github.com/Kotlin/dataframe/pull/909
Full Changelog: https://github.com/Kotlin/dataframe/compare/v0.14.2...v0.15.0