A curated list of data mining papers about fraud detection
Multi-purpose serial data visualization & processing
Upserts, Deletes And Incremental Processing on Big Data
Data and tools for generating and inspecting OLMo pre-training data
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
Data Science Guide With Videos And Materials
OpenGL Mathematics (GLM)
A simple interface for working with TeX documents
Non-Blocking Reactive Foundation for the JVM
Official HDF5® Library Repository
Concurrent and multi-stage data ingestion and data processing
Production-ready data processing made easy and shareable
The lxml XML toolkit for Python
GridDB is a next-generation open source database
Unified programming model for Batch and Streaming
A multi-cloud framework for big data analytics
A unified analytics engine for large-scale data processing
Miller is like awk, sed, cut, join, and sort for name-indexed data
A distributed and extensible workflow scheduler platform
HTML Loader
A modern library for 3D data processing
Flink CDC is a streaming data integration tool
Blazing-fast Data-Wrangling toolkit
Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX
A standalone, large scale, open project for 2D/3D image processing