Machine learning in Python
Training data (data labeling, annotation, workflow) for all data types
Label Studio is a multi-type data labeling and annotation tool
A free, open-source, and cross-platform big data analytics framework
The open-source tool for building high-quality datasets
A reactive notebook for Python
AutoGluon: AutoML for Image, Text, and Tabular Data
High-level, high-performance dynamic language for technical computing
Data science on data without acquiring a copy
Create HTML profiling reports from pandas DataFrame objects
Analyzing, storing and visualizing big data, scientifically
Uncover insights, surface problems, monitor, and fine tune your LLM
Detecting silent model failure. NannyML estimates performance
Python Stream Processing
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
C++ DataFrame for statistical, Financial, and ML analysis
A framework for real-life data science
Making Enterprise Data Intelligent and Responsive for AI
A system for quickly generating training data with weak supervision
Train machine learning models within Docker containers
Toolkit for making machine learning and data analysis applications
A self-hostable CDN for databases
Test Suites for validating ML models & data
A curated list of data mining papers about fraud detection