Unsplash images made available for research and machine learning
A powerful tool for creating datasets for LLM fine-tuning
This dataset code generates mathematical question and answer pairs
Photorealistic Synthetic Dataset for Holistic Indoor Scene
JSON to DataSet and DataSet to JSON converter for Delphi and Lazarus
Passport Index 2023: visa requirements for 199 countries, in .csv
ExDARK dataset is the largest collection of low-light images
The first large-scale public benchmark dataset for image harmonization
GeoIP lookup over DAG-CBOR dataset loaded from IPFS
Framework to easily create LLM powered bots over any dataset
Tooling for the Common Objects In 3D dataset
Fluid, elastic data abstraction and acceleration for BigData/AI apps
Julia implementation of Parquet columnar file format reader
Dataset Management Framework, a Python library and a CLI tool to build
Unified open dataset enabling cross-embodiment learning for robotics
Data and tools for generating and inspecting OLMo pre-training data
Automatically find issues in image datasets
Hub of ready-to-use datasets for ML models
An in-memory database that persists on disk
A dataset consists of 15,140 ChatGPT prompts from Reddit
The Abstraction and Reasoning Corpus
An open source implementation of CLIP
Import public NYC taxi and for-hire vehicle (Uber, Lyft)
A tool for semi-automatic cell type classification, harmonization
A list of online news & info sources in the AI/ML/Data Science space