A powerful tool for creating datasets for LLM fine-tuning
This dataset code generates mathematical question and answer pairs
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Unsplash images made available for research and machine learning
JSON to DataSet and DataSet to JSON converter for Delphi and Lazarus
Passport Index 2023: visa requirements for 199 countries, in .csv
ExDARK dataset is the largest collection of low-light images
The first large-scale public benchmark dataset for image harmonization
GeoIP lookup over DAG-CBOR dataset loaded from IPFS
Framework to easily create LLM powered bots over any dataset
Hub of ready-to-use datasets for ML models
Fluid, elastic data abstraction and acceleration for BigData/AI apps
Julia implementation of Parquet columnar file format reader
Dataset Management Framework, a Python library and a CLI tool to build
Data and tools for generating and inspecting OLMo pre-training data
Tooling for the Common Objects In 3D dataset
A list of online news & info sources in the AI/ML/Data Science space
A dataset consists of 15,140 ChatGPT prompts from Reddit
Unified open dataset enabling cross-embodiment learning for robotics
An in-memory database that persists on disk
A tool for semi-automatic cell type classification, harmonization
An open source implementation of CLIP
Automatically find issues in image datasets
Image polygonal annotation with Python
Easily turn large sets of image urls to an image dataset