Connect processes into powerful data pipelines with a simple git-like filesystem interface. DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define complex build pipelines over version-controlled data. DataKit is currently used as the coordination layer for HyperKit, the hypervisor component of Docker for Mac and Windows, and for the DataKitCI continuous integration system. src contains the main DataKit service. This is a Git-like database to which other services can connect. ci contains DataKitCI, a continuous integration system that uses DataKit to monitor repositories and store build results. The easiest way to use DataKit is to start both the server and the client in containers.

Features

  • The easiest way to build the DataKit project is to use docker
  • Git-like database to which other services can connect
  • DataKit is currently used as the coordination layer for HyperKit
  • Continuous integration system
  • DataKit allows you to define complex build pipelines over version-controlled data
  • A tool to orchestrate applications

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow DataKit

DataKit Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DataKit!

Additional Project Details

Operating Systems

Mac, Windows

Programming Language

OCaml (Objective Caml)

Related Categories

OCaml (Objective Caml) Database Software, OCaml (Objective Caml) CI CD, OCaml (Objective Caml) Data Pipeline Tool

Registered

2022-07-27