Cascalog is a powerful Clojure (and Java) data processing and querying library built atop Hadoop (via Cascading), providing a high-level, Datalog-inspired abstraction for both big data processing and local computation. Cascalog is hosted at Clojars, and some of its dependencies are hosted at Conjars. Both Clo/Con-jars are maven repos that's easy to use with maven or leiningen. The Cascalog website contains more information and links to Various articles and tutorials. The best way to get started with Cascalog is experiment with the toy datasets that ship with the project.

Features

  • Expressive, Datalog-like query language that runs on Hadoop or locally
  • Simplified abstraction over Cascading to avoid low-level Hadoop complexity
  • Seamless handling of distributed Big Data workflows
  • Pure Java API (JCascalog) available for Java integration and experimentation
  • Useful for prototyping data flows that scale from local tests to production clusters
  • Draws inspiration from existing tools like Pig, Hive, and Cascading while providing richer abstraction

Project Samples

Project Activity

See All Activity >

Categories

Data Management

License

MIT License

Follow Cascalog

Cascalog Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Cascalog!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Java

Related Categories

Java Data Management System

Registered

2025-08-20