Cascalog is a powerful Clojure (and Java) data processing and querying library built atop Hadoop (via Cascading), providing a high-level, Datalog-inspired abstraction for both big data processing and local computation. Cascalog is hosted at Clojars, and some of its dependencies are hosted at Conjars. Both Clo/Con-jars are maven repos that's easy to use with maven or leiningen. The Cascalog website contains more information and links to Various articles and tutorials. The best way to get started with Cascalog is experiment with the toy datasets that ship with the project.
Features
- Expressive, Datalog-like query language that runs on Hadoop or locally
- Simplified abstraction over Cascading to avoid low-level Hadoop complexity
- Seamless handling of distributed Big Data workflows
- Pure Java API (JCascalog) available for Java integration and experimentation
- Useful for prototyping data flows that scale from local tests to production clusters
- Draws inspiration from existing tools like Pig, Hive, and Cascading while providing richer abstraction
Categories
Data ManagementLicense
MIT LicenseFollow Cascalog
Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects
Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of Cascalog!