22 projects for "big data" with 2 filters applied:

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    testng

    testng

    TestNG testing framework

    TestNG is a testing framework inspired from JUnit and NUnit but introduces some new functionalities that make it more powerful and easier to use. Run your tests in arbitrarily big thread pools with various policies available (all methods in their own thread, one thread per test class, etc...).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Bacalhau

    Bacalhau

    Community-driven, simple, yet powerful framework

    Bacalhau is a decentralized compute platform for running jobs on data stored across distributed networks, like IPFS or Filecoin, without moving the data to centralized cloud environments. It allows developers to run containerized workloads close to where the data lives, reducing latency, cost, and privacy risks. Bacalhau supports various runtime environments and is designed to make decentralized data processing as accessible as traditional cloud computing. It’s especially useful for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    ...With Spark Streaming (microbatches) and Structured Streaming, it delivers low-latency event processing suitable for real-time analytics. The built-in MLlib library provides scalable machine learning algorithms, while GraphX enables graph computations integrated with data pipelines. Spark supports multiple languages—Scala, Java, Python, R—and connects with many storage systems like HDFS, S3, Cassandra, and streaming platforms like Kafka, making it a versatile choice for big data workloads in analytics, ETL, and data science.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    huihut interview

    huihut interview

    A summary of C/C++ technical interview basics

    ...It’s organized to be approachable whether you’re a student preparing for your first internship or an experienced engineer brushing up on fundamentals before a big interview round.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF decompression filters and handles common font encoding pathways, which are essential for turning raw PDF content streams into readable text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Nano Events

    Nano Events

    Simple and tiny (107 bytes) event emitter library for JavaScript

    Nano Events is a minimalistic, high-performance event emitter library for JavaScript. Its goal is to provide the simplest possible API to add pub/sub capabilities (emitters and listeners) to any JS object or application, while keeping overhead and bundle size extremely small. Rather than offering many complex features, nanoevents focuses on the core primitives: creating an emitter, subscribing to named events, emitting events with arbitrary data, and unsubscribing. Because of its minimal API...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 8
    applied-ml

    applied-ml

    Papers & tech blogs by companies sharing their work on data science

    ...For someone designing—or planning to build—a production ML system, this repo provides patterns, precedents, and lessons learned from firms that operate at big scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    wxMEdit

    wxMEdit

    wxMEdit, Cross-platform Text/Hex Editor, Improved Version of MadEdit

    •Added automatically checking for updates •Added bookmark support •Added right-click context menu for each tab •Added purging histories support •Added selecting a line by triple click •Added FreeBASIC syntax file •Added an option to place configuration files into %APPDATA% directory under Windows •Improved support for Find/Replace •Improved Mac OS X support •Improved system integration under Windows •Improved encoding detection result •Improved Hex editing support •Added more...
    Leader badge
    Downloads: 128 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Guide to Technical Interviews

    Guide to Technical Interviews

    Guided collection and roadmap for preparing technical interviews

    This repository is a guided collection and roadmap for preparing technical interviews, covering the gamut from algorithmic challenges and data structures to system design and behavioral preparation. It consolidates resources like interview question lists, practice platforms, mock interview sites, and recommended books or blogs. For individuals targeting big-tech or rigorous interview processes, this acts as a structured study guide rather than a random list of links. The README breaks down preparation into categories — coding problems, system design, mock interview sites — so you can identify gap areas and allocate study time accordingly. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    jQuery json-viewer

    jQuery json-viewer

    jQuery plugin for displaying JSON data

    json-viewer is a jQuery plugin for easily displaying JSON objects by transforming them into HTML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Svelte forms lib

    Svelte forms lib

    A lightweight library for managing forms in Svelte

    Svelte Forms lib is a Formik-inspired library for building forms easily in a Svelte project. When building modern web applications forms often play a big part in it. We use forms to log in, place orders, book flights and perform other data-entry tasks. In developing a form, it's important to create a flow that guides the user efficiently and effectively through the workflow. This library helps you build forms by exposing an easy API for form creation, validation and submission.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    The Big List of Naughty Strings is a community-maintained catalog of “gotcha” inputs that commonly break software, from unusual Unicode to SQL and script injection payloads. It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    data-science-ipython-notebooks

    data-science-ipython-notebooks

    Data science Python notebooks: Deep learning

    Data Science IPython Notebooks is a broad, curated set of Jupyter notebooks covering Python, data wrangling, visualization, machine learning, deep learning, and big data tools. It aims to be a practical map of the ecosystem, showing hands-on examples with libraries such as NumPy, pandas, matplotlib, scikit-learn, and others. Many notebooks introduce concepts step by step, then apply them to real datasets so readers can see techniques in action.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Skill Map

    Skill Map

    A visualization of programmer skill maps

    Skill‑Map is an open-source, collaborative project—originating from Geekbang—offering a structured visualization of programmer skill maps across domains like AI, front-end, backend, architecture, DevOps, and more. It serves as a navigable resource to organize learning paths and essential knowledge areas. Covers areas like AI, big data, architecture, frontend, backend, DevOps, testing, etc. Visual representation of programming and IT skill domains. Encourages community collaboration and feedback via GitHub Issues. Open to content updates, additions, and evolution as a living reference.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    giServer

    giServer

    giServer the easy to use and extensible batch and integration server

    ...Instead of using complex XML configuration files an elaborate GUI for batch job management is included. Some possible usage scenarios are: - Automatic processing of incoming data files - Big Data applications - Process automation - Data Mining/Aggregation applications - Automatic Reporting - Processing and analysis of database records
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Big Sack

    Big Sack: A lightweight Java Key/Value store with undo and disk cache.

    Big Sack is a Java persistence mechanism that allows storage of key value pairs following the popular Big Data paradigms. Its a very simple and straightforward way to bridge the gap between in-memory data structures and long-term storage. It has the convenience of Java SDK TreeMap and TreeSet classes and is used the same easy way, but it includes rollback through undo logging to checkpoint data so it does not wind up in an unknown state regardless of failures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    microhex [discontinued]

    microhex [discontinued]

    Crossplatform hex-editing software based on Python and Qt

    This project is no longer supported. Use it on your own risk (or not use at all). Microhex is an intuitive HEX editing application that enables you to view and manipulate binary data for any file in your computer. Microhex displays the integer column and the characters column, allowing you to add new columns and delete existing ones. Each column can be assigned an unlimited number of linked address bars.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    pyIRDG

    pyIRDG

    IMDb Relational Dataset Generator

    pyIRDG is a program written in Python to generate relational datasets in Prolog format. It uses data from the Internet Movie Database in combination with IMDbPY as backend. A graphical user interface written in pyQt allows the user to link multiple entities together as model for the generation process. The big four entities are Title, Person, Company and Character. Many attributes can be chosen for adding to the output .pl file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A generic SQL driven data audit tool for detecting differences between any JDBC accessible database tables and other data sources. Platform independent. It's a unix like diff for databases. Produces key values with the differing column name and data
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Protodata
    Protodata is a language for manually creating binary data files without the use of a hex editor, with the original purpose of prototyping new file formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    pXw4Pa (poor XML wrapper for PHP arrays) are 2 simple php functions written with php4.3.7 that can read/write a php array from/to an xml file. Can be used to store data on xml files simply and fast, without to make use of others big stuff.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo