Showing 102 open source projects for "large csv"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    League CSV

    League CSV

    CSV data manipulation made easy in PHP

    The PHP League CSV is a PHP library for reading, writing, and manipulating CSV files. It offers a straightforward API for handling common CSV operations, including parsing data, writing rows, and formatting output. The library is designed to handle large datasets efficiently, making it a reliable choice for data processing tasks in web applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Fast CSV

    Fast CSV

    CSV parser and formatter for node

    A high-performance Node.js library for parsing and formatting CSV data efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Vince's CSV Parser

    Vince's CSV Parser

    A modern C++ library for reading, writing, and analyzing CSV

    There's plenty of other CSV parsers in the wild, but I had a hard time finding what I wanted. Inspired by Python's csv module, I wanted a library with simple, intuitive syntax. Furthermore, I wanted support for special use cases such as calculating statistics on very large files. With the deluge of large datasets available, a performant CSV parser is a necessity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    QSV

    QSV

    Blazing-fast Data-Wrangling toolkit

    qsv is a fast, command-line CSV data toolkit written in Rust that extends the capabilities of xsv. It’s designed to make working with CSV files at scale easy and efficient, offering over 40 powerful subcommands for tasks like querying, sampling, splitting, deduplicating, and more. qsv is ideal for data engineers, analysts, and developers who need high-performance CSV manipulation on the command line.
    Downloads: 85 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    WeChatMsg

    WeChatMsg

    Project aimed at extracting, exporting, and analyzing chat records

    WeChatMsg repository hosts an open-source project aimed at extracting, exporting, and analyzing chat records from the WeChat messaging platform. It provides tools that read local WeChat database files and allow users to convert chat data into readable formats such as HTML, Word, and CSV, making it possible to inspect conversations outside the mobile app environment. Beyond simple export, the project includes mechanisms for analyzing chat histories and generating annual reports or visual...
    Downloads: 194 This Week
    Last Update:
    See Project
  • 6
    DuckDB

    DuckDB

    DuckDB is an in-process SQL OLAP Database Management System

    ...DuckDB supports arbitrary and nested correlated subqueries, window functions, collations, complex types (arrays, structs), and more. For more information on the goals of DuckDB, please refer to the Why DuckDB page on our website. Processing and storing tabular datasets, e.g. from CSV or Parquet files. Interactive data analysis, e.g. Joining & aggregate multiple large tables. Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns. Large result set transfer to client. For development, DuckDB requires CMake, Python3 and a C++11 compliant compiler. Run make in the root directory to compile the sources. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    NYC Taxi Data

    NYC Taxi Data

    Import public NYC taxi and for-hire vehicle (Uber, Lyft)

    The nyc-taxi-data repository is a rich dataset and exploratory project around New York City taxi trip records. It collects and preprocesses large-scale trip datasets (fares, pickup/dropoff, timestamps, locations, passenger counts) to enable data analysis, modeling, and visualization efforts. The project includes scripts and notebooks for cleaning and filtering the raw data, memory-efficient processing for large CSV/Parquet files, and aggregation workflows (e.g. trips per hour, heatmaps of pickups/dropoffs). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Readr

    Readr

    Read flat files (csv, tsv, fwf) into R

    readr is an R package that provides a fast and friendly way to read rectangular data, such as CSV and TSV files. Part of the Tidyverse, it simplifies data import and parsing tasks in R.​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    SiteDorks

    SiteDorks

    Automate search engine dorking across hundreds of websites

    ...A built-in dataset contains hundreds of websites grouped into categories such as cloud services, developer platforms, documentation sites, social platforms, and communication tools. Users can also supply custom domain lists or CSV files to tailor searches for tasks like penetration testing, bug bounty research, or OSINT investigations.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    DocStrange

    DocStrange

    Extract and convert data from any document, images, pdfs, word doc

    DocStrange is an open-source document understanding and extraction library designed to convert complex files into structured, LLM-ready outputs such as Markdown, JSON, CSV, and HTML. Developed by Nanonets, the project combines OCR, layout detection, table understanding, and structured extraction into one end-to-end pipeline, which reduces the need to stitch together multiple separate services. It is built for developers who need high-quality parsing from scans, photos, PDFs, office files,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Miller

    Miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data

    Miller is like awk, sed, cut, join, and sort for data formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. Then, on the fly, you can add new fields which are functions of existing fields, drop fields, sort, aggregate statistically, pretty-print, and more. Miller operates on key-value-pair data while the...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 12
    dataline

    dataline

    AI data analysis and visualization on CSV, Postgres, MySQL, Snowflake

    dataline is an open-source AI data analysis and visualization platform that allows users to interact with datasets using natural language. The system enables both technical and non-technical users to explore data by asking questions conversationally, which the platform translates into database queries and analytical operations. It supports connections to multiple structured data sources such as PostgreSQL, MySQL, Snowflake, SQLite, Excel files, CSV datasets, and other database systems. Once...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OmniTools

    OmniTools

    Self-hosted collection of powerful web-based tools for everyday tasks

    ...The tool catalog spans both technical and non-technical needs, including image, video, audio, PDF, text, date/time, math, and data format utilities like JSON/CSV/XML helpers. It’s also packaged for straightforward self-hosting, with a lightweight Docker image and simple run commands, so it can be deployed quickly on a homelab or internal network.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    timelinize

    timelinize

    Store your data from all your accounts and devices

    ...It supports multiple visualization styles, such as chronological horizontal views and compact vertical modes, so you can tailor output to fit different screen sizes and content density. With automatic linking between events and optional categories or tags, timelines become easier to navigate and filter for large datasets, making this tool practical for history projects, product roadmaps, research archives, or personal journals.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    OpenBB

    OpenBB

    Investment Research for Everyone, Everywhere

    Customize and speed up your analysis, bring your own data, and create instant reports to gain a competitive edge. Whether it’s a CSV file, a private endpoint, an RSS feed, or even embed an SEC filing directly. Chat with financial data using large language models. Don’t waste time reading, create summaries in seconds and ask how that impacts investments. Create your dashboard with your favorite widgets. Create charts directly from raw data in seconds. Create charts directly from raw data in seconds. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    database.build

    database.build

    In-browser Postgres sandbox with AI assistance

    ...It eliminates the need for traditional database setup by running a lightweight version of PostgreSQL locally using PGlite, enabling developers to experiment, prototype, and analyze data without relying on external servers. Each database instance is paired with a large language model, allowing users to perform tasks such as generating schemas, importing CSV data, building diagrams, and creating reports using natural language interactions. The platform supports persistent storage through IndexedDB, ensuring that database changes remain available across sessions. It also provides tools for generating charts and visualizations, making it useful not only for developers but also for analysts and data-driven workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Query.jl

    Query.jl

    Query almost anything in julia

    Query is a package for querying julia data sources. It can filter, project, join and group data from any iterable data source, including all the sources supported in IterableTables.jl. One can for example query any of the following data sources: any array, DataFrames, DataStreams (including CSV, Feather, SQLite, ODBC), DataTables, IndexedTables, TimeSeries, Temporal, TypedTables and DifferentialEquations (any DESolution). The package currently provides working implementations for in-memory...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    slang

    slang

    Type-safe i18n for Dart and Flutter

    Type-safe i18n solution using JSON, YAML, CSV, or ARB files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CSV Quick Viewer

    CSV Quick Viewer

    CSV Quick Viewer

    ...CSVQuickViewer runs without administrative rights, provides column insights, and allows exporting filtered data, offering a fast, reliable, and local solution for inspecting structured text data. Ideal for analysts, developers, and support teams working with large or messy data files.
    Leader badge
    Downloads: 49 This Week
    Last Update:
    See Project
  • 20
    Gmail Cleaner

    Gmail Cleaner

    Web based GUI to cleanup gmail delete, mark as read

    ...The tool employs smart filtering options such as age, size, category, and sender, and runs entirely locally — meaning all Gmail API interactions happen on the user’s own machine without sending data to external servers. With Docker and native Python support, Gmail Cleaner works on Linux, macOS, and Windows and makes large-scale inbox management accessible to both casual and power users. It includes features like progress tracking, smart filters, label management, and CSV export for metadata, making it versatile for different cleanup strategies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    sparklyr

    sparklyr

    R interface for Apache Spark

    sparklyr is an R package that provides seamless interfacing with Apache Spark clusters—either local or remote—while letting users write code in familiar R paradigms. It supplies a dplyr-compatible backend, Spark machine learning pipelines, SQL integration, and I/O utilities to manipulate and analyze large datasets distributed across cluster environments.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    ImportExcel

    ImportExcel

    PowerShell module to import/export Excel spreadsheets, without Excel

    ImportExcel is a popular PowerShell module that enables reading, writing, and manipulating Excel spreadsheets without requiring Microsoft Excel to be installed on the host. It exposes straightforward cmdlets like Import-Excel and Export-Excel that convert between Excel sheets and PowerShell objects, making it simple to pipeline tabular data into reporting and automation flows. Advanced features include adding and formatting tables, setting number/date formats, creating charts, and applying...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Plog

    Plog

    Portable, simple and extensible C++ logging library

    Portable, simple and extensible C++ logging library. Plog is a C++ logging library that is designed to be as simple, small and flexible as possible. It is created as an alternative to existing large libraries and provides some unique features as CSV log format and wide string support.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Python 100 Days

    Python 100 Days

    Python - From Novice to Master in 100 Days

    Python-100-Days is a comprehensive, practice-first learning roadmap by Luo Hao that spans 100 days from absolute Python basics to professional, production-grade skills. It starts with foundational syntax, control flow, data structures, and functions, then advances through object-oriented programming, file I/O, exceptions, and modules. The middle sections focus on real-world Python applications, including working with CSV, Excel, Word, PowerPoint, PDFs, images, email/SMS, and regular...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 25

    Eng2BN CSV Translator

    Translate English to Bangla using CSV file format and range wise.

    Eng2BN CSV Translator user-friendly Python tool that enables efficient translation of English text to Bangla within CSV files. The application supports large datasets and allows users to translate specific row ranges, making it ideal for batch processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB