Showing 200 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 1
    Scry.info blockchain Data Protocol

    Scry.info blockchain Data Protocol

    This is the first data protocol layer open source in the blockchain

    By providing SDK for data exchange through blockchain, developers can more conveniently develop DAPP applications. It mainly includes the following contents:data encryption and decryption, digital signature, smart contract, event notification, data storage interface, data acquisition and query, digital currency payment, the third-party App payment interface, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tera

    Tera

    An Internet-scale database

    ...Support RAMDISK/SSD/DFS tiered cache. Block cache and Bloom Filters for real-time queries. Multi-type table support (RAMDISK/SSD/DISK table). Easy to use C++/Java/Python/REST-ful API. Column-oriented storage and locality group support. Ranged and hashed sharding strategies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    Azure Cosmos DB Spark is the official connector for Azure CosmosDB and Apache Spark. The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Grinn

    Grinn

    graph database and R package for omic data integration

    http://kwanjeeraw.github.io/grinn/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    FAKE2DB

    FAKE2DB

    Create custom test databases that are populated with fake data

    Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, CouchDB. The installation through pypi retrieves 'fake-factory' as a main dependency. db argument takes 6 possible options, sqlite, mysql, postgresql, mongodb, redis, CouchDB. name argument is OPTIONAL. When it is absent fake2db will name db's randomly. host argument is OPTIONAL. Hostname to use for database connection. Not used for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    WikiSQL

    WikiSQL

    A large annotated semantic parsing corpus for developing NL interfaces

    A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated. If you'd still like to use the tokenizer, please use the docker image. We do not anticipate switching...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    DatacenterManager

    DatacenterManager

    UNIX Performance Monitoring / Trend Analysis Java Software

    Remotely Inventory and Poll UNIX servers in seconds. (without installing extra software on your servers, just by SSH communication plain old UNIX commands).https://sites.google.com/site/ronuitzaandam/ Your entire datacenter can be automatically inventoried by supplying hostname, username & password for each server, either “one by one” or via an automated CSV host-list import file. This software goes great with other UNIX software like WinSCP and Putty etc !!!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Wandora
    Wandora is a general purpose information extraction, management, and publishing environment based on Topic Maps and Java. Wandora has several data storage options, rich data extraction, import and export capabilities and embedded server.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    FastVersion

    FastVersion

    FastVersion is a Qgis plugin for data versioning in a Postgis database

    ...The system uses algorithms and data structures so that information is not duplicated each time a version is created. We recommend downloading it from the Qgis app. Plugins->Manage and Install Plugins->Settings->Show also experimental plugins Plugins->Manage and Install Plugins->Search FastVersion If downloaded from sourceforge, it must be unzipped in \.qgis2\python\plugins
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    DataCleaner

    DataCleaner

    Data quality analysis, profiling, cleansing, duplicate detection +more

    DataCleaner is a data quality analysis application and a solution platform for DQ solutions. It's core is a strong data profiling engine, which is extensible and thereby adds data cleansing, transformations, enrichment, deduplication, matching and merging. Website: http://datacleaner.github.io
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11

    Adele

    Adhoc Data Exploration - Live & Easy

    ...There are many technical concepts in an easier way included. For example realtime OLAP, transformations, charts, analysis tools,... Connectors (e.g. JDBC, SAP ABAP, OData) can be used to pre-analyse the data and extract it without saving the data as text files. A plugin concept for enhancements are available. Enjoy! Its free for commercial use too. Adele runs without installation from USB stick for Windows, Linux and MacOSX. Last added changes: - data science tools (V1, IQR) - export to remote and desktop databases (mysql,sqlite, ms access) - internet features for emails and domains
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    This is a tool for patent searching, downloading and analyzing. Analysis focuses on patent-families, citation-networks, assignee-networks , and inventor-networks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    BioNLP-Corpora is a repository of biomedically and linguistically annotated corpora and biomedical data sources. There are many resources available in separate packages in this project.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    littletable is a lightweight in-memory data manager of collections of Python objects, providing ORM-like access for querying and joining data using object attributes as pseudo-columns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    (py)biblib

    A python library to handle BibTeX bibliographic data.

    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    NoSQLMap

    NoSQLMap

    Automated NoSQL database enumeration and web application exploitation

    A security tool for detecting and exploiting vulnerabilities in NoSQL databases, similar to SQLMap for traditional databases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LightProfiler

    LightProfiler

    Profiler for Oracle extended SQL trace files

    LightProfiler – application for performance analysis of the Oracle databases. It generates detailed resource profile for extended SQL trace files (10046 event), containing information about consuming of response time (by events, by cursors, etc.), data files usage, error analysis (SQL, PL/SQL) and much more. Also it contain tools for additional processing of trace files (extract session data, splitting files) and for management of database's sessions (disconnecting, tracing, monitor parameters, blocking locks, events and etc.)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Open Information Integration
    Open Information Integration Tool Suite (Open II) is used by analysts and programmers to accelerate data integration and harmonization across organizations. OpenII has a neutral schema repository for browsing and comparing all sorts of data models. OpenII is built as a Rich Client Platform Application on top of Eclipse 3.x. Developers need to download Eclipse, install the RCP support, the Fatjar plugin and the Delta Pack in one of the 3.x flavors. Release Notes Release Date: Jan...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 19
    Collective Mind Technology

    Collective Mind Technology

    plugin-based framework for systematic and reproducible experimentation

    New version moved to http://github.com/ctuning/ck Collective Mind framework (cM) is an open-source plugin-based schema-free repository and infrastructure for collaborative, systematic and reproducible research and experimentation. This 3rd version (started in 2006) helps to implement, preserve, share and reproduce the whole experimental setup as connected modules and data. cM uses crowdsourcing to leverage knowledge and computational resources of multiple users. For example, it includes multi-objective GCC, LLVM and ICC auto-tuning scenarios using shared benchmarks, codelets, data sets, tools, and combined with classification and predictive models. cM includes OpenME interactive interface to open up and expose internals of various third-party tools such as GCC, LLVM, run-time systems, etc. and connect them to cM through dynamic plugins that allows online analysis and tuning of programs and architectures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PyTables - Hierarchical datasets
    The goal of PyTables is to enable the end user to efficiently and easily manipulate large datasets (both homogenous, i.e. arrays, and heterogenous, i.e. tables) on a persistent, hierarchical way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    PalOOCa OpenOffice Extension for Palo

    palo olap open office calc plugin for data analysis

    The PalOOCa Project offers a fast, flexible and intuitive Office-based Business Intelligence solution based on Jedox. It provides an extension for OpenOffice.org Calc which allows both, read and write, access to data from within the Jedox OLAP Server via Calc. If used together with the Open Source Jedox/Palo OLAP Server it completes the Open Source MOLAP-Stack for Business Intelligence. Additionally to Jedox OLAP it is also (read-only) compatible to (almost) all OLAP servers supporting XMLA. Intended audiences are: - scientific research & teaching - financial analysis & controlling Consult the FAQ in the Wiki.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22

    ProC 3.0

    smart Workflow Engine

    ProC 3.0 is a scientific workflow engine to build, manage and execute workflows (pipelines) in heterogeneous environments, supporting GRID and other means of parallel processing. It includes a data management component (DMC) to transparently access databases for storage of results and automatically adds metadata to track the processing of data products, so that at every time a full processing history is available. The software was developed and used within the ESA Planck satellite mission.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    helenos

    Web based GUI tool to manage your data stored in Apache Cassandra

    Helenos is a free web based environment that simplifies a data exploring & schema managament with Apache Cassandra database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Graph-RAT
    Graph-RAT is a database abstraction layer designed to make it easy to use a large library of graph-analysis routines on a database as well as add new kinds of algorithms to data mining.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB