Showing 68 open source projects for "files"

View related business solutions
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • $300 in Free Credit for Your Google Cloud Projects Icon
    $300 in Free Credit for Your Google Cloud Projects

    Build, test, and explore on Google Cloud with $300 in free credit. No hidden charges. No surprise bills.

    Launch your next project with $300 in free Google Cloud credit—no hidden charges. Test, build, and deploy without risk. Use your credit across the Google Cloud platform to find what works best for your needs. After your credits are used, continue building with free monthly usage products. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    Metacrafter

    Metacrafter

    Metadata and data identification tool and Python library

    Python command line tool and Python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifiable information (PII). Metacrafter is a rule-based tool that helps to label fields of the tables in databases. It scans table and finds person names, surnames, midnames, PII data, basic identifiers like UUID/GUID. These rules written as .yaml files and could be easily extended.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    gusty

    gusty

    Making DAG construction easier

    gusty allows you to control your Airflow DAGs, Task Groups, and Tasks with greater ease. gusty manages collections of tasks, represented as any number of YAML, Python, SQL, Jupyter Notebook, or R Markdown files. A directory of task files is instantly rendered into a DAG by passing a file path to gusty's create_dag function. gusty also manages dependencies (within one DAG) and external dependencies (dependencies on tasks in other DAGs) for each task file you define. All you have to do is provide a list of dependencies or external_dependencies inside of a task file, and gusty will automatically set each task's dependencies and create external task sensors for any external dependencies listed. gusty works with both Airflow 1.x and Airflow 2.x, and has even more features, all of which aim to make the creation, management, and iteration of DAGs more fluid, so that you can intuitively design your DAG and build your tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    marimo

    marimo

    A reactive notebook for Python

    ...Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. Version with git, run as Python scripts, import symbols from a notebook into other notebooks or Python files, and lint or format with your favorite tools. You'll always be able to reproduce your collaborators' results. Notebooks are executed in a deterministic order, with no hidden state, delete a cell and marimo deletes its variables while updating affected cells.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    ...You can build pretty much any task you want, but Luigi also comes with a toolbox of several common task templates that you use. It includes support for running Python mapreduce jobs in Hadoop, as well as Hive, and Pig, jobs. It also comes with file system abstractions for HDFS, and local files that ensures all file system operations are atomic.
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    JS Analyzer

    JS Analyzer

    Burp Suite extension for JavaScript static analysis

    JS Analyzer is a powerful static analysis tool implemented as a Burp Suite extension that helps security researchers and web developers automatically uncover important artifacts in JavaScript files during web application testing. It parses JavaScript responses intercepted by Burp Suite and intelligently extracts API endpoints, full URLs (including cloud storage links), secrets like API keys or tokens, and email addresses while filtering out noise from irrelevant code patterns. The extension is designed to reduce manual effort when analyzing large or obfuscated JavaScript assets, helping testers find security vulnerabilities and sensitive information faster and more reliably. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    AI Data Science Team is a Python library and agent ecosystem designed to accelerate and automate common data science workflows by modeling them as specialized AI “agents” that can be orchestrated to perform tasks like data cleaning, transformation, analysis, visualization, and machine learning. It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    GemGIS

    GemGIS

    Spatial data processing for geomodeling

    GemGIS is a Python-based, open-source geographic information processing library. It is capable of preprocessing spatial data such as vector data (shape files, geojson files, geopackages,…), raster data (tif, png,…), data obtained from online services (WCS, WMS, WFS) or XML/KML files (soon). Preprocessed data can be stored in a dedicated Data Class to be passed to the geomodeling package GemPy in order to accelerate the model-building process. Postprocessing of model results will allow export from GemPy to geoinformation systems such as QGIS and ArcGIS or to Google Earth for further use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    dxf2gcode

    dxf2gcode

    DXF2GCODE: converting 2D dxf drawings to CNC machine compatible G-Code

    DXF2GCODE is a tool for converting 2D (dxf, pdf, ps) drawings to CNC machine compatible GCode. Windows, Linux, and Mac support by using python scripting language.
    Leader badge
    Downloads: 374 This Week
    Last Update:
    See Project
  • 9
    HEALPix

    HEALPix

    Data Analysis, Simulations and Visualization on the Sphere

    ...Gorski et al., 2005, Ap.J., 622, p.759 Full software documentation available at https://healpix.sourceforge.io/documentation.php Wiki Pages: https://sourceforge.net/p/healpix/wiki/Home Exchanging Data with HEALPix (in FITS files): https://sourceforge.net/p/healpix/wiki/Exchanging%20Data%20with%20HEALPix/ GDL and FL users should read https://sourceforge.net/p/healpix/wiki/HEALPix%20and%20GDL/
    Leader badge
    Downloads: 226 This Week
    Last Update:
    See Project
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 10
    File Sorter for Photographers

    File Sorter for Photographers

    Organize files/images from a csv or xlsx file.

    A user-friendly application to efficiently sort all types of files from a source folder into a destination folder based on a list of filenames provided in an Excel or CSV file.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Old File Delete

    Old File Delete

    Clean up old files with a single click.

    ...The app helps you instantly clear selected folders of accumulated digital clutter. Featuring a modern flat design, the interface is intuitive: simply select a folder, specify the number of days, and the program will find and remove outdated files. No complex settings—just cleanliness and speed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Data Preprocessing Automate

    Data Preprocessing Automate

    Data Preprocessing Automation: A GUI for easy data cleaning & visualiz

    Data Preprocessing Automation is a Python-based GUI application designed to simplify and automate data preprocessing tasks. It allows users to upload Excel files, automatically handle missing values, remove duplicates, and detect and remove outliers using statistical methods. The application provides data visualization tools, including box plots for distribution analysis and scatter plots for exploring relationships between variables. Users can download the processed data for further analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Autoplot

    Autoplot

    Autoplot is an interactive browser for data on the web

    Autoplot is an interactive browser for data on the web. Give Autoplot a URL or local file name and it creates a sensible plot of the data. Autoplot allows you to interactively browse data stored in ascii, .cdf, netcdf, and many other formats. Autoplot's source has been moved to GitHub. Thanks to SourceForge for many years of hosting!
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14

    DataPrep

    Python-based data preprocessing tool

    DataPrep v0.2 is a Tkinter-based GUI application/tool designed to assist users in data preprocessing, multicollinearity removal, and feature selection for a wide range of applications in Cheminformatics, Bioinformatics, Data Analysis, Feature Selection, Molecular Modeling, Machine Learning, and Quantitative-structure-property relationship (QSPR) studies. It includes functionality to load, process, and save datasets with support for different preprocessing & multicollinearity removal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PipeRider

    PipeRider

    Code review for data in dbt

    ...You can compare two previously generated reports or use a single command to compare the differences between the current branch and the main branch. The latter is designed specifically for code review scenarios. In our pull requests on GitHub, we not only want to know which files have been changed, but also the impact of these changes on the data. PipeRider can easily generate comparison reports with a single command to provide this information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    QuickPlot

    QuickPlot

    Simple user interface for gnuplot aimed for reflectometry data

    ...It supports templates for fast formatting of graphics, different plot styles, insets, axis and label options. One important feature is storing metadata in png and pdf files that can be used to reload any graph saved with QuickPlot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    PySchool

    PySchool

    Installable / Portable Python Distribution for Everyone.

    PySchool is a free and open-source Python distribution intended primarily for students who learn Python and data analysis, but it can also used by scientists, engineering, and data scientists. It includes more than 150 Python packages (full edition) including numpy, pandas, scipy, sympy, keras, scikit-learn, matplotlib, seaborn, beautifulsoup4...
    Leader badge
    Downloads: 1,430 This Week
    Last Update:
    See Project
  • 18
    XISMuS

    XISMuS

    X-Ray Imaging Software for Multiple Samples

    ...IMPORTANT FIXES in respect to base v2.0.0 version: v.2.5.0 introduces the Differential Attenuation and Cube Viewer utilities, and migrates user database to *.json files v2.4.3 fixes a with K element in the fit-approx method v2.4.3 fixes and issue where saving plots with fit-approx or a auto-wizard could freeze the software v2.4.2 introduces Image Viewer to Mosaic v2.4.1 fixes an issue in merging H5 or EDF datasets with Mosaic Full changelog at https://linssab.github.io/history X-Ray Fluorescence Imaging Software for Multiple Samples is an open source software to manipulate and study macro-X-Ray Fluorescence (MA-XRF) datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    pycoQC

    pycoQC

    pycoQC computes metrics and generates Interactive QC plots

    PycoQC computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data. PycoQC relies on the sequencing_summary.txt file generated by Albacore and Guppy, but if needed it can also generate a summary file from basecalled fast5 files. The package supports 1D and 1D2 runs generated with Minion, Gridion and Promethion devices and basecalled with Albacore 1.2.1+ or Guppy 2.1.3+. PycoQC is written in pure Python3. Python 2 is not supported.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GPlates

    GPlates

    Interactive visualization of plate tectonics.

    GPlates is a plate-tectonics program. Manipulate reconstructions of geological and paleo-geographic features through geological time. Interactively visualize vector, raster and volume data. PyGPlates is the GPlates Python library. Get fine-grained access to GPlates functionality in your Python scripts.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 21
    repo2docker GitHub Action

    repo2docker GitHub Action

    A GitHub action to build data science environment images

    Trigger repo2docker to build a Jupyter enabled Docker image from your GitHub repository and push this image to a Docker registry of your choice. This will automatically attempt to build an environment from configuration files found in your repository. Images generated by this action are automatically tagged with both latest and <SHA> corresponding to the relevant commit SHA on GitHub. Both tags are pushed to the Docker registry specified by the user. If an existing image with the latest tag already exists in your registry, this Action attempts to pull that image as a cache to reduce uncessary build steps.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    ...Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featuring the data, optimizing and evaluating models, and scaling up to the cloud. The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    QtiPlot
    QtiPlot is a user-friendly, platform independent data analysis and visualization application similar to the non-free Windows program Origin.
    Downloads: 56 This Week
    Last Update:
    See Project
  • 24
    nonechucks

    nonechucks

    Deal with bad samples in your dataset dynamically

    nonechucks is a library that provides wrappers for PyTorch's datasets, samplers and transforms to allow for dropping unwanted or invalid samples dynamically. What if you have a dataset of 1000s of images, out of which a few dozen images are unreadable because the image files are corrupted? Or what if your dataset is a folder full of scanned PDFs that you have to OCRize, and then run a language detector on the resulting text, because you want only the ones that are in English? Or maybe you have an AlternateIndexSampler, and you want to be able to move to dataset[6] after dataset[4] fails while attempting to load! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GEOMS2

    GEOMS2

    Geostatistics and geosciences modeling software

    ...attredirects=0&d=1 http://sourceforge.net/projects/geoms2/files/Mining.7z/download
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →