Showing 207 open source projects for "duplicates"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Czkawka

    Czkawka

    Multi functional app to find duplicates, empty folders, similar images

    Czkawka (Polish for “hiccup”) is a lightning‑fast, multi‑purpose file cleaning tool written in Rust. It helps users declutter storage by finding duplicate files, similar images or audio, empty folders, and unusually large files through CPU‑efficient multithreading. Available with both GUI (GTK‑based) and CLI versions for flexible usage.
    Downloads: 350 This Week
    Last Update:
    See Project
  • 2
    FDUPES

    FDUPES

    FDUPES is a program for identifying or deleting duplicate files

    ...It works by scanning directories and subdirectories, identifying sets of files with identical content through size and hash comparisons, and then listing them together so users can examine duplicates. Once duplicates are identified, the tool offers interactive deletion options where users can choose which copy to keep and which to remove, or apply flags to automate deletion while preserving a single instance. Because it operates directly on file content rather than just filenames, fdupes can accurately detect true copies and guide cleaning operations in data cleanup or migration tasks. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    janitor

    janitor

    Simple tools for data cleaning in R

    janitor provides simple, convenient tools for data cleaning, formatting, and exploration in R. It is especially useful for cleaning messy data frames, removing duplicates, formatting column names, and producing frequency tables in a tidy workflow.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Kuroba Experimental

    Kuroba Experimental

    Free and open source image board browser

    ...Ability to attach multiple media files to reply, attach media files that were shared by external apps (even by some keyboards), attach remote media files by URL, etc. New image downloader. Allows downloading images while the app is in the background, retrying failed to download images, resolving duplicates, etc.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    AIOStreams

    AIOStreams

    One addon to rule them all

    AIOStreams is a “super-add-on” for Stremio that consolidates results from many add-ons and debrid services into a single, customizable feed. Instead of juggling multiple add-ons, you point AIOStreams at them; it then queries, merges, de-duplicates, and re-ranks everything according to your rules. The project includes a powerful filtering and sorting engine—think conditions on resolution, codec, provider, cached status, seed count, and more—so power users can shape results precisely. It also provides a template-based formatter to control how streams are labeled in the UI, complete with live preview for quick iteration. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 8
    trackerslist

    trackerslist

    Updated list of public BitTorrent trackers

    trackerslist is a repository that provides continuously updated lists of public BitTorrent trackers, designed to improve torrent download performance and peer discovery. The project is maintained by an automated system that regularly checks tracker availability, removes duplicates, and ranks them based on reliability and latency. It offers multiple formats of tracker lists, including HTTP, HTTPS, WebSocket, and IP-based versions, making it compatible with a wide range of BitTorrent clients. The lists can be integrated into torrent files or magnet links to enhance connectivity and download speeds. Additionally, the project includes specialized lists for alternative networks such as I2P and Yggdrasil, expanding its usability in privacy-focused environments. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    PostgREST

    PostgREST

    REST API for any Postgres database

    ...The structural constraints and permissions in the database determine the API endpoints and operations. Using PostgREST is an alternative to manual CRUD programming. Custom API servers suffer problems. Writing business logic often duplicates, ignores or hobbles database structure. Object-relational mapping is a leaky abstraction leading to slow imperative code. The PostgREST philosophy establishes a single declarative source of truth: the data itself. It’s easier to ask PostgreSQL to join data for you and let its query planner figure out the details than to loop through rows yourself. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    PhotoPrism

    PhotoPrism

    AI-Powered Photos App for the Decentralized Web 🌈💎✨

    PhotoPrism® is an AI-Powered Photos App for the Decentralized Web. It makes use of the latest technologies to tag and find pictures automatically without getting in your way. You can run it at home, on a private server, or in the cloud. Our mission is to provide the most user- and privacy-friendly solution to keep your pictures organized and accessible. That's why PhotoPrism was built from the ground up to run wherever you need it, without compromising freedom, privacy, or functionality.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 11
    Node Modules Inspector

    Node Modules Inspector

    Interactive UI for local node modules inspection

    This is a tool (CLI + interactive UI) for inspecting the node_modules directory of a JavaScript/TypeScript project, created by Anthony Fu. It supports projects using npm, pnpm or bun. The idea is to help developers visualise the dependency graph, see which dependencies are installed, their sizes, types (ESM vs CJS), origins (catalog vs registry), and filter or build a static report of a project’s dependency tree. The project includes a web UI version that you can try at node-modules.dev, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    websocket for Go

    websocket for Go

    Minimal and idiomatic WebSocket library for Go

    ...RFC 7692 permessage-deflate compression. Compile to Wasm. Transparent message buffer reuse with wsjson and wspb subpackages. Gorilla writes directly to a net.Conn and so duplicates features of net/http.Client. Gorilla requires registering a pong callback before sending a Ping. Compare godoc of nhooyr.io/websocket with gorilla/websocket side by side. Will enable easy HTTP/2 support in the future.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Aves

    Aves

    Aves is a gallery and metadata explorer app, built for Android

    Aves is a feature-rich gallery app focused on organizing, browsing, and viewing photos and videos with precision. It indexes media on device storage and surfaces powerful ways to explore content, such as by folders, albums, dates, tags, and embedded metadata. The viewer prioritizes smooth gestures (pinch-to-zoom, pan, fling) and a clean, immersive interface that keeps controls out of the way until needed. Aves emphasizes metadata awareness, reading details like EXIF and other tags to enable...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    EPLB

    EPLB

    Expert Parallelism Load Balancer

    ...In EP, different “experts” are mapped to different GPUs or nodes, so load imbalance becomes a performance bottleneck if certain experts are invoked much more often. EPLB solves this by duplicating heavily used experts (redundancy) and then placing those duplicates across GPUs to even out computational load. It uses policies like hierarchical load balancing (grouped experts placed at node and then GPU level) and global load balancing depending on configuration. The logic is implemented in eplb.py and supports predicting placements given estimated expert usage weights. EPLB aims to reduce hot-spotting and ensure more uniform usage of compute resources in large MoE deployments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    ...File sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata. Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    gitignore.io

    gitignore.io

    Create useful .gitignore files for your project

    ...You can access it from a clean web UI, a simple REST API, or the command line, making it easy to script into new-project scaffolds and automation. The generator accepts multiple technologies in one request, normalizes duplicates, and orders rules sensibly so the result is readable and effective. Templates are versioned and updated over time as tools evolve, helping teams avoid accidentally committing build artifacts, credentials, caches, and other noisy files. The repository includes documentation, example invocations, and contribution guidelines so users can add or refine templates.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MusicBrainz Server

    MusicBrainz Server

    Server for the MusicBrainz project (website, API, database tools)

    MusicBrainz Server is the web application powering the MusicBrainz open music metadata database. It handles artist, album, track, and recording entities and allows users to submit edits, vote, merge duplicates, and browse relationships like releases or linked ISRCs. The server supports full-text search, advanced filtering, web APIs, data dumps, and integration with other services like Cover Art Archive and Discogs. Its architecture ensures referential integrity and versioned change tracking so edits can be audited or reverted, and moderation workflows manage conflicting contributions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Skills Janitor

    Skills Janitor

    Audit, track usage, and compare your Claude Code skills

    The Skills Janitor project is a lightweight plugin designed to manage, audit, and optimize AI agent skill ecosystems, particularly for environments like Claude Code and OpenAI Codex. It functions as a “maintenance layer” for AI skills by automatically scanning installed skill directories, identifying duplicates, and analyzing their structure and usage. One of its core purposes is to help developers maintain a clean and efficient skill environment, especially as the number of installed skills grows over time. The system provides a set of command-based tools that allow users to perform health checks, generate reports, and automatically fix issues such as broken or redundant skills. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SortPhotos

    SortPhotos

    SortPhotos is a Python script that organizes photos and videos

    ...SortPhotos includes options for copying versus moving files, recursive searches, silent or test modes, and customizable start times for when a “day” begins. It also prevents duplicate files by comparing content, with an option to keep duplicates if needed. With support for automation through launch agents or cron jobs, SortPhotos is well-suited for photographers, archivists, and anyone looking to streamline large personal or professional media collections.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CleanVision

    CleanVision

    Automatically find issues in image datasets

    CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc. This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used to train them, but it is hard to manually identify all of the low-quality data in a big dataset. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    anti-copy-paster

    A plugin for IntelliJ IDEA for just-in-time code duplicates extraction

    The plugin monitors the copying and pasting that takes place inside the IDE. As soon as a code fragment is pasted, the plugin checks if it introduces code duplication, and if it does, the plugin calculates a set of code metrics for it, and these metrics are compared against the currently selected metrics thresholds. If the chosen thresholds are surpassed, the plugin suggests the developer to perform the Extract Method refactoring and applies the refactoring if necessary.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    RemoveDuplicate

    RemoveDuplicate

    A software that can helps you remove duplicate files

    RemoveDuplicate is a Windows application designed to help users find and remove duplicate files within folders. It uses MD5 hashing to identify identical files and provides a user-friendly interface to manage and delete duplicates while keeping one copy of each file.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    Tartube

    Tartube

    Download videos/channels/playlists from YouTube and many other sites

    Tartube is a GUI front-end for youtube-dl, yt-dlp and other compatible video downloaders. It is written in Python 3 / Gtk 3 and runs on MS Windows, Linux, BSD and MacOS.
    Leader badge
    Downloads: 1,301 This Week
    Last Update:
    See Project
  • 24
    Audit Data Analytics
    Audit Data Analytics, LLC's ADA software is an open source solution for performing operations on audit data, such as filtering specific records and summarizing data sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    WizTree

    WizTree

    WizTree quickly analyzes disk space, showing large files and folders.

    WizTree is an ultra-fast and intuitive disk space analyzer that helps you instantly see what’s taking up space on your hard drives. Using advanced technology, it reads the Master File Table (MFT) directly on NTFS drives, making it dramatically faster than traditional scanners. The program displays your files and folders in a clear, visual tree-map view, showing their relative sizes so you can quickly locate and delete large or unnecessary items.
    Downloads: 744 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB