Showing 77 open source projects for "content analysis"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Khazix Skills

    Khazix Skills

    Digital Life Kazik Open Source AI Skills Collection

    Khazix Skills project is an automation framework designed to transform GitHub repositories into structured, reusable AI agent skills. It acts as a pipeline that analyzes a repository’s metadata, extracts relevant information such as README content and commit hashes, and converts it into a standardized skill format that can be integrated into agent ecosystems. The system emphasizes lifecycle management by embedding versioning, traceability, and metadata directly into generated skill files,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Build with Claude

    Build with Claude

    A single hub to find Claude Skills, Agents, Commands, Hooks, Plugins

    Build with Claude is an open-source plugin marketplace and discovery hub for the Claude Code ecosystem that centralizes hundreds of plugins, agents, commands, hooks, skills, and marketplaces to enhance developer workflows with autonomous AI functionality. It serves as a one-stop index where users can browse curated agent modules for tasks like blockchain development, code analysis, DevOps, documentation generation, and much more — all designed to be installed directly into Claude Code using a simple plugin system. The repository includes an organized collection of community-maintained plugins, searchable by category, and offers clear instructions on how to add and install marketplace content within Claude Code environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and improving markdown conversion, reflecting active community use in research flows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Sa2VA

    Sa2VA

    Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

    Sa2VA is a cutting-edge open-source multi-modal large language model (MLLM) developed by ByteDance that unifies dense segmentation, visual understanding, and language-based reasoning across both images and videos. It merges the segmentation power of a state-of-the-art video segmentation model (based on SAM‑2) with the vision-language reasoning capabilities of a strong LLM backbone (derived from models like InternVL2.5 / Qwen-VL series), yielding a system that can answer questions about...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5
    DeepSeek AIO

    DeepSeek AIO

    Access and use all DeepSeek AI models in one program.

    DeepSeek AIO is a simple program that allows you to interact with all DeepSeek large language models in one place. It supports text-based chats, data analysis, code generation, language translation, and more. The program is designed to make it easy for users to use DeepSeek's AI tools for different purposes without switching between multiple platforms.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    AnyTXT Searcher

    AnyTXT Searcher

    A Powerful Desktop Full-Text Search Engine, Just Like Local Google.

    AnyTXT Searcher is a powerful file full-text search engine, a desktop search application for fast document retrieval. Just like a local disk Google search engine, much faster than Windows Search, it is your ideal desktop file content full-text search engine. It has a powerful document parsing engine built in, which extracts the text of commonly used file formats without installing any other software, and combines the built-in high-speed indexing system to store the metadata of the...
    Leader badge
    Downloads: 7,454 This Week
    Last Update:
    See Project
  • 7
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    ...For supported audio and video files, AI File Sorter can read embedded metadata (such as ID3, Vorbis, and MP4 tags) to suggest normalized names like year_artist_album_title.ext. AI analysis runs read-only, and all suggestions must be reviewed before being applied. AI File Sorter can run fully offline using local models like Mistral or LLaMA, so files and metadata stay on your device unless you configure a remote endpoint.
    Leader badge
    Downloads: 216 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    sketch

    sketch

    AI code-writing assistant that understands data content

    Sketch is an open-source AI-powered data analysis assistant designed specifically for pandas users, enabling natural language interaction with tabular datasets to generate code, insights, and transformations. It works by summarizing the structure and statistical properties of a dataset and providing that context to a language model, allowing it to generate highly relevant and accurate responses tailored to the data. The tool integrates directly into pandas dataframes through an extension,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    GPT-2 Output Dataset

    GPT-2 Output Dataset

    Dataset of GPT-2 outputs for research in detection, biases, and more

    ...It contains 250,000 samples of GPT-2 outputs, generated with different sampling strategies such as top-k truncation, to highlight the diversity and quality of model completions. The dataset also includes corresponding human-written text for comparison, enabling researchers to explore methods for distinguishing machine-generated content from human-authored text. The repository provides scripts and metadata for working with the dataset, with the goal of supporting research in areas like detection, evaluation of text coherence, and analysis of generative models. While no active development is expected, the dataset remains a useful benchmark for tasks involving text classification, style analysis, and generative model evaluation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    PumpkinBook

    PumpkinBook

    Machine Learning formula derivation and analysis

    All the contents of the Pumpkin Book are expressed with the content of the Mr. Zhou Zhihua's "Machine Learning" Watermelon Book as the pre-knowledge, so the best way to use the Pumpkin Book is to use the Watermelon Book as the main line. Please refer to it when you encounter a formula that you cannot derive or cannot understand. We strive to explain and derive each formula from the perspective of undergraduate mathematics. Therefore, we usually give out the mathematics knowledge of the super...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    README-AI

    README-AI

    README file generator, powered by AI

    README-AI is an automated documentation generator that creates structured README files for GitHub repositories using AI-powered analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Complete Machine Learning Package

    Complete Machine Learning Package

    A comprehensive machine learning repository containing 30+ notebooks

    ...By organizing the content into modular notebooks, the project allows users to explore topics independently and experiment with the code directly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PromethAI

    PromethAI

    Open-source framework that gives you AI Agents

    PromethAI-Backend is a backend framework for AI-driven automation and knowledge extraction. It is designed to integrate with large language models (LLMs) to provide AI-enhanced workflows, including content generation, summarization, and data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16

    Sentiment dataset of Algerian dialect

    Dataset of 11760 sentiment comments written in Algerian dialect

    * To cite this dataset refer to https://doi.org/10.31449/inf.v46i6.3340 * This sentiment dataset of Algerian dialect consists of 11760 comments (6111 positive/ 5649 negative comments)) collected from (Facebook, YouTube and Twitter) during Hirak 2019. * Comments concern the Algerian spoken language, written in Arabic and/or Latin characters and/or Arabizi, which could be either Modern Standard Arabic, French or local dialect. * Value ‘1’ is attributed for Positive review / value ‘0’...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    MAE (Masked Autoencoders)

    MAE (Masked Autoencoders)

    PyTorch implementation of MAE

    MAE (Masked Autoencoders) is a self-supervised learning framework for visual representation learning using masked image modeling. It trains a Vision Transformer (ViT) by randomly masking a high percentage of image patches (typically 75%) and reconstructing the missing content from the remaining visible patches. This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    MTBook

    MTBook

    Machine Translation: Foundations and Models

    This is a tutorial, the purpose is to introduce the basic knowledge and modeling methods of machine translation systematically, and on this basis, discuss some cutting-edge technologies of machine translation (formerly known as "Machine Translation: Statistical Modeling and Deep Learning") method"). Its content is compiled into a book, which can be used for the study of senior undergraduates and graduate students in computer and artificial intelligence related majors, and can also be used as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Douyin-Bot

    Douyin-Bot

    Python TikTok bot

    Douyin-Bot is a Python automation project for interacting with the Douyin mobile app through Android device tooling. It was built as an experimental bot that captures phone screenshots, analyzes visible content, and performs automated app actions. The project uses Python and ADB to connect desktop-side logic with a mobile device. Its original goal was playful and experimental, focused on browsing and identifying content based on computer vision and face analysis. It is best understood as a demonstration of app automation, screen capture, API-based image analysis, and scripted mobile interaction. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20

    Semantic Assistants

    Natural Language Processing (NLP) for the Masses

    Semantic Assistants support users in content retrieval, analysis, and development, by offering context-sensitive NLP services directly integrated in standard desktop clients, like a word processor, and web information systems, like a wiki.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    RadicalSpam

    RadicalSpam

    Open Source Anti-Spam and Anti-Virus Gateway

    RadicalSpam is a free and open source package distributed under GPL v2, including products such as Postfix, SpamAssassin Amavisd-new, Clamav, Razor, DCC, Postgrey, Bind; providing a secure SMTP relay, ready to use with linux and docker environement. More information : http://www.radical-spam.org
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MyNook

    MyNook

    A machine learning system for supervised document classification

    An open source system for supervised document classification based on statistical machine learning techniques. On the contrary of the state of art classification techniques, MyNook just requires the title of the document, not the content itself.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Content Addressable Memory, Multi-Variate Statistics, Data Mining Includes analyzing datasets, extracting patterns, creating empirical expert system. Computes joint probabilities and implements a "belief" as the solution of an equilibrium equation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Scan, the Semantic Content ANnotator, is a semantic pipeline that helps connecting information extraction tools to semantic database. UIMA-based, it allows easy plugin-writing: information extraction, ontology control, store in RDF Repositories.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo