Showing 62 open source projects for "convert"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    OpenVINO Training Extensions

    OpenVINO Training Extensions

    Trainable models and NN optimization tools

    OpenVINO™ Training Extensions provide a convenient environment to train Deep Learning models and convert them using the OpenVINO™ toolkit for optimized inference. When ote_cli is installed in the virtual environment, you can use the ote command line interface to perform various actions for templates related to the chosen task type, such as running, training, evaluating, exporting, etc. ote train trains a model (a particular model template) on a dataset and saves results in two files. ote optimize optimizes a pre-trained model using NNCF or POT depending on the model format. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. Additionally, it can generate embeddings for extracted content and integrate with vector databases like Milvus, making it well-suited for retrieval-augmented generation pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Anything to NotebookLM

    Anything to NotebookLM

    Multi-source content processor for NotebookLM

    Qiaomu Anything to NotebookLM is a Claude Code skill that turns many types of source material into structured NotebookLM-ready outputs. It is built for users who want to convert articles, web pages, videos, PDFs, office files, podcasts, images, and search results into more usable study or presentation formats. The project uses natural-language commands, so the user can ask for a podcast, slide deck, mind map, report, quiz, flashcards, or infographic without manually building the workflow. It supports multilingual material, with especially strong use cases for Chinese and English content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    ...This allows the system to detect and extract structured elements such as tables, signatures, key fields, and layout information while maintaining semantic understanding of the document content. The toolkit can also convert complex documents into structured markdown representations that preserve formatting and contextual relationships.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    MiniOneRec

    MiniOneRec

    Minimal reproduction of OneRec

    ...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LLM Vision

    LLM Vision

    Visual intelligence for your home.

    ...Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process events from surveillance platforms such as Frigate and convert them into meaningful summaries, notifications, or structured data for automation workflows. It also maintains a timeline of analyzed camera events that can be displayed in dashboards or queried through the assistant interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JSON_REPAIR

    JSON_REPAIR

    A python module to repair invalid JSON from LLMs

    json_repair is an open-source Python library designed to automatically fix malformed JSON data and convert it into valid, parseable structures. The tool is particularly useful in scenarios where JSON output is generated by large language models or external services that may produce syntactically invalid responses. Instead of failing when encountering errors such as missing quotes, trailing commas, or incomplete objects, the library analyzes the malformed data and reconstructs it into valid JSON. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware language prediction so that outputs maintain both fidelity to the original speech and grammatical coherence. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    mcpo

    mcpo

    A simple, secure MCP-to-OpenAPI proxy server

    mcpo is a minimal bridge that exposes any MCP tool as an OpenAPI-compatible HTTP server. Instead of writing glue code, you point mcpo at an MCP server command and it generates REST endpoints and an OpenAPI spec that other systems (or LLM agent frameworks) can call immediately. This design lets you reuse a growing library of MCP servers with platforms that only understand HTTP+OpenAPI, unifying tool access across ecosystems. The project emphasizes “dead-simple” setup and pairs with Open WebUI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    imgclsmob Deep learning networks

    imgclsmob Deep learning networks

    Sandbox for training deep learning networks

    ...It includes implementations of models used for tasks such as image classification, object detection, semantic segmentation, and pose estimation. The repository also contains scripts that help train models, evaluate performance, and convert trained networks between different frameworks. Several deep learning frameworks are supported, allowing researchers to experiment with architectures in different environments. The project is frequently used by developers who want to study modern convolutional neural network designs and compare their performance across datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    E2M

    E2M

    E2M converts various file types (doc, docx, epub, html, htm, url

    E2M is a SourceForge mirror of the e2m open-source project, which focuses on providing tools or services designed to convert or process content between different formats or systems. Projects with similar naming conventions typically emphasize automation workflows where input data from one environment is transformed into another representation or output structure. The mirrored repository allows users to access the project’s codebase independently from its original hosting platform while preserving the development history and release artifacts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    NExT-GPT

    NExT-GPT

    Code and models for ICML 2024 paper, NExT-GPT

    ...The system connects a large language model with multimodal encoders and diffusion-based decoders so it can interpret information from different sensory formats and generate responses in different media types. This architecture allows the model to convert between modalities, such as generating images from text descriptions or producing audio or video outputs based on textual prompts. The project also introduces instruction-tuning strategies that enable the model to perform complex multimodal reasoning and generation tasks with minimal additional parameters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    tf2onnx

    tf2onnx

    Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

    tf2onnx converts TensorFlow (tf-1.x or tf-2.x), keras, tensorflow.js and tflite models to ONNX via command line or python API. Note: tensorflow.js support was just added. While we tested it with many tfjs models from tfhub, it should be considered experimental. TensorFlow has many more ops than ONNX and occasionally mapping a model to ONNX creates issues. tf2onnx will use the ONNX version installed on your system and installs the latest ONNX version if none is found. We support and test ONNX...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Streamer-Sales

    Streamer-Sales

    LLM Large Model of Selling Anchor

    ...By analyzing product characteristics and marketing information, the model can produce engaging explanations that emphasize benefits, features, and emotional appeal to encourage viewers to make purchasing decisions. The system integrates multiple AI technologies including retrieval-augmented generation to incorporate product knowledge, speech synthesis to convert generated scripts into voice output, and digital human generation to create virtual hosts. It also supports automatic speech recognition and agent-based tools that can retrieve additional information such as logistics or product details during live sessions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    shuyuan

    shuyuan

    Reading book source

    ...The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some AI-aided summarization or lookup functions, which helps users convert large text corpora into a structured, searchable knowledge base. For learners, researchers, or avid readers, Shuyuan offers a way to bridge from plain text files or eBooks into a manageable, interactive resource — one where notes, references, and reading progress can be tracked. It likely supports different input formats (text, HTML, PDF), and may integrate optional translation or text normalization tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Core ML Stable Diffusion

    Core ML Stable Diffusion

    Stable Diffusion with Core ML on Apple Silicon

    ...The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion. Hugging Face ran the conversion procedure on the following models and made the Core ML weights publicly available on the Hub. If you would like to convert a version of Stable Diffusion that is not already available on the Hub, please refer to the Converting Models to Core ML. Log in to or register for your Hugging Face account, generate a User Access Token and use this token to set up Hugging Face API access by running huggingface-cli login in a Terminal window.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Melodfy

    Melodfy

    ✨:AI-Powered Piano Audio to MIDI Converter 🎶

    Melodfy is an application that utilizes the power of artificial intelligence (developed by ByteDance) to seamlessly convert audio recordings of piano playing into playable MIDI files.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    ...It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file. One sentence per new line. ## Notes: - Its 2GB in size and requires 2-6GB of GPU VRAM too. (basically you need atleast a mid-range gaming PC to use this.) - Its fairly slow to start (10min) and transcribe, this is normal behavior...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    AudioBC

    AudioBC

    Offline desktop app to convert EPUB to MP3 using Kokoro-82M neural TTS

    ...Powered by the state-of-the-art Kokoro-82M neural engine, AudioBC produces natural, human-like speech that rivals premium cloud services. It is built with a focus on privacy and simplicity, offering a streamlined three-step workflow: Extract, Edit, and Convert. Key Features: Neural Quality TTS: Uses the compact yet powerful Kokoro-82M model for high-fidelity, expressive voice synthesis. Privacy-First & Offline: After a one-time initial model download, all processing happens on your CPU. Your books never leave your computer. Multi-Language Support: Curated voices for English (US & UK), Italian, French, Spanish, and Portuguese (BR). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    PoseidonQ  - AI/ML Based QSAR Modeling

    PoseidonQ - AI/ML Based QSAR Modeling

    ML based QSAR Modelling And Translation of Model to Deployable WebApps

    - This Software was made with an intention to make QSAR/QSPR development more efficient and reproducible. - Published in ACS, Journal of Chemical Information and Modeling . Link : https://pubs.acs.org/doi/10.1021/acs.jcim.4c02372 - Simple to use and no compromise on essential features necessary to make reliable QSAR models. - From Generating Reliable ML Based QSAR Models to Developing Your Own QSAR WebApp. For any feedback or queries, contact kabeermuzammil614@gmail.com - Available on...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 22
    TITTSE

    TITTSE

    Two Integrated Text To Speech Engines uses MMS & Silero

    TITTSE is a Python Application that allows you to easily and quickly convert text to speech in 15 different languages (or add more easily) using Two TTS Engines. All you need is a text file ending in the tittse extension with 4 header lines including the TITTSE language code (see documentation for your language), the 'base' file name for the audio files TITTSE creates, voice gender (girl or boy), offset (file numbers added to base file name start at this number).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Image to Text

    Convert an image to text to spot intelligible words.

    The program will convert to text an image, such as a photo , with the purpose of analyzing it to spot intelligible words. Use the program with photos of clouds, sea, soil, vegetation or any other photo of natural or man-made semi-homogeneous configuration, to reveal the hidden universal-philosophical messages of the image. You can also use it on photos of people or art pieces to have a psychological insight of the person portrayed or of the image author.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Img2Txt

    Img2Txt

    Img2Txt - Extract Text From Images using AI

    ...Img2Txt is a Python-based application packaged using PyInstaller that utilizes the power of pytesseract, an AI-powered optical character recognition (OCR) library, to extract text from images and convert it into plain text. The application features a simple and modern user-friendly interface created using customtkinter, allowing users to easily process images and obtain the text within them. Support me at : https://www.buymeacoffee.com/zsynctic it will motivate me and it will make me create more projects Support For any questions or issues, please open an issue on the Img2Txt GitHub repository. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Txt-2-Mp3  6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [Improved.Simplified.Alternative]

    'Txt2Mp3' an desktop application developed using python 3.6.8 and other add-on libaries. Can convert texts into audio (.mp3) files using gTTS (Google Text-to-speech) api module library. Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo