Search Results for "jtdx-improved" - Page 2

Sort By:

Showing 178 open source projects for "jtdx-improved"

View related business solutions

Python Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

SeedVR2 Upscaler ComfyUI

Official SeedVR2 Video Upscaler for ComfyUI

...This project packages the SeedVR2 architecture as a custom node for ComfyUI, letting users upscale low-resolution video or imagery inside a node-based interface without needing to write code manually. The underlying SeedVR2 model is known for delivering high-quality video enhancement with strong temporal consistency and improved detail preservation by using diffusion-based techniques that are trained specifically on video sequences. Within the ComfyUI ecosystem, the upscaler integrates with existing nodes and pipelines, making it easier to combine with other processing steps such as denoising, color correction, or format conversion. Enthusiasts often use it for workflows ranging from hobby video enhancement to professional content improvement.

Downloads: 23 This Week

Last Update: 2026-01-07
See Project
2

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 9 This Week

Last Update: 2026-02-03
See Project
3

CO3D (Common Objects in 3D)

Tooling for the Common Objects In 3D dataset

CO3Dv2 (Common Objects in 3D, version 2) is a large-scale 3D computer vision dataset and toolkit from Facebook Research designed for training and evaluating category-level 3D reconstruction methods using real-world data. It builds upon the original CO3Dv1 dataset, expanding both scale and quality—featuring 2× more sequences and 4× more frames, with improved image fidelity, more accurate segmentation masks, and enhanced annotations for object-centric 3D reconstruction. CO3Dv2 enables research in multi-view 3D reconstruction, novel view synthesis, and geometry-aware representation learning. Each of the thousands of sequences in CO3Dv2 captures a common object (from categories like cars, chairs, or plants) from multiple real-world viewpoints. ...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
4

CodiumAI Cover-Agent

CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation

CodiumAI Cover Agent aims to help efficiently increasing code coverage, by automatically generating qualified tests to enhance existing test suites.

Downloads: 1 This Week

Last Update: 2025-05-21
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Ricks-Lab GPU Utilities

A set of utilities for monitoring and customizing GPU performance

A set of utilities for monitoring GPU performance and modifying control settings. In order to get the maximum capability of these utilities, you should be running with a kernel that provides support for the GPUs you have installed. If using AMD GPUs, installing the latest AMD GPU driver or ROCm package may provide additional capabilities. If you have Nvidia GPUs installed, you should have Nvidia-smi installed in order for the utility reading of the cards to be possible. Writing to GPUs is...

Downloads: 8 This Week

Last Update: 2024-10-30
See Project
6

Audiogen Codec

48khz stereo neural audio codec for general audio

AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games. We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic quality, and audible artifacts, which hinder industry use for these models. ...

Downloads: 7 This Week

Last Update: 2024-10-02
See Project
7

CogVideo

Text and image to video generation: CogVideoX and CogVideo

...Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference, fine-tuning, and optimization, making it suitable for both research and production use. It supports efficient deployment on a range of GPUs, including consumer hardware with quantization techniques. Overall, CogVideo provides a powerful framework for generating high-quality AI videos and experimenting with cutting-edge multimodal AI systems.

Downloads: 21 This Week

Last Update: 2025-10-04
See Project
8

x-transformers

A simple but complete full-attention transformer

...Proposes adding learned tokens, akin to CLS tokens, named memory tokens, that is passed through the attention layers alongside the input tokens. You can also use the l2 normalized embeddings proposed as part of fixnorm. I have found it leads to improved convergence when paired with small initialization (proposed by BlinkDL). The small initialization will be taken care of as long as l2norm_embed is set to True.

Downloads: 6 This Week

Last Update: 2026-02-12
See Project
9

autoMate

AI tool for automating desktop tasks via natural language input

autoMate is an AI-powered local automation tool designed to enable users to control and automate their computers using natural language instructions instead of traditional scripting or rule-based systems. It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans...

Downloads: 8 This Week

Last Update: 2026-03-31
See Project
Earn up to 16% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
10

Python API for JMComic

Python crawler and API for downloading JMComic albums and images

JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes. It supports both web-based and mobile API interfaces, enabling flexible interaction with the platform depending on the available...

Downloads: 11 This Week

Last Update: 7 days ago
See Project
11

Triton

Development repository for the Triton language and compiler

Triton is a programming language and compiler framework specifically designed for writing highly efficient custom deep learning operations, particularly for GPUs. It aims to bridge the gap between low-level GPU programming, such as CUDA, and higher-level abstractions by providing a more productive and flexible environment for developers. Triton enables users to write optimized kernels for machine learning workloads while maintaining readability and control over performance-critical aspects...

Downloads: 4 This Week

Last Update: 2026-03-20
See Project
12

wxPython Project Phoenix

wxPython's Project Phoenix. A new implementation of wxPython

...With wxPython software developers can create truly native user interfaces for their Python applications, that run with little or no modifications on Windows, Macs and Linux or other Unix-like systems. Welcome to wxPython's Project Phoenix! Phoenix is the improved next-generation wxPython, "better, stronger, faster than he was before." This new implementation is focused on improving speed, maintainability and extensibility. Just like "Classic" wxPython, Phoenix wraps the wxWidgets C++ toolkit and provides access to the user interface portions of the wxWidgets API, enabling Python applications to have a native GUI on Windows, Macs or Unix systems, with a native look and feel and requiring very little (if any) platform-specific code.

Downloads: 4 This Week

Last Update: 2026-02-08
See Project
13

HolyClaude

AI coding workstation: Claude Code + web UI + 5 AI CLIs + headless

HolyClaude is a developer-focused toolkit designed to enhance and extend the capabilities of Claude Code environments by providing structured prompts, utilities, and workflow enhancements for AI-assisted coding. The project centers around improving how developers interact with AI agents, enabling more efficient code generation, debugging, and task execution through optimized prompt engineering. It includes predefined templates and interaction patterns that guide the AI toward producing more...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
14

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...

Downloads: 9 This Week

Last Update: 2025-11-27
See Project
15

VOID

Video Object and Interaction Deletion

VOID is an advanced AI video processing system developed by Netflix that focuses on removing objects from videos while preserving the physical and visual realism of the surrounding environment. Unlike traditional inpainting methods that only erase pixels or simple artifacts, VOID models the full interaction dynamics between objects and their environment, including shadows, reflections, and even physical consequences such as movement or balance changes. Built on top of transformer-based...

Downloads: 4 This Week

Last Update: 7 days ago
See Project
16

Ansible-lint

Best practices checker for Ansible

...Still, its rules are the result of community contributions and they can always be disabled based individually or by category by each user. ansible-lint checks playbooks for practices and behavior that could potentially be improved. As a community-backed project ansible-lint supports only the last two major versions of Ansible.

Downloads: 12 This Week

Last Update: 2026-04-01
See Project
17

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch

bitsandbytes is an open-source library designed to make training and inference of large neural networks more efficient by dramatically reducing memory usage. Built primarily for the PyTorch ecosystem, the library introduces advanced quantization techniques that allow models to operate using reduced numerical precision while maintaining high accuracy. These optimizations enable large language models and other deep learning architectures to run on hardware with limited memory resources,...

Downloads: 6 This Week

Last Update: 2026-03-04
See Project
18

Papermerge

Open Source Document Management System for Digital Archives

...Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open-source software which means that transparency is the core value of our software development. Source code can be reviewed and improved by anyone from anywhere. Papermerge supports multiple users. Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. OCR technology is vital part of Papermerge. It extracts text information from scanned documents, PDF, JPEG, TIFF files.

Downloads: 20 This Week

Last Update: 2025-07-24
See Project
19

SAM 2

The repository provides code for running inference with SAM 2

SAM2 is a next-generation version of the Segment Anything Model (SAM), designed to improve performance, generalization, and efficiency in promptable image segmentation tasks. It retains the core promptable interface—accepting points, boxes, or masks—but incorporates architectural and training enhancements to produce higher-fidelity masks, better boundary adherence, and robustness to complex scenes. The updated model is optimized for faster inference and lower memory use, enabling real-time...

Downloads: 8 This Week

Last Update: 2025-10-06
See Project
20

Imagen - Pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network

...Architecturally, it is actually much simpler than DALL-E2. It consists of a cascading DDPM conditioned on text embeddings from a large pre-trained T5 model (attention network). It also contains dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient unit design. It appears neither CLIP nor prior network is needed after all. And so research continues. For simpler training, you can directly supply text strings instead of precomputing text encodings. (Although for scaling purposes, you will definitely want to precompute the textual embeddings + mask)

Downloads: 9 This Week

Last Update: 2024-10-07
See Project
21

PaddleNLP

Easy-to-use and powerful NLP library with Awesome model zoo

...Provide rich industry-level pre-task capabilities Taskflow And process-wide text area API: Support for the loading of rich Chinese data sets Dataset API, can flexibly and efficiently complete data pretreatment Data API, Preset 60 + pre-training word vector Embedding API, Providing 100 + pre-training model Transformer API Wait, the efficiency of NLP task modeling can be greatly improved.

Downloads: 5 This Week

Last Update: 2025-05-21
See Project
22

autocrawler

Multiprocess Selenium crawler for downloading images by keywords

AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a...

Downloads: 2 This Week

Last Update: 5 days ago
See Project
23

Archon

The knowledge and task management backbone for AI coding assistants

Archon is an open-source “command center” designed to enhance AI coding assistant workflows by giving developers a centralized environment for knowledge management, context engineering, and task coordination across AI agents. It acts as a backend (including an MCP server) that allows different AI coding tools and assistants to share the same structured context, knowledge base, and task lists, improving consistency, productivity, and collaboration across multi-agent interactions. Users can...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
24

CodeGeeX2

CodeGeeX2: A More Powerful Multilingual Code Generation Model

...Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports over 100 programming languages. With improved inference efficiency, quantization options, and multi-query/flash attention, CodeGeeX2 achieves faster generation speeds and lightweight deployment, requiring as little as 6GB GPU memory at INT4 precision. Its backend powers the CodeGeeX IDE plugins for VS Code, JetBrains, and other editors, offering developers interactive AI assistance with features like infilling and cross-file completion.

Downloads: 5 This Week

Last Update: 7 days ago
See Project
25

Agent S

Agent S: an open agentic framework that uses computers like a human

Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines...

Downloads: 11 This Week

Last Update: 2025-12-16
See Project