Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "can=" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Windows 43
Linux 40
Mac 40
More...
ChromeOS 5
BSD 4
Mobile Operating Systems 1

Category

Artificial Intelligence 44
Software Development 8
Business 1
Multimedia 1

License

OSI-Approved Open Source 44

Translations

English 3

Programming Language

Python 44
C++ 1
JavaScript 1
Rust 1

Status

Production/Stable 2

Showing 44 open source projects for "can="

View related business solutions

LLM Inference Python Clear Filters & Widen Search

Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
1

DeepSpeed

Deep learning optimization library: makes distributed training easy

DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference. With DeepSpeed you can: 1. Train/Inference dense or sparse models with billions or trillions of parameters 2. Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme compression for an unparalleled inference latency and model size reduction with low costs DeepSpeed offers a confluence of system innovations, that has made large scale DL training effective, and efficient, greatly improved ease of use, and redefined the DL training landscape in terms of scale that is possible. ...

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
2

Curated Transformers

PyTorch library of curated Transformer models and their components

...Supports state-of-the-art transformer models, including LLMs such as Falcon, Llama, and Dolly v2. Implementing a feature or bugfix benefits all models. For example, all models support 4/8-bit inference through the bitsandbytes library and each model can use the PyTorch meta device to avoid unnecessary allocations and initialization.

Downloads: 0 This Week

Last Update: 2024-04-17
See Project
3

ollama_manager_gui

A graphical manager for ollama that can manage your LLMs

...It checks for ollama when launched and if it doesn't exist it will help by bringing you to the ollama site for download. This app is heavily upgraded and now also works properly on Linux. It now has progress bars and many many many improvements. It can launch the LLM by clicking the link. it can launch multiple LLMs in separate windows. It can also remove an installed LLM. There is a confirmation dialog when removing or hard resetting an LLM. It can also install LLMs. Reset deletes all data for LLM of your choosing and automatically reinstalls that LLM of your choosing by redownloading. ...

Downloads: 1 This Week

Last Update: 2025-08-14
See Project
4

OpenFieldAI - AI Open Field Test Tracker

OpenFieldAI is an AI based Open Field Test Rodent Tracker

...The software generates Centroid graph, Heat map and Line path and a spreadsheet containing all calculated parameters like - Speed - Time in and out of ROI - Distance - Entries/Exits for single/multiple pre-recorded videos or live webcam video. The ROI is assigned automatically in multiple video input , and can be manually given in single input. - For Queries/ Reporting Bugs, contact: kabeermuzammil614@gmail.com - Available on WIndows OS - Software Authorship - Muzammil Kabier and Shamili Mariya Varghese ( Sole Authors )

Downloads: 20 This Week

Last Update: 2026-01-05
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

Bard API

The unofficial python package that returns response of Google Bard

...Please note that the bardapi is not a free service, but rather a tool provided to assist developers with testing certain functionalities due to the delayed development and release of Google Bard's API. It has been designed with a lightweight structure that can easily adapt to the emergence of an official API. Therefore, I strongly discourage using it for any other purposes. If you have access to official PaLM-2 API, replace the provided response with the corresponding official code.

Downloads: 0 This Week

Last Update: 2024-02-24
See Project
6

FastChat

Open platform for training, serving, and evaluating language models

FastChat is an open platform for training, serving, and evaluating large language model-based chatbots. If you do not have enough memory, you can enable 8-bit compression by adding --load-8bit to the commands above. This can reduce memory usage by around half with slightly degraded model quality. It is compatible with the CPU, GPU, and Metal backend. Vicuna-13B with 8-bit compression can run on a single NVIDIA 3090/4080/T4/V100(16GB) GPU. In addition to that, you can add --cpu-offloading to commands above to offload weights that don't fit on your GPU onto the CPU memory. ...

Downloads: 1 This Week

Last Update: 2024-02-11
See Project
7

Autodistill

Images to inference with no labeling

Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. You can use Autodistill on your own hardware, or use the Roboflow hosted version of Autodistill to label images in the cloud.

Downloads: 0 This Week

Last Update: 2024-08-14
See Project
8

SSD in PyTorch 1.0

High quality, fast, modular reference implementation of SSD in PyTorch

...The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for research based on SSD. Multi-GPU training and inference: We use DistributedDataParallel, you can train or test with arbitrary GPU(s), the training schema will change accordingly. Add your own modules without pain. We abstract backbone, Detector, BoxHead, BoxPredictor, etc. You can replace every component with your own code without changing the code base. For example, You can add EfficientNet as the backbone, just add efficient_net.py (ALREADY ADDED) and register it, specific it in the config file, It's done! ...

Downloads: 0 This Week

Last Update: 2024-01-13
See Project
9

MMDeploy

OpenMMLab Model Deployment Framework

MMDeploy is an open-source deep learning model deployment toolset. It is a part of the OpenMMLab project. Models can be exported and run in several backends, and more will be compatible. All kinds of modules in the SDK can be extended, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on. Install and build your target backend. ONNX Runtime is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. ...

Downloads: 2 This Week

Last Update: 2023-12-25
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
10

pipeless

A computer vision framework to create and deploy apps in minutes

...It provides the development experience of serverless frameworks applied to computer vision. You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.

Downloads: 2 This Week

Last Update: 2024-02-23
See Project
11

towhee

Framework that is dedicated to making neural data processing

Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. ...

Downloads: 0 This Week

Last Update: 2023-12-05
See Project
12

SageMaker Inference Toolkit

Serve machine learning models within a Docker container

...The SageMaker Inference Toolkit implements a model serving stack and can be easily added to any Docker container, making it deployable to SageMaker. This library's serving stack is built on Multi Model Server, and it can serve your own models or those you trained on SageMaker using machine learning frameworks with native SageMaker support.

Downloads: 0 This Week

Last Update: 2023-10-25
See Project
13

Petals

Run 100B+ language models at home, BitTorrent-style

...Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. You get the comforts of an API with the flexibility of PyTorch. You can also host BLOOMZ, a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime — just replace bloom-petals with bloomz-petals. ...

Downloads: 0 This Week

Last Update: 2023-09-06
See Project
14

Repo of Tree of Thoughts (ToT)

Implementation of "Tree of Thoughts

Language models are increasingly being deployed for general problem-solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem-solving. ...

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
15

Lightning Bolts

Toolbox of models, callbacks, and datasets for AI/ML researchers

Bolts package provides a variety of components to extend PyTorch Lightning, such as callbacks & datasets, for applied research and production. Torch ORT converts your model into an optimized ONNX graph, speeding up training & inference when using NVIDIA or AMD GPUs. We can introduce sparsity during fine-tuning with SparseML, which ultimately allows us to leverage the DeepSparse engine to see performance improvements at inference time.

Downloads: 0 This Week

Last Update: 2024-08-15
See Project
16

Sockeye

Sequence-to-sequence framework, focused on Neural Machine Translation

...For a quickstart guide to training a standard NMT model on any size of data, see the WMT 2014 English-German tutorial. If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions to sockeye-dev-at-amazon-dot-com. Developers may be interested in our developer guidelines. Starting with version 3.0.0, Sockeye is also based on PyTorch. We maintain backwards compatibility with MXNet models of version 2.3.x with 3.0.x. If MXNet 2.x is installed, Sockeye can run both with PyTorch or MXNet. All models trained with 2.3.x (using MXNet) can be converted to models running with PyTorch using the converter CLI (sockeye.mx_to_pt).

Downloads: 0 This Week

Last Update: 2023-03-02
See Project
17

MMTracking

OpenMMLab Video Perception Toolbox

...We are the first open-source toolbox that unifies versatile video perception tasks include video object detection, multiple object tracking, single object tracking and video instance segmentation. We decompose the video perception framework into different components and one can easily construct a customized method by combining different modules. MMTracking interacts with other OpenMMLab projects. It is built upon MMDetection that we can capitalize any detector only through modifying the configs. All operations run on GPUs. The training and inference speeds are faster than or comparable to other implementations. ...

Downloads: 0 This Week

Last Update: 2023-08-15
See Project
18

AI Chatbots based on GPT Architecture

Training & Implementation of chatbots leveraging GPT-like architecture

...This repo is a simple attempt to help solve that problem. ai-msgbot covers the practical use case of building a chatbot that sounds like you (or some dataset/persona you choose) by training a text-generation model to generate conversation in a consistent structure. This structure is then leveraged to deploy a chatbot that is a "free-form" model that consistently replies like a human. Some of the trained models can be interacted with through the HuggingFace spaces and model inference APIs on the ETHZ Analytics Organization page on huggingface.co.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
19

Hugging Face Transformer

CPU/GPU inference server for Hugging Face transformer models

...Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI. Both are great tools but not very performant in inference. Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.

Downloads: 0 This Week

Last Update: 2022-08-22
See Project

Previous
1
You're on page 2
Next

Related Searches

smart home control panel

gpt

openfield

chatbot code

ssd health check

towhee

ai

inference engine

win

python code editor

Related Categories

Artificial Intelligence

Software Development

Business

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise