Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "compiler python linux" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Windows 98
Linux 98
Mac 97
More...
ChromeOS 10
BSD 9
Mobile Operating Systems 1

Category

Artificial Intelligence 98
Software Development 21
Business 3
System 2
Database 1
Education 1
Formats and Protocols 1
Multimedia 1

License

OSI-Approved Open Source 98

Translations

English 4

Programming Language

Python 85
C++ 10
C 2
JavaScript 2
More...
Julia 1
Rust 1
Swift 1

Status

Production/Stable 3

Showing 98 open source projects for "compiler python linux"

View related business solutions

LLM Inference Windows Clear Filters & Widen Search

Gen AI apps are built with MongoDB Atlas
Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free
Simple, Secure Domain Registration
Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.

Sign up for free
1

Arize Phoenix

Uncover insights, surface problems, monitor, and fine tune your LLM

Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative)...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
2

marqo

Tensor search for humans

A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
3

TensorFlow Model Optimization Toolkit

A toolkit to optimize ML models for deployment for Keras & TensorFlow

The TensorFlow Model Optimization Toolkit is a suite of tools for optimizing ML models for deployment and execution. Among many uses, the toolkit supports techniques used to reduce latency and inference costs for cloud and edge devices (e.g. mobile, IoT). Deploy models to edge devices with restrictions on processing, memory, power consumption, network usage, and model storage space. Enable execution on and optimize for existing hardware or new special purpose accelerators. Choose the model...

Downloads: 0 This Week

Last Update: 2024-02-08
See Project
4

SAHI

A lightweight vision library for performing large object detection

A lightweight vision library for performing large-scale object detection & instance segmentation. Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities. Detection of small objects and objects far away in the scene is a major...

Downloads: 0 This Week

Last Update: 2025-09-28
See Project
Get the most trusted enterprise browser
Advanced built-in security helps IT prevent breaches before they happen

Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.

Download Chrome
5

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding,...

Downloads: 0 This Week

Last Update: 2025-10-16
See Project
6

SageMaker Inference Toolkit

Serve machine learning models within a Docker container

Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the...

Downloads: 0 This Week

Last Update: 2023-10-25
See Project
7

LLMFlows

LLMFlows - Simple, Explicit and Transparent LLM Apps

LLMFlows is a framework for building simple, explicit, and transparent applications utilizing Large Language Models (LLMs). It emphasizes clarity and control in the development process, allowing developers to create LLM-powered applications with well-defined workflows and interactions. LLMFlows supports various LLMs and provides tools to manage prompts, responses, and application logic effectively.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
8

llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere

Running Llama 2 with gradio web UI on GPU or CPU from anywhere (Linux/Windows/Mac).

Downloads: 0 This Week

Last Update: 2023-10-04
See Project
9

Medusa

Framework for Accelerating LLM Generation with Multiple Decoding Heads

Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

Petals

Run 100B+ language models at home, BitTorrent-style

Run 100B+ language models at home, BitTorrent‑style. Run large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and...

Downloads: 0 This Week

Last Update: 2023-09-06
See Project
11

Repo of Tree of Thoughts (ToT)

Implementation of "Tree of Thoughts

Language models are increasingly being deployed for general problem-solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of...

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
12

Lightning Bolts

Toolbox of models, callbacks, and datasets for AI/ML researchers

Bolts package provides a variety of components to extend PyTorch Lightning, such as callbacks & datasets, for applied research and production. Torch ORT converts your model into an optimized ONNX graph, speeding up training & inference when using NVIDIA or AMD GPUs. We can introduce sparsity during fine-tuning with SparseML, which ultimately allows us to leverage the DeepSparse engine to see performance improvements at inference time.

Downloads: 0 This Week

Last Update: 2024-08-15
See Project
13

ollama_manager_gui

A graphical manager for ollama that can manage your LLMs

This app will help install ollama and LLMs using the gui provided by this app. It checks for ollama when launched and if it doesn't exist it will help by bringing you to the ollama site for download. This app is heavily upgraded and now also works properly on Linux. It now has progress bars and many many many improvements. It can launch the LLM by clicking the link. it can launch multiple LLMs in separate windows. It can also remove an installed LLM. There is a confirmation dialog...

Downloads: 2 This Week

Last Update: 2025-08-14
See Project
14

pipeless

A computer vision framework to create and deploy apps in minutes

Pipeless is an open-source computer vision framework to create and deploy applications without the complexity of building and maintaining multimedia pipelines. It ships everything you need to create and deploy efficient computer vision applications that work in real-time in just minutes. Pipeless is inspired by modern serverless technologies. It provides the development experience of serverless frameworks applied to computer vision. You provide some functions that are executed for new...

Downloads: 1 This Week

Last Update: 2024-02-23
See Project
15

Temporal Inference Engine

A real time inference engine for temporal logical specifications

A real time inference engine for temporal logical specifications, which is able to acquire, process and generate any binary or real signal through POSIX IPC, files or UNIX sockets. Specifications of signals and dynamic systems are represented as special graphs and executed in real time, with a predictable sampling time of few milliseconds. Real time signal processing, dynamic system control, state machine modeling and logical property verification are some fields of application of this...

1 Review

Downloads: 0 This Week

Last Update: 2 hours ago
See Project
16

GPT-NeoX

Implementation of model parallel autoregressive transformers on GPUs

This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. For those looking for a TPU-centric codebase, we...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
17

Sockeye

Sequence-to-sequence framework, focused on Neural Machine Translation

Sockeye is an open-source sequence-to-sequence framework for Neural Machine Translation built on PyTorch. It implements distributed training and optimized inference for state-of-the-art models, powering Amazon Translate and other MT applications. For a quickstart guide to training a standard NMT model on any size of data, see the WMT 2014 English-German tutorial. If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions...

Downloads: 0 This Week

Last Update: 2023-03-02
See Project
18

MMTracking

OpenMMLab Video Perception Toolbox

MMTracking is an open-source video perception toolbox by PyTorch. It is a part of OpenMMLab project. We are the first open-source toolbox that unifies versatile video perception tasks include video object detection, multiple object tracking, single object tracking and video instance segmentation. We decompose the video perception framework into different components and one can easily construct a customized method by combining different modules. MMTracking interacts with other OpenMMLab...

Downloads: 0 This Week

Last Update: 2023-08-15
See Project
19

AI Chatbots based on GPT Architecture

Training & Implementation of chatbots leveraging GPT-like architecture

Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations. It sure seems like there are a lot of text-generation chatbots out there, but it's hard to find a python package or model that is easy to tune around a simple text file of message data. This repo is a simple attempt to help solve that problem. ai-msgbot covers the practical use case of building a chatbot that sounds like you (or some dataset/persona you choose...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
20

Hello AI World

Guide to deploying deep-learning inference networks

Hello AI World is a great way to start using Jetson and experiencing the power of AI. In just a couple of hours, you can have a set of deep learning inference demos up and running for realtime image classification and object detection on your Jetson Developer Kit with JetPack SDK and NVIDIA TensorRT. The tutorial focuses on networks related to computer vision, and includes the use of live cameras. You’ll also get to code your own easy-to-follow recognition program in Python or C++, and train...

Downloads: 0 This Week

Last Update: 2022-08-03
See Project
21

SageMaker MXNet Inference Toolkit

Toolkit for allowing inference and serving with MXNet in SageMaker

SageMaker MXNet Inference Toolkit is an open-source library for serving MXNet models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain MXNet model types and utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep...

Downloads: 0 This Week

Last Update: 2022-07-05
See Project
22

Hugging Face Transformer

CPU/GPU inference server for Hugging Face transformer models

Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI....

Downloads: 0 This Week

Last Update: 2022-08-22
See Project
23

BudgetML

Deploy a ML inference service on a budget in 10 lines of code

Deploy a ML inference service on a budget in less than 10 lines of code. BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end. We built BudgetML because it's hard to find a simple way to get a model in production fast and cheaply. Deploying from scratch involves learning too many different concepts like SSL certificate generation, Docker, REST,...

Downloads: 0 This Week

Last Update: 2022-08-26
See Project

Previous
1
2
3
You're on page 4
Next

Related Searches

ollama

phoenix

sahi

ai

inference engine

employee training records

self-learning ai

surveillance

ollama gui

ai open chat

Related Categories

Artificial Intelligence

Software Development

Business

System

Database

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: