Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "human vision enhancer"

x

Sort By:

Relevance

OS

Linux 39
Windows 39
Mac 37
More...
BSD 24
ChromeOS 23

Category

Artificial Intelligence 28
Scientific/Engineering 7
Software Development 7
Multimedia 6
Business 2
System 2
Education 1
Text Editors 1

License

OSI-Approved Open Source 38
Creative Commons Attribution License 1
Public Domain 1

Translations

English 2

Programming Language

Python 24
C++ 7
C 4
MATLAB 3
More...
Java 2
TypeScript 2
Go 1
Rust 1
Unix Shell 1
Visual Basic 1

Status

Alpha 3
Production/Stable 3
Planning 2
Beta 1

Showing 45 open source projects for "human vision enhancer"

View related business solutions

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
1

RuView

Turn WiFi signals into real-time human sensing and spatial awareness.

RuView is an edge AI perception system that transforms ordinary WiFi signals into real-time environmental sensing and human pose estimation. Built on the concept of WiFi DensePose, it analyzes disturbances in WiFi Channel State Information (CSI) caused by human movement to reconstruct body position, breathing patterns, heart rate, and presence. Unlike traditional vision systems, RuView operates without cameras, wearables, or cloud connectivity, making it a privacy-first sensing solution. ...

Downloads: 59 This Week

Last Update: 5 days ago
See Project
2

python realtime human deteciton

human detection using yolov8

I would give you more but im tired and it's 4:26 am. yotube video is slightly outdated but has more info. https://www.youtube.com/watch?v=UAkjyeTOyo4

Downloads: 0 This Week

Last Update: 2025-05-20
See Project
3

Sapiens

High-resolution models for human tasks

Sapiens is a research framework from Meta AI focused on embodied intelligence and human-like multimodal learning, aiming to train agents that can perceive, reason, and act in complex environments. It integrates sensory inputs such as vision, audio, and proprioception into a unified learning architecture that allows agents to understand and adapt to their surroundings dynamically. The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and action into a single agentic model capable of following abstract goals. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
4

DriveLM

Driving with Graph Visual Question Answering

...The system includes DriveLM-Data, a dataset built on driving environments such as nuScenes and CARLA, where human-written reasoning steps connect different layers of driving tasks. This design allows models to learn relationships between objects, behaviors, and navigation decisions through graph-structured logic.

Downloads: 1 This Week

Last Update: 2026-03-09
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
5

dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective for global and low-resource language scenarios. ...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
6

CogView4

CogView4, CogView3-Plus and CogView3(ECCV 2024)

...It emphasizes bilingual usability, making it well-suited for cross-lingual multimodal applications. The model also supports fine-tuning and downstream customization, extending its applicability to creative content generation, human–computer interaction, and research on vision-language alignment.

Downloads: 0 This Week

Last Update: 6 days ago
See Project
7

autoMate

AI tool for automating desktop tasks via natural language input

autoMate is an AI-powered local automation tool designed to enable users to control and automate their computers using natural language instructions instead of traditional scripting or rule-based systems. It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans actions, and executes them through simulated input such as mouse clicks and keyboard events. Unlike conventional RPA tools that require predefined workflows, autoMate dynamically adapts to tasks by making autonomous decisions based on the current interface state. autoMate emphasizes local execution, meaning all processing happens on the user’s machine to maintain privacy and data security.

Downloads: 1 This Week

Last Update: 2026-03-31
See Project
8

The FreeMoCap Project

Free Motion Capture for Everyone

FreeMoCap is an open-source markerless motion capture system that enables users to record human movement using ordinary cameras and convert the footage into usable 3D motion data. The project’s goal is to democratize motion capture by removing the need for expensive suits or proprietary studio hardware, instead relying on computer vision and pose estimation pipelines. It processes synchronized video feeds to reconstruct skeletal motion, which can then be exported for animation, biomechanics research, or creative projects. ...

Downloads: 12 This Week

Last Update: 3 days ago
See Project
9

ElectronBot

ElectronBot is a mini desktop robot

...The creator provides full source materials—mechanical (3D printed or CNC parts), electronics (PCBs, custom boards), firmware and drivers—so someone can build or modify the robot themselves. The platform also integrates computer vision or gesture sensing (for example, keypoint detection of human pose) so the bot can respond dynamically to a person’s presence or movement.

Downloads: 1 This Week

Last Update: 2025-11-05
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

VisualGLM-6B

Chinese and English multimodal conversational language model

VisualGLM-6B is an open-source multimodal conversational language model developed by ZhipuAI that supports both images and text in Chinese and English. It builds on the ChatGLM-6B backbone, with 6.2 billion language parameters, and incorporates a BLIP2-Qformer visual module to connect vision and language. In total, the model has 7.8 billion parameters. Trained on a large bilingual dataset — including 30 million high-quality Chinese image-text pairs from CogView and 300 million English pairs — VisualGLM-6B is designed for image understanding, description, and question answering. Fine-tuning on long visual QA datasets further aligns the model’s responses with human preferences. ...

Downloads: 3 This Week

Last Update: 12 hours ago
See Project
11

MolmoWeb

Open multimodal web agent built by Ai2

...Unlike traditional automation tools that rely on structured HTML parsing or predefined APIs, MolmoWeb operates directly from screenshots of web pages, interpreting visual content in the same way a human user would. This approach allows it to generalize across different websites without requiring site-specific integrations, making it highly adaptable to diverse web environments.

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
12

DINOv2

PyTorch code and models for the DINOv2 self-supervised learning

DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval, detection, and segmentation—often requiring little or no fine-tuning. ...

Downloads: 0 This Week

Last Update: 2026-02-24
See Project
13

IPFS

IPFS implementation in Go

...IPFS keeps every version of your files and makes it simple to set up resilient networks for mirroring data. The Internet has turbocharged innovation by being one of the great equalizers in human history, but increasing consolidation of control threatens that progress. IPFS stays true to the original vision of an open, flat web by delivering technology to make that vision a reality.

Downloads: 1 This Week

Last Update: 2 days ago
See Project
14

UI-TARS

UI-TARS-desktop version that can operate on your local personal device

UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception, reasoning, grounding, and action into one end-to-end framework: it “thinks before acting,” enabling flexible, general-purpose automation. This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. ...

Downloads: 2 This Week

Last Update: 2025-12-01
See Project
15

The AI Scientist-v2

Workshop-Level Automated Scientific Discovery via Agentic Tree Search

...The system also integrates automated review mechanisms, including vision-language feedback loops, to iteratively refine the quality of generated research outputs.

Downloads: 1 This Week

Last Update: 2026-03-27
See Project
16

SteadyDancer

Harmonized and Coherent Human Image Animation

SteadyDancer is a research-oriented motion stabilization and dancer tracking system designed to analyze and correct motion in videos, making captured performances appear smoother and more stable while preserving expressiveness. It employs computer vision and motion modeling to estimate and reduce unwanted jitters, shakes, or camera wobbles — particularly in dance or movement sequences where traditional smoothing would distort intentional motion. By differentiating between intentional...

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
17

Page Agent

JavaScript in-page GUI agent. Control web interfaces

Page Agent is an open-source in-page AI agent framework that allows developers to control and interact with web interfaces using natural language directly within the browser. Unlike traditional browser automation tools, it operates entirely through in-page JavaScript, eliminating the need for browser extensions, headless browsers, or external automation environments. The system enables users to manipulate the DOM through text-based commands, allowing complex workflows such as form filling,...

Downloads: 0 This Week

Last Update: 2026-04-14
See Project
18

AppAgent

Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

...The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same way a human user would, making it compatible with a wide variety of mobile applications. AppAgent combines vision capabilities with language reasoning to understand interface elements and determine which actions are required to accomplish a task. The system also includes mechanisms for exploration and learning, allowing the agent to analyze user interface layouts and build structured knowledge about how different apps function.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
19

Seamless Communication

Foundational Models for State-of-the-Art Speech and Text Translation

Seamless Communication is a research project focused on building more integrated, low-latency multimodal communication between humans and AI agents. The motivation is to move beyond “text in, text out” and enable direct, live, multi-turn exchange involving language, gesture, gaze, vision, and modality switching without user friction. The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. ...

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
20

OAGI Python SDK

Python SDK for the Computer Use model Lux, developed by OpenAGI

OAGI Python SDK is a Python client library for the Lux computer-use model that turns Lux into a programmable automation layer for operating human-facing software via vision and actions. It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives. The SDK is designed around “computer use” as a paradigm, where the AI actually navigates interfaces, clicks, types, scrolls, and reads the screen through screenshots instead of only calling APIs. ...

Downloads: 0 This Week

Last Update: 2026-02-22
See Project
21

AI Employe

Create browser automation as if you were teaching a human using GPT-4

Try without Firebase authentication (temporary solution). Our stack consists of Next.js, Rust, Postgres, MeiliSearch, and Firebase Auth for authentication. Please sign up for a Firebase account and create a project. There are several techniques for this, ranging from sending a shortened form of HTML to GPT-3, creating a bounding box with IDs and sending it to GPT-4-vision to take actions, or directly asking GPT-4-vision to obtain the X and Y coordinates of the element. However, none of these...

Downloads: 0 This Week

Last Update: 2024-08-19
See Project
22

MDp

autonomous system + observable + understandable + controllable + AI

..." #1 An OS/Runtime for Innovation #2 A Self-Evolving Innovation Studio #3 A Distributed Yet Lightweight Cluster #4 A learning ecosystem #5 A living and fractal system #6 Optimization and lightweight design #7 An ethical and open framework #8 Memory and Self-Monitoring #9 Human and Social Development Tool #10 Prototype and Realistic Vision V0.1 → v0.1.7: dashboard, MC.py, task queue, simulated agents. Systemic and modular vision, but achievable step-by-step. # One-sentence summary: MD Portable + Venture Studio AI is a modular, self-resilient, and educational AI-orchestrated ecosystem capable of managing projects, capsules, and agents in a portable, multi-OS, and multi-studio environment, while remaining lightweight, educational, and open to the community. ...

Downloads: 0 This Week

Last Update: 2026-03-20
See Project
23

AI-Aimbot

CS2, Valorant, Fortnite, APEX, every game

AI-Aimbot is a computer vision project that demonstrates how artificial intelligence can be used to automatically identify and target opponents in video games. The system uses an object detection model based on the YOLOv5 architecture to detect human-shaped characters in gameplay screenshots or video frames. Once a target is identified, the program automatically adjusts the player’s aim toward the detected target, effectively automating the aiming process in first-person shooter games. ...

Downloads: 488 This Week

Last Update: 2026-03-15
See Project
24

MindForger

Thinking notebook and Markdown editor

...It is actually more than an editor or IDE - it's human mind inspired personal knowledge management tool.

1 Review

Downloads: 13 This Week

Last Update: 2024-08-17
See Project
25

PIFuHD

High-Resolution 3D Human Digitization from A Single Image

PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...

Downloads: 5 This Week

Last Update: 2025-10-06
See Project

Previous
You're on page 1
2
Next

Related Searches

aimbot

valorant ai aimbot

aimbot fortnite

aimbot for games

fortnite cheat

ai aimbot

rivals mac aimbot

aimbot for mac

valorant aimbot

soft aim fortnite

Related Categories

Artificial Intelligence

Scientific/Engineering

Software Development

Multimedia

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise