Showing 7 open source projects for "visual%20scraper"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution Icon
    Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution

    K-12 Schools, Higher Education, Businesses, Restaurants

    Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
    Learn More
  • 1
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    ...Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen. The framework supports features like Optical Character Recognition (OCR) and Set-of-Mark (SoM) prompting to enhance visual grounding capabilities. It is designed to be compatible with macOS, Windows, and Linux (with X server installed), and is released under the MIT license.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    ...It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Android Use

    Android Use

    Automate native Android apps with AI using accessibility APIs

    android-action-kernel is an open source Python library designed to let AI agents control and automate native Android applications running on real devices or emulators. It fills a gap in automation tooling by focusing on mobile-first workflows where traditional browser or desktop-based automation doesn’t work; such as logistics, gig work, field operations, and other industries reliant on phones or tablets. The project works by using Android’s accessibility API to extract structured UI state...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software. Icon
    Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.

    Banks, lending institutions

    Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)
    Learn More
  • 5
    Universe Starter Agent

    Universe Starter Agent

    A starter agent that can solve a number of universe environments

    The universe-starter-agent repository is an archived OpenAI codebase designed as a starter reinforcement-learning agent that can interact with and solve tasks in OpenAI’s Universe environment platform. Its purpose is to serve as a baseline or reference implementation so researchers or developers can see how to build agents that operate in real-time, visual environments (e.g., games, browser apps) via pixel observations and keyboard/mouse actions. Under the hood, this starter agent implements a version of the A3C (Asynchronous Advantage Actor-Critic) algorithm, adapted for the specific challenges of Universe environments (e.g., network latency, VNC streaming, asynchronous observations). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    aCompute

    aCompute

    Aims to enable researcher to tap in to mobile computing capability

    This is a software agent based computing program that will enable researchers and other users to tap in computing power of machine available by sharing work load on the fly with zero configuration on network & resources A self organizing agent program that will understand network and its resource. where as the only job left to researcher is to split up jobs in several chunks of programs either parallel or sequential jobs and go issue the job (A visual Modeler or Scripting support need to be yet designed) Software agents will automatically manage the rest or resource management, sharing , cloning of tasks etc. new resources can be added and removed from the system on fly; in layman terms the project will create an agent program that enable sharing & execution of program among all the available resources whether it be desktop, laptop, pda . thereby one can accelerate research to the very extent of resource availability with out bothering about anything... ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Pythia is a natural language question answering system, which uses Speech Recognition and Text To Speech technologies to communicate with the user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next