AskUI Vision Agent - Browse /v0.22.4 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-11-21	2.7 kB	0
v0.22.4 source code.tar.gz	2025-11-21	592.6 kB	0
v0.22.4 source code.zip	2025-11-21	753.1 kB	0
Totals: 3 Items		1.3 MB	0

What's Changed

Feat/Add annotation function by @mlikasam-askui in https://github.com/askui/vision-agent/pull/188
docs: add docstring to AndroidVisionAgent class by @mlikasam-askui in https://github.com/askui/vision-agent/pull/192
feat(chat): select model of run with request params by @adi-wan-askui in https://github.com/askui/vision-agent/pull/193

🚀 Features

Element Annotation:
annotate() method: Generate interactive HTML files that visualize detected UI elements on screenshots. The generated HTML allows users to:
- View bounding boxes around all detected elements
- Hover over elements to see their names and text values
- Click on elements to copy their text values to the clipboard

```python from askui import VisionAgent

with VisionAgent() as agent: # Annotate current screen and save to default 'annotations' directory agent.annotate()

  # Or specify custom screenshot and output directory
  agent.annotate(screenshot="screenshot.png", annotation_dir="htmls")

```

Also works with AndroidVisionAgent: ```python from askui import AndroidVisionAgent

with AndroidVisionAgent() as agent: agent.annotate() ```

locate_all_elements() method: Retrieve all detected elements programmatically as a list of DetectedElement objects:

```python from askui import VisionAgent

with VisionAgent() as agent: detected_elements = agent.locate_all_elements() print(f"Found {len(detected_elements)} elements: {detected_elements}")

  # Access element properties
  for element in detected_elements:
      print(f"Name: {element.name}, Text: {element.text}")
      print(f"Position: {element.center}, Size: {element.width}x{element.height}")

```

New Data Models:
- DetectedElement: Represents a detected UI element with name, text, and bounding_box properties, plus convenience properties for center, width, and height
- BoundingBox: Represents element coordinates with xmin, ymin, xmax, ymax, plus convenience properties for width, height, and center
Chat API Model Selection: Chat runs can now specify which model to use via the model parameter in the run creation request, allowing dynamic model selection per run instead of using only the configured default model.

📜 Documentation

AndroidVisionAgent class: Added comprehensive docstring with detailed parameter descriptions and usage examples for the AndroidVisionAgent class (src/askui/android_agent.py:41-62).

Full Changelog: https://github.com/askui/vision-agent/compare/v0.22.3...v0.22.4

Source: README.md, updated 2025-11-21

AskUI Vision Agent Files

Enable AI to control your desktop, mobile and HMI devices

What's Changed

🚀 Features

📜 Documentation

AskUI Vision Agent Files

Enable AI to control your desktop, mobile and HMI devices

Get an email when there's a new version of AskUI Vision Agent

What's Changed

🚀 Features

📜 Documentation