| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-11-21 | 2.7 kB | |
| v0.22.4 source code.tar.gz | 2025-11-21 | 592.6 kB | |
| v0.22.4 source code.zip | 2025-11-21 | 753.1 kB | |
| Totals: 3 Items | 1.3 MB | 0 | |
What's Changed
- Feat/Add annotation function by @mlikasam-askui in https://github.com/askui/vision-agent/pull/188
- docs: add docstring to AndroidVisionAgent class by @mlikasam-askui in https://github.com/askui/vision-agent/pull/192
- feat(chat): select model of run with request params by @adi-wan-askui in https://github.com/askui/vision-agent/pull/193
🚀 Features
- Element Annotation:
annotate()method: Generate interactive HTML files that visualize detected UI elements on screenshots. The generated HTML allows users to:- View bounding boxes around all detected elements
- Hover over elements to see their names and text values
- Click on elements to copy their text values to the clipboard
```python from askui import VisionAgent
with VisionAgent() as agent: # Annotate current screen and save to default 'annotations' directory agent.annotate()
# Or specify custom screenshot and output directory
agent.annotate(screenshot="screenshot.png", annotation_dir="htmls")
```
Also works with AndroidVisionAgent:
```python
from askui import AndroidVisionAgent
with AndroidVisionAgent() as agent: agent.annotate() ```
locate_all_elements()method: Retrieve all detected elements programmatically as a list ofDetectedElementobjects:
```python from askui import VisionAgent
with VisionAgent() as agent: detected_elements = agent.locate_all_elements() print(f"Found {len(detected_elements)} elements: {detected_elements}")
# Access element properties
for element in detected_elements:
print(f"Name: {element.name}, Text: {element.text}")
print(f"Position: {element.center}, Size: {element.width}x{element.height}")
```
-
New Data Models:
DetectedElement: Represents a detected UI element withname,text, andbounding_boxproperties, plus convenience properties forcenter,width, andheightBoundingBox: Represents element coordinates withxmin,ymin,xmax,ymax, plus convenience properties forwidth,height, andcenter
-
Chat API Model Selection: Chat runs can now specify which model to use via the
modelparameter in the run creation request, allowing dynamic model selection per run instead of using only the configured default model.
📜 Documentation
- AndroidVisionAgent class: Added comprehensive docstring with detailed parameter descriptions and usage examples for the
AndroidVisionAgentclass (src/askui/android_agent.py:41-62).
Full Changelog: https://github.com/askui/vision-agent/compare/v0.22.3...v0.22.4