prithivMLmods commited on
Commit
982b3a3
·
verified ·
1 Parent(s): a09f09f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@ tags:
15
 
16
  # **GELab-Zero-4B-preview-GGUF**
17
 
18
- > The GELab-Zero-4B-preview from stepfun-ai is a 4B-parameter multimodal GUI agent model fine-tuned on Qwen3-VL-4B-Instruct, optimized for autonomous control of Android devices through visual understanding and actions like clicking, typing, swiping, and waiting, enabling zero-shot execution of complex, multi-step tasks across apps in categories such as food, transportation, shopping, and social without app-specific training. Designed for local deployment on consumer-grade hardware with low latency and full privacy, it excels in GUI navigation, UI element detection, localization, and open-world generalization on dynamic interfaces, achieving state-of-the-art results like 73.4% accuracy on AndroidDaily benchmarks—surpassing UI-TARS-1.5 by 26.4% and outperforming GPT-4o by 3.7x. Part of the open-source GELab-Zero project, it includes plug-and-play infrastructure for ADB connections, dependency management, task recording/replay, ReAct loops, multi-agent collaboration, and supports formats like GGUF for efficient inference via Ollama or llama.cpp on everyday machines.
19
 
20
  ## GELab-Zero-4B-preview [GGUF]
21
 
 
15
 
16
  # **GELab-Zero-4B-preview-GGUF**
17
 
18
+ > The [GELab-Zero-4B-preview](https://huggingface.co/stepfun-ai/GELab-Zero-4B-preview) from stepfun-ai is a 4B-parameter multimodal GUI agent model fine-tuned on Qwen3-VL-4B-Instruct, optimized for autonomous control of Android devices through visual understanding and actions like clicking, typing, swiping, and waiting, enabling zero-shot execution of complex, multi-step tasks across apps in categories such as food, transportation, shopping, and social without app-specific training. Designed for local deployment on consumer-grade hardware with low latency and full privacy, it excels in GUI navigation, UI element detection, localization, and open-world generalization on dynamic interfaces, achieving state-of-the-art results like 73.4% accuracy on AndroidDaily benchmarks—surpassing UI-TARS-1.5 by 26.4% and outperforming GPT-4o by 3.7x. Part of the open-source GELab-Zero project, it includes plug-and-play infrastructure for ADB connections, dependency management, task recording/replay, ReAct loops, multi-agent collaboration, and supports formats like GGUF for efficient inference via Ollama or llama.cpp on everyday machines.
19
 
20
  ## GELab-Zero-4B-preview [GGUF]
21