Not a ROS2 replacement. An intelligent upper-computer layer above ROS2.
VLAClaw keeps stable motion inside robot-side ROS2 nodes while adding an OpenClaw layer for observation, planning, skill validation, and execution monitoring.
Traditional upper computer
VLAClaw intelligent upper computer
Interface-first validation for reliable robot-agent deployment.
VLAClaw organizes the engineering path around interfaces, skill contracts, safety checks, and repeatable ROS2 robot workflows so each capability can be tested and extended systematically.
Interface validation
Goal: Prove that OpenClaw can connect to ROS2 robots through rosbridge.
Evidence: WebSocket endpoint, JSON pub/sub examples, sensor topics, and command templates.
Next: Validate against robot-side rosbridge_server and platform-specific topic names.
Skill abstraction
Goal: Turn scattered actions into an AI-callable capability layer.
Evidence: skills.yaml schema, action-group mapping, parameter limits, and execution feedback.
Next: Register 5-8 core skills first: stop, stand, walk, turn, sit_wave, status, camera, IMU.
Safety boundary
Goal: Keep model output at the skill layer rather than raw motor control.
Evidence: Skill Server checks speed, duration, robot posture, IMU stability, and emergency stop.
Next: Test refusal behavior and recovery flow on repeated unsafe commands.
Demo workflow readiness
Goal: Package voice, vision, action-group, and developer-integration flows into repeatable robot demos.
Evidence: Voice greeting, visual interaction, safe action group, and developer integration workflows.
Next: Add robot logs, field footage, and deployment notes as each workflow is verified.
VLAClaw turns the upper computer into an embodied agent layer.
OpenClaw observes sensor topics, reasons with VLM/LLM models, selects validated skills, sends commands through rosbridge, and replans from execution feedback.
Perception In
Subscribe to camera, IMU, radar, odometry, and robot status topics as agent observations.
Intelligence Core
Use OpenClaw, VLM, LLM, memory, and safety rules to convert user intent into a skill plan.
Skill Execution Out
Call bounded robot skills that are validated before ROS2 lower controllers execute motion.
A practical path from an existing ROS2 robot to an OpenClaw-controlled demo.
This is the implementation story customers and developers need: VLAClaw works with existing robot controllers and maps robot capabilities into observations and validated skills.
Audit robot interfaces
List ROS2 topics, services, action groups, sensor streams, and safety commands already available on the robot.
Enable rosbridge
Run rosbridge_websocket on the robot and expose a stable WebSocket endpoint for upper-computer access.
Map observations
Normalize camera, IMU, radar, odometry, and status topics into OpenClaw-readable observation channels.
Register skills
Convert motion commands, action groups, services, and interaction behaviors into skills with parameters and limits.
Run bounded demos
Start with single-command and short workflow demos before moving to multi-step planning.
Measure and harden
Log latency, success rate, refusal behavior, recovery path, and feedback quality before deployment.
Designed for ROS2 robots first, then extended across embodiments.
The platform is strongest where the robot already exposes ROS2 topics or can bridge its controller into a small set of commands, sensors, and status messages.
Robot bodies
Quadruped robot dog
Primary MVPLocomotion, action groups, camera, IMU, greeting demos.
Robotic arm / gripper
Planned extensionSkill schema supports grasp, release, pose, and vision-guided actions.
Interactive display robot
Ready to modelExpression display and speech skills can be integrated as non-motion skills.
Control interfaces
ROS2 topic publish / subscribe
Core pathSensor observation and motion command surface.
ROS2 service call
Supported patternUseful for buzzer, mode switching, action triggering, and status queries.
Action group files
Skill source.d6a or platform-specific motion files become validated semantic skills.
Compute placement
Raspberry Pi 5 gateway
Edge targetHandles connection, lightweight preprocessing, skill dispatch, and local services.
Laptop / workstation host
Development targetPreferred for coding, debugging, OpenClaw iteration, and visual inspection.
Cloud model fallback
OptionalUsed for heavier VLM/LLM reasoning, long-context planning, and report generation.
OpenClaw x rosbridge x ROS2, separated for safety and deployability.
The architecture separates intelligence, communication, and real-time execution so developers can build robot agents without installing ROS2 on every host.
Human Interaction
Voice command, text instruction, web dashboard, and developer API.
OpenClaw / VLAClaw Upper Computer
Agent runtime for multimodal understanding, planning, memory, and safety validation.
ROS2 Bridge Layer
rosbridge_websocket exposes ROS2 pub/sub and services over WebSocket JSON.
Robot Lower Computer
Robot-side ROS2 nodes handle sensors, action groups, motors, servos, displays, and arms.
Observation In
Command Out
Skill-based control, not unsafe motor-level generation.
VLAClaw treats action groups, ROS2 commands, services, and navigation behaviors as semantic robot skills. The model chooses skills and parameters, not raw actuator values.
Direct AI-to-Motor Control
VLAClaw Skill-Based Control
A product matrix for robot-side embodied intelligence.
VLAClaw is a platform stack rather than a single remote-control page. Each module has a clear role in the upper-computer agent layer.
OpenClaw Runtime
Agent loop, tool calling, task planning, context memory, and cloud fallback.
ROSBridge Adapter
WebSocket client for ROS2 topic pub/sub, service calls, and JSON serialization.
Perception Hub
Camera, IMU, radar, odometry, and status streams normalized as observations.
Skill Server
Skill registry, parameter validation, execution dispatch, and feedback tracking.
ActionGroup Manager
Converts authored action groups such as sit_wave.d6a into callable robot skills.
Safety Guard
Speed limits, posture checks, emergency stop, and repeated-command filtering.
Reliability-first roadmap from voice MVP to multi-robot orchestration.
The roadmap is intentionally staged: build safe single-robot skills first, then add perception, recovery, navigation, and multi-embodiment orchestration.