Robotics
From real-world scans to simulation-ready assets—tool-driven 3D content preparation for robot training.
Robotics Overview
Modern robotics increasingly tries to make robots learn to manipulate objects by learning from demonstrations and training policies in digital simulation.
The Core Pipeline: Real World → Digital Simulation → Robot Training
Robotics teams often follow a loop like:
Why Simulation is Essential
Simulation offers repeatability and scale that the real world cannot:
Crash the robot in sim and reset instantly; no repairs.
Train hundreds/thousands of environments in parallel.
Synthesize unlimited rollouts and corner cases.
Test dangerous actions safely.
Key Bottleneck: 3D Assets are Still Labor-Intensive
Across recent robotics/simulation literature (e.g., Seed3D, URDFormer, Real2Render2Real), the recurring issue is the content bottleneck—high-quality, compliant assets are expensive to build and maintain.
"building simulation models is often still done by hand … with predefined assets to construct rich scenes"— URDFormer (RSS 2024)
"generating high-quality, compliant, and intersection-free assets for simulation remains labor-intensive"— Real2Render2Real (2025)
Digitization Pipeline & Automation Maturity
Understanding where manual GUI work is still required versus what can be fully automated.
End-to-End Pipeline (Asset-Centric View)
Scan
Repair
Rig
Collision
Sim
Physical Object → 3D Scan
Capture real objects using mobile scanners, photogrammetry, or structured light.
Mesh Cleanup / Repair
Fix holes, noise, and non-manifold geometry. Manual work in Blender.
Articulation / Rigging
Define joints, bones, and motion limits. Manual setup in Blender.
Collision (Convex Decomposition)
Generate collision meshes. Highly automated via CoACD, V-HACD.
Simulation Testing
Test in PyBullet, MuJoCo, or other simulators. Scriptable.
Automation Maturity by Step
| Step | Typical Tools | Automation | GUI? |
|---|---|---|---|
3D Scanning | Mobile scanner apps, RealityScan | High | No |
Mesh Cleanup / Repair | Blender (manual edit/sculpt/modifiers) | Low | Yes |
Articulation / Rigging | Blender Armature / manual joint setup | Low | Yes |
Convex Decomposition | CoACD, V-HACD | High | No |
Simulation Testing | PyBullet, MuJoCo | High | No |
3D Scanning
Mobile scanner apps, RealityScan
Mesh Cleanup / Repair
Blender (manual edit/sculpt/modifiers)
Articulation / Rigging
Blender Armature / manual joint setup
Convex Decomposition
CoACD, V-HACD
Simulation Testing
PyBullet, MuJoCo
Key Finding
What many people intuitively label as "simulation readiness" often overweights convex decomposition, but collision generation is already highly automated (CoACD, V-HACD). The true manual bottlenecks are mesh cleanup/repair and articulation rigging.
Limits of Existing Automation (Why GUI Work Remains)
URDFormer-style "image → URDF" systems
- •Depend heavily on bounding box / detection quality
- •Often rely on predefined meshes that don't match real scenes
- •Don't reliably infer physical parameters (mass, friction)
- •Still need human GUI correction for key geometry / boundaries
Auto-retopology tools (Quadriflow, Instant Meshes)
- •Unstable on complex geometry
- •Still require manual cleanup around critical regions (e.g., joints)
- •Common caveat: "may still need to clean up areas manually"
Where LLM Agents Fit (AgentHLE Lens)
What makes a good benchmark task for robotics asset preparation workflows.
AgentHLE Requirements
Task requires GUI/tool operation (not pure coding).
Clear file-level I/O definition.
Acceptance criteria can be checked automatically or semi-automatically.
Millions (ideal) or at least easy-to-collect ongoing data.
What is NOT a Good Fit (v1)
Current LLMs already do well; not distinctive for AgentHLE.
E.g., convex decomposition via CoACD CLI.
Tasks requiring real scanning hardware or lab setups.
Best-Fit Robotics Tasks (High Pain + GUI + Verifiable)
Example Tasks (Benchmark Definitions)
These tasks are defined in AgentHLE style: end-to-end execution on a computer with real tools, producing artifacts that a reviewer can verify.
Design Principles (Same Across Industries)
Tasks require tools/software; pure reasoning is not acceptable.
Define only I/O and acceptance criteria.
Prefer deterministic checks.
Use public datasets or synthetic corruption pipelines.
Keep the benchmark computer-completable.
Core Tasks (2)
3D Mesh Repair / Scan Cleanup
Every scanned object typically needs this step; manual work is commonly 2–8 hours per model.
Articulation Rigging
Manipulable objects (drawers, doors, tools) require correct joint definitions; existing automation is narrow and still needs human correction.
Summary
| Task | Input | Output | Software | Scale |
|---|---|---|---|---|
| Mesh repair / scan cleanup | Damaged .obj/.glb | Repaired .obj/.glb | Blender | Millions |
| Articulation rigging | Static .obj + joint text | Articulated .glb/.urdf | Blender | Thousands+ |
Why these two: They require GUI operation (not solvable by a single CLI/script), are real production bottlenecks for simulation content, have scalable data strategies, and have clear verification hooks.
Workflow Collection Guide
What we need from robotics contributors to build high-quality benchmarks.
Required Deliverables (Per Task)
| Deliverable | Description | Quantity |
|---|---|---|
| Sample input files | Cover easy / medium / hard | 5–10 |
| Corresponding outputs | Ground-truth outputs | 5–10 |
| Screen recordings | Full workflow + "think-aloud" explanation | 2–3 |
| Evaluation script | Python script to check outputs | 1 |
Screen Recording Guidelines
Show the full process: open → operate → validate → export/save
"There's a hole here; I'll select the boundary loop and fill it…"
Valuable for training and evaluation
Data Sources (Public)
Mesh Repair
Objaverse (10M+ assets; watertight subset can be million-scale)
Articulation
PartNet-Mobility (2,000+ articulated objects; expandable)
Suggested Timeline
Professional Tools in Robotics
Contribute to Robotics
We seek high-level, representative contributions—not exhaustive documentation. Share your expertise in any of these areas:
Submit Landscape Understanding
Help us map subfields, tools, and workflows in robotics simulation. Share your perspective on the domain structure.
Submit a Workflow
Describe a specific professional task with tools, inputs, outputs, and how success is verified.
Our Commitments to Contributors
- Evaluation Only: All contributions are used exclusively for agent evaluation, never for model training.
- Partner Review: Industry partners can review and approve task specifications before public release.
- Data Control: Contributors can exclude sensitive or proprietary data from submissions.