Sim-to-Real

Exporting Accurate Actuator Models from MuJoCo URDF for Sim-to-Real Transfer

Rafael Tran January 14, 2026 Research Blog

The first time you run a policy trained entirely in MuJoCo on real hardware, you'll notice the gap. The sim looks smooth. The hardware does something different — sometimes slightly, sometimes dramatically. That gap isn't random noise. It's a systematic consequence of how MuJoCo's actuator model diverges from what your physical actuators actually do. Closing it requires methodical work at the model level before any RL training begins, not just domain randomization applied afterward.

What MuJoCo's Default Actuator Model Gets Wrong

MuJoCo's default joint actuator is a torque source with a gear ratio applied to a motor model. Out of the box, it models the motor as producing torque proportional to the control signal, with optional armature inertia (the armature parameter) and optional joint-level damping. That's a reasonable approximation for some actuator types. It misses several things that matter greatly for strain-wave gearbox actuators:

Gearbox compliance: Strain-wave gearboxes have measurable torsional compliance (spring constant on the order of 1000–5000 Nm/rad depending on size). MuJoCo's default model treats the gearbox as a rigid connection. This matters most during contact events and load reversals.
Gearbox damping: The strain-wave flexspline introduces viscous friction that is load-dependent and not well captured by a constant damping coefficient at the joint level.
Motor inertia: The armature parameter captures rotor inertia but it must be set correctly. Typical values for the motors we use are 0.00015–0.0003 kg·m² before the gear ratio scaling. With an 80:1 reduction, the reflected inertia at the output is non-trivial and affects the dynamic response during fast transitions.
Velocity-dependent friction: Stiction and Coulomb friction at the gearbox output are not symmetric and are poorly approximated by MuJoCo's built-in friction models.

Setting Up a Calibrated URDF Export

The workflow we follow starts with physical identification on a benchtop dynamometer before any URDF parameter is finalized. The key measurements to gather per joint size are:

Torsional stiffness sweep: Apply static torque in both directions across the full angular range of the flexspline. Fit a piecewise linear stiffness value. Use the lower bound as your conservative MuJoCo spring constant.
No-load friction profile: Measure back-drive torque at multiple velocities (0.1, 0.5, 1.0, 2.0, 5.0 rad/s). This gives you the viscous and Coulomb components you'll encode in MuJoCo's frictionloss and damping parameters.
Motor electrical time constant: Measure the current rise time under a step voltage command. This determines whether your torque bandwidth model needs the motor's electrical dynamics or whether they're fast enough to treat the motor as ideal within your control bandwidth.
Reflected inertia validation: Compare the measured acceleration response to a step torque command against the model prediction. Tune the armature value to match.

Domain randomization doesn't compensate for a systematically wrong nominal model. It adds noise around whatever center you've given it — if the center is wrong, you're training policies to tolerate the wrong distribution.

MuJoCo URDF Parameters and Where They Map

The URDF itself doesn't carry most of these parameters — they live in the MuJoCo XML (.xml or embedded mujoco extension blocks within the URDF). The critical parameter mapping is:

Physical Property	MuJoCo Parameter	URDF Location
Rotor inertia (motor)	`armature`	MuJoCo joint extension block
Joint viscous friction	`damping`	MuJoCo joint extension block
Coulomb friction	`frictionloss`	MuJoCo joint extension block
Gear ratio	`gear`	MuJoCo actuator definition
Torque limits	`forcelimited`, `forcerange`	MuJoCo actuator definition
Link mass and inertia	`mass`, `inertia`	Standard URDF `inertial` tag

Exporting From Our SDK

The Tendonkindle motion SDK ships with a MuJoCo export utility that generates per-joint XML parameter blocks from the same datasheet values used to initialize the hardware controller. This matters for a specific reason: when the simulator uses the same gear ratio, armature, and damping values as the real-time impedance controller, the behavioral correspondence between sim and hardware is significantly better at the nominal parameter point. We set these values from physical identification, not from theoretical motor specs.

The export also includes the force-torque sensor placement in the kinematic chain — placing it at the correct link with the right mass offset so the sensor frame in simulation matches the physical sensor location. Getting this wrong introduces a systematic error in any contact force estimation that the policy might use as an observation.

What Domain Randomization Should and Shouldn't Cover

Once you have a well-calibrated nominal model, domain randomization serves a legitimate purpose: covering the remaining uncertainty in parameters you can't measure precisely, and training resilience to hardware unit-to-unit variation. We typically randomize:

Link mass ±5–8% around measured values
Joint damping ±20% around identified value (friction varies significantly with temperature)
External force disturbances (simulating ground irregularity and contact uncertainty)
Observation delay (0–3 ms) to account for real hardware sensor pipeline latency

What domain randomization should not cover: the gap between a wrong nominal model and physical reality. We've seen teams apply 50% mass randomization to compensate for incorrect link inertia estimates. That produces policies that pass sim benchmarks but consistently exhibit the same systematic drift on hardware — because the randomization envelope doesn't include the true parameter value.

Get the nominal model right first. Randomize second. Measure the gap after your first hardware deployment, and use those measurements to tighten the model rather than widening the randomization bounds.

If you're working through this process for a specific actuator configuration and want to compare notes on identification methodology, our engineering team is available to discuss directly.

What MuJoCo's Default Actuator Model Gets Wrong

Setting Up a Calibrated URDF Export

MuJoCo URDF Parameters and Where They Map

Exporting From Our SDK

What Domain Randomization Should and Shouldn't Cover

Related Articles

Writing a Clean ROS 2 Hardware Abstraction Layer for Custom Actuators

Deploying Reinforcement Learning Locomotion Policies on Real Actuator Hardware