Examples

Reinforcement learning

3D reaching task (iiwa’s end-effector must reach a certain target point in space). The training was done in Omniverse Isaac Gym using the skrl reinforcement learning library, . The real robot control is performed through the Python, ROS and ROS2 APIs using the same reinforcement learning library. Training and evaluation is performed for both Cartesian and joint control space

Note

Visit the skrl documentation, under Real-world examples section, to access the training and evaluation files (both in simulation and in real-world)

Implementation (see details in the table below):

The observation space is composed of the episode’s normalized progress, the robot joints’ normalized positions (\(q\)) in the interval -1 to 1, the robot joints’ velocities (\(\dot{q}\)) affected by a random uniform scale for generalization, and the target’s position in space (\(target_{_{XYZ}}\)) with respect to the robot’s base
The action space, bounded in the range -1 to 1, consists of the following. For the joint control it’s robot joints’ position scaled change. For the Cartesian control it’s the end-effector’s position (\(ee_{_{XYZ}}\)) scaled change
The instantaneous reward is the negative value of the Euclidean distance (\(\text{d}\)) between the robot end-effector and the target point position. The episode terminates when this distance is less than 0.035 meters in simulation (0.075 meters in real-world) or when the defined maximum timestep is reached
The target position lies within a rectangular cuboid of dimensions 0.2 x 0.4 x 0.4 meters centered at (0.6, 0.0, 0.4) meters with respect to the robot’s base. The robot joints’ positions are drawn from an initial configuration [0º, 0º, 0º, -90º, 0º, 90º, 0º] modified with uniform random values between -7º and 7º approximately

Variable - Formula / value - Size
Observation space - \(\dfrac{t}{t_{max}},\; 2 \dfrac{q - q_{min}}{q_{max} - q_{min}} - 1,\; 0.1\,\dot{q}\,U(0.5,1.5),\; target_{_{XYZ}}\) - 18
Action space (joint) - \(\dfrac{2.5}{120} \, \Delta q\) - 7
Action space (Cartesian) - \(\dfrac{1}{100} \, \Delta ee_{_{XYZ}}\) - 3
Reward - \(-\text{d}(ee_{_{XYZ}},\; target_{_{XYZ}})\) -
Episode termination - \(\text{d}(ee_{_{XYZ}},\; target_{_{XYZ}}) \le 0.035 \quad\) or \(\quad t \ge t_{max} - 1\) -
Maximum timesteps (\(t_{max}\)) - 100 -

Real-world (Python)

Real-world (ROS/ROS2)

Simulation (Omniverse Isaac Gym)

MoveIt in RViz

In order to run this demonstration, in addition to the packages listed in Installation (under ROS/ROS2 section), the following packages are required in the ROS workspace. Make sure to source the ROS distribution and build the workspace

Package	ROS
KUKA LBR iiwa URDF	`iiwa_description.zip`
MoveIt configuration for LBR iiwa 14	`iiwa14_moveit_config.zip`

Launch the libiiwa ROS node in a terminal and execute the Java library installed in the KUKA Sunrise Cabinet via the smartHMI

$ roslaunch libiiwa_ros default.launch

Launch MoveIt with RViz in another terminal

$ roslaunch iiwa14_moveit_config real.launch