A Modular Approach to Video Designation of Manipulation Targets for Mobile Manipulators

This process enables remote manipulation of objects using a 2D video feed.

Currently fielded EOD (explosive ordnance disposal) robots are limited in terms of both mechanical ability and autonomous capabilities when compared to the current state-of-the-art in mobile robotics. To combat this problem the Joint Service EOD Program is developing the Advanced EOD Robot System (AEODRS). AEODRS consists of three system variants that vary in size: small for dismounted operations, medium for tactical operations, and large for base/infrastructure operations. Differing from past EOD UGV development efforts, these robots will be designed under a modular architecture consisting of several capability modules that are to be developed separately.

High-level overview of the AEODR system

Both the medium and large variants will incorporate heavy-duty, high-degree-of-freedom manipulators that can be cumbersome for operators to maneuver even with the help of high-tech controllers. As such, there are requirements for several semi-autonomous behaviors, one of which is the ability for the operator to designate an object through a 2D video feed and have the manipulator autonomously move to the object but not grasp it.

The few requirements needed to use this procedure are: the robot is equipped with an encoded serial manipulator; the robot has some type of 3D sensor (e.g. stereo camera, structured light, or scanning lidar) that can be aimed to anywhere in the arm’s workspace; the robot has at least one video camera that transmits to an Operator Control Unit (OCU) and there is no maximum number of cameras; and there are known or encoded kinematic chains connecting the robot’s arm, cameras, and 3D sensor.

A brief high-level overview of the system is shown in the figure, which depicts a hand grenade inside of a backpack being designated from a 2D video feed. The 2D pixel position is transformed into a 3D point using the forward kinematics of the robot. Inverse kinematics is used to solve the joint angles required to get the end effector to that point. Finally the manipulator is commanded to those joint values.

The exact process is as follows:

  1. Video feeds are sent from the robot to the OCU.
  2. The user selects an object from the OCU display.
  3. The pixel position corresponding to the selected object is sent to robot computer and transformed into a ray.
  4. The 3D location of the point of interest (POI) is estimated using the ray, location of ground, and arm’s workspace.
  5. The pan/tilt mechanism points at the estimated point of interest.
  6. A point cloud is constructed by tilting the lidar up and down.
  7. The point cloud is inserted into an octree format.
  8. The ray from step 3 is casted into the octree to find the 3D location of point of interest.
  9. Inverse kinematics are used to compute the pose of the arm required to get close to point of interest.
  10. The arm motors are commanded to that pose.

Some mobile robot systems are equipped with a 3D sensor that constantly scans the environment 360° around the robot to carry out obstacle detection and path planning tasks. For such systems, it is likely that any pixel a user selects from any camera will correspond to a 3D point that has already been recorded. However, if a 3D sensor is only onboard for manipulation tasks, the sensor only needs to scan when the user asks for semi-autonomous manipulation. Or, if the 3D sensor has a limited field of view, it may not be currently pointing at the POI at the time the user selects it. For either of these cases there needs to be a plan in place to enable the 3D sensor to scan the area of interest quickly rather than waste time doing a full hemispheric scan. For this to happen, the location of the POI needs to be approximated using the forward kinematic model of the robot (e.g. a pan tilt mechanism) and the camera’s intrinsic parameters. Then the 3D sensor can center its scan on this estimated location to get the true position of the point of interest.

This work was done by Aaron O’Toole and Jessica N. Jones of the Naval Surface Warfare Center Indian Head EOD Technology Division. NSWC-0001



This Brief includes a Technical Support Package (TSP).
Document cover
A MODULAR APPROACH TO VIDEO DESIGNATION OF MANIPULATION TARGETS FOR MOBILE MANIPULATORS

(reference NSWC-0001) is currently available for download from the TSP library.

Don't have an account? Sign up here.