This project addresses the challenge of autonomously navigating a robotic arm, specifically a UR5 manipulator, to pick up a cube in an environment where the initial positions of the robot's base frame, the RealSense camera, and the cube are unknown. The objective is to develop an integrated solution that combines computer vision techniques, registration, and robotic control strategies to achieve Position-Based Visual Servoing (PBVS).
-
Object Detection and Segmentation
Utilizing a classical computer vision approach (color thresholding, Canny edge detection, and corner detection), the system is able to identify and segment the cube.
-
Pose Estimation
The Perspective-n-Point (PnP) algorithm was employed to determine the pose of the cube relative to the RealSense camera. By utilizing the 2D image coordinates of cube corners in the camera frame and their corresponding 3D coordinates in the world frame, PnP efficiently calculated the rotation and translation vectors, providing a precise spatial understanding of the cube's position and orientation in the camera's field of view. This information was crucial for subsequent robotic manipulation tasks.
-
Registration Between Camera and Robot
The ar_track_alvar package was employed for camera-to-robot registration by detecting a predefined anchor marker in the RealSense camera's field of view. This marker had a known relative pose with respect to the base_link of the UR5 robot. Using the detected marker's pose, we can establish a transformation between the camera and the robot base.
-
Robot Control Using MoveIt!
Finally, using Moveit for motion planning, the robot can follow a collision-free motion plan for the pick and place task. By instructing the robot to align its end-effector with the estimated pose of the cube, the robot can subsequently close the gripper and pick the cube up.
The demo was run on Ubuntu 18.04 LTS, with ROS Melodic installed.
First, make sure you already have installed the libraries and ROS wrapper for the realsense camera: tutorial
Next, launch the realsense camera
roslaunch realsense2_camera rs_camera.launch depth_align:=true
The rgb image topic should be /camera/color/image_raw
, and the camera info topic should be /camera/color/camera_info
Next, run the cube_pose_estimation.py
file for cube segmentation and pose estimation.
In order to launch the robot, you have to first create a workspace:
mkdir -p ur5_ws/src
cd ur5_ws/src
git clone this repo
git clone https://github.com/intuitivecomputing/ur5_with_robotiq_gripper
Then, build and source the path:
cd ../
catkin build
source ./devel/setup.bash
launch the following in seperate terminals in order to control the robot (remember to source the path in each terminal):
roslaunch icl_ur5_setup_bringup ur5_gripper.launch
roslaunch icl_ur5_setup_bringup activate_gripper.launch
roslaunch icl_ur5_setup_moveit_config ur5_gripper_moveit_planning_execution.launch
roslaunch icl_ur5_setup_moveit_config moveit_rviz.launch config:=true
After that, make sure the ar_track_alvar package is installed:
cd ~/ur5_ws/src
git clone https://github.com/ros-perception/ar_track_alvar?tab=readme-ov-file
put the tag_detector_rs.launch
into the ar_track_alvar/launch
folder and build the workspace.
After making sure the realsense can see the ar marker, run the launch file:
roslaunch ar_track_alvar tag_detector_rs.launch
Now you should be able to see the camera frame, the ar marker, as well as the pose of the cube in rviz like this:
Finally, run cube_pick_and_place.py
and the robot will pick the cube up and then drop it down!
With limited time, this demo was just to demonstrate the possiblity of using the UR5 robot for autonomous cube retrieval task. In order to make the system more robust, the following future work should be implemented and explored:
-
Despite realsense camera already have on-chip calibration, proper camera calibration procedure can still be conducted to improve the pose estimation accuracy and rectify the image distortion.
-
Right now, to detect and estimate the pose of the cube, classical CV techniques were employed. However, there is still limitations for classical approach. The first major issue is that color thresholding will limit the color of the object and lighting can also have an influence on the detection accuracy. To streamline the process and improve the accuracy, PoseCNN and Decoder can be employed and train on the specific object.
-
Also, to improve the pose estimation accuracy, instead of using the Perspective-n-point algorithm, the Iterative Closest Point algorithm can also be used in order to utilize the point cloud generated from the Realsense camera. Here is a in-depth guide on how to use ICP for pose estimation
-
Finally, to have a more robust registration between the camera and the robot base, instead of using an anchor AR marker, a calibration procedure can be taken. This approach is to first use the gripper to grasp an AR marker, and then move the robot to various points within the camera view. For each pose, we can establish the following relationship:
- robot base to end-effector (gripper center): can be calculated using forward kinematics
- camera to AR marker: can be detected using packages like ar_track_alvar
If we can make sure the AR marker is aligned with the gripper center, then the transformation between camera to the robot base can be established.
This approach has already been implemented in my another project using ROS2: AR pick and place