This article is a tutorial of the code implemented in the repository stereo-dense-reconstruction. The theory is not covered in detail, so some basic reading on the topic is suggested before one delves into the code. This ROS package computes dense disparity maps using LIBELAS and converts them into 3D point clouds. The method for generation of disparity maps can be changed based on user preferences. I had written an article on generating dense disparity maps using ORB descriptors earlier. That method can be implemented real time by using optimized code and parallelization.
An overview of the working code can be seen in the above video. Generation of dense disparity maps and the resulting point clouds is shown. The visualizations are shown in
rviz. Towards the end of the video, point cloud transformation is demonstrated using
A calibrated pair of stereo cameras is required for dense reconstruction. The calibration file is stored as a YAML file containing the following information about the stereo cameras:
||Intrinsic matrix of left camera|
||Intrinsic matrix of right camera|
||Distortion coefficients of left camera|
||Distortion coefficients of right camera|
||Rotation matrix from left camera to right camera|
||Translation vector from left camera to right camera|
||Rotation matrix for point cloud transformation|
||Translation matrix for point cloud transformation|
See a sample calibration file. A YAML calibration file can be generated by using this stereo camera calibration tool. For more information see this detailed tutorial on stereo camera calibration.
Although we require only the matrices mentioned in the above table, the sample calibration file contains other matrices found after rectification (e.g.,
Q etc.). These matrices are found by OpenCV’s
stereoRectify function and they depend on the output image resolution of the rectified stereo images. Hence it is better to run
stereoRectify online than storing those matrices in a calibration file. This gives us the flexibility to work at different image resolutions.
The use of
XT matrices is explained later. For initial values
XR can be set as the $3 \times 3$ identity matrix while
XT can be set as the $3 \times 1$ zero vector.
Disparity maps are generated from rectified stereo images. The raw stereo images should be published by a topic which publishes messages of the type
sensor_msgs/Image. The code for rectification of stereo images is implemented in
Declare all the calibration matrices as
calib_img_size is the resolution of the images used for stereo calibration.
out_img_size is the output resolution of the rectified images. This can be set at our own will. Lower resolution images require less computation time for dense reconstruction.
pub_img_right are the publishers for the rectified images. The output images are published as a
sensor_msgs/CompressedImage message. This is helpful while collecting large datasets, so that file size is reduced.
This function undistorts and rectifies the
src image into
dst. The homographic mappings
rmapy are found from OpenCV’s
This function computes all the projection matrices and the rectification transformations using the
initUndistortRectifyMap functions respectively.
This callback function takes a raw stereo (left) image as input. Then it undistorts and rectifies the image using the functions defined above and publishes on the rectified image topic using
imgRightCallback is defined for the right stereo image.
In the main function, the stereo calibration information is read into
sub_img_right subscribes to the topics publishing the raw stereo image data. The rectified output images are published to the following topics:
The ros node can be run as
Dense 3D Reconstruction
Generating dense 3D reconstructions involve two major steps: (1) computing a disparity map (2) converting the disparity map into a 3D point cloud. The code for dense reconstruction is implemented in
Having time synced stereo images is important for generating accurate disparity maps. To sync the left and right image messages by their header time stamps,
ApproximateTime is used. The procedure is implemented in the main function of
This syncs the two rectified image topics, and the callback function
imgCallback receives the time synced messages.
Disparity Maps using LIBELAS
LIBELAS (Efficient Large-Scale Stereo Matching) is a great open source software for generating dense disparity maps. It does not use global optimization techniques and yet produces great results at almost real time. More information about ELAS can be found here.
This function computes the dense disparity map using LIBELAS, and publishes it as a 8-bit grayscale image to the topic
/camera_left_rect/disparity_map. The disparity map is constructed with the left image as reference. The parameters for LIBELAS can be changed in the file
Any method other than LIBELAS can be implemented inside the
generateDisparityMap function to generate disparity maps. One can use OpenCV’s
StereoBM class as well. The output should be a 8-bit grayscale image.
Given a disparity map, a corresponding 3D point cloud can be easily constructed. The
Q matrix stored in the calibration file is used for this conversion. The reconstruction is mathematically expressed by the following matrix equation.
where $(X,Y,Z,W)^\top$ denotes the 3D point in homogeneous coordinates. $d(x,y)$ is the disparity of a point $(x,y)$ in the left image. The $4 \times 4$ matrix denotes the
Q matrix. This matrix multiplication in homogeneous coordinates is implemented in the
publishPointCloud function. It publishes a
sensor_msgs/PointCloud message to the topic
camera_left_rect/point_cloud. The point cloud generated is in the reference frame of the left camera. Hence a transformation (
XT) is applied to transform the point cloud into a different reference frame (as required by the user). The transformation equation is as follows
where $P_A$ is a point in reference frame $A$, $P_B$ is a point in the reference frame $B$. $R$ and $T$ denote the rotation and translation matrices respectively. In this case $R$ =
XR and $T$ =
XT. Figure 1 shows some visualizations of the resulting point clouds from dense disparity maps.
Run the dense reconstruction node as
Visualize the point cloud in
Point Cloud Transformation
For ground robots with mounted stereo cameras, usually the point clouds are formed in the reference frame of the left camera. Often the point clouds need to be transformed to a different frame e.g., a reference frame with the origin at the centre of rotation of the robot projected on to the ground plane. These transformations are hard to calculate mathematically. Hence it is easier to find these transformations visually, if the geometry of the scene is known.
The transformation from one reference frame to another is represented by the rotation matrix
XR and the translation vector
XR can be decomposed into Euler angles ($\phi_x, \phi_y, \phi_z$) while
XT contains the translation values in the $X,Y,$ and $Z$ direction. A config file
cfg/CamToRobotCalibParams.cfg stores these six values. The config file is bound with the
dense_reconstruction node using
The above piece of code implements
dynamic_reconfigure to configure the transformation parameters.
The above two functions are used to compose the matrices
composeRotationCamToRobot finds a rotation matrix from three Euler angles by the $ZYX$ convention.
composeTranslationCamToRobot returns a $3\times 1$ translation matrix.
Initially, the values of
XT in the calibration file were the $3\times 3$ identity matrix and the $3\times 1$ zero matrix. The function
debugCamToRobotTransformation(XR, XT) can be used to dynamically change the values of
XT with the help of
dynamic_reconfigure. If the desired transformation is found, the resulting values of
XT can be updated in the calibration file. To modify the transformation values start
Refer to the video in the Introduction section to have a better understanding of how the point cloud transformation works. It is shown towards the end of the video.
Build and Run
Clone the repository stereo-dense-reconstruction in your ros catkin workspace. Build using
catkin_make. Further instructions on running each node is given in the README of the repository.