【原】高翔Slambook第七讲代码解读（2d-2d位姿估计）

小白学视觉 2021-01-28

展开全文

大家好我是小绿，给自己起了英文名叫gREEN，为了方便大家记住我已经把它写在车牌上了。这里感谢福特公司，个人很喜欢猛禽，于是改了张图以后就作为小绿的封面了。

图源：百度

修图：gREEN

下面开始本期的正文。

SLAM前端，或者说视觉里程计VO，做的最主要的一件事就是计算或者估计两帧图像之间的位姿变换，也即旋转矩阵R和平移向量t。根据所应用于计算的特征点信息的不同，可分为三类：2d-2d类、3d-2d类、3d-3d类，这取决于我们手头所现有的能拿来做位姿计算的数据形式。倘若我们只知道某个特征点在相机成像平面的位置，也即像素坐标或归一化平面坐标，却不知其深度，那么这将是一个只具备2d信息的特征点；如果我们通过三角化计算，或者通过RGB-D相机直接测得了某点的深度信息，结合其2d信息，我们能够推算其在该帧图像的相机坐标系下的位置，即相机坐标，此时便形成了一个具备3d信息的特征点。

我们不妨首先来看一下2d-2d位姿估计的过程。关于其中所使用的对极几何、对极约束等数学推算，小绿在此不做赘述，而是将重点放在代码上。

先把子函数声明与主函数放在这里：

#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/calib3d/calib3d.hpp>

using namespace std;
using namespace cv;

void find_feature_matches (
    const Mat& img_1, const Mat& img_2,
    std::vector<KeyPoint>& keypoints_1,
    std::vector<KeyPoint>& keypoints_2,
    std::vector< DMatch >& matches );

void pose_estimation_2d2d (
   const  std::vector<KeyPoint> & keypoints_1,
    const std::vector<KeyPoint> & keypoints_2,
    const std::vector< DMatch > & matches,
    Mat& R, Mat& t );

void verify_polar_constraint(
  const  std::vector<KeyPoint> & keypoints_1,
  const std::vector<KeyPoint> & keypoints_2,
  const Mat& R, const Mat& t,
  const std::vector<DMatch> & matches );


// 像素坐标转相机归一化坐标
Point2d pixel2cam ( const Point2d& p, const Mat& K );

int main ( int argc, char** argv )
{
    if ( argc != 3 )
    {
        cout<<"usage: pose_estimation_2d2d img1 img2"<<endl;
        return 1;
    }
    //-- 读取图像
    Mat img_1 = imread ( argv[1], CV_LOAD_IMAGE_COLOR );
    Mat img_2 = imread ( argv[2], CV_LOAD_IMAGE_COLOR );

    vector<KeyPoint> keypoints_1, keypoints_2;
    vector<DMatch> matches;
    find_feature_matches ( img_1, img_2, keypoints_1, keypoints_2, matches );
    cout<<"一共找到了"<<matches.size() <<"组匹配点"<<endl;

    //-- 估计两张图像间运动
    Mat R,t;
    pose_estimation_2d2d ( keypoints_1, keypoints_2, matches, R, t );
    
    //验证对极约束
    verify_polar_constraint ( keypoints_1, keypoints_2, R, t, matches );

    return 0;
}

可以看出这里在主函数之外声明了4个函数：find_feature_matches、pose_estimation_2d2d、verify_polar_constraint、pixel2cam。

其中find_feature_matches完全是由feature_extraction.cpp封装成的函数：对存储图像的两个Mat型变量、存储特征点信息的两个vector<KeyPoint>容器、存储特征点配对信息的一个vector<DMatch>容器进行引用调用，并在Mat类变量前加设了const限定符避免修改（因此这里使用&引用调用是为了避免传递参数时对象的构造与函数调用完结时对象的析构，节省内存与时间）。该函数无返回值，但在调用后会将特征点及特征点配对写入keypoints_1、keypoints_2与matches。

pose_estimation_2d2d函数则是本程序中的重头戏：在传递入特征点数据容器keypoints_1与keypoints_2、特征点配对信息容器matches后，进行计算并存入Mat类旋转矩阵R与平移向量t。其内部原理我们稍后会回过头来细细分析。

verify_polar_constraint函数则是在解算出R、t后，将其带回到特征点坐标中，来验算是否满足对极约束的一个过程。在验算过程中其函数内部调用了坐标变换函数pixel2cam（这不是opencv提供的），用来将像素坐标通过相机内参转化为归一化成像平面坐标。

那么对于主函数，其主要操作也可以简单列举如下：读取图像->初始化变量->寻找特征点->计算相机位姿变换->验证计算结果。下面我们来逐步解读各个子函数。

pixel2cam

Point2d pixel2cam ( const Point2d& p, const Mat& K )
{
    return Point2d
           (
               ( p.x - K.at<double> ( 0,2 ) ) / K.at<double> ( 0,0 ),
               ( p.y - K.at<double> ( 1,2 ) ) / K.at<double> ( 1,1 )
           );
}

调用这个函数会返回一个cv::Point2d类的变量，而Point2d类的变量会存储一个2d点的xy坐标，即有两个成员变量.x和.y，类型为double。在此函数中，直接return了一个通过Point2d构造的变量，而并没有先定义变量再返回（我竟然连这个都要感叹一下）。

而这个函数内部则使用了像素坐标、归一化平面坐标与相机内参的等式关系：

即

其中，使用K.at<double> (0,0)访问K矩阵（相机内参）的第(0,0)号元素。在使用.at访问或修改Mat类变量中的某个元素时，一定要确定元素的变量类型并填写在.at后的<>中，否则编译能够通过，但读取的值会出错。

pose_estimation_2d2d（本程序的主要计算环节）

void pose_estimation_2d2d ( const vector< KeyPoint >& keypoints_1, const vector< KeyPoint >& keypoints_2, const vector< DMatch >& matches, Mat& R, Mat& t )
{
    // 相机内参,TUM Freiburg2
    Mat K = ( Mat_<double> ( 3,3 ) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1 );

    //-- 把匹配点转换为vector<Point2f>的形式
    vector<Point2f> points1;
    vector<Point2f> points2;

    for ( int i = 0; i < ( int ) matches.size(); i++ )
    {
        points1.push_back ( keypoints_1[matches[i].queryIdx].pt );
        points2.push_back ( keypoints_2[matches[i].trainIdx].pt );
    }

    //-- 计算基础矩阵
    Mat fundamental_matrix;
    fundamental_matrix = findFundamentalMat ( points1, points2, CV_FM_8POINT );
    cout<<"fundamental_matrix is "<<endl<< fundamental_matrix<<endl;

    //-- 计算本质矩阵
    Point2d principal_point ( 325.1, 249.7 );  //相机光心, TUM dataset标定值
    double focal_length = 521;      //相机焦距, TUM dataset标定值
    Mat essential_matrix;
    essential_matrix = findEssentialMat ( points1, points2, focal_length, principal_point );
    cout<<"essential_matrix is "<<endl<< essential_matrix<<endl;

    //-- 计算单应矩阵
    Mat homography_matrix;
    homography_matrix = findHomography ( points1, points2, RANSAC, 3 );
    cout<<"homography_matrix is "<<endl<<homography_matrix<<endl;

    //-- 从本质矩阵中恢复旋转和平移信息.
    recoverPose ( essential_matrix, points1, points2, R, t, focal_length, principal_point );
    cout<<"R is "<<endl<<R<<endl;
    cout<<"t is "<<endl<<t<<endl;
    
}

这里

Mat K = ( Mat_<double> ( 3,3 ) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1 );

在定义Mat类变量K时直接进行了初始化，调用Mat_类构造函数并规定好元素类型为double，然后通过<<将元素依次传递进去，外面套上括号避免编译出错。

由于在进行对极几何计算时，需要使用特征点的2d坐标，而此时两帧图像中的特征点坐标还保存在两个存储KeyPoint类对象的容器keypoints_1和keypoints_2中，因此我们需要将其中的特征点坐标信息提取出来（方便一会使用OpenCV提供的计算函数进行计算）：

vector<Point2f> points1;
vector<Point2f> points2;

    for ( int i = 0; i < ( int ) matches.size(); i++ )
    {
        points1.push_back ( keypoints_1[matches[i].queryIdx].pt );
        points2.push_back ( keypoints_2[matches[i].trainIdx].pt );
    }

这里特征点坐标按照Point2f进行存储，因为在keypoints_1中存储的点的坐标（keypoints_1[0].pt）是Point2f类型的，因此定义两个存储Point2f类型对象的容器points1与points2。在坐标值的存储循环中，以i=0为例：

points1.push_back ( keypoints_1[matches[0].queryIdx].pt );
points2.push_back ( keypoints_2[matches[0].trainIdx].pt );

调用容器对应的push_back()函数将括号内的值加入到容器的尾部。此时加入的值分别为keypoints_1[matches[0].queryIdx].pt与keypoints_2[matches[0].trainIdx].pt，分别为第一对特征点对（角标从0开始，i=0对应第一对特征点对）中，前一帧（查询图像）中的特征点索引在keypoints_1中对应的特征点坐标，与后一帧（训练图像）中的特征点索引在keypoints_2中对应的特征点坐标（有点啰嗦）。

//-- 计算基础矩阵
    Mat fundamental_matrix;
    fundamental_matrix = findFundamentalMat ( points1, points2, CV_FM_8POINT );
    cout<<"fundamental_matrix is "<<endl<< fundamental_matrix<<endl;

进而，调用OpenCV提供的基础矩阵计算函数findFundamentalMat，按照八点法进行计算，并返回一个4×4的F矩阵fundamental_matrix。

//-- 计算本质矩阵
    Point2d principal_point ( 325.1, 249.7 );  //相机光心, TUM dataset标定值
    double focal_length = 521;      //相机焦距, TUM dataset标定值
    Mat essential_matrix;
    essential_matrix = findEssentialMat ( points1, points2, focal_length, principal_point );
    cout<<"essential_matrix is "<<endl<< essential_matrix<<endl;

同理，调用OpenCV提供的本质矩阵计算函数findEssentialMat计算本质矩阵essential_matrix。由于计算本质矩阵E时需要提供归一化平面坐标，因此需要将像素坐标转化成归一化平面坐标，需要提供相机内参c_u、c_v与f。

//-- 从本质矩阵中恢复旋转和平移信息.
    recoverPose ( essential_matrix, points1, points2, R, t, focal_length, principal_point );
    cout<<"R is "<<endl<<R<<endl;
    cout<<"t is "<<endl<<t<<endl;

最后，通过OpenCV提供的R、t计算函数recoverPose计算R和t。由于函数默认使用本质矩阵进行解算，因此需要传入E。

verify_polar_constraint

void verify_polar_constraint(
  const vector< KeyPoint >& keypoints_1, const vector< KeyPoint >& keypoints_2, const Mat& R, const Mat& t, const vector< DMatch >& matches )
{
  
    //-- 验证E=t^R*scale
    Mat t_x = ( Mat_<double> ( 3,3 ) <<
                0,                      -t.at<double> ( 2,0 ),     t.at<double> ( 1,0 ),
                t.at<double> ( 2,0 ),      0,                      -t.at<double> ( 0,0 ),
                -t.at<double> ( 1,0 ),     t.at<double> ( 0,0 ),      0 );

    cout<<"t^R="<<endl<<t_x*R<<endl;

    //-- 验证对极约束
    Mat K = ( Mat_<double> ( 3,3 ) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1 );
    for ( DMatch m: matches )
    {
        Point2d pt1 = pixel2cam ( keypoints_1[ m.queryIdx ].pt, K );
        Mat y1 = ( Mat_<double> ( 3,1 ) << pt1.x, pt1.y, 1 );
        Point2d pt2 = pixel2cam ( keypoints_2[ m.trainIdx ].pt, K );
        Mat y2 = ( Mat_<double> ( 3,1 ) << pt2.x, pt2.y, 1 );
        Mat d = y2.t() * t_x * R * y1;
        cout << "epipolar constraint = " << d << endl;
    }
}

这里是为了验证刚才所计算得到的R和t是否满足对极约束：

这里有一个循环条件语句：

for ( DMatch m: matches )
    {
       ...
    }

使用for循环对vector<DMatch>类型的容器进行遍历，且在每个循环内部所产生的m也为DMatch类型的变量（由于这里使用“m : matches”而非“&m : matches”，所以我们得到的m只是matches这个容器中存储的第某个变量的副本，并非引用，因而无法修改matches中的具体变量）。

下面来看一下程序运行结果：