[Paper Reading] AirSLAM: An Efficient and Illumination-Robust Point-Line Visual SLAM System
AirSLAM: An Efficient and Illumination-Robust Point-Line Visual SLAM System [XHYWX24] arxiv:2408.03520
author: Kuan Xu , Yuefan Hao , Shenghai Yuan , Chen Wang , Lihua Xie
First, feature detection and tracking often fail due to drastic changes or low light, severely affecting the quality of the estimated trajectory [6], [7].
The proposed system is a hybrid system as we need the robustness of data-driven approaches and the accuracy of geometric methods.
Consequently, we aim to design an efficient unified model that can detect keypoints and line features concurrently However, achieving a unified model for keypoint and line detection is challenging, as these tasks typically require different real-image datasets and training procedures.
It is important to note that in this paper, the term “line detection” refers specifically to the wireframe parsing task.
In the first round, only the backbone and the keypoint detection module are trained, which means we need to train a keypoint detection network. In the second round, the backbone and the keypoint detection module are fixed, and we only train the line detection module on the Wireframe dataset.
We use LightGlue [26] to match keypoints
line representation: 使用的是 plucker 坐标系 $(n^T,v^T)^T$,v 是单位向量,表示直线的方向,n 是法向量,其模长直线到原点的距离。
直线上的三个点我们表示为 $X_1$、$X_2$ 和 $ X_l$,那么该直线的 Plucker 坐标系可以表示为:
\[\begin{bmatrix} n \\ v \end{bmatrix} = \begin{bmatrix} X_l \times (X_1 - X_2) \\ X_1 - X_2 \end{bmatrix}\]该直线的 plucker 坐标系在另一个坐标系下为:
\[\begin{align} \begin{bmatrix} n' \\ v' \end{bmatrix} &= \begin{bmatrix} (RX_l + t) \times [(RX_1+t) - (RX_2+t)] \\ (RX_1+t) - (RX_2+t) \end{bmatrix} \\ &= \begin{bmatrix} R[X_l \times (X_1 - X_2)] + t \times R(X_1 - X_2) \\ R(X_1 - X_2) \end{bmatrix} \\ &= \begin{bmatrix} Rn + \lfloor t \rfloor \times Rv \\ Rv \end{bmatrix} \\ &= \begin{bmatrix} R & \lfloor t \rfloor \times R \\ 0 & R \end{bmatrix} \begin{bmatrix} n\\ v \end{bmatrix} \end{align}\]