There are multiple ways of estimating camera position and orientation, typical approaches include

  • camera pose from point correspondences
  • camera pose from Fundamental matrix

We will discuss the theory and implementation of each approach in this post.

camera pose from point correspondences

Basic equation

Given a point correspondence \(\mathbf{X_i}\leftrightarrow\mathbf{x_i}\) bewteen 3D point \(\mathbf{X}_i\) and 2D image point \(\mathbf{x}_i\), we have the following relation

\[\mathbf{x}_i = P\mathbf{X}_i\]

Since \(P\) is only defined up to a scale, this relation is better written as

\[\mathbf{x}_i\times P\mathbf{X}_i = 0\]

Let \(\mathbf{x}_i = [x_i, y_i, w_i]\), and \(P = \begin{bmatrix}\mathbf{P^1}\\ \mathbf{P^2}\\ \mathbf{P^3} \end{bmatrix}\), we derive the relation

\[\begin{bmatrix} \mathbf{0}^\top & -w_i\mathbf{X}_i^\top & y_i\mathbf{X}_i^\top\\ w_i\mathbf{X}_i^\top & \mathbf{0}^\top & -x_i\mathbf{X}_i^\top\\ -y_i\mathbf{X}_i^\top & x_i\mathbf{X}_i^\top & \mathbf{0}^\top\\ \end{bmatrix} \begin{bmatrix}\mathbf{P^{1\top}}\\ \mathbf{P^{2\top}}\\ \mathbf{P^{3\top}} \end{bmatrix}=\mathbf{0}\]

It can be easily verified that the third column is the linear combination of the first two, thus we only have two equations from a correspondening pair.

\[\begin{bmatrix} \mathbf{0}^\top & -w_i\mathbf{X}_i^\top & y_i\mathbf{X}_i^\top\\ w_i\mathbf{X}_i^\top & \mathbf{0}^\top & -x_i\mathbf{X}_i^\top\\ \end{bmatrix} \begin{bmatrix}\mathbf{P^{1\top}}\\ \mathbf{P^{2\top}}\\ \mathbf{P^{3\top}} \end{bmatrix}=\mathbf{0}\]

since each pair of correspondence can provide 2 equations, and there are 11 DoF in \(P\), thus \(5\frac{1}{2}\) pairs of correspondences are required, only the \(x-\)coordinate or \(y-\)coordinate of the 6th pair is needed.

The projection matrix \(P\) is computed by solving \(A\mathbf{p}=0\), where \(\mathbf{p}\) contains the entries of \(P\).

camera pose from Fundamental matrix

We start by giving the conclusion:

Result 1. The general formula for a pair of canonic camera matrices corresponding to a fundamental matrix \(F\) is given by

\[P=\left[\mathbf{I} | \mathbf{0}\right] \quad P'=\left[\left[\mathbf{e}'\right]_\times F + \mathbf{e}'\mathbf{v}^\top | \lambda \mathbf{e}' \right]\]

where \(\mathbf{v}\) is any 3-vector, and \(\lambda\) a non-zero scalar.

The proof proceeds as follows:

1. Canonical formular

Result 2. Let \(F\) be a fundamental matrix and \(S\) any skew-symmetric matrix. Define the pair of camera matrics

\[P=\left[\mathbf{I} | \mathbf{0}\right] \quad P'=\left[SF|\mathbf{e}'\right]\]

where \(\mathbf{e}'\) is the epipole. To prove Result 2, we first need the following result:

Result 3. A non-zero matrix \(F\) is the fundamental matrix corresponding to a pair of camera matrix \(P\) and \(P'\) if and only if \(P'^\top FP\) is skew-symmetric.

Proof. The condition that \(P'^\top FP\) is skew-symmetric is equivalent to \(X^\top P'^\top FPX=0\), which is equivalent to \(x^\top F x=0\) if \(x=PX\) and \(x'=P'X\).

We can then prove Result 2 by showing that

\[\left[SF|\mathbf{e}'\right]^\top F \left[\mathbf{I} | \mathbf{0}\right] = \begin{bmatrix} F^\top S^\top F & \mathbf{0}\\ \mathbf{0}^\top & 0 \end{bmatrix}\]

which is skew symmetric.

The skew symmetric matrix \(S\) may be written in terms of its null vector as \(S=\left[s\right]_\times\). Then \(P'\) can be re-written as

\[P'=\left[\left[s\right]_\times F\mid \mathbf{e}'\right]\]

The projection matrix \(P'\) has rank 3 provided \(\mathbf{s}^\top \mathbf{e}'\not = 0\). A good choice of \(S\) is \(S=\left[\mathbf{e}'\right]_\times\), for in this case \(e'^\top e'\not = 0\).

2. Camera not at infinity

However, the camera center obtained above is at infinity. To see this, we just need to prove that the left \(3\times 3\) matrix of \(P'\) is singular, i.e., \(\left[s\right]_\times F\) has rank 2. (TBD).

The following result can help us get to our goal:

Result 4. Camera pairs in canonical form \(P=\left[\mathbf{I}\mid\mathbf{0}\right], P'=\left[A\mid\mathbf{a}\right]\) have the same fundamental matrix as the canonical pair \(P=\left[\mathbf{I}\mid\mathbf{0}\right], P'=\left[A + \mathbf{a}\mathbf{v}^\top\mid k\mathbf{a}\right]\).

where \(\mathbf{v}\) is any 3-vector, and \(k\) a non-zero scalar.

Result 1 follows from Result 4.