James Gregson's Website

Dec 03, 2021

Quaternion Notes

These are some helpful quaternion identities, all summarized from Kok et al., Using Inertial Sensors for Position and Orientation Estimation although I have put the scalar component last.

A quaternion \(q\) is represented as a 4-tuple, or an imaginary vector portion \(q_v=[q_x,q_y,q_z]^T\) and a scalar value \(q_s\) giving \(q = [q_v^T, q_s]^T\). The norm of a quaternion is \(\|q\|_2 = (q_v^T q_v + q_s^2)^{\frac{1}{2}}\), the conjugate of a quaternion is \(q^* = [-q_v^T, q_s]^T\) and the inverse of a quaternion is \(q^{-1} = \frac{q^*}{\|q\|_2^2}\).

Unit quaternions have \(\|q\|_2 = 1\), in which case \(q^* = q^{-1}\). Unit quaternions also represent spatial rotations/orientations of \(\theta\) radians around a unit axis \(a\) via \(q = [a^T \sin\left(\frac{\theta}{2}\right), \cos\left(\frac{\theta}{2}\right) ]^T\). Both \(q\) and \(-q = [-q_v, -q_s]^T\) represent the same orientation, although the one with positive scalar component takes the shortest rotation to achieve that orientation since \(\cos^{-1}(|x|) \leq \cos^{-1}(-|x|)\).

The multiplication of two quaternions is \(p \otimes q = [ (p_s q_v + q_s p_v + p_v \times q_v)^T, p_s q_s - p_v \times q_v ]^T\). The presence of the cross-product means that \(p\otimes q \neq q \otimes p\). When \(p, q\) are unit quaternions, \(p\otimes q\) is equivalent to chaining the two rotations. The rotation matrix corresponding to a unit quaternion is:

\begin{equation*} R(q) = \begin{bmatrix} 1 - 2 q_y^2 - 2*q_z^2 & 2 q_x q_y - 2 q_z q_s & 2 q_x q_z + 2 q_y q_s \\ 2 q_x q_y + 2 q_z q_s & 1 - 2 q_x^2 - 2 q_z^2 & 2 q_y q_z - 2 q_x q_s \\ 2 q_x q_z - 2 q_y q_s & 2 q_y q_z + 2 q_x q_s & 1 - 2 q_x^2 - 2 q_y^2 \end{bmatrix} \end{equation*}

or in python:

def quat2mat( q: np.ndarray ) -> np.ndarray:
    assert q.ndim == 1 and q.size == 4
    qx,qy,qx,qs = q
    return np.ndarray((
        ( 1 - 2*qy**2 - 2*qz**2,     2*qx*qy - 2*qz*qs,     2*qx*qz + 2*qy*qs ),
        (     2*qx*qy + 2*qz*qs, 1 - 2*qx**2 - 2*qz**2,     2*qy*qz - 2*qx*qs ),
        (     2*qx*qz - 2*qy*qs,     2*qy*qz + 2*qx*qs, 1 - 2*qx**2 - 2*qy**2 )
    ))

The, usually non-unit, quaternion representation of a vector \(v^\wedge = [ v^T, 0]^T\). The vector can be rotated by the unit-quaternion \(q\) via \(R(q) v = (q\otimes v^\wedge \otimes q^*)_v\) where only the vector portion is retained.

The quaternion-quaternion product can be represented by matrix multiplication in either the same, or reversed order, i.e.:

\begin{align*} \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}} p \otimes q &= p^L q = q^R p \hspace{1cm} & \mbox{where} \\ p^L &= \begin{bmatrix} p_s I + \lfloor p_v \rfloor_\times & p_v \\ -p_v^T & p_s \end{bmatrix} & \\ q^R &= \begin{bmatrix} q_s I - \lfloor q_v \rfloor_\times & q_v \\ -q_v^T & q_s \end{bmatrix} & \end{align*}

where \(\lfloor m \rfloor_\times\) is the matrix representation of the cross-product. This is extremely useful when differentiating quaternion expressions since \(p^L = \pd{}{q}(p\otimes q)\) and \(q^R = \pd{}{p}(p\otimes q)\).

A Rodrigues rotation vector \(\eta = \theta a\) where \(\|a\|_2 = 1\) can be mapped to a quaternion via the exponential map:

\begin{align*} \exp_q(\eta) =& q = \begin{pmatrix} a \sin\left(\frac{\theta}{2}\right) \\ \cos\left(\frac{\theta}{2}\right) \end{pmatrix} \\ \log_q(q) =& \eta = 2 q_v \end{align*}

Under the assumption that \(\theta << 1\), the following approximations can be made:

\begin{align*} \exp_q(\eta) \approx& \begin{pmatrix} \frac{\eta}{2} \\ 1 \end{pmatrix} \\ \pd{\exp_q(\eta)}{\eta} \approx& \begin{pmatrix} \frac{1}{2} I_{3\times3} \\ 0 \end{pmatrix} \\ \pd{\log_q(q)}{q} \approx& \begin{pmatrix} 2 I_{3\times3} & 0_{3\times1} \end{pmatrix} \end{align*}

The above are also very helpful in manifold optimization over quaternion states where the Newton update is generally small which results in linear Jacobians with respect to the update variables.

Nov 28, 2021

QR Decomposition Notes

I find this process easier to reason through by writing out a small 3-column version explicitly:

\begin{equation*} \begin{bmatrix} A_{*0} & A_{*1} & A_{*2} \end{bmatrix} = \begin{bmatrix} Q_{*0} & Q_{*1} & Q_{*2} \end{bmatrix} \begin{bmatrix} R_{00} & R_{01} & R_{02} \\ & R_{11} & R_{12} \\ & & R_{22} \end{bmatrix} \end{equation*}

which gives:

\begin{align*} &A_{*0} = Q_{*0} R_{00} \\ &A_{*1} = Q_{*0} R_{01} + Q_{*1} R_{11} \\ &A_{*2} = Q_{*0} R_{02} + Q_{*1} R_{12} + Q_{*2} R_{22} \end{align*}

From this and the orthonormal properties of \(Q\), the meaning of the \(R_{ji}\) values are clear, the projection of the \(i\)'th column of \(A\) onto the \(j\)'th column of \(Q\). These allows fixing the values of \(R_{ji}\) and norms of \(Q_{*i}\):

  • \(Q_{*i}^T Q_{*j} = 0, \forall i \neq j\): columns of \(Q\) are orthogonal so each \(R_{j*}\) value must account for entire parallel component of \(A_{*j}\) onto each column of \(Q\).
  • \(\|Q_{*i}\|_2 = 1, \forall i\): constrains \(\|Q_{*i}\|\) and corresponding \(R_{ji}\) values.

Therefore:

  • \(R_{00} = \|A_{*0}\|_2\) and \(Q_{*0} = A_{*0}/R_{00}\)
  • \(R_{01} = Q_{*0}^T A_{*1}\), \(s_1 = A_{*1} - R_{01} Q_{*0}\), \(R_{11} = \|s_1\|_2\) \(Q_{*1} = s_1/R_{11}\)
  • \(R_{02} = Q_{*0}^T A_{*2},~R_{12} = Q_{*1}^T A_{*2}, s_2 = A_{*2} - R_{02} Q_{*0} - R_{12} Q_{*1}, R_{22} = \| s_2 \|_2, Q_{*2} = s_2/R_{22}\)

or in general:

  • \(R_{ji} = Q_{*j}^T A_{*i}, \forall j < i\)
  • \(s_{i} = A_{*i} - \sum_j R_{ji} Q_{*j}\)
  • \(R_{ii} = \| s_{i} \|_2, Q_{*i} = s_i/R_{ii}\)

and in python:

def qr_gs( A, inplace=True ):
    '''Decompose A into Q*R with Q orthonormal and R upper triangular using classical Gram-Schmidt (unstable)'''
    A = A if inplace else A.copy()
    R = np.zeros((A.shape[1],A.shape[1]))
    for i in range( A.shape[1] ):
        for j in range(i):
            R[j,i] = np.sum(A[:,j]*A[:,i])
            A[:,i] -= R[j,i]*A[:,j]
        R[i,i] = np.linalg.norm(A[:,i])
        A[:,i] /= R[i,i]
    return A,R

The form above generates \(Q\) column by column but has stability issues due to the use of classical Gram-Schmidt. It can be improved by replacing classical Gram-Schmidt with modified Gram-Schmidt:

def qr_mgs( A, inplace=True ):
    '''Decompose A into Q*R with Q orthonormal and R upper triangular using modified Gram-Schmidt

    Assumes columns of A are linearly independent.
    '''
    A = A if inplace else A.copy()
    R = np.zeros((A.shape[1],A.shape[1]))
    for i in range( A.shape[1] ):
        R[i,i] = np.linalg.norm(A[:,i])
        A[:,i] /= R[i,i]
        for j in range(i+1,A.shape[1]):
            R[i,j] = np.dot(A[:,j],A[:,i])
            A[:,j] -= R[i,j]*A[:,i]
    return A,R

The inner loop of this can be replaced entirely with vectorization in python but the above form gives a reasonable starting point for porting to another language like C or C++. These implementations should also check for \(R_{ii} = 0\) which indicate a rank deficient system.