These are some helpful quaternion identities, all summarized from Kok et al., Using Inertial Sensors for Position and Orientation Estimation although I have put the scalar component last.
A quaternion \(q\) is represented as a 4-tuple, or an imaginary vector portion \(q_v=[q_x,q_y,q_z]^T\) and a scalar value \(q_s\) giving \(q = [q_v^T, q_s]^T\). The norm of a quaternion is \(\|q\|_2 = (q_v^T q_v + q_s^2)^{\frac{1}{2}}\), the conjugate of a quaternion is \(q^* = [-q_v^T, q_s]^T\) and the inverse of a quaternion is \(q^{-1} = \frac{q^*}{\|q\|_2^2}\).
Unit quaternions have \(\|q\|_2 = 1\), in which case \(q^* = q^{-1}\). Unit quaternions also represent spatial rotations/orientations of \(\theta\) radians around a unit axis \(a\) via \(q = [a^T \sin\left(\frac{\theta}{2}\right), \cos\left(\frac{\theta}{2}\right) ]^T\). Both \(q\) and \(-q = [-q_v, -q_s]^T\) represent the same orientation, although the one with positive scalar component takes the shortest rotation to achieve that orientation since \(\cos^{-1}(|x|) \leq \cos^{-1}(-|x|)\).
The multiplication of two quaternions is \(p \otimes q = [ (p_s q_v + q_s p_v + p_v \times q_v)^T, p_s q_s - p_v \times q_v ]^T\). The presence of the cross-product means that \(p\otimes q \neq q \otimes p\). When \(p, q\) are unit quaternions, \(p\otimes q\) is equivalent to chaining the two rotations. The rotation matrix corresponding to a unit quaternion is:
\begin{equation*}
R(q) = \begin{bmatrix} 1 - 2 q_y^2 - 2*q_z^2 & 2 q_x q_y - 2 q_z q_s & 2 q_x q_z + 2 q_y q_s \\
2 q_x q_y + 2 q_z q_s & 1 - 2 q_x^2 - 2 q_z^2 & 2 q_y q_z - 2 q_x q_s \\
2 q_x q_z - 2 q_y q_s & 2 q_y q_z + 2 q_x q_s & 1 - 2 q_x^2 - 2 q_y^2 \end{bmatrix}
\end{equation*}
or in python:
def quat2mat( q: np.ndarray ) -> np.ndarray:
assert q.ndim == 1 and q.size == 4
qx,qy,qx,qs = q
return np.ndarray((
( 1 - 2*qy**2 - 2*qz**2, 2*qx*qy - 2*qz*qs, 2*qx*qz + 2*qy*qs ),
( 2*qx*qy + 2*qz*qs, 1 - 2*qx**2 - 2*qz**2, 2*qy*qz - 2*qx*qs ),
( 2*qx*qz - 2*qy*qs, 2*qy*qz + 2*qx*qs, 1 - 2*qx**2 - 2*qy**2 )
))
The, usually non-unit, quaternion representation of a vector \(v^\wedge = [ v^T, 0]^T\). The vector can be rotated by the unit-quaternion \(q\) via \(R(q) v = (q\otimes v^\wedge \otimes q^*)_v\) where only the vector portion is retained.
The quaternion-quaternion product can be represented by matrix multiplication in either the same, or reversed order, i.e.:
\begin{align*}
\newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}}
p \otimes q &= p^L q = q^R p \hspace{1cm} & \mbox{where} \\
p^L &= \begin{bmatrix} p_s I + \lfloor p_v \rfloor_\times & p_v \\ -p_v^T & p_s \end{bmatrix} & \\
q^R &= \begin{bmatrix} q_s I - \lfloor q_v \rfloor_\times & q_v \\ -q_v^T & q_s \end{bmatrix} &
\end{align*}
where \(\lfloor m \rfloor_\times\) is the matrix representation of the cross-product. This is extremely useful when differentiating quaternion expressions since \(p^L = \pd{}{q}(p\otimes q)\) and \(q^R = \pd{}{p}(p\otimes q)\).
A Rodrigues rotation vector \(\eta = \theta a\) where \(\|a\|_2 = 1\) can be mapped to a quaternion via the exponential map:
\begin{align*}
\exp_q(\eta) =& q = \begin{pmatrix} a \sin\left(\frac{\theta}{2}\right) \\ \cos\left(\frac{\theta}{2}\right) \end{pmatrix} \\
\log_q(q) =& \eta = 2 q_v
\end{align*}
Under the assumption that \(\theta << 1\), the following approximations can be made:
\begin{align*}
\exp_q(\eta) \approx& \begin{pmatrix} \frac{\eta}{2} \\ 1 \end{pmatrix} \\
\pd{\exp_q(\eta)}{\eta} \approx& \begin{pmatrix} \frac{1}{2} I_{3\times3} \\ 0 \end{pmatrix} \\
\pd{\log_q(q)}{q} \approx& \begin{pmatrix} 2 I_{3\times3} & 0_{3\times1} \end{pmatrix}
\end{align*}
The above are also very helpful in manifold optimization over quaternion states where the Newton update is generally small which results in linear Jacobians with respect to the update variables.
I find this process easier to reason through by writing out a small 3-column version explicitly:
\begin{equation*}
\begin{bmatrix} A_{*0} & A_{*1} & A_{*2} \end{bmatrix} =
\begin{bmatrix} Q_{*0} & Q_{*1} & Q_{*2} \end{bmatrix} \begin{bmatrix}
R_{00} & R_{01} & R_{02} \\
& R_{11} & R_{12} \\
& & R_{22}
\end{bmatrix}
\end{equation*}
which gives:
\begin{align*}
&A_{*0} = Q_{*0} R_{00} \\
&A_{*1} = Q_{*0} R_{01} + Q_{*1} R_{11} \\
&A_{*2} = Q_{*0} R_{02} + Q_{*1} R_{12} + Q_{*2} R_{22}
\end{align*}
From this and the orthonormal properties of \(Q\), the meaning of the \(R_{ji}\) values are clear, the projection of the \(i\)'th column of \(A\) onto the \(j\)'th column of \(Q\). These allows fixing the values of \(R_{ji}\) and norms of \(Q_{*i}\):
- \(Q_{*i}^T Q_{*j} = 0, \forall i \neq j\): columns of \(Q\) are orthogonal so each \(R_{j*}\) value must account for entire parallel component of \(A_{*j}\) onto each column of \(Q\).
- \(\|Q_{*i}\|_2 = 1, \forall i\): constrains \(\|Q_{*i}\|\) and corresponding \(R_{ji}\) values.
Therefore:
- \(R_{00} = \|A_{*0}\|_2\) and \(Q_{*0} = A_{*0}/R_{00}\)
- \(R_{01} = Q_{*0}^T A_{*1}\), \(s_1 = A_{*1} - R_{01} Q_{*0}\), \(R_{11} = \|s_1\|_2\) \(Q_{*1} = s_1/R_{11}\)
- \(R_{02} = Q_{*0}^T A_{*2},~R_{12} = Q_{*1}^T A_{*2}, s_2 = A_{*2} - R_{02} Q_{*0} - R_{12} Q_{*1}, R_{22} = \| s_2 \|_2, Q_{*2} = s_2/R_{22}\)
or in general:
- \(R_{ji} = Q_{*j}^T A_{*i}, \forall j < i\)
- \(s_{i} = A_{*i} - \sum_j R_{ji} Q_{*j}\)
- \(R_{ii} = \| s_{i} \|_2, Q_{*i} = s_i/R_{ii}\)
and in python:
def qr_gs( A, inplace=True ):
'''Decompose A into Q*R with Q orthonormal and R upper triangular using classical Gram-Schmidt (unstable)'''
A = A if inplace else A.copy()
R = np.zeros((A.shape[1],A.shape[1]))
for i in range( A.shape[1] ):
for j in range(i):
R[j,i] = np.sum(A[:,j]*A[:,i])
A[:,i] -= R[j,i]*A[:,j]
R[i,i] = np.linalg.norm(A[:,i])
A[:,i] /= R[i,i]
return A,R
The form above generates \(Q\) column by column but has stability issues due to the use of classical Gram-Schmidt. It can be improved by replacing classical Gram-Schmidt with modified Gram-Schmidt:
def qr_mgs( A, inplace=True ):
'''Decompose A into Q*R with Q orthonormal and R upper triangular using modified Gram-Schmidt
Assumes columns of A are linearly independent.
'''
A = A if inplace else A.copy()
R = np.zeros((A.shape[1],A.shape[1]))
for i in range( A.shape[1] ):
R[i,i] = np.linalg.norm(A[:,i])
A[:,i] /= R[i,i]
for j in range(i+1,A.shape[1]):
R[i,j] = np.dot(A[:,j],A[:,i])
A[:,j] -= R[i,j]*A[:,i]
return A,R
The inner loop of this can be replaced entirely with vectorization in python but the above form gives a reasonable starting point for porting to another language like C or C++. These implementations should also check for \(R_{ii} = 0\) which indicate a rank deficient system.