The essential matrix relates normalized coordinates in one view with those in another view of the same scene and is widely used in stereo reconstruction algorithms. It is a 3x3 matrix with rank 2, having only 8 degrees of freedom.
Given \(p\) as a normalized point in image P and \(q\) as the corresponding normalied point in another image Q of the same scene, the essential matrix provides the constraint:
It's important that \(p\) and \(q\) be normalized homogeneous points, i.e. the following where \([u_p,v_p]\) are the pixel coordinates of point p:
By fixing the extrinsics of image P as \(R_P = I\) and \(t_P = [0,0,0]^T\) the essential matrix relating P and Q can defined by the relationship:
where \(R\) and \(t\) are the relative transforms of camera Q with respect to camera P and \(t_\times\) is the matrix representation of the the cross-product: