Thursday, November 3, 2011

Moving between Coordinate Frames

Go to Index

Perhaps the most common thing one does when works in the field of computer graphics is doing transformations, and one of them is moving objects from one frame of reference to another. A classical example is camera transformation. We have our objects defined in local space, or world space, and then, to make projection transformation simple, we move all objects to camera space. Camera space is a space where all objects are positioned in such a way as if the camera was positioned in point $(0, 0, 0)$ and looking directly toward $-Z$ (right-handed system) or $+Z$ (left-handed system) axis. When I read about how the view transform matrix is carried out, I wasn't too much surprised. The idea is rather simple. First, we want to translate the camera to the origin, and then we want to orient it such that it looks down the $Z$ axis. We can do so using two matrix transformations, the first one is a translation matrix (filled with the negative coordinates of the camera's position) and the second one is a rotation matrix (filled with the orientation vectors of the camera). The translation matrix was obviously straightforward to understand but I could never fully grasp why the rotation matrix looked the way it looked). I went through basic vector/matrix math in a couple of books but nowhere could find the answer, with one exception. I found the answer in 3D Game Engine Design (2nd edition) by David H. Eberly. He wrote some simple formula from which on everything became immediately clear. So, let's start.

Let's first define our general problem of moving between coordinate frames (in 2D, for simplicity). Given some point $P$ we want to find its position $P'$ in some alternate frame of reference which is located at point $O$ and is spanned on two orthogonal vectors $R$ and $U$. The crucial condition here is that $P$, $O$, $R$ and $U$ are all defined in the same space, which is usually the case (if not, we can bring them using the very approach we're describing right now). The magic formula (which is nothing more than a simple linear combination of two vectors) then is:
\begin{eqnarray}
P = O + (P'_x*R) + (P'_y*U)
\end{eqnarray}
You can check empirically that this works. Just sketch a coordinate frame starting at $(0, 0)$ with two direction vectors $(1, 0)$ and $(0, 1)$. Then, choose some point $O$ in this coordinate frame, as well as vectors $R$ and $U$. And finally, pick some point $P$ and see yourself that the coordinates of $P$ in the starting coordinate frame correspond to the same point in the $(O, R, U)$ coordinate space. Here's some picture:


So let's say that $P = (P_x, P_y) = (5, 4)$, $O = (4.2, 2)$, $R = (\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})$, $U = (-\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})$. Then, the coordinates $(P'_x, P'_y)$ in the coordinate frame $(O, R, U)$ are, more or less, $P' = (2, 0.85)$. Let's now see how to find $P' = (P'_x, P'_y)$.

Okay, so we have one formula with two unknowns $P'_x$ and $P'_y$:
\begin{eqnarray}
P = O + (P'_x*R) + (P'_y*U)
\end{eqnarray}
Unsolvable? Of course it is. The equation above is actually not one but two equations:
\begin{eqnarray}
P_x &=& O_x + (P'_x*R_x) + (P'_y*U_x) \cr
P_y &=& O_y + (P'_x*R_y) + (P'_y*U_y)
\end{eqnarray}
We can now employ any technique to solve this system of equations for $P'_x$ and $P'_y$. However, we can use some assumptions that we have made to simplify the calculations. We assumed that $R$ and $U$ are orthogonal, which is usually the case, but doesn't have to be. If these vectors are not orthogonal then you need to solve the equations above as they are. But here, we take a shortcut, because we know that $R$ and $U$ are orthogonal (orthonormal, to be even more precise), so the following holds:
\begin{eqnarray}
R \cdot R &=& 1 \cr
R \cdot U &=& 0
\end{eqnarray}
Let's see what happens if we apply the dot product with $R$ on both sides:
\begin{eqnarray}
P &=& O + (P'_x*R) + (P'_y*U)\ \ \ / \cdot R \cr
P \cdot R &=& (O \cdot R) + (P'_x*R \cdot R) + (P'_y*U \cdot R) \cr
P \cdot R &=& (O \cdot R) + (P'_x*1) + (P'_y*0) \cr
P'_x &=& (P \cdot R) - (O \cdot R) \cr
P'_x &=& (P - O) \cdot R = (P_x - O_x) * R_x + (P_y - O_y) * R_y
\end{eqnarray}
Doing the same with $\cdot U$ applied to both sides will give:
\begin{eqnarray}
P'_y = (P - O) \cdot U = (P_x - O_x) * U_x + (P_y - O_y) * U_y
\end{eqnarray}
From these two equations we can now easily derive two matrices. The first one should translate the original point $P$ by $-O$. The second one is just filled with $R$ and $U$ vectors:
\[
\begin{bmatrix}
P'_x & P'_y & 1
\end{bmatrix}
=
\begin{bmatrix}
P_x & P_y & 1
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \cr
0 & 1 & 0 \cr
-O_x & -O_y & 1
\end{bmatrix}
\begin{bmatrix}
R_x & U_x & 0 \cr
R_y & U_y & 0 \cr
0 & 0 & 1
\end{bmatrix}
\]
Note that we're using homogenous coordinates here. Otherwise we wouldn't be able to "inject" translation into the matrix.

That's pretty much it.

EDIT 1:
Today is 7th of November. I actually put that note four days ago but have constantly modified it since then... mostly the LaTeX part. Seems that I finally got it working :).

EDIT 2:
Today is 8th of November. I made some big mistake in this note because I thought that it was not possible to solve for $P'_x$ and $P'_y$ if vectors $R$ and $U$ were not orthogonal. Of course, it is possible, and now the note explains how to do that.