środa, 3 października 2012

Fast Texture Projection onto Geometry on the GPU

While implementing various graphical algorithms, a problem that must be solved from time to time is projection of texture onto some geometry. This happens, for instance, when we want to make some water reflection or cast g-buffer textures onto light's geometry.

The procedure is pretty straightforward. If we multiply a vertex's position by the whole world-view-projection matrix and perform the perspective division, we get vertex's position in normalized device coordinates, which all are in $[-1, 1]$ interval. If we wanted to project a texture onto geometry, we would need to do exactly the same thing (but use world-view-projection matrix of some sort of "projector", not the game's "camera") and remap the vertex's final $x$ and $y$ coordinates from $[-1, 1]$ interval to $[0, 1]$ interval, thus successfully generating texture coordinates, which can be used to sample a texture.

Let's say that we have a vertex shader like this:
uniform mat4 worldViewProjTransform;
uniform mat4 projectorWorldViewProjTransform;

attribute vec4 attrib_position;

varying vec4 varying_projectorTexCoord;

void main()
{
  gl_Position = attrib_position * worldViewProjTransform;

  varying_projectorTexCoord = attrib_position * projectorWorldViewProjTransform;
}
In this vertex shader we simply multiply the vertex's position by the projector's matrix. Perspective division and coordinates remap does not occur here yet. It will in the fragment shader:
uniform sampler2D projectorSampler;

varying vec4 varying_projectorTexCoord;

void main()
{
  vec4 projectorTexCoord = varying_projectorTexCoord;
  
  projectorTexCoord /= projectorTexCoord.w;
  projectorTexCoord.xy = 0.5 * projectorTexCoord.xy + 0.5;
  
  gl_FragColor = texture2D(projectorSampler, projectorTexCoord.xy);
}
In this pixel shader we first do the perspective division and then, when we know we are in $[-1, 1]$ interval, we can do the remapping part.

The title of this post contains word "fast". Is our approach fast? Well, with today's hardware it actually is but we can do better. The very first thing that should capture your attention is the division in the pixel shader. Can we do something about it? Yup. There is a function in basically any real-time shading language like texture2DProj which, before taking a sample from the texture, will first divide all components of the second argument by the 4th component, $w$. So, we can try doing this:
uniform sampler2D projectorSampler;

varying vec4 varying_projectorTexCoord;

void main()
{
  vec4 projectorTexCoord = varying_projectorTexCoord;

  projectorTexCoord.xy = 0.5 * projectorTexCoord.xy + 0.5;
  
  gl_FragColor = texture2DProj(projectorSampler, projectorTexCoord);
}
Will this work? Of course not. That is because we switched the order of operations. In this version of code the projection happens after remapping whereas it should occur before. So, we need to fix the remapping code.

Remember that the projected coordinates $[x, y, z, w]$ that come into the pixel shader are all in $[-w, w]$ interval. This means that we want to remap the coordinates from $[-w, w]$ to $[0, w]$. Then, the divison can be performed either by dividing manually and calling texture2D or by calling texture2DProj directly. So, our new pixel shader can look like this:
uniform sampler2D projectorSampler;

varying vec4 varying_projectorTexCoord;

void main()
{
  vec4 projectorTexCoord = varying_projectorTexCoord;

  projectorTexCoord.xy = 0.5 * projectorTexCoord.xy + 0.5 * projectorTexCoord.w;
  
  gl_FragColor = texture2DProj(projectorSampler, projectorTexCoord);
}
Are we done yet? Well, there is actually one more thing we can do. We can shift the "$0.5$" multiplication and addition to the vertex shader. So the final vertex shader is:
uniform mat4 worldViewProjTransform;
uniform mat4 projectorWorldViewProjTransform;

attribute vec4 attrib_position;

varying vec4 varying_projectorTexCoord;

void main()
{
  gl_Position = attrib_position * worldViewProjTransform;
 
  varying_projectorTexCoord = attrib_position * projectorWorldViewProjTransform;
  varying_projectorTexCoord.xy = 0.5*varying_projectorTexCoord.xy + 0.5*varying_projectorTexCoord.w;
}
and the final pixel shader is:
uniform sampler2D projectorSampler;

varying vec4 varying_projectorTexCoord;

void main()
{
  vec4 projectorTexCoord = varying_projectorTexCoord;

  gl_FragColor = texture2DProj(projectorSampler, projectorTexCoord);
}
Yes. Now we are done.

One might wonder if we could do the perspective division in the vertex shader and call texture2D instead of texture2DProj in the pixel shader. The answer is: we can't. The hardware, during rasterization, interpolates all vertex-to-pixel shader varyings. Assuming that varying_projectorTexCoord is $[x, y, z, w]$ vector, the value that is read in the pixel shader is $[\mbox{int}(x), \mbox{int}(y), \mbox{int}(z), \mbox{int}(w)]$, where $\mbox{int}$ is an interpolating function. If we performed the division in the vertex shader, the vector coming to the pixel shader would be of the form $[\mbox{int}(\frac{x}{w}), \mbox{int}(\frac{y}{w}), \mbox{int}(\frac{z}{w}), 1]$. And obviously $\mbox{int}(\frac{x}{w}) \neq \frac{\mbox{int}(x)}{\mbox{int}(w)}$ (same for other components). Note that the right hand side of this inequality is what happens when we do the perspective division in the pixel shader

Instead of doing all this trickery we could include a special remapping matrix into the chain world-view-projector of the projector's matrix. What this matrix looks like can be found here (this is also a great tutorial about texture projection in general by the way). The problem with this approach is that you might not always have this matrix injected into the whole chain and for some reason you either don't want or even can't do it. You could, naturally, create this remapping matrix in the vertex shader but that would raise the number of instructions. It is better to use the approach discussed throughout this post in that case, as it will make things running much quicker.