Monday, February 26, 2018

DirectX 11, HLSL, GatherRed

Go to Index

Every once in a while I am in need to use one of those Gather functions from DirectX 11's HLSL library. GatherRed in this case. This function is useful because it allows you take four samples with just one instruction and store them all in float4. As the name indicates, of the four texels that are sampled simultaneously, only the values from the red channels will be stored in float4. If you need data from other channels you can use respective functions. It is really worth using this function(s) if you only need data from one channel as calling one gather is faster than taking four samples individually.

If instead of using your original UV coordinates to take one regular sample with Sample or SampleLevel you call GatherRed which four samples (their red channels) exactly will be taken? This is something the DirectX's documentation doesn't specify so this short blog post is here to fill this gap. You can also stumble on that information in various DirectX 11 presentations.
 
Take a look at the picture:
 
 
 
The grey pixel is the one whose UV coordinates we have in the shader (the very center of that texel, to be more specific). If you call GatherRed you will get the four labeled samples (again, only their red channel's values). Probably a little bit counter-intuitively the value of the "base" sample is not stored in return value's $x$ component but $w$ as the image above shows. For better picture the two following snippets are equivalent:
float r1 = myTexture.Sample( mySampler, uv, int2(0, 0) ).x;
float r2 = myTexture.Sample( mySampler, uv, int2(1, 0) ).x;
float r3 = myTexture.Sample( mySampler, uv, int2(0, 1) ).x;
float r4 = myTexture.Sample( mySampler, uv, int2(1, 1) ).x;
and:
float4 samples = myTexture.GatherRed( mySampler, uv + float2(0.5f, 0.5f)/myTextureDim );
float r1 = samples.w;
float r2 = samples.z;
float r3 = samples.x;
float r4 = samples.y;

And these as well:
float myValueR = myTexture.Sample( mySampler, uv ).x;
float myValueG = myTexture.Sample( mySampler, uv ).y;
float myValueB = myTexture.Sample( mySampler, uv ).z;
float myValueA = myTexture.Sample( mySampler, uv ).w;
and:
float myValueR = myTexture.GatherRed( mySampler, uv + float2(0.5f, 0.5f)/myTextureDim ).w;
float myValueG = myTexture.GatherGreen( mySampler, uv + float2(0.5f, 0.5f)/myTextureDim ).w;
float myValueB = myTexture.GatherBlue( mySampler, uv + float2(0.5f, 0.5f)/myTextureDim ).w;
float myValueA = myTexture.GatherAlpha( mySampler, uv + float2(0.5f, 0.5f)/myTextureDim ).w;
You probably noticed that when using Gather there is a half-texel offset applied. That is because this instruction does not blindly sample the texture at the specified uv and its right/bottom/right-bottom neighbors. Instead, it picks uvs that would have been chosen if we wanted to perform custom bilinear texture filtering. Have a look at the image below:
 

Here we do not apply the half-texel offset. As a result Gather picks different samples than Sample. It picks the samples that would have been used for bilinear filtering. In order to "invalidate" that, and make sure that Gather will always have the upper-left sample under the current/sampled uv we need to apply the half-texel offset.

There. I hope you won't have to wonder anymore the order of samples returned by gathers :). At least I know I won't.
 
ACKNOWLEDGEMENTS: I'd like to thank Klaudiusz Zych for pointing out to me the need to apply the half-texel offset. I missed it in the first version of this post. Klaudiusz also drew the image that explains graphically the need to use the half-texel offset.
Also thanks to @xi@g@me for pointing out the same mistake in the comments.

3 comments:

  1. The order is documented on this page:
    https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/gather4--sm5---asm-
    which is mentioned on this page:
    https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/texture2d-gather
    but thanks for making the picture, it helps with memorizing the pattern :)

    ReplyDelete
  2. Good article, thanks :)
    If we look at the documentation of the asm instruction that gatherRed uses, here : https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/gather4--sm5---asm-

    We read this :
    "This is the same as point sampling with (u,v) texture coordinate deltas at the following locations: (-,+),(+,+),(+,-),(-,-), where the magnitude of the deltas are always half a texel."

    Thus, I wonder if the correct texture coordinate that shall be provided to gather is not the center of the W texel, but the intersection of the 4 texels to sample?

    ReplyDelete