## środa, 23 września 2015

### Deferred Lighting on Android

The demo is here; the package also contains README.txt file with steps needed to run it. Basically you will need Eclipse, Android NDK and possibly other stuff. To get everything up and running when developing Android apps I use this https://developer.nvidia.com/AndroidWorks.

Now a few words on how my implementation works.

The demo uses OpenGL ES 2.0. This API allows to barely render into one render target at a time and as far as I know does not provide floating point formats. How do you generate the g-buffer with that limitation? Well, you could go with a few passes but yeah, that is insane in terms of performance for Android devices. That's why I settled with one pass and fit everything into a single RGBA8 buffer. All I need for basic deferred lighting is position and normals. Instead of postition we store (linear) depth using 16 bits of the buffer (position is reconstructed later in the lighting pass). In case of normals I store $x$ and $y$ components in view-space and reconstruct $z$ in the lighting pass, hence only two (8-bit) channels suffice. All fits into a 32-bit buffer. Please examine shaders code to see how it all works.

The demo can use two common optimizations for the lights in the lighting pass. One is using scissor test and the other using stencil buffer. Both can be turned on/off in code using macros USE_SCISSOR and USE_STENCIL.

The demo renders 18 small-to-medium lights on Tegra 3 with 21 FPS which I consider not bad. Here goes a picture:

## sobota, 19 września 2015

### As Simple As Possible Explanation of Depth of Field in Ray Tracer

Have a look at the picture:

It shows a side view of the view frustum. There is the eye along with the "plane" (let's call it eye plane) the eye is on, the near plane at near plane distance (you could also call it film plane) and the focal plane at focal plane distance.

The point of depth of field is to make the image appear blurry everywhere where the generated rays don't intersect scene at focal plane distance. To achieve this, instead of ray tracing only one regular pinhole ray (the black ray in the picture), we generate a bunch of rays (let's call them depth of field rays) that originate at random places on the eye plane (one of these rays is the red ray in the picture). So, we know where the new ray originates. Now what we need to know is its direction. Since our image cannot be blurry at the focal plane, all of the depth of field rays must meet at the same point on the focal plane. This intersection point is simply calculated by finding the intersection point of the regular pinhole ray (the black ray in the picture) with the focal plane. So now, for each depth of field ray you have its origin (random point on the eye plane; the bigger the variance of those rays the bigger the DOF effect) and point it passes through (on the focal plane) hence the ray's direction is easy to be calculated. Once you've traced all DOF rays and got their radiances you just average them and get a nice depth of field effect.

That was it!

## wtorek, 19 maja 2015

### Spherical Harmonic Lighting - Analytical Lights

## SH Formulas

Here come the first three bands of SH functions (both in spherical and cartesian coords, with Condon-Shortley phase). I've seen them in various places but often with typos. The formulas here are taken from chapter 3.2 from GPU Pro 2.\begin{eqnarray}

Y_0^0 &=& \frac{1}{2} \sqrt{\frac{1}{\pi}} &=& \frac{1}{2} \sqrt{\frac{1}{\pi}} \cr

Y_1^{-1} &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} \sin(\theta) \sin(\phi) &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} y \cr

Y_1^0 &=& \frac{1}{2} \sqrt{\frac{3}{\pi}} \cos(\theta) &=& \frac{1}{2} \sqrt{\frac{3}{\pi}} z \cr

Y_1^1 &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} \sin(\theta) \cos(\phi) &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} x \cr

Y_2^{-2} &=& \frac{1}{2} \sqrt{\frac{15}{\pi}} \sin^2(\theta) \sin(2\phi) &=& \frac{1}{2} \sqrt{\frac{15}{\pi}} xy \cr

Y_2^{-1} &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} \sin(\theta) \cos(\theta) \sin(\phi) &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} yz \cr

Y_2^0 &=& \frac{1}{4} \sqrt{\frac{5}{\pi}} (3 \cos^2(\theta) - 1) &=& \frac{1}{4} \sqrt{\frac{5}{\pi}} (3z^2 - 1) \cr

Y_2^1 &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} \sin(\theta) \cos(\theta) \cos(\phi) &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} zx \cr

Y_2^2 &=& \frac{1}{4} \sqrt{\frac{15}{\pi}} \sin^2(\theta) \cos(2\phi) &=& \frac{1}{4} \sqrt{\frac{15}{\pi}} (x^2 - y^2) \cr

\end{eqnarray}

We'll work only with the first three bands as this is enough for accurately representing directional diffuse (Lambertian BRDF) lighting.

## Projecting Light and Transfer Functions into SH

Remember we're dealing with directional lights acting on Lambertian surfaces. The light function $L$ is simply constant (light) color ($lc$), projected into SH in the direction of the light ($ld$). Thus, $L$ projected into SH gives $L_l^m$:\begin{eqnarray}

L_0^0 &=& \frac{1}{2} \sqrt{\frac{1}{\pi}} lc \cr

L_1^{-1} &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} ld_y lc \cr

L_1^0 &=& \frac{1}{2} \sqrt{\frac{3}{\pi}} ld_z lc \cr

L_1^1 &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} ld_x lc \cr

L_2^{-2} &=& \frac{1}{2} \sqrt{\frac{15}{\pi}} ld_x ld_y lc \cr

L_2^{-1} &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} ld_y ld_z lc \cr

L_2^0 &=& \frac{1}{4} \sqrt{\frac{5}{\pi}} (3ld_z^2 - 1) lc \cr

L_2^1 &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} ld_z ld_x lc \cr

L_2^2 &=& \frac{1}{4} \sqrt{\frac{15}{\pi}} (ld_x^2 - ld_y^2) lc \cr

\end{eqnarray}

This is just light. We also need to project into SH the transfer function $A$, which in case of our Lambertian BRDF function is simply $\cos(\theta)$. The formula for this function projected into SH is derived here; have a look at equations (19) (this equation has two typos which are fixed here, thanks to KriS) and (26) in particular. Formulas:

\begin{eqnarray}

l = 1: & A_l &=& \frac{\sqrt{\pi}}{3} N_l \cr

l > 1, \mbox{odd}: & A_l &=& 0 \cr

l, \mbox{even}: & A_l &=& 2 \pi \sqrt{\frac{2l+1}{4 \pi}} \frac{(-1)^{l/2-1}}{(l+2)(l-1)} \frac{l!}{2^l ((n/2)!)^2} N_l \cr

\mbox{where} & N_l &=& \sqrt{\frac{4 \pi}{2l + 1}} \cr

\end{eqnarray}

Note that $A$ varies per SH band index. $A$ values for the first three SH bands are:

\begin{eqnarray}

A_0 &=& \pi \cr

A_1 &=& \pi \frac{2}{3} \cr

A_2 &=& \frac{\pi}{4} \cr

\end{eqnarray}

The final formula for each SH coefficient of light and transfer functions, projected into SH and convolved is:

\begin{eqnarray}

E_l^m = L_l^m A_l

\end{eqnarray}

## Final Vertex Color

To finally calculate the lighting for a vertex, given its normal $n$, we need to project the vertex's color ($vc$) in the direction of the normal into SH. This is pretty much the same what we did for the light function:\begin{eqnarray}

V_0^0 &=& \frac{1}{2} \sqrt{\frac{1}{\pi}} vc \cr

V_1^{-1} &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} n_y vc \cr

V_1^0 &=& \frac{1}{2} \sqrt{\frac{3}{\pi}} n_z vc \cr

V_1^1 &=& \frac{-1}{2} \sqrt{\frac{3}{\pi}} n_x vc \cr

V_2^{-2} &=& \frac{1}{2} \sqrt{\frac{15}{\pi}} n_x n_y vc \cr

V_2^{-1} &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} n_y n_z vc \cr

V_2^0 &=& \frac{1}{4} \sqrt{\frac{5}{\pi}} (3n_z^2 - 1) vc \cr

V_2^1 &=& \frac{-1}{2} \sqrt{\frac{15}{\pi}} n_z n_x vc \cr

V_2^2 &=& \frac{1}{4} \sqrt{\frac{15}{\pi}} (n_x^2 - n_y^2) vc \cr

\end{eqnarray}

Convolved with $E_l^m$ it gives the final vertex color $C_l^m$ for each band:

\begin{eqnarray}

C_l^m &=& V_l^m E_l^m

\end{eqnarray}

To retrieve the final vertex color $C$ all we need is to sum all SH components:

\begin{eqnarray}

C = \sum\limits_{l=0}^2 \sum\limits_{m=-l}^l C_l^m

\end{eqnarray}

## środa, 1 kwietnia 2015

### Data-oriented Design and Fourier Transform

I'm not going to go into details of how to code in data-oriented manner. I'm a very adept of this technique. What I want is to share with you some of my old code (2010-2011 or something) and its refactored counter-part (2015); refactored with data-oriented design in mind. How I see it.

The code consists mostly of a rather simple implementation of math and image library. There is also a sample that performs Discrete Fourier Transform using the image library.

fourier-old

The math library code is in math/include/blossom_math/ and math/src/.

The general image code is in common/include/blossom_common/image.hpp and common/src/image.cpp.

The sample's code is in samples/fourier/src/main.cpp. Also, the Fourier functions can be found here.

fourier-new

The math code is in include/math/ and src/math/.

The image code is in include/image/ and src/image/. Also, the Fourier functions are here.

The sample's code is in samples/fourier/src/main.cpp.

Both archives come with binaries (for Windows). In samples/fourier/data/ there is run.bat file that will run the sample and perform Fourier transform on selected image using specified filter.

Feel free to express what you think about my data-oriented understanding :).

## wtorek, 17 marca 2015

### Stepping Towards Target Value

current = current + step;If you wanted to make sure it doesn't step further once it has reached some target value you would do this:

float current = 0.0f; float target = 1.0f; float step = 0.1f; for (;;) { current = current + step; if (current > target) current = target; }Not a big deal. But once I stumbled upon such piece of code:

float current = 0.0f; float target = 1.0f; float step = 0.1f; for (;;) { float delta = fabs(target - current) + 0.0001f; float t = Clamp(step / delta); current = Lerp(current, target, t); }Turns out this code, at first glance looking quite complicated, does the same thing. So, why did the author of that code preferred it this way instead of something more simple as the solution presented in the beginning of this post? I'm not sure. Maybe because this code works correctly when we arbitrarily change target value? For instance, the first solution won't work when target is smaller than current. But we can easily handle this case as well.

Anyway, I was curious to find out that if from mathematical standpoint these two approaches were identical. Here's the proof:

\begin{eqnarray}

a &=& current \cr

b &=& target \cr

s &=& step \cr

t &=& \frac{s}{b-a} \cr

\end{eqnarray}

\begin{eqnarray}

Lerp(a, b, t) = (1-t)a + tb &=& \cr

(a - \frac{s}{b-a}a) + (\frac{s}{b-a}b) &=& \cr

\frac{(b-a)a - sa + sb}{b-a} &=& \cr

\frac{(b-a)a+s(b-a)}{b-a} &=& \cr

a + s

\end{eqnarray}

So, to sum up, the lerp way results in simple incrementation of a with s.

## niedziela, 15 marca 2015

### Normal Map Generator (from diffuse maps)

You can get my program's code here. It's written in C# and runs in command-line. The program takes four arguments:

- input diffuse map file's path
- output normal map file's path (can be the same as first argument)
- scale - the bigger this value the bigger the "contrast" of the normal map will be
- border - width of border of the image filled with (0, 0, 1) normal

## piątek, 9 stycznia 2015

### A Better Way To Use Lerp To Smooth Things Out

Note that given scenario allows $X$ to be changed while the time progresses, so that the object follows the $X$ position. Also, even though the code uses lerp function call, the movement is not linear in any sense of that word.

The thing that caught our attention and made us wonder is that interpolation parameter (the last argument of lerp function) can sometimes be greater than one, e.g. when $\Delta t$ is big because last frame lasted a long time for some reasons. We don't want that.

Given solution is wrong because the movement coded in such way is not independent from the frame rate. To see it clearly, let us rewrite equation $1$ by parametrizing it with $\Delta t$ and make it recursive sequence of positions at $n$-th frame. \begin{equation} \label{wrong2} P^{\Delta t}_n = lerp(P_{n-1}, X, a \cdot \Delta t)~~~~~~~~~~~~~~~(2) \end{equation} We assume for a moment for simplicity that $\Delta t$ and $X$ do not change during execution of the code. For our purposes, we can also assume that $P^{\Delta t}_0=0$ and compute the explicit form of equation $2$ (using our math skills or Wolfram Alpha). $$ P^{\Delta t}_n = X(1 - (1-a\cdot\Delta t)^n) $$ Now let us consider two special cases. Let $a=1$. One were code runs at $60$ FPS for $60$ frames and second one that runs at $10$ FPS for $10$ frames. $$ P^{1/60}_{60} = X(1 - (1-\frac{1}{60})^{60}) \approx 0.6352 X $$ $$ P^{1/10}_{10} = X(1 - (1-\frac{1}{10})^{10}) \approx 0.6513 X $$ Both cases last one second and it would be desirable if those values were the same. Thus, by example above, we see that this method is not frame rate independent.

Generally speaking, we want the following equality to be met \begin{equation} \label{req} P^{\Delta t}_{n} = P^{\Delta t \cdot 1/\beta}_{n \cdot \beta}~~~~~~~~~~~~~~~(3) \end{equation} for any value of $\beta$.

Once we have described and analysed the problem, we can try to come up with a better solution. We propose to use the following code: $$ newPosition = lerp(oldPosition, X, 1 - e^{-a \cdot \Delta t}) $$ where $e$ is the base of the natural logarithm and $a$ is a constant that denote how fast we want to interpolate the movement. Rewriting to a more concise form as we did before, we get $$ P^{\Delta t}_n = lerp(P_{n-1}, X, 1 - e^{-a \cdot \Delta t}). $$ Assuming initial conditions $P^{\Delta t}_0=0$, the explicit version can be computed (we used some math software again). \begin{equation} \label{good} P^{\Delta t}_n = X (1 - e^{-a\cdot\Delta t\cdot n})~~~~~~~~~~~~~~~(4) \end{equation}

Equation $4$ meets the requirement given by equation $3$: $$ P^{\Delta t \cdot 1/\beta}_{n \cdot \beta} = X(1 - e^{-a\cdot\frac{1}{\beta}\cdot\Delta t \cdot n \cdot \beta}) = X (1 - e^{-a\cdot\Delta t\cdot n}) = P^{\Delta t}_n $$ and therefore is independent from the frame rate.

To sum up, we propose to use the $1-e^{-a\cdot\Delta t}$ instead of $a\cdot\Delta t$ as the interpolation factor. It makes much more sense and the movement becomes more predictable when the frame rate varies.