GPU 2014 — Homework #6

GPU 2014 — Homework #6

GPU Programming for Video Games

Summer 2014

Homework #6: Peeking at Performance (Graduate Section Only)

Due: Friday, August 1 at 4:00 PM (via T-square)

This assignment will be graded out of 60 points.

Predicting and measuring
the performance is tricky and often counterintuitive. Because of
the myriad of complex ways that GPU cores interact with the frame buffer,
the main memory, and the CPU, it is quite easy
to draw the wrong conclusions.

But, we’re going to take a stab at it anyway.

For this assignment, you will choose two shaders:
a really simple one, and a really complex one.
They should be usual vertex/fragment shader pairs, and not
Unity-specific Surface Shaders.
Your complex shader
should be “simple” enough that it can still run using shader model
2.0, and not need shader model 3.0, but it should be as complex
as you can easily find that still works in 2.0. You can use
shaders you can find on the web, or shaders you’ve made,
or shaders that I’ve provided to you or some combination. All that
matters is that the code for one is much longer and more complicated
than the code for another. If the complicated shader uses a lot
of textures, that the simple one doesn’t, that’s even better!

Try compiling each of your shaders as shader model 2.0
and shader model 3.0 (changing #pragma as needed).
For each case, click “Show current” under “Debugging”
in the Inspector window for each shader to see the assembly code for
the vertex and fragment shaders. At the bottom of the vertex and
fragment shaders,
you will see an instruction count for each kind of shader.

Question 1:
For each shader model, how many vertex instructions
and fragment instructions does your simple shader need?
How many does your complex shader need?

let’s try to test performance directly. Create a scene in Unity with
a large object
that takes up most of the screen. If your one of your
shaders reacts to light, add a light source,
make sure each light has its render mode set to “important” and that its
range is lare enought to include
your object.

Before you run your scene, activate
“Maximize on Play” and “Stat” in the Game window. Make
your Unity window is as large as it can be.

Apply your simple shader, using shader model 2.0,
to your object and run your scene; record “Renderer” time
and the milliseconds per frame (shown to the right of the FPS
value) shown in the Statistics window.
These may vary over time, so write down something in
the ballpark. Repeat this step for your simple shader, using
shader model 3.0, and then repeat it for your complex shader,
using shader model 2.0, and then shader model 3.0.

It can be difficult to “max out” a modern GPU. If needed, add
“Zbuffer Off” to your shaders, and make many copies of
your object using
the “Duplicate” keyboard shortcut to push the GPU harder so you
start seeing some differences in performance between your simple
and complex shaders.

Question 2: What are the performance metrics, “Renderer time”
and “milliseconds per frame,”
for the four cases? (SM 2.0 vs 3.0,
simple vs. complex code)

Question 3:
Comment on how the performance metrics of “Renderer time”
and “milliseconds per frame”
relate to the intruction count. Do you conclude that
there is a straightforward mapping from instruction count to
milliseconds per frame or renderer time?

Create a Microsoft Word or PDF document (PDF is preferable, but
Word is OK) document answering the above three questions, and also
containing the code for your simple and complex shaders. You need
not provide any commentary on the shaders themselves; just paste
them in. Also include a screenshot of one of your test cases for
your simple shader (I don’t care which shader model) and one
of your test cases for your complex shader (again, I don’t care which
shader model) showing off the Statistics window.
Upload your
to T-squre.
Include “HW6” and as much
as possible of your full name in the
filename, e.g., HW6_Aaron_Lanterman.doc. (The upload
procedure should be reasonably self explanatory once
you log in to T-square.)
Be sure to finish sufficiently in
advance of the deadline that you will
be able to work around any troubles
T-square gives you to successfully
submit before the deadline. If you have
trouble getting T-square to work,
please e-mail your compressed
file to Prof. Lanterman at,
with “GPU HW #6” and your full
name in the header line;
please only
use this e-mail submission as a last resort if T-square isn’t working.