Special Topics Course Proposal for Fall 2007

ECE4893A/CS4803MPG: Multicore and GPU Programming for Video Games

Last updated 8/19/07: The course is now live! I’ll leave this proposal up, since
it explains some of our motivation for putting the class together (although some of it is now
out of date), but you’ll really
want to go see the official
course webpage
to find the latest and greatest info.

proposal submitted to ECE departmental committees

Video interview
with Aaron Lanterman
about the class,
conducted by at the 2007 Game Developers Conference

Instructors (co-taught on first offering):
David Bader (CoC),
Aaron Lanterman (ECE), and
Hsien-Hsin “Sean” Lee (ECE)

ECE3035: Mechanisms for Computation or CS2110: Computer Organization
and Programming. Student must be comfortable with C programming.
To be widely accessible to ECE students, no background in computer
graphics will be required.

A note on the title: Maybe “architectures” would work better than

The proposed class will cover the architecture and programming of
multicore processors and graphical processing units (GPUs), using examples
from the algorithmic
needs of modern 3-D games (rendering, collision detection, physics
engines, and artifical intelligence), as well as techniques for
adapting such architectures
for use in scientific applications, as in the
GPGPU movement.

The class will focus on inexpensive consumer hardware, particularly
the Playstation 3, the Xbox 360, and NVIDIA and ATI graphics cards.
trade-offs between asymmetric multicore architures (such as the STI Cell BE used
in the Playstation 3) and symmetric multicore architectures (such as the
triple-Power PC used in the Xbox 360) will be discussed.

While there are many examples of classes (listed below) that cover either multicore
or GPU architecture and programming, we are not aware of any existing
class (except for UIUC ECE498AL, which has a rather different flavor)
that tries to cover both in a synergistic fashion. Also, many of the
courses described below are advanced graduate courses. The intention here
is to provide something that is accessible to junior and senior undergraduates
as well as graduates. We may offer the same basic course
with two separate course numbers for the undergraduate and graduate
sections, with more advanced work required of the students signing up for
the graduate section.

Tentative Syllabus: The best ordering for the topics below is yet to
be decided.

  • Overview
  • Introduction to real-time computer graphics techniques (set groundwork)
  • Multicore concepts
  • Power PC/AltiVec architecture (common to both Xbox 360 processor and the STI
    Cell BE)
  • AltiVec Extension to PowerPC
    Accelerates Media Processing
    , IEEE Micro, March-April 2000, pp. 85-95.
  • Playstaion 2 architecture – the Emotion Engine (MIPS core, Vector Units,
    and Graphics Synthesizer)

  • STI Cell BE architecture
  • NVIDIA GPU architectures (used in Playstation 3)
  • ATI GPU architectures (used in Xbox 360)
  • Xbox 360 overall architecture
  • Playstation 3 overall architecture
  • XNA (Xbox 360) programming (particularly multithread support)
  • STI Cell BE programming
  • Flow control idioms in GPUs (z-culling, etc.)
  • Low-level GPU programming
  • High-level GPU programming using Cg
  • Computationally intensive
    algorithms in modern games (multicore CPU and GPU implementations)

    • Collision detection
    • Physics engines
    • Particle effects
    • Artificial intelligence
  • Example scientific applications (multicore CPU and GPU implementations)
  • Game programming on the STI Cell BE
    • The
      : Can a team of scrappy game programmers
      save Sony’s monster chip?, IEEE Spectrum, Dec. 2006, pp. 24-29.
  • Scientific computing on the STI Cell BE
  • CPU/GPU tradeoffs: What should go on the CPU? What should go on the GPU? In
    the case of the STI Cell BE, what should go on the PPE (Power Processor Element),
    and what should go on the SPEs (Synergistic Processing Elements)
  • Future architectures: What should the Playstation 4 and Xbox 720 look like?Lab requirements:
    Ideally, the students would conduct experiments on actual Playstation 3s
    (loaded with
    Dog Linux
    ) and
    Xbox 360s (programmed via
    Game Studio Express
    However, their availability is not an absolute requirement for
    teaching the course; simulators
    for the Cell processor are freely available from IBM, and XNA
    provides multithreading
    support in both its Windows and Xbox 360 manifestations (hardware threads
    on the
    360 may become software threads on the PC). Certainly, programming on actual
    Playstation 3s and Xbox 360s would have an strong appeal for many students.

    students could also explore GPU concepts on the PS3; however, Sony appears to
    have shut off access to accellerated graphics from Linux. There may be
    a workaround for this; if not, standard PCs loaded with good graphics cards
    will suffice as an alternative.)

    Connections to other efforts in ECE: Dan Cambpell of GTRII and
    Mark Richards of ECE recently
    achieved a 35x time speedup of a synthetic aperture radar phase unwrapping
    algorithm using a GPU; he also recently acquired a PS3 for experimenting with
    Cell programming. The Air Force may be sending additional PS3s to hook
    to a computing cluster at GTRI. Several other ECE
    faculty, such as Sudha Yalamanchili and David Anderson,
    are interested in GPGPU computing and/or the STI Cell BE.

    Connections between ECE, CoC, and LCC:
    In an e-mail dated Feb. 8, Prof. Blair MacIntyre (undergraduate coordinator
    for CoC’s new School of Interative Computing and faculty advisor for
    Computational Media) wrote: “I’ve talked
    to Greg Turk about getting
    more GPU stuff in our curricula, especially the ‘media’ thread, but we both
    agree a whole course is probably not appropriate. However, we also both
    agreed that a course that combines CPU and GPU issues more broadly would be
    good — we just weren’t the right folks to teach it.”

    MacIntyre further notes that
    College of Computing’s Computational Media program
    “has a significant games thread,
    with students who want to learn how to build games,” and there is in general
    a growing interest in games in both the College of Computing and the
    School of
    Literature, Communication, and Culture.
    While faculty in LLC, Michael Mateas (now with Santa Cruz) developed an
    “augmented reality” version of his remarkable, award-winning interactive
    one-act play Facade
    while part of LLC’s Experimental Game Lab.

    Amy Bruckman
    offered the Tech’s first Video Game Design class in 1998. CoC also now
    offers courses in computer graphics, computer animation, and digital
    video special effects, as well as a
    in which the students program
    a Gameboy Advance

    Of course, most topics in computer games naturally fit with
    CoC and LLC. The algorithms lie in CoC, and the art lies in LLC.
    Where could ECE fit into this picture? The algorithms may lie
    in CoC,
    the “big iron” that runs those algorithm fits best within ECE.
    MacIntyre commented that CS4455: Video Game Design does not cover CPU/GPU
    material due to lack of time, and the proposed course would provide a perfect
    complement for students interested in such topics. MacIntyre is interested
    in cross-listing the proposed course in CS as an elective in the media thread,
    suggesting it for Computational Media games students, and also in including it
    in CoC’s scientific computation thread.

    David Bader,
    Executive Director of High-Performance Computing
    in the CoC, was recently made head of a Sony-Toshiba-IBM Center for
    Competence for the Cell processor.

    Relevance to industry: Blair MacIntyre noted that one comment he heard
    from games industry personnel is that
    “CPU/GPU programming skill is the biggest
    hole they have. They can’t find students who can do it well.”
    This gels with observations by arstechnical writer
    Jeremy Reimer: “The biggest challenge facing
    game companies right now is the problem of writing multithreaded code that
    fully supports the multiple-core architectures of the latest PCs and the
    next generation game consoles” (from
    goes multicore
    , and
    “If a programming genius
    like John Carmack [programmer of Doom and Quake]
    can be so befuddled by mysterious issues
    coming from multithreaded programming, what chance do mere mortals have?”
    game development and the next generation of

    Motivating experience: In the Fall and Spring 2006 semesters,
    Aaron Lanterman taught
    Theory and Design of Music Synthesizers
    . About 2/3 of the class covered
    analog circuits, and the remaining part covered DSP. The homeworks were all
    based on studying real
    schematics from real synthesizers that real musicians have
    made music on. After spending years looking at idealized “textbook” examples,
    students benefited greatly from and responded well to analyzing more complex
    systems. Synthesizers essentially provided a “focal point” for thinking about
    issues in designing analog circuits. In this proposal, gaming systems such
    as the Playstation 3 and Xbox 360
    provide a similar focal point for thinking about issues
    in computer architecture.

    What this course is not: The course is not on game
    design per se; such issues, along with topics such as alternative
    user interfaces,
    social implications of games, gender and games,
    “theories” of games,
    virtual and augmented reality, networked and
    massively multiplayer games, prototyping, and user testing, are best covered
    in a class such as CS4455: Video Game Design. A few of these topics might
    be briefly touched upon here and there in the proposed class,
    but only tangentially.

    This is also not meant to be a course on covering every conceivable
    kind of high performance
    architecture, or deep treatise on numerical methods and their high-performance
    implementations; such topics are best handled in their own courses, such
    as ECE: 6101: Parallel & Distributed Computer Architecture,
    CS477: Vector & Parallel Scientific Computing, CS6230: High Performance
    Parallel Computing, and CS6236: Parallel & Distributed Simulation, CS6245:
    Parallelizing Compilers, and CS6290: High Performance Computer Architecture,

    Our interest lies specificically in those
    architectures available in inexpensive consumer hardware, hence
    benefit from the
    extremes of mass production.

    Playstation 2 possibilities: It would also be possible to cover
    many of these ideas
    Playstation 2s.
    Lanterman recently installed the official
    Sony Linux for Playstation 2
    on his personal playstation,
    and has been playing around with it. The user has access to most of the
    hardware, including the
    Vector Units in the Emotion Engine
    and the Graphics Synthesizer (unlike the PS3, where Sony appears to have sadly
    shut off access to the best parts of the graphics hardware from Linux).
    The core of the Emotion Engine is a MIPS
    processor, which fits nicely with how ECE3035 is taught.
    This might be an
    interesting option to explore
    since a lot of students have their own Playstation 2s;
    they are also cheap ($125).
    (The $500-600
    price tag of the Playstation 3 has limited its market penetration.) The
    primary problem with this option is that the Sony Linux PS2 kit is sold out;
    Lanterman had to find mine his
    ebay. You need the official Sony Linux CD in order to
    boot into Linux. There might be ways around that issue; also, Sony might
    be able to provide additional CDs directly, or offer some other alternative.
    (There is a
    devoted homebrew
    community that
    writes programs that run on
    the PS2 directly, as if they were standard PS2 games; however, the procedures
    for doing so are cumbersome.)

    Link to courses on both multicore and GPU processing:

    Links to courses on multicore processing:

    to courses on GPU architecture and also possibly GPGPU programming:

    • Caltech CS101.3:
      Hacking the GPU
      , “open to grads and undergrads
      with a background in
      either graphics,
      numerical analysis, computer languages or computer architecture.”
    • Univ. of Aarhus CS
      GPGPU (Q2 2004)
    • UNC at Chapel Hill COMP 290-058: General Purpose Computation
      using Graphics Processors
    • Stony Brook University
      CS690 –
      General Purpose Computing on
      Graphics Hardware
      . “Graduate standing or permission of instructor.”
      “The course is intended for anyone
      who has encountered a need for
      accelerated computing. The material
      and its presentation is suited for a
      general audience from all academic
      displines. The only prerequisite
      is knowledge of a programming language,
      preferably C/C++. No prior knowledge
      of computer graphics is required,
      but some mathematical background,
      such as linear algebra, is desirable.”
    • Univ. of Illinois at Chicago
      CS 594 – Special
      Topics – GPU Programming
      . Assumes
      computer graphics/OpenGL background.
    • UC Davis, EEC227
      – Graphics Architecture.

      “Ideally, students should
      have a background in both
      computer architecture (at the level of EEC 170 or ECS 154B) and
      computer graphics (at the level of ECS 175). Students should also
      have a reasonable familiarity with the C programming language.
    • Stanford CS448A – Real-Time Graphics Architectures.
      “The class is open to students with a background in computer graphics or
      computer systems and architecture. It may be taken for 1 or 3 credits.
      For 1 credit, each student will be expected to attend all the lectures
      and participate in discussions. For 3 credits, two projects will be assigned.
      The first will be to analyze tradeoffs in graphics architectures, and the second will be to design part of a graphics system.”
    • U. Penn CIS
      700/010 (Special Topics) – GPU Programming and