Technology & Research

Intel® Technology Journal Home

Volume 11, Issue 03

Tera-scale Computing


Intel Technology Journal - Featuring Intel's recent research and development

ISSN 1535-864X DOI 10.1535/itj.1103.08

  • Volume 11
  • Issue 03
  • Published August 22, 2007

Tera-scale Computing

  Section 2 of 10  

High-Performance Physical Simulations on Next-Generation Architecture with Many Cores

INTRODUCTION

The booming computer games and visual effects industries continue to drive the graphics community's seemingly insatiable desire for increased realism, believability, and speed. In the past decade, physical simulation has become a key to achieving the realism expected by audiences of games and movies. Physical simulation models the laws of physics to simulate life-like movement and interaction among objects, such as rigid and deformable bodies, human faces, cloth, and water.

Physical simulation can be used in a variety of settings such as weather prediction, movie special effects, and computer games. Complex natural phenomena such as ocean waves crashing on a shore, a flag waving in the wind, or bricks falling from a collapsing tower are modeled by means of numerical simulation of physical laws. Modeling different natural phenomena requires a diverse set of techniques, algorithms, and data structures, making physical simulation both complex and general. Computation and memory requirements are extremely demanding. This makes the workloads a challenging target for current as well as future architectures.

In this paper, we examine applications involving physical simulation for production environments and for gaming. For production physical simulation, we study the PhysBAM package from Stanford University [5, 11], which is used by several special-effects and film production companies, including Pixar and Industrial Light and Magic. The goal is to recreate the visual experience of a human observing a natural phenomenon. For gaming physical simulation, we study the open source ODE package [13]. This package provides similar functionality to the widely used commercial Havok Effect package from Havok. The goal of physical simulation in gaming is to make real-time interactions between objects as accurate as possible. The difference in goals for the two physical simulation domains leads to different choices for algorithms and data structures. However, these two domains do have many similar characteristics.

One common characteristic of production and gaming physical simulation is a need for significant acceleration. On a 4-way Intel® Xeon® processor 3.0GHz system, with 16GB of DDR2-3200 and three levels of cache on each processor (16KB L1, 1MB L2, and 8MB L3), the production physics workloads take 5 to 188 seconds to process a single frame. These workloads have hundreds of thousands to a few million entities (tetrahedra/grid cells) interacting with each other. In contrast, for game physics workloads, only a thousand objects can currently interact in real time. Acceleration by an order of magnitude or more will allow improved accuracy, modeling of new effects, and even interactive or real-time production applications. Multi-core processors are now common, and we expect the number of cores to increase steadily for the foreseeable future, so that multi-core processors capable of executing applications tens of times faster than today's processors are on the horizon. Such processors would improve the speed and realism of production-quality or real-time game physical simulation applications. However, for an application to harness the computational power of such a multi-core processor, it must effectively utilize multiple threads. Parallelization of a large code base as used by production or game physics applications is not trivial, especially when the target parallel scalability is tens of threads.

Another similarity in requirements for the two categories of physical simulation applications is high-bandwidth requirements. The size of the data scales with increasing resolution or number of objects in the simulation. Input sizes are often millions of volume elements or tens of thousands of objects. This leads to memory footprints that are tens of megabytes (i.e., larger than typical caches). These applications therefore require either much larger caches or a large main memory bandwidth.

Our contributions are as follows:

  • We have parallelized six state-of-the-art physical simulation applications (fluid dynamics [4], human face simulation [12], and cloth simulation [2] for production physics and convex body collision [1, 3], game cloth [7], and game fluids [9] for game physics). In parallelizing these workloads, we employed various techniques which include parallelizing loops/graph operations and using alternative algorithms for better scalability.
  • We simulated and analyzed the scalability of these applications using cycle-accurate simulation of a chip-multiprocessor with 64 cores. The workloads studied achieved a parallel scaling of 30x to 60x for 64 cores.

We perform a detailed analysis of the memory requirements of these applications. Our study finds that future physical simulation workloads demand cache sizes close to 100 megabytes or physical main memory bandwidths in the hundreds of GB/s.

  Section 2 of 10  

Back to Top

In This Article

Download a PDF of this article.