Code Sample: An Approach to Parallel Processing with Unreal Engine*

ID 671645
Updated 4/26/2018
Version Latest



File(s): Download (Intel Software Github)
License: Intel Sample Source Code License Agreement
Optimized for...  
Operating System: Windows® 10 (64 bit)
Hardware: GPU required
(Programming Language, tool, IDE, Framework)
Microsoft Visual Studio* 2017, Unreal Engine* 4, C#

Familiarity with Microsoft Visual Studio, Unreal Engine API, 3D graphics, parallel processing.


The idea behind this project was to provide a demonstration of parallel processing in gaming with Unreal Engine* and how to perform gaming-related physics using this game engine. In this domain, realism is important as an indicator of success. In order to mimic the actual world, many things need to happen at the same time which requires parallel processing. This code and accompanying article (see References below) cover development of a flocking algorithm, which is then demonstrated as schools of fish via an application. This application can be run single- or multi-threaded in order to see the differences in performance.  Furthermore, performance is improved by performing physics calculations on the GPU. The article also covers the overall complexity of the algorithm and how increasing the number of fish affects productivity.

  1. Overview of a flocking algorithm
  2. Overview of the algorithm implementation
  3. Single thread, multi-thread or GPU

Get Started

Overview of a flocking algorithm

In this example, a flock was defined as a school of fish. For each member, the algorithm needs to worry about cohesion, alignment and separation.  Each fish was calculated to “swim” within a school if it was within a certain distance from any other fish in the school. Members of a school will not act as individuals, but only as members of a flock, sharing same parameters such as the speed and the direction.

  • Cohesion: Fish search for their neighbors in a radius defined as the “Radius of Cohesion”. The current positions of all neighbors are summed and the result is divided by the number of neighbors, thus, the center of mass of the neighbors is obtained. This is the point to which the fish strive for cohesion. To determine the direction of movement of the fish, the current position of the fish is subtracted from the result obtained earlier, and then the resulting vector is normalized.
  • Separation: Fish search for their neighbors in a radius defined as the “Separation Radius”. To calculate the motion vector of an individual fish in a specific separation direction from a school, the difference in the positions of the neighbors and its own position is summed. The result is divided by the number of neighbors and then normalized and multiplied by -1 to change the initial direction of the fish to swim in the opposite direction of the neighbors.
  • Alignment: Fish search for their neighbors in a radius defined as the “Radius of Alignment”. The current speeds of all neighbors are summed, then divided by the number of neighbors. The resulting vector is normalized.
  • Reversal: The fish can only swim in a given space, the boundaries of which are specified. The moment a fish crosses a boundary must be identified. If a fish contacts a boundary then direction of the fish is changed to the opposite vector, thereby keeping the fish within the defined space.

These four basic principles of behavior for each fish in a school are combined to calculate the total position values, speed, and acceleration of each fish. These values must be calculated for each fish in each frame.

In the algorithm the concept of weight coefficients was introduced to increase or decrease the influence of each of these three modes of behavior (cohesion, separation, and alignment). The weight coefficient was not applied to the behavior of Reversal because fish were not permitted to swim outside of the defined boundaries. For this reason, Reversal had the highest priority. Also, the algorithm provided for maximum speed and acceleration

Overview of the algorithm implementation

To calculate the state of fish in a school, double buffering is used. Fish states are stored in an array of size N x 2, where N is the number of fish, and 2 is the number of copies of states. The algorithm is implemented using two nested loops. In the internal nested loop, the direction vectors are calculated for the three types of behavior (cohesion, separation, and alignment). In the external nested loop, the final calculation of the new state of the fish is made based on calculations in the internal nested loop. These calculations are also based on the values ​​of the weight coefficients of each type of behavior and the maximum values ​​of speed and acceleration. The article covers each of these loops in detail, as well as the compute shader used to calculate the position of each fish.

Single-thread, multi-thread or GPU

As was mentioned above, the example can be run in single- or multi-threaded mode, which is easily accomplished using ParallelFor.  ParallelFor can be used in either of two modes, depending on the state of the isSingleThread Boolean variable:

ParallelFor(cnt, [&agents, currentStatesIndex, previousStatesIndex, kCoh, kSep, kAlign, rCohesion, rSeparation, rAlignment, maxAccel, maxVel, mapSz, DeltaTime, isSingleThread](int32 fishNum) {

Because the position of each fish and its neighbors needs to be calculated for each frame increasing the number of fish greatly increases the number of calculations. It is not surprising that the application running in multiple threads outperforms the single-threaded mode. Splitting the work farther by utilizing the GPU further improves the performance. Again, this is covered in more detail in the article.


Nikolay Lazarev, Integrated Computing Solutions, Inc., Unreal Engine 4 - Parallel Processing a School of Fish, 2018

Updated Log

Created March 20, 2018