# Unreal Engine 4 Parallel Processing School of Fish

Published: 03/17/2018

Last Updated: 03/16/2018

Nikolay Lazarev

Integrated Computer Solutions, Inc.

## General Description of the Flocking Algorithm

The implemented flocking algorithm simulates the behavior of a school, or flock, of fish. The algorithm contains four basic behaviors:

- Cohesion: Fish search for their neighbors in a radius defined as the
*Radius of Cohesion*. The current positions of all neighbors are summed. The result is divided by the number of neighbors. Thus, the center of mass of the neighbors is obtained. This is the point to which the fish strive for cohesion. To determine the direction of movement of the fish, the current position of the fish is subtracted from the result obtained earlier, and then the resulting vector is normalized. - Separation: Fish search for their neighbors in a radius defined as the
*Separation Radius*. To calculate the motion vector of an individual fish in a specific separation direction from a school, the difference in the positions of the neighbors and its own position is summed. The result is divided by the number of neighbors and then normalized and multiplied by -1 to change the initial direction of the fish to swim in the opposite direction of the neighbors. - Alignment: Fish search for their neighbors in a radius defined as the
*Radius of Alignment*. The current speeds of all neighbors are summed, then divided by the number of neighbors. The resulting vector is normalized. - Reversal: All of the fish can only swim in a given space, the boundaries of which can be specified. The moment a fish crosses a boundary must be identified. If a fish
*hits*a boundary, then the direction of the fish is changed to the opposite vector (thereby keeping the fish within the defined space).

These four basic principles of behavior for each fish in a school are combined to calculate the total position values, speed, and acceleration of each fish. In the proposed algorithm, the concept of weight coefficients was introduced to increase or decrease the influence of each of these three modes of behavior (cohesion, separation, and alignment). The weight coefficient was not applied to the behavior of reversal, because fish were not permitted to swim outside of the defined boundaries. For this reason, reversal had the highest priority. Also, the algorithm provided for maximum speed and acceleration.

According to the algorithm described above, the parameters of each fish were calculated (position, velocity, and acceleration). These parameters were calculated for each frame.

## Source Code of the Flocking Algorithm with Comments

To calculate the state of fish in a school, double buffering is used. Fish states are stored in an array of size N x 2, where N is the number of fish, and 2 is the number of copies of states.

The algorithm is implemented using two nested loops. In the internal nested loop, the direction vectors are calculated for the three types of behavior (cohesion, separation, and alignment). In the external nested loop, the final calculation of the new state of the fish is made based on calculations in the internal nested loop. These calculations are also based on the values of the weight coefficients of each type of behavior and the maximum values of speed and acceleration.

External loop: At each iteration of a cycle, a new value for the position of each fish is calculated. As arguments to the lambda function, references are passed to:

agents | Array of fish states |

currentStatesIndex | Index of array where the current states of each fish are stored |

previousStatesIndex | Index of array where the previous states of each fish are stored |

kCoh | Weighting factor for cohesion behavior |

kSep | Weighting factor for separation behavior |

kAlign | Weighting factor for alignment behavior |

rCohesion | Radius in which neighbors are sought for cohesion |

rSeparation | Radius in which neighbors are sought for separation |

rAlignment | Radius in which the neighbors are sought for alignment |

maxAccel | Maximum acceleration of fish |

maxVel | Maximum speed of fish |

mapSz | Boundaries of the area in which fish are allowed to move |

DeltaTime | Elapsed time since the last calculation |

isSingleThread | Parameter that indicates in which mode the loop will run |

ParllelFor can be used in either of two modes, depending on the state of the isSingleThread Boolean variable:

```
ParallelFor(cnt, [&agents, currentStatesIndex, previousStatesIndex, kCoh, kSep, kAlign, rCohesion, rSeparation,
rAlignment, maxAccel, maxVel, mapSz, DeltaTime, isSingleThread](int32 fishNum) {
```

Initializing directions with a zero vector to calculate each of the three behaviors:

` FVector cohesion(FVector::ZeroVector), separation(FVector::ZeroVector), alignment(FVector::ZeroVector);`

Initializing neighbor counters for each type of behavior:

` int32 cohesionCnt = 0, separationCnt = 0, alignmentCnt = 0;`

Internal nested loop. Calculates the direction vectors for the three types of behavior:

` for (int i = 0; i < cnt; i++) {`

Each fish should ignore (not calculate) itself:

` if (i != fishNum) {`

Calculate the distance between the position of a current fish and the position of each other fish in the array:

` float distance = FVector::Distance(agents[i][previousStatesIndex].position, agents[fishNum][previousStatesIndex].position);`

If the distance is less than the cohesion radius:

` if (distance < rCohesion) {`

Then the neighbor position is added to the cohesion vector:

` cohesion += agents[i][previousStatesIndex].position;`

The value of the neighbor counter is increased:

```
cohesionCnt++;
}
```

If the distance is less than the separation radius:

` if (distance < rSeparation) {`

The difference between the position of the neighbor and the position of the current fish is added to the separation vector:

` separation += agents[i][previousStatesIndex].position - agents[fishNum][previousStatesIndex].position;`

The value of the neighbor counter is increased:

```
separationCnt++;
}
```

If the distance is less than the radius of alignment:

` if (distance < rAlignment) {`

Then the velocity of the neighbor is added to the alignment vector:

` alignment += agents[i][previousStatesIndex].velocity;`

The value of the neighbor counter is increased:

```
alignmentCnt++;
}
}
```

If neighbors were found for cohesion:

` if (cohesionCnt != 0) {`

Then the cohesion vector is divided by the number of neighbors and its own position is subtracted:

```
cohesion /= cohesionCnt;
cohesion -= agents[fishNum][previousStatesIndex].position;
```

The cohesion vector is normalized:

```
cohesion.Normalize();
}
```

If neighbors were found for separation:

` if (separationCnt != 0) {`

The separation vector is divided by the number of neighbors and multiplied by -1 to change the direction:

```
separation /= separationCnt;
separation *= -1.f;
```

The separation vector is normalized:

```
separation.Normalize();
}
```

If neighbors were found for alignment:

` if (alignmentCnt != 0) {`

The alignment vector is divided by the number of neighbors:

` alignment /= alignmentCnt;`

The alignment vector is normalized:

```
alignment.Normalize();
}
```

Based on the weight coefficients of each of the possible types of behavior, a new acceleration vector is determined, limited by the value of the maximum acceleration:

`agents[fishNum][currentStatesIndex].acceleration = (cohesion * kCoh + separation * kSep + alignment * kAlign).GetClampedToMaxSize(maxAccel);`

To limit the acceleration vector along the Z-axis:

` agents[fishNum][currentStatesIndex].acceleration.Z = 0;`

Add to the previous position of the fish the result of the multiplication of the new velocity vector and the time elapsed since the last calculation:

` agents[fishNum][currentStatesIndex].velocity += agents[fishNum][currentStatesIndex].acceleration * DeltaTime;`

The velocity vector is limited to the maximum value:

```
agents[fishNum][currentStatesIndex].velocity =
agents[fishNum][currentStatesIndex].velocity.GetClampedToMaxSize(maxVel);
```

To the previous position of a fish, the multiplication of the new velocity vector and the time elapsed since the last calculation is added:

` agents[fishNum][currentStatesIndex].position += agents[fishNum][currentStatesIndex].velocity * DeltaTime;`

The current fish is checked to be within the specified boundaries. If yes, the calculated speed and position values are saved. If the fish has moved beyond the boundaries of the region along one of the axes, then the value of the velocity vector along this axis is multiplied by -1 to change the direction of motion:

** **agents[fishNum][currentStatesIndex].velocity = checkMapRange(mapSz,
agents[fishNum][currentStatesIndex].position, agents[fishNum][currentStatesIndex].velocity);
}, isSingleThread);

For each fish, collisions with world-static objects, like underwater rocks, should be detected, before new states are applied:

` for (int i = 0; i < cnt; i++) {`

То detect collisions between fish and world-statiс objects:

```
FHitResult hit(ForceInit);
if (collisionDetected(agents[i][previousStatesIndex].position, agents[i][currentStatesIndex].position, hit)) {
```

If a collision is detected, then the previously calculated position should be undone. The velocity vector should be changed to the opposite direction and the position recalculated:

```
agents[i][currentStatesIndex].position -= agents[i] [currentStatesIndex].velocity * DeltaTime;
agents[i][currentStatesIndex].velocity *= -1.0;
agents[i][currentStatesIndex].position += agents[i][currentStatesIndex].velocity * DeltaTime;
}
}
```

Having calculated the new states of all fish, these updated states will be applied, and all fish will be moved to a new position:

** **for (int i = 0; i < cnt; i++) {
FTransform transform;
m_instancedStaticMeshComponent->GetInstanceTransform(agents[i][0]->instanceId, transform);

Set up a new position of the fish instance:

` transform.SetLocation(agents[i][0]->position);`

Turn the fish head forward in the direction of movement:

```
FVector direction = agents[i][0].velocity;
direction.Normalize();
transform.SetRotation(FRotationMatrix::MakeFromX(direction).Rotator().Add(0.f, -90.f, 0.f).Quaternion());
```

Update instance transform:

```
m_instancedStaticMeshComponent->UpdateInstanceTransform(agents[i][0].instanceId, transform, false, false);
}
```

Redraw all the fish:

```
m_instancedStaticMeshComponent->ReleasePerInstanceRenderData();
m_instancedStaticMeshComponent->MarkRenderStateDirty();
```

Swap indexed fish states:

` swapFishStatesIndexes();`

## Complexity of the Algorithm: How Increasing the Number of Fish Affects Productivity

Suppose that the number of fish participating in the algorithm is N. To determine the new state of each fish, the distance to all the other fish must be calculated (not counting additional operations for determining the direction vectors for the three types of behavior). The initial complexity of the algorithm will be O(N^{2}). For example, 1,000 fish will require 1,000,000 operations.

Figure 1: Computational operations for calculating the positions of all fish in a scene.

## Compute Shader with Comments

Structure describing the state of each fish:

```
struct TInfo{
int instanceId;
float3 position;
float3 velocity;
float3 acceleration;
};
```

Function for calculating the distance between two vectors:

```
float getDistance(float3 v1, float3 v2) {
return sqrt((v2[0]-v1[0])*(v2[0]-v1[0]) + (v2[1]-v1[1])*(v2[1]-v1[1]) + (v2[2]-v1[2])*(v2[2]-v1[2]));
}
RWStructuredBuffer<TInfo> data;
[numthreads(1, 128, 1)]
void VS_test(uint3 ThreadId : SV_DispatchThreadID)
{
```

Total number of fish:

` int fishCount = constants.fishCount;`

This variable, created and initialized in C++, determines the number of fish calculated in each graphics processing unit (GPU) thread (by default:1):

` int calculationsPerThread = constants.calculationsPerThread;`

Loop for calculating fish states that must be computed in this thread:

` for (int iteration = 0; iteration < calculationsPerThread; iteration++) {`

Thread identifier. Corresponds to the fish index in the state array:

` int currentThreadId = calculationsPerThread * ThreadId.y + iteration;`

The current index is checked to ensure it does not exceed the total number of fish (this is possible, since more threads can be started than there are fish):

```
if (currentThreadId >= fishCount)
return;
```

To calculate the state of fish, a single double-length array is used. The first N elements of this array are the new states of fish to be calculated; the second N elements are the older states of fish that were previously calculated.

Current fish index:

` int currentId = fishCount + currentThreadId;`

Copy of the structure of the current state of fish:

` TInfo currentState = data[currentThreadId + fishCount];`

Copy of the structure of the new state of fish:

` TInfo newState = data[currentThreadId];`

Initialize direction vectors for the three types of behavior:

```
float3 steerCohesion = {0.0f, 0.0f, 0.0f};
float3 steerSeparation = {0.0f, 0.0f, 0.0f};
float3 steerAlignment = {0.0f, 0.0f, 0.0f};
```

Initialize neighbors counters for each type of behavior:

```
float steerCohesionCnt = 0.0f;
float steerSeparationCnt = 0.0f;
float steerAlignmentCnt = 0.0f;
```

Based on the current state of each fish, direction vectors are calculated for each of the three types of behaviors. The cycle begins with the middle of the input array, which is where the older states are stored:

` for (int i = fishCount; i < 2 * fishCount; i++) {`

Each fish should ignore (not calculate) itself:

` if (i != currentId) {`

Calculate the distance between the position of current fish and the position of each other fish in the array:

` float d = getDistance(data[i].position, currentState.position);`

If the distance is less than the cohesion radius:

` if (d < constants.radiusCohesion) {`

Then the neighbor’s position is added to the cohesion vector:

` steerCohesion += data[i].position;`

And the counter of neighbors for cohesion is increased:

```
steerCohesionCnt++;
}
```

If the distance is less than the separation radius:

` if (d < constants.radiusSeparation) {`

Then the separation vector is added to the difference between the position of the neighbor and the position of the current fish:

` steerSeparation += data[i].position - currentState.position;`

The counter of the number of neighbors for separation increases:

```
steerSeparationCnt++;
}
```

If the distance is less than the alignment radius:

` if (d < constants.radiusAlignment) {`

Then the velocity of the neighbor is added to the alignment vector:

` steerAlignment += data[i].velocity;`

The counter of the number of neighbors for alignment increases:

```
steerAlignmentCnt++;
}
}
}
```

If neighbors were found for cohesion:

** ** if (steerCohesionCnt != 0) {

The cohesion vector is divided by the number of neighbors and its own position is subtracted:

` steerCohesion = (steerCohesion / steerCohesionCnt - currentState.position);`

The cohesion vector is normalized:

```
steerCohesion = normalize(steerCohesion);
}
```

If neighbors were found for separation:

` if (steerSeparationCnt != 0) {`

Then the separation vector is divided by the number of neighbors and multiplied by -1 to change the direction:

` steerSeparation = -1.f * (steerSeparation / steerSeparationCnt);`

The separation vector is normalized:

```
steerSeparation = normalize(steerSeparation);
}
```

If neighbors were found for alignment:

` if (steerAlignmentCnt != 0) {`

Then the alignment vector is divided by the number of neighbors:

` steerAlignment /= steerAlignmentCnt;`

The alignment vector is normalized:

```
steerAlignment = normalize(steerAlignment);
}
```

Based on the weight coefficients of each of the three possible types of behaviors, a new acceleration vector is determined, limited by the value of the maximum acceleration:

```
newState.acceleration = (steerCohesion * constants.kCohesion + steerSeparation * constants.kSeparation
+ steerAlignment * constants.kAlignment);
newState.acceleration = clamp(newState.acceleration, -1.0f * constants.maxAcceleration,
constants.maxAcceleration);
```

To limit the acceleration vector along the Z-axis:

` newState.acceleration[2] = 0.0f;`

To the previous velocity vector, the product of the new acceleration vector and the time elapsed since the last calculation is added. The velocity vector is limited to the maximum value:

```
newState.velocity += newState.acceleration * variables.DeltaTime;
newState.velocity = clamp(newState.velocity, -1.0f * constants.maxVelocity, constants.maxVelocity);
```

Add to the previous position of the fish the result of the multiplication of the new velocity vector and the time elapsed since the last calculation:

` newState.position += newState.velocity * variables.DeltaTime;`

The current fish is checked to be within the specified boundaries. If yes, the calculated speed and position values are saved. If the fish has moved beyond the boundaries of the region along one of the axes, then the value of the velocity vector along this axis is multiplied by -1 to change the direction of motion:

```
float3 newVelocity = newState.velocity;
if (newState.position[0] > constants.mapRangeX || newState.position[0] < -constants.mapRangeX) {
newVelocity[0] *= -1.f;
}
if (newState.position[1] > constants.mapRangeY || newState.position[1] < -constants.mapRangeY) {
newVelocity[1] *= -1.f;
}
if (newState.position[2] > constants.mapRangeZ || newState.position[2] < -3000.f) {
newVelocity[2] *= -1.f;
}
newState.velocity = newVelocity;
data[currentThreadId] = newState;
}
}
```

Table 1: Comparison of algorithms.

Fish |
Algorithm (FPS) |
Computing Operations |
||
---|---|---|---|---|

CPU SINGLE |
CPU MULTI |
GPU MULTI |
||

100 |
62 |
62 |
62 |
10000 |

500 |
62 |
62 |
62 |
250000 |

1000 |
62 |
62 |
62 |
1000000 |

1500 |
49 |
61 |
62 |
2250000 |

2000 |
28 |
55 |
62 |
4000000 |

2500 |
18 |
42 |
62 |
6250000 |

3000 |
14 |
30 |
62 |
9000000 |

3500 |
10 |
23 |
56 |
12250000 |

4000 |
8 |
20 |
53 |
16000000 |

4500 |
6 |
17 |
50 |
20250000 |

5000 |
5 |
14 |
47 |
25000000 |

5500 |
4 |
12 |
35 |
30250000 |

6000 |
3 |
10 |
31 |
36000000 |

6500 |
2 |
8 |
30 |
42250000 |

7000 |
2 |
7 |
29 |
49000000 |

7500 |
1 |
7 |
27 |
56250000 |

8000 |
1 |
6 |
24 |
64000000 |

8500 |
0 |
5 |
21 |
72250000 |

9000 |
0 |
5 |
20 |
81000000 |

9500 |
0 |
4 |
19 |
90250000 |

10000 |
0 |
3 |
18 |
100000000 |

10500 |
0 |
3 |
17 |
110250000 |

11000 |
0 |
2 |
15 |
121000000 |

11500 |
0 |
2 |
15 |
132250000 |

12000 |
0 |
1 |
14 |
144000000 |

13000 |
0 |
0 |
12 |
169000000 |

14000 |
0 |
0 |
11 |
196000000 |

15000 |
0 |
0 |
10 |
225000000 |

16000 |
0 |
0 |
9 |
256000000 |

17000 |
0 |
0 |
8 |
289000000 |

18000 |
0 |
0 |
3 |
324000000 |

19000 |
0 |
0 |
2 |
361000000 |

20000 |
0 |
0 |
1 |
400000000 |

**Figure 2:*** Comparison of algorithms.*

**Laptop Hardware**:

CPU – Intel^{®} Core^{™} i7-3632QM processor 2.2 GHz with turbo boost up to 3.2 GHz

GPU - NVIDIA GeForce* GT 730M

RAM - 8 GB DDR3*

^{1}

#### Product and Performance Information

^{1}

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.