1,862 reads

Using C# for Real-time Systems

by Sergei SolokhinJanuary 20th, 2023

Too Long; Didn't Read

Real-time systems, motion capture, and character animation. I have gained experience in this field through my work in the previs and game industries. My best practice includes reading books and articles, learning from the work of others on public repositories, doing my own benchmarks in order to draw conclusions.

featured image - Using C# for Real-time Systems

My area of expertise lies in real-time systems, motion capture, and character animation. I have gained experience in this field through my work in the previs and game industries, mainly using Autodesk Motionbuilder and various game engines such as UnrealEngine and Avalanche Studio Internal Engine. I have gained a strong understanding of how to use the C++ programming language and the compiler to accommodate real-time evaluation needs.

Sometime ago, I was hesitant about taking on the challenge of continuing my real-time systems work on the .NET platform. However, I soon realized that success was not so much dependent on the language or platform I was using, but rather on my understanding of limitations and the development of a "best practice" for utilizing the pros and cons of the language or platform to write effective code and achieve my objectives. My best practice includes reading books and articles, learning from the work of others on public repositories, doing my own benchmarks in order to draw conclusions about how to do my work in the best way, and of course, talking to my great colleagues at work to share the experience.

There are two things that I keep in mind during the development process - code readability and budgeting.

Code readability

It is important to achieve a balance between optimization and readability when writing code so that the review process and collaboration with other team members are smooth and efficient. Going too far in optimizing code can increase complexity and reduce readability, thus slowing down the development process.

Budgeting

Another challenge in developing real-time systems is the resources of a PC, especially when doing prototypes or unit tests in a small synthetic environment. It is easy to run a lot of tasks in a frame without noticing the amount of memory and CPU/GPU power being used or to overlook which parts are causing a bottleneck. Even if the unit tests are successful, it does not necessarily mean that the system is correctly configured. It is important to benchmark the memory and CPU usage and be aware of what the budget should be for the system in order to ensure that its execution, memory usage, and threading do not become a bottleneck when combined with other systems under high load.

I am pleased to see that .NET C# optimization is now getting more attention and allowing developers to have greater control over processes and resources. We must abide by .NET Core 3.1 C# 7.2, so I will share my best practice from that perspective with regard to features, memory, and CPU usage.

The main instruments I used to find best practices were:

books and articles on .NET performance, repositories of high-performance code on .NET
benchmarks using the BenchmarkDotNet package. The code and benchmark package were able to answer many questions, but I had to take into account the execution platform, as we are using Unity as well, and together with the platform, the results of execution could be different
I also used profiling, from the use of Stopwatch, and manually prepared counters up to 3rd party .NET profilers, Unity profiler.

As for the Stopwatch, I personally try to avoid the creation of a new instance of Stopwatch class and I use some static methods like GetTimestamp and Frequency where the number of seconds could be calculated as:

double seconds = (Stopwatch.GetTimestamp() - lastTicksPerSecond) / (double)Stopwatch.Frequency;

One of the biggest obstacles to having a good real-time experience is Garbage Collection (GC), which can cause "hitches" in the run-time execution. As such, I believe it is important to have a zero allocation strategy to minimize the impact of GC on performance. I have read some interesting articles that discuss the challenges of making high-loaded real-time systems using .NET C#, which I will share at the end of the article.

So here is my best practice that I would like to share.

Performance

Avoid using runtime reflection for serialization, either consider using modes of compile-time source generation or using manual low-level primitive calls like BeginObject, WriteValue
- I would recommend this instead of reflection-based serialization
```
[Serializable]
public class Item
{
	public float a;
	public int b;
}
...
Item item = new Item() {a = 1f, b = 5};
string jsonString = JsonSerializer.Serialize(item);
```
- use something like low-level primitives calls instead
```
writer.WriteStartObject();
writer.WritePropertyName("a");
writer.WriteValue(item.a);
writer.WriteEndObject();
```
- or consider compile-time modes as I mentioned before. I will not show here an example of such usage, but I’ll share a link to the article at the end.
Aggressive inlining. It doesn’t mean that the compiler will strictly follow that manual annotation, that is a hint. I’m using that for some small and simple calls which also don’t contain any exception logic
```
using System.Runtime.CompilerServices;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
```
Consider using simple for loops instead of foreach, enumerable. And just to remind you that I’m writing from the perspective of c# 7.2, in the latest releases of c# the situation could be different, that some improvements in these areas. But in my case, keeping writing code with for loops for some critical real-time parts gives some visible performance numbers.
Avoid using Linq, as linq expressions look cool and handy, but the downside of that is a performance drop.
Avoid exceptions for the real-time part of code
Logging (IO operations) is also expensive for real-time applications, use a simpler logic with counting error-prone cases or making a status variable for the current evaluation. This could then be printed out with less frequency and in batches with some other accumulated log data.
And I also would like to mention parallel evaluation in c#. It, of course, refers to async await methods and task asynchronous programming model (TAP). I will not go into details in this article, I would just recommend studying the topic to understand the patterns.

I would like to mention, that for me the simple running thread gives a more straightforward, more readable, and functional solution. I think that async methods and Tasks are powerful instruments, but I would say that they have different roles of rescheduling the flow with some GC overhead, generating state machines behind the scenes, and performing depending on a final configuration of a system. That could even slow down the performance due to many tasks being evaluated without an understanding of the priority of subsystems. In Unity, I still think that Jobs and Burst's Ahead-of-time compilation of them perform the best. So I would only recommend learning the topic really well, as asynchronous execution is very actual for the current hardware we have now.

Memory

Zero-allocation strategy means avoiding massive triggering GC and reference counting during a frame evaluation. There are different strategies to tackle that. Some of them are covered in the articles that I will share at the end. My personal ways to deal with that is to:

Allocating on the stack if possible for small local processing arrays and using MemoryPool for bigger arrays. What we consider as small, I would say from what I’ve learned, is about 769 bytes for the overall method stack.

There is also a helper method: RuntimeHelpers.TryEnsureSufficientExecutionStack could ensure that the function has a sufficient stack for a normal .NET function execution
```
var arr = stackalloc int[8];
```
Memory pool could be used for bigger arrays, but you have to take care to always return arrays back to the pool
```
ItemData[] myData = ArrayPool<ItemData>.Shared.Rent(size);
ArrayPool<ItemData>.Shared.Return(myData);
```
Avoid allocating zero-length arrays and use empty arrays instead
```
var emptyIntArray = Array.Empty<int>();
```

Use aligned memory allocations and cast them to structs.

handy and efficient way for changing representation into a Span of type

byte[] buffer;
Span<ItemData> span = MemoryMarshal.Cast<byte, ItemData>(buffer);

very efficient way for accessing individual element

int offset = index * ItemData.SIZE;
ref ItemData fullKey = ref Unsafe.As<byte, ItemData>(ref buffer[offset]);

Reuse memory when possible, one of the examples is to use Span and ReadOnlySpan instead of copying and trimming strings or arrays

string path = "process.this.long.&.string";
var pathSpan = path.AsSpan();
            
int lastDelim = path.LastIndexOf('&');
if (lastDelim > 0)
{
 		var subPath = path.AsSpan(0, lastDelim);
	  // Console.WriteLine(subPath.ToString());
}

Strings are expensive to use as keys, consider using Guid or Hashes. And this a simple logic at work. Every symbol of the string is 2 bytes in size, while the whole uint hash code is 4 bytes in size.