Writing Multithread Code with Task Parallel Library (TPL)

Written by ssukhpinder | Published 2023/03/09
Tech Story Tags: software-development | dotnet | multithreading | task-scheduler | dotnet-core | aspnetcore | coding | programming | web-monetization

TLDRThe "Parallel.For" loop takes the directory name from the command line and uses the "TPL For loop" to yield results. Let’s consider an example of iterating a directory and output the following things:- The number of files inside the directory, not sub folder files though.- Size in bytes.via the TL;DR App

I am a big fan of executing multi-threads in an application, and it is interesting to see how quickly parallelism can solve a complex query.

Let's understand TPL with the following content:

  • Write a simple "Parallel.For" loop
  • Write a simple "Parallel.ForEach" loop
  • Cancel "Parallel.For" or "Parallel.ForEach" loop
  • Exception Handling in parallel loops

The "Parallel.For" Loop

The below function takes the directory name from the command line and uses the "TPL For loop" to yield results. Let's consider an example of iterating a directory and output the number of files inside the directory, not subfolder files, though, along with the total file size in bytes.

  public static void BasicParallelForLoop()
  {
      long totalSize = 0;
      Console.WriteLine("Enter valid directory path :");
      String args = Console.ReadLine();
      if (!Directory.Exists(args))
      {
          Console.WriteLine("The specified directory does not exist.");
          return;
      }
      String[] files = Directory.GetFiles(args);
      Parallel.For(0, files.Length,
          index =>
          {
              FileInfo fileInfo = new FileInfo(files[index]);
              long size = fileInfo.Length;
              Interlocked.Add(ref totalSize, size);
          });
      Console.WriteLine("Directory '{0}':, {1:N0} files, {2:N0} bytes", args, files.Length, totalSize);
  }

Output

The below result displays both positive and negative scenarios, i.e., for valid and invalid directory names.

//Valid directory scenario
Enter valid directory path :
E:\gifs
//Result
Directory 'E:\gifs':, 7 files, 2,313,294 bytes
//Invalid directory Result
//The specified directory does not exist.

The "Parallel.ForEach" Loop

The "ForEach" loop works accurately, similar to the "For" loop; the difference lies in the syntax. Let's solve the same example using the "TPL ForEach loop."

Example Reference

Let’s consider an example of iterating a directory and output the following things:
- The number of files inside the directory, not subfolder files though.
- Size in bytes

Let's write some code. Notice the syntax is much more readable as the file object is directly managed rather than indexed.

String[] files = Directory.GetFiles(args);
Parallel.ForEach(files, (currentFile) =>
{
    FileInfo fileInfo = new FileInfo(currentFile);
    long size = fileInfo.Length;
    Interlocked.Add(ref totalSize, size);
});
Console.WriteLine("Directory '{0}':, {1:N0} files, {2:N0} bytes", args, files.Length, totalSize);

Output

//Valid directory scenario
Enter valid directory path :
E:\gifs
//Result
Directory 'E:\gifs':, 7 files, 2,313,294 bytes
//Invalid directory Result
//The specified directory does not exist.

Cancel "Parallel.For" or "Parallel.ForEach"

Now, if the user wants to cancel the parallel loop execution at any time, it's done with the help of a cancellation token. The "Parallel.For" and "Parallel.ForEach" loops support cancellation through cancellation tokens. Let's take an example of For Loop iteration with integers ranging from 0 to 10000000 and cancel its execution in between using the "s" keyword.

Let's write some code.

Use the "ParallelOptions" class object to provide a cancellation token and max parallelism. Notice that the code below shows a "Task Creation" using "Task.Factory", so we can cancel from another thread. The example below demonstrates how to apply the cancellation token on "For Loop," Likewise, it can be used for "ForEach Loop" as well.

  public static void BasicCancelLoop()
  {
      int[] nums = Enumerable.Range(0, 100000000).ToArray();
      CancellationTokenSource cts = new CancellationTokenSource();
      ParallelOptions parallelOptions = new ParallelOptions();
      parallelOptions.CancellationToken = cts.Token;
      parallelOptions.MaxDegreeOfParallelism = System.Environment.ProcessorCount;
      Console.WriteLine("Press any key to start. Press 's' to cancel.");
      Console.ReadKey();
      Task.Factory.StartNew(() =>
       {
           if (Console.ReadKey().KeyChar == 's')
               cts.Cancel();
           Console.WriteLine("press any key to exit");
       });
      try
      {
          Parallel.ForEach(nums, parallelOptions, (num) =>
          {
              Console.WriteLine("{0} on {1}", num);
              parallelOptions.CancellationToken.ThrowIfCancellationRequested();
          });
      }
      catch (OperationCanceledException e)
      {
          Console.WriteLine(e.Message);
      }
      finally
      {
          cts.Dispose();
      }
  }

Output

//print num values
....
....
....
//s key pressed
The operation was cancelled

Exception Handling in parallel loops

The Parallel.For and Parallel.ForEach loops don't have any mechanism to handle exceptions. They resemble regular "for and foreach" loops, i.e., an unhandled exception causes the loop to terminate. To add exception handling to TPL, handle the cases in which the same exceptions might be thrown on multiple threads simultaneously. Finally, wrap all exceptions from the loop in a System.AggregateException.

Exception Handling Code

  public static void Handle()
  {
      int[] data = new int[3] { 1, 2, 3 };
      try
      {
          ProcessDataInParallel(data);
      }
      catch (AggregateException ex)
      {
          Console.WriteLine(ex.Message);
      }
  }
  private static void ProcessDataInParallel(int[] data)
  {
      Parallel.ForEach(data, d =>
      {
          throw new ArgumentException($"Exception with value : {d}");
      });
  }

Output

The output will show multiple exceptions thrown.

One or more errors occurred. (Exception with value 1) (Exception with value 3) (Exception with value 2)

Github Sample

https://github.com/ssukhpinder/TaskParallelLibExample

Why TPL?

First, TPL not only saves time for rolling and optimizing the solution, but TPL also has the advantage of being able to hook into the internals of the ThreadPool that are not exposed publicly, and that can boost performance in a variety of critical scenarios. As mentioned above, Tasks take advantage of the new work-stealing queues in the ThreadPool, which can help avoid contention and cache coherency issues, leading to performance degradation. Using Tasks results in larger object allocations, and maintaining a Task's lifecycle (status, cancellation requests, exception handling, etc.) requires extra work and synchronization.


Thank you for reading. I hope you like the article..!!

Also published here.


Written by ssukhpinder | Programmer by heart | C# | Python | .Net Core | Xamarin | Angular | AWS
Published by HackerNoon on 2023/03/09