paint-brush
Memory Efficient Multithreaded Incremental Segmented Sieve Algorithm: Sieve Optimizationsby@multithreading

Memory Efficient Multithreaded Incremental Segmented Sieve Algorithm: Sieve Optimizations

tldt arrow

Too Long; Didn't Read

With the growing computational needs associated with prime enumeration tasks, traditional implementations of the Sieve algorithm remain a bottleneck.
featured image - Memory Efficient Multithreaded Incremental Segmented Sieve Algorithm: Sieve Optimizations
MultiThreading.Tech: #1 Publication on Concurrent Programming HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Evan Ning, Milton Academy & [email protected];

(2) David R. Kaeli, Department of Electrical and Computer Engineering & Northeastern University [email protected].

2. Sieve Optimizations

Our implementation of the Sieve algorithm incorporates several optimizations to improve computational efficiency, selected specifically since we are mapping this algorithm to a multithreaded environment. We also consider a set of mathematical optimizations.

2.1 Memory Optimizations

Primes greater than 3 lie only in the form of 6k+1 and 6k+5, which means they are either 1 more or 5 more than a multiple of 6, as shown in Figure 2. By considering only numbers in the form 6k+1 and 6k+5, memory usage can be significantly reduced.


Figure 2. Prime greater than 3.

2.2 Divide the Range

If we are looking for primes up to a value of n, we can divide this range into segments. For instance, if n=10,000,000 and we have 10 threads, each thread would be assigned to handle 1,000,000 numbers. The main range can be split so that each thread works on contiguous segments of numbers, reducing the complexity of merging the results.

2.3 Multithreaded Tasks

A thread, often termed a lightweight process, represents the smallest sequence of programmed instructions that can be managed independently by a scheduler. Each thread is responsible for sieving out multiples of primes in its assigned range.


Due to memory constraints, it is not always possible to sieve very large numbers in a single thread. Utilizing consecutive chunks of memory for each thread can indeed help reduce the need for synchronization and thus can prevent memory contention. By allocating a distinct memory range for each thread, we ensure that no two threads are attempting to write to the same memory address at the same time, which is a common cause of race conditions. The segmented sieve approach sieves numbers in segments, making it memory efficient and parallel thread friendly.

2.4 Incremental Sieve

An incremental formulation of the sieve generates primes indefinitely (i.e., without an upper bound) by interleaving the generation of primes where the multiples of each prime P are generated directly by counting from the square of the prime in increments of P. The generation must be initiated only when the prime's square is reached, to avoid adverse effects on efficiency.