This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Evan Ning, Milton Academy & [email protected];
(2) David R. Kaeli, Department of Electrical and Computer Engineering & Northeastern University [email protected].
Our implementation of the Sieve algorithm incorporates several optimizations to improve computational efficiency, selected specifically since we are mapping this algorithm to a multithreaded environment. We also consider a set of mathematical optimizations.
Primes greater than 3 lie only in the form of 6k+1 and 6k+5, which means they are either 1 more or 5 more than a multiple of 6, as shown in Figure 2. By considering only numbers in the form 6k+1 and 6k+5, memory usage can be significantly reduced.
If we are looking for primes up to a value of n, we can divide this range into segments. For instance, if n=10,000,000 and we have 10 threads, each thread would be assigned to handle 1,000,000 numbers. The main range can be split so that each thread works on contiguous segments of numbers, reducing the complexity of merging the results.
A thread, often termed a lightweight process, represents the smallest sequence of programmed instructions that can be managed independently by a scheduler. Each thread is responsible for sieving out multiples of primes in its assigned range.
Due to memory constraints, it is not always possible to sieve very large numbers in a single thread. Utilizing consecutive chunks of memory for each thread can indeed help reduce the need for synchronization and thus can prevent memory contention. By allocating a distinct memory range for each thread, we ensure that no two threads are attempting to write to the same memory address at the same time, which is a common cause of race conditions. The segmented sieve approach sieves numbers in segments, making it memory efficient and parallel thread friendly.
An incremental formulation of the sieve generates primes indefinitely (i.e., without an upper bound) by interleaving the generation of primes where the multiples of each prime P are generated directly by counting from the square of the prime in increments of P. The generation must be initiated only when the prime's square is reached, to avoid adverse effects on efficiency.