Segmented prime sieve for fun and profit by haampie · Pull Request #87 · JuliaMath/Primes.jl

haampie · 2020-09-07T14:19:54Z

Closes #45.

I have this code lying around for a bit now, but never turned it into a PR. Now I'm turning it into a PR to force myself to finish it 😅. It could use some tests and maybe some code style improvements / cleanup.

What it brings is an iterator:

using Primes.SegmentedSieve: SegmentIterator, vec_count_ones

function count_primes(a, b)
    cnt = 0
    for segment in SegmentIterator(a:b, 32 * 1024)
        cnt += vec_count_ones(segment.segment)
    end
    return cnt
end

which makes some operations very fast, as they operate on O(1) memory instead of O(n). For reference:

julia> @time count(Primes.primesmask(2^31, 2^32))
  7.362490 seconds (3.86 k allocations: 802.346 MiB, 1.18% gc time)
98182656

julia> @time count_primes(2^31, 2^32)
  0.452503 seconds (2.19 k allocations: 375.266 KiB)
98182656

this is 16x faster.

There's some neat optimizations: you can pick a buffer size to make things fit in L1 cache (above it's 32kb); the inner loop is unrolled by hand such that 8 multiples are removed per iteration; memory is compressed s.t. 1 byte equals an interval of 30 numbers (through a {2, 3, 5}-wheel); multiples of {7, 11, 13, 17} are "pre-sieved" in the sense that a buffer of size 7 * 11 * 13 * 17 is sieved and repeatedly copied over, instead of actually sieving 7, 11, 13 and 17 each over the full range.

You can also print the segment if you like:

julia> for segment in SegmentIterator(1:180, 2)
           println(segment)
       end
    01  07  11  13  17  19  23  29  
00   .   x   x   x   x   x   x   x  
30   x   x   x   x   x   .   x   x  

    01  07  11  13  17  19  23  29  
60   x   x   x   x   .   x   x   x  
90   .   x   x   x   x   x   x   .  

     01  07  11  13  17  19  23  29  
120   .   x   x   .   x   x   .   x  
150   x   x   .   x   x   .   x   x

So, here it has a buffer of 2 bytes, resembling an interval of 60 numbers, meaning it needs 3 iterations. Every x marks a prime number.

galanakis · 2021-12-11T23:16:23Z

Is there a documentation or a paper this algorithm is based on? I would like to help review it.

haampie · 2021-12-11T23:46:02Z

I think primesieve.org is a good reference. The wheel is a generalization of the idea of skipping multiples of 2 (which is easy to code), see https://en.wikipedia.org/wiki/Wheel_factorization for a good overview.

If you take a {2, 3, 5}-wheel, then consider any number z = 2*3*5*x + y = 30*x + y, where y = 1, ..., 30. Then z is composite if y is a multiple of 2, 3 or 5. So you only have to verify y = {1, 7, 11, 13, 17, 19, 23, 29} are prime for all x. Those are 8 numbers, and them being prime or not is 1 bit of info, so this neatly compressed to 1 byte of data per interval of 30 numbers on the number line.

The next optimization after compressing the number line by a constant factor is to work on constant size segments of the number line. So instead of 1:n, you work in segments 1:b, b+1:2b, 2b+1:3b, ..., m*b+1:n, and you allocate the boolean array once and for all. Effectively you do:

for (a, b) in segments # sieve a:b
  fill!(segment, 1)
  for p in primes_so_far
    # find the smallest number q st p * q >= a.
    # then remove p * q, p * (q + 1), p * (q + 2), ... until p * (q + m) > b from segment.
  end
  # find all primes in `segment` and push to `primes_so_far`
end

the inner loop is finicky, cause (a) you don't want to always compute the smallest q s.t. p * q >= a, instead you already know what it is thanks to the last iteration in the previous segment. and (b) you aren't going to remove multiples q, q+1, q+2, ... as you're skipping all p * (q+i)'s that aren't in the {1, 7, 11, 13, 17, 19, 23, 29} spokes -- and this is where the loop is unrolled by hand, and where all the pain and suffering comes from. There's going to be 8 loops for all possible p mod 30's, and 8 steps inside that loop for all q mod 30 values, leading to 64 entrypoints into the loops, and the main optimization is going to be here that instead of doing "cross off q * p, increment q, check if p * q is inside the segment" you're adding a hot loop where you're checking if you can cross off e.g. p * [q, q + 6, q + 10, q + 12, q + 16, q + 18, q + 22, q + 28] at once.

galanakis · 2021-12-12T01:28:08Z

Thank you. I am already familiar with wheel factorization and the the Segmented Sieve (Bays-Hudson). I have to think a bit about combining the two. I appreciate the long response and I will think about it. I really like the idea of combining the 8 bits into one byte.

galanakis · 2021-12-12T01:38:24Z

I have not thought about it very much, but this paper has some other little optimizations, under Algorithms B and C.

https://link.springer.com/content/pdf/10.1007/BF01932283.pdf

Is your segmented sieve similar to this? Or are you using a completely different approach which is incompatible. Maybe it is incompatible, because this paper basically uses the usual 2-wheel, but I really have to think about it.

oscardssmith · 2022-05-25T20:40:08Z

Want to merge update and this? It's a lot of extra code, but I've become convinced that something like this is necessary for optimal performance.

oscardssmith · 2022-05-25T20:40:08Z

Want to merge update and this? It's a lot of extra code, but I've become convinced that something like this is necessary for optimal performance.

Add a segmented sieve iterator

514a67c

haampie mentioned this pull request Dec 11, 2021

n-th prime is too slow. Consider using primecount as a backend #99

Open

oscardssmith mentioned this pull request Jun 13, 2022

faster prime sieve part 1 #120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmented prime sieve for fun and profit#87

Segmented prime sieve for fun and profit#87
haampie wants to merge 1 commit intoJuliaMath:mainfrom
haampie:master

haampie commented Sep 7, 2020 •

edited

Loading

Uh oh!

galanakis commented Dec 11, 2021

Uh oh!

haampie commented Dec 11, 2021 •

edited

Loading

Uh oh!

galanakis commented Dec 12, 2021 •

edited

Loading

Uh oh!

galanakis commented Dec 12, 2021

Uh oh!

oscardssmith commented May 25, 2022

Uh oh!

oscardssmith commented May 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

haampie commented Sep 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galanakis commented Dec 11, 2021

Uh oh!

haampie commented Dec 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galanakis commented Dec 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galanakis commented Dec 12, 2021

Uh oh!

oscardssmith commented May 25, 2022

Uh oh!

oscardssmith commented May 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

haampie commented Sep 7, 2020 •

edited

Loading

haampie commented Dec 11, 2021 •

edited

Loading

galanakis commented Dec 12, 2021 •

edited

Loading