class Bitcoin::GCSFilter

Golomb-coded set filter see github.com/bitcoin/bips/blob/master/bip-0158.mediawiki

Constants

MAX_ELEMENTS_SIZE

Attributes

encoded[R]
key[R]
m[R]
n[R]
p[R]

Public Class Methods

new(key, p, m, elements: nil, encoded_filter: nil) click to toggle source

initialize Filter object. @param [String] key the 128-bit key used to randomize the SipHash outputs. @param [Integer] p the bit parameter of the Golomb-Rice coding. @param [Integer] m which determines the false positive rate. @param [Array] elements the filter elements. @param [String] encoded_filter encoded filter with hex format. @return [Bitcoin::GCSFilter]

# File lib/bitcoin/gcs_filter.rb, line 24
def initialize(key, p, m, elements: nil, encoded_filter: nil)
  raise 'specify either elements or encoded_filter.' if elements.nil? && encoded_filter.nil?
  raise 'p must be <= 32' if p > 32
  @key = key
  @p = p
  @m = m
  if elements
    raise 'elements size must be < 2**32.' if elements.size >= MAX_ELEMENTS_SIZE
    @n = elements.size
    encoded = Bitcoin.pack_var_int(@n)
    bit_writer = Bitcoin::BitStreamWriter.new
    unless elements.empty?
      last_value = 0
      hashed_set = elements.map{|e| hash_to_range(e) }.sort
      hashed_set.each do |v|
        delta = v - last_value
        golomb_rice_encode(bit_writer, p, delta)
        last_value = v
      end
    end
    bit_writer.flush
    encoded << bit_writer.stream
    @encoded = encoded.bth
  else
    @encoded = encoded_filter
    @n, payload = Bitcoin.unpack_var_int(encoded_filter.htb)
  end
end

Public Instance Methods

f() click to toggle source

Range of element hashes, F = N * M

# File lib/bitcoin/gcs_filter.rb, line 54
def f
  n * m
end
hash_to_range(element) click to toggle source

Hash a data element to an integer in the range [0, F). @param [String] element with binary format. @return [Integer]

# File lib/bitcoin/gcs_filter.rb, line 61
def hash_to_range(element)
  hash = SipHash.digest(key, element)
  map_into_range(hash, f)
end
match?(element) click to toggle source

Checks if the element may be in the set. False positives are possible with probability 1/M. @param [String] element with binary format @return [Boolean] whether element in set.

# File lib/bitcoin/gcs_filter.rb, line 69
def match?(element)
  query = hash_to_range(element)
  match_internal?([query], 1)
end
match_any?(elements) click to toggle source

Checks if any of the given elements may be in the set. False positives are possible with probability 1/M per element checked. This is more efficient that checking Match on multiple elements separately. @param [Array] elements list of elements with binary format. @return [Boolean] whether element in set.

# File lib/bitcoin/gcs_filter.rb, line 78
def match_any?(elements)
  queries = elements.map{|e| hash_to_range(e) }.sort
  match_internal?(queries, queries.size)
end

Private Instance Methods

golomb_rice_decode(bit_reader, p) click to toggle source

decode golomb rice

# File lib/bitcoin/gcs_filter.rb, line 127
def golomb_rice_decode(bit_reader, p)
  q = 0
  while bit_reader.read(1) == 1
    q +=1
  end
  r = bit_reader.read(p)
  (q << p) + r
end
golomb_rice_encode(bit_writer, p, x) click to toggle source

encode golomb rice

# File lib/bitcoin/gcs_filter.rb, line 115
def golomb_rice_encode(bit_writer, p, x)
  q = x >> p
  while q > 0
    nbits = q <= 64 ? q : 64
    bit_writer.write(-1, nbits) # 18446744073709551615 is 2**64 - 1 = ~0ULL in cpp.
    q -= nbits
  end
  bit_writer.write(0, 1)
  bit_writer.write(x, p)
end
map_into_range(x, y) click to toggle source

hash are then mapped uniformly over the desired range by multiplying with F and taking the top 64 bits of the 128-bit result. lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/ stackoverflow.com/a/26855440

# File lib/bitcoin/gcs_filter.rb, line 88
def map_into_range(x, y)
  (x * y) >> 64
end
match_internal?(hashes, size) click to toggle source

Checks if the elements may be in the set. @param [Array] hashes the query hash list. @param [Integer] size query size. @return [Boolean] whether elements in set.

# File lib/bitcoin/gcs_filter.rb, line 96
def match_internal?(hashes, size)
  n, payload = Bitcoin.unpack_var_int(encoded.htb)
  bit_reader = Bitcoin::BitStreamReader.new(payload)
  value = 0
  hashes_index = 0
  n.times do
    delta = golomb_rice_decode(bit_reader, p)
    value += delta
    loop do
      return false if hashes_index == size
      return true if hashes[hashes_index] == value
      break if hashes[hashes_index] > value
      hashes_index += 1
    end
  end
  false
end