class ElastomerClient::Client::Scroller

Attributes

client[R]
query[R]
scroll_id[R]

Public Class Methods

new(client, query, opts = {}) click to toggle source

Create a new scroller that can be used to iterate over all the documents returned by the ‘query`. The Scroller supports both the ’scan’ and the ‘scroll’ search types.

See www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html and www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html#scan

client - ElastomerClient::Client used for HTTP requests to the server query - The query to scroll as a Hash or a JSON encoded String opts - Options Hash

:index       - the name of the index to search
:type        - the document type to search
:scroll      - the keep alive time of the scrolling request (5 minutes by default)
:size        - the number of documents per shard to fetch per scroll

Examples

scan = Scroller.new(client, {query: {match_all: {}}}, index: 'test-1')
scan.each_document { |doc|
  doc['_id']
  doc['_source']
}
# File lib/elastomer_client/client/scroller.rb, line 171
def initialize(client, query, opts = {})
  @client = client

  @opts = DEFAULT_OPTS.merge({ body: query }).merge(opts)

  @scroll_id = nil
  @offset = 0
end

Public Instance Methods

clear!() click to toggle source

Terminate the scroll query. This will remove the search context from the cluster and no further documents can be returned by this Scroller instance.

Returns nil if the ‘scroll_id` is not valid; returns the response body if the `scroll_id` was cleared.

# File lib/elastomer_client/client/scroller.rb, line 244
def clear!
  return if scroll_id.nil?
  client.clear_scroll(scroll_id)
rescue ::ElastomerClient::Client::IllegalArgument
  nil
end
do_scroll() click to toggle source

Internal: Perform the actual scroll requests. This method wil call out to the ‘Client#start_scroll` and `Client#continue_scroll` methods while keeping track of the `scroll_id` internally.

Returns the response body as a Hash.

# File lib/elastomer_client/client/scroller.rb, line 256
def do_scroll
  if scroll_id.nil?
    body = client.start_scroll(@opts)
    if body["hits"]["hits"].empty?
      @scroll_id = body["_scroll_id"]
      return do_scroll
    end
  else
    body = client.continue_scroll(scroll_id, @opts[:scroll])
  end

  @scroll_id = body["_scroll_id"]
  body
end
each() { |hits| ... } click to toggle source

Iterate over all the search results from the scan query.

block - The block will be called for each set of matching documents

returned from executing the scan query.

Yields a hits Hash containing the ‘total’ number of hits, current ‘offset’ into that total, and the Array of ‘hits’ document Hashes.

Examples

scan.each do |hits|
  hits['total']
  hits['offset']
  hits['hits'].each { |document| ... }
end

Returns this Scroller instance.

# File lib/elastomer_client/client/scroller.rb, line 199
def each
  loop do
    body = do_scroll

    hits = body["hits"]
    break if hits["hits"].empty?

    hits["offset"] = @offset
    @offset += hits["hits"].length

    yield hits
  end

  self
ensure
  clear!
end
each_document(&block) click to toggle source

Iterate over each document from the scan query. This method is just a convenience wrapper around the ‘each` method; it iterates the Array of documents and passes them one by one to the block.

block - The block will be called for each document returned from

executing the scan query.

Yields a document Hash.

Examples

scan.each_document do |document|
  document['_id']
  document['_source']
end

Returns this Scroller instance.

# File lib/elastomer_client/client/scroller.rb, line 234
def each_document(&block)
  each { |hits| hits["hits"].each(&block) }
end