ragnar_retrieve_vss {ragnar} | R Documentation |
Vector Similarity Search Retrieval
Description
Computes a similarity measure between the query and the document embeddings and uses this similarity to rank and retrieve document chunks.
Usage
ragnar_retrieve_vss(
store,
query,
top_k = 3L,
...,
method = "cosine_distance",
query_vector = store@embed(query),
filter
)
Arguments
store |
A |
query |
Character. The query string to embed and use for similarity search. |
top_k |
Integer. Maximum number of document chunks to retrieve. Defaults to 3. |
... |
Additional arguments passed to methods. |
method |
Character. Similarity method to use: |
query_vector |
Numeric vector. The embedding for |
filter |
Optional. A filter expression evaluated with
|
Details
Supported methods:
-
cosine_distance – cosine of the angle between two vectors.
-
euclidean_distance – L2 distance between vectors.
-
negative_inner_product – negative sum of element-wise products.
If filter
is supplied, the function first performs the similarity
search, then applies the filter in an outer SQL query. It uses the HNSW
index when possible and falls back to a sequential scan for large result
sets or filtered queries.
Value
A tibble
with the top_k retrieved chunks,
ordered by metric_value
.
Note
The results are not re-ranked after identifying the unique values.
See Also
Other ragnar_retrieve:
ragnar_retrieve()
,
ragnar_retrieve_bm25()
,
ragnar_retrieve_vss_and_bm25()
Examples
## Build a small store with categories
store <- ragnar_store_create(
embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
extra_cols = data.frame(category = character()),
version = 1 # store text chunks directly
)
ragnar_store_insert(
store,
data.frame(
category = c(rep("pets", 3), rep("dessert", 3)),
text = c("playful puppy", "sleepy kitten", "curious hamster",
"chocolate cake", "strawberry tart", "vanilla ice cream")
)
)
ragnar_store_build_index(store)
# Top 3 chunks without filtering
ragnar_retrieve(store, "sweet")
# Combine filter with similarity search
ragnar_retrieve(store, "sweet", filter = category == "dessert")