create_vectorstore {RAGFlowChainR} | R Documentation |
Create a DuckDB-based vector store
Description
Initializes a DuckDB database connection for storing embedded documents, with optional support for the experimental 'vss' extension.
Arguments
db_path |
Path to the DuckDB file. Use '":memory:"' to create an in-memory database. |
overwrite |
Logical; if 'TRUE', deletes any existing DuckDB file or table. |
embedding_dim |
Integer; the dimensionality of the vector embeddings to store. |
load_vss |
Logical; whether to load the experimental 'vss' extension. This defaults to 'TRUE', but is forced to 'FALSE' during CRAN checks. |
Details
This function is part of the vector-store utilities for:
Embedding text via the OpenAI API
Storing and chunking documents in DuckDB
Building 'HNSW' and 'FTS' indexes
Running nearest-neighbour search over vector embeddings
Only create_vectorstore()
is exported; helpers like insert_vectors()
, build_vector_index()
,
and search_vectors()
are internal but designed to be composable.
Value
A live DuckDB connection object. Be sure to manually disconnect with:
DBI::dbDisconnect(con, shutdown = TRUE)
Examples
## Not run:
# Create vector store
con <- create_vectorstore("tests/testthat/test-data/my_vectors.duckdb", overwrite = TRUE)
# Assume response is output from fetch_data()
docs <- data.frame(head(response))
# Insert documents with embeddings
insert_vectors(
con = con,
df = docs,
embed_fun = embed_openai(),
chunk_chars = 12000
)
# Build vector + FTS indexes
build_vector_index(con, type = c("vss", "fts"))
# Perform vector search
response <- search_vectors(con, query_text = "Tell me about R?", top_k = 5)
## End(Not run)