create_vectorstore {RAGFlowChainR}R Documentation

Create a DuckDB-based vector store

Description

Initializes a DuckDB database connection for storing embedded documents, with optional support for the experimental 'vss' extension.

Arguments

db_path

Path to the DuckDB file. Use '":memory:"' to create an in-memory database.

overwrite

Logical; if 'TRUE', deletes any existing DuckDB file or table.

embedding_dim

Integer; the dimensionality of the vector embeddings to store.

load_vss

Logical; whether to load the experimental 'vss' extension. This defaults to 'TRUE', but is forced to 'FALSE' during CRAN checks.

Details

This function is part of the vector-store utilities for:

Only create_vectorstore() is exported; helpers like insert_vectors(), build_vector_index(), and search_vectors() are internal but designed to be composable.

Value

A live DuckDB connection object. Be sure to manually disconnect with: DBI::dbDisconnect(con, shutdown = TRUE)

Examples

## Not run: 
# Create vector store
con <- create_vectorstore("tests/testthat/test-data/my_vectors.duckdb", overwrite = TRUE)

# Assume response is output from fetch_data()
docs <- data.frame(head(response))

# Insert documents with embeddings
insert_vectors(
  con = con,
  df = docs,
  embed_fun = embed_openai(),
  chunk_chars = 12000
)

# Build vector + FTS indexes
build_vector_index(con, type = c("vss", "fts"))

# Perform vector search
response <- search_vectors(con, query_text = "Tell me about R?", top_k = 5)

## End(Not run)


[Package RAGFlowChainR version 0.1.5 Index]