Skip to contents

Computes a similarity measure between the query and the documents embeddings and uses this similarity to rank the documents.

Usage

ragnar_retrieve_vss(
  store,
  text,
  top_k = 3L,
  method = c("cosine_distance", "cosine_similarity", "euclidean_distance", "dot_product",
    "negative_dot_product")
)

Arguments

store

A RagnarStore object.

text

A string to find the nearest match too

top_k

Integer, maximum amount of document chunks to retrieve

method

A string specifying the method used to compute the similarity between the query and the document chunks embeddings store in the database.

Value

A dataframe of retrieved chunks. Each row corresponds to an individual chunk in the store. It always contains a column named text that contains the chunks.

Details

The supported methods are:

  • cosine_distance: Measures the dissimilarity between two vectors based on the cosine of the angle between them. Defined as \(1 - cos(\theta)\), where \(cos(\theta)\) is the cosine similarity.

  • cosine_similarity: Measures the similarity between two vectors based on the cosine of the angle between them. Ranges from -1 (opposite) to 1 (identical), with 0 indicating orthogonality.

  • euclidean_distance: Computes the straight-line (L2) distance between two points in a multidimensional space. Defined as \(\sqrt{\sum(x_i - y_i)^2}\).

  • dot_product: Computes the sum of the element-wise products of two vectors.

  • negative_dot_product: The negation of the dot product.

See also