Skip to contents

BM25 refers to Okapi Best Matching 25. See doi:10.1561/1500000019 for more information.

Usage

ragnar_retrieve_bm25(store, text, top_k = 3L)

Arguments

store

A RagnarStore object.

text

A string to find the nearest match too

top_k

Integer, maximum amount of document chunks to retrieve

Value

A dataframe of retrieved chunks. Each row corresponds to an individual chunk in the store. It always contains a column named text that contains the chunks.

Details

The supported methods are:

  • cosine_distance: Measures the dissimilarity between two vectors based on the cosine of the angle between them. Defined as \(1 - cos(\theta)\), where \(cos(\theta)\) is the cosine similarity.

  • cosine_similarity: Measures the similarity between two vectors based on the cosine of the angle between them. Ranges from -1 (opposite) to 1 (identical), with 0 indicating orthogonality.

  • euclidean_distance: Computes the straight-line (L2) distance between two points in a multidimensional space. Defined as \(\sqrt{\sum(x_i - y_i)^2}\).

  • dot_product: Computes the sum of the element-wise products of two vectors.

  • negative_dot_product: The negation of the dot product.

See also