Similarity-Search

Lucas-TY

AI|Jan 30, 2024|Last edited: Feb 1, 2024|
type
status
date
slug
summary
tags
category
icon
password

Sparse search

TF-IDF

notion image
  • TF:
  • IDF:
  • Output:
notion image
 

BM25

  • The TF-IDF score increases linearly with the number of relevant tokens. So, if the frequency doubles — so does the TF-IDF score.
notion image
The IDF part of BM25 (left) compared to the IDF of TF-IDF (right).

Dense Search

Sentence-Bert

  • cos similarity
  • Use Hugging face transformer