paint-brush
Vector Search: A Reranker Algorithm Showdownby@datastax
293 reads

Vector Search: A Reranker Algorithm Showdown

by DataStax
DataStax HackerNoon profile picture

DataStax

@datastax

DataStax is the real-time data company for building production GenAI...

November 26th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow
en-flagEN
Read this story in the original language, English!
es-flagES
Lee esta historia en Español!
ja-flagJA
この物語を日本語で読んでください!
zu-flagZU
Funda le ndaba ngesiZulu!
sk-flagSK
Prečítajte si tento príbeh v slovenčine!
be-flagBE
Прачытайце гэтае апавяданне па-беларуску!
ts-flagTS
Hlaya xitori lexi hi Xitsonga!
eu-flagEU
Irakurri ipuin hau euskaraz!
rw-flagRW
Soma iyi nkuru muri Kinyarwanda!
qu-flagQU
Ñawinchay kay willakuyta en quechua!
tk-flagTK
Bu hekaýany türkmenlerde okaň!
ro-flagRO
Citiți această poveste în limba română!
af-flagAF
Lees hierdie storie in Afrikaans!
EN

Too Long; Didn't Read

Rerankers are ML models that take a set of search results and reorder them to improve relevance. We tested 6 of them.
featured image - Vector Search: A Reranker Algorithm Showdown
1x
Read by Dr. One voice-avatar

Listen to this story

DataStax HackerNoon profile picture
DataStax

DataStax

@datastax

DataStax is the real-time data company for building production GenAI applications.

0-item

STORY’S CREDIBILITY

Opinion piece / Thought Leadership

Opinion piece / Thought Leadership

The is an opinion piece based on the author’s POV and does not necessarily reflect the views of HackerNoon.

Vector search effectively delivers semantic similarity for retrieval augmented generation, but it does poorly with short keyword searches or out-of-domain search terms. Supplementing vector retrieval with keyword searches like BM25 and combining the results with a reranker is becoming the standard way to get the best of both worlds.


Rerankers are ML models that take a set of search results and reorder them to improve relevance. They examine the query paired with each candidate result in detail, which is computationally expensive but produces more accurate results than simple retrieval methods alone. This can be done either as a second stage on top of a single search (pull 100 results out of vector search, then ask the reranker to identify the top 10) or, more often, to combine results from different kinds of search; in this case, vector search and keyword search.


But how good are off-the-shelf rerankers? To find out, I tested six rerankers on the text from the ViDoRe benchmark, using Gemini Flash to extract text from the images. Details on the datasets can be found in section 3.1 of the ColPali paper. Notably, TabFQuAD and Shift Project sources are in French; the rest are in English.


We tested these rerankers:

  • Reciprocal Rank Fusion (RRF), a formula for combining results from multiple sources without knowing anything about the queries or documents; it depends purely on relative ordering within each source. RRF is used in Elastic and LlamaIndex, among other projects.





The rerankers were fed the top 20 results from both DPR and BM25, and the reranked NDCG@5 was evaluated.


In the results, raw vector search (with embeddings from the bge-m3 model) is labeled dpr (dense passage retrieval). BGE-m3 was chosen to compute embeddings because that’s what the ColPali authors used as a baseline.


Here’s the data on relevance (NDCG@5):

image

And here’s how fast they are at reranking searches in the arxiv dataset; latency is proportional to document length. This is graphing latency, so lower is better. The self-hosted bge model was run on an NVIDIA 3090 using the simplest possible code lifted straight from the Hugging Face model card.

image

Finally, here’s how much it cost with each model to rerank the almost 3,000 searches from all six datasets. Cohere prices per search (with additional fees for long documents), while the others price per token.

image

Analysis

  • All the models do roughly as well on the French datasets as they do on the English ones.


  • Cohere is significantly more expensive and offers slightly (but consistently) worse relevance than the other ML rerankers – but it’s 3x faster than the next-fastest services. Additionally, Cohere’s standard rate limits are the most generous.


  • Voyage rerank-2 is the king of reranking relevance in all datasets, for an additional hit to latency. Notably, it’s the only model that does not do worse than DPR alone in the arxiv dataset, which seems to be particularly tricky.


  • Voyage rerank-2-lite and jina reranker v2 are very, very similar: they’re the same speed, hosted at the same price, and close to the same relevance (with a slight edge to Voyage). But Voyage’s standard rate limit is double jina’s, and with Voyage you get a “real” Python client instead of having to make raw http requests.


  • BGE-reranker-v2-m3 is such a lightweight model (under 600M parameters) that even on an older consumer GPU it is usably fast.

Conclusion

RRF adds little to no value to hybrid search scenarios; on half of the datasets, it performed worse than either BM25 or DPR alone. In contrast, all ML-based rerankers tested delivered meaningful improvements over pure vector or keyword search, with Voyage rerank-2 setting the bar for relevance.


Tradeoffs are still present: superior accuracy from Voyage rerank-2, faster processing from Cohere, or solid middle-ground performance from Jina or Voyage's lite model. Even the open-source BGE reranker, while trailing commercial options, adds significant value for teams choosing to self-host.


As foundation models continue advancing, we can expect even better performance. But today's ML rerankers are already mature enough to deploy with confidence across multilingual content.



By Jonathan Ellis, DataStax

L O A D I N G
. . . comments & more!

About Author

DataStax HackerNoon profile picture
DataStax@datastax
DataStax is the real-time data company for building production GenAI applications.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Hackernoon
Threads
Bsky
Skynetandchill
X REMOVE AD