It can be confusing to hear search companies explain how search and AI work. Bing has added ChatGPT which uses large language models (LLMs), but even before that they had deep learning capabilities. Google recently announced new image search capabilities and its own LLM service. At Algolia, we’re also about to introduce our own AI-powered technology that uses neural hashing to scale intelligent search for any application. All of these terms can be confusing.
Let’s fix that by breaking down the technologies involved with search.
Keyword search engines have been around for decades. The Apache Lucene project is one of the best well-known open source search engines which offers keyword search functionality. This type of search engine uses statistical techniques to match queries to items in the index. They work much like the index at the back of a book by pointing to all the places in the book where information is located. Query processing technologies like typo tolerance, word segmentation, and stemming are also used to help search engines digest and make sense of spelling and query understanding.
Keyword search tends to be very fast, and works well for exact query-keyword matches. However, they often struggle with long tail queries, concept searches, question-style searches, synonyms, and other phrases where the query doesn’t exactly match the content in the index. For this reason, many companies have added additional features such as AI synonym generation to help.
Semantic search involves understanding the meaning of words and phrases in a search query and returning results that are semantically related to the query. Semantic search engines use natural language processing (NLP) techniques to understand the meaning of words and phrases and to find related concepts, synonyms, and other related information that may be relevant to the search query.
AI search is a general and broader term that includes semantic search as well as other machine learning techniques for delivering search results. AI search typically involves several steps, including query processing, retrieval, and ranking.
Query processing: This step involves analyzing the user’s query to understand its intent, scope, and constraints. Query processing may include tasks such as parsing the query into its constituent parts, semantic understanding of keywords and phrases, normalizing the query to a standard format, and more.
Retrieval: Once the query has been processed, the system retrieves a set of documents or data items that match the query criteria. AI search typically uses machine learning algorithms to determine similarity and measure relatedness between terms to deliver relevant results.
Ranking: After the documents or data items have been retrieved, the system ranks them based on their relevance and importance to the user’s query. Learning-to-rank models such as reinforcement learning are used to continuously optimize results.
OpenAI’s ChatGPT, Google’s Bard, Midjourney and other similar AI technology are what’s called generative AI. These general purpose solutions attempt to predict the results based on input, and will actually generate a fresh response. They use pre-existing text and visual content to generate something new.
On the other hand, search engines can use AI to improve search results. Just like generative AI, search AI can be used to understand natural language inputs. Unlike generative AI, search engines are not creating any new, novel content. Both technologies can be used together or independently. Generative AI technologies can be used to aid with creative output, and search is used to filter and rank order results. Someone looking for new fashion ideas might ask a chat bot what the latest trends are, get results, and then use search to find results. Or, you might use search to find products and then ask chat to explain pros and cons of each result.
Both generative chat AI and search AI often provide a better user experience through the understanding of natural language.
Large language models (LLMs) have been around for a while now, but GPT has put them in the spotlight. LLMs are artificial intelligence models that are trained to process and generate natural language text. These models are typically built using deep learning techniques and require vast amounts of data and computational resources for training. At Algolia, we use LLMs, too, but to aid in machine understanding. We use LLMs to create vectors which we can use to compare queries to results.
Vectorization is the process of converting words into vectors (numbers) which allows their meaning to be encoded and processed mathematically. You can think of vectors as groups of numbers that represent something. In practice, vectors are used for automating synonyms, clustering documents, detecting specific meanings and intents in queries, and ranking results. Embeddings are very versatile and other objects — like entire documents, images, video, audio, and more — can be embedded too.
Vector search is a way to use word embeddings (or image, videos, documents, etc.,) to find related objects that have similar characteristics using machine learning models that detect semantic relationships between objects in an index.
There are many different approximate nearest neighbor (ANN) algorithms for calculating vector similarity. Techniques such as HNSW (Hierarchical Navigable Small World), IVF (Inverted File), or PQ (Product Quantization, a technique to reduce the number of dimensions of a vector) are some of the most popular ANN methods to find similarity between vectors. Each technique focuses on improving a particular performance property, such as memory reduction with PQ or fast but accurate search times with HNSW and IVF. It is common practice to mix several components to produce a ‘composite’ index to achieve optimal performance for a given use case.
One of the challenges for working with vectors is their size. They tend to be very large strings that require specialized databases and GPU management. Neural hashing is a new process that uses neural networks to compress vectors so they can be processed up to 500 times faster than standard vector calculations and run on commodity hardware.
Hybrid search is the combination of vector search with keyword search. Vector search is terrific for fuzzy or broad searches, but keyword search still rules the roost for precise queries. For example, when you query for “Adidas” on a keyword engine, by default you will only see the Adidas brand. The default behavior in a vector engine is to return similar results — Nike, Puma, Adidas, etc., because they are all in the same conceptual space. Keyword search still provides better results for short queries with specific intention.
Hybrid search offers the best of both words providing speed and accuracy for exact matches and simple phrases, while vectors improve long tail queries and open the door to new search solutions. At Algolia, our hybrid AI solution — Algolia NeuralSearch — is coming soon. Learn more.
Also published here.