In the ever-evolving digital landscape, search engines play an increasingly crucial role in powering search functionalities across various platforms. Among the popular search engines, Meilisearch and Manticore Search stand out with their unique offerings.
However, choosing the right search engine for your project requires a thorough understanding of their performance, use cases, and limitations. This article aims to provide a comparison of Meilisearch and Manticore Search, focusing on their feature set and data ingestion and search performance in three real-world benchmarks: 10 million NGINX logs, Hacker News 1.1 million docs dataset, and Hacker News 116 million docs dataset all available at DB Benchmarks. All the performance test scripts, configurations and the data collections are publicly available and reproducible.
Both Manticore and Meilisearch position themselves as full-text search engines. The key element in full-text search engines is how they rank documents during a search.
Choosing the right search ranking algorithm is crucial to ensure users can find the information they need with precision and recall. In the context of full-text search relevance, it is essential to understand how these algorithms work and how they contribute to providing accurate and meaningful search results.
Manticore Search is very flexible in controlling search ranking and exposes dozens of ranking factors; however, by default, it employs the classical BM25 algorithm and its derivatives. BM25 is a well-established information retrieval algorithm that calculates the relevance of documents based on term frequency and inverse document frequency.
An ongoing pull request for the BEIR (Benchmarking and Evaluation of Information Retrieval) benchmark demonstrates Manticore Search’s commitment to search relevance. BEIR is an evaluation framework that measures the performance of information retrieval systems on various tasks, such as document retrieval and question-answering. The results of the BEIR benchmark can be found here:
https://docs.google.com/spreadsheets/d/1_ZyYkPJ_K0st9FJBrjbZqX14nmCCPVlE_y3a_y5KkYI/edit#gid=0.
In contrast, Meilisearch claims to offer good search relevance, but there are no public benchmarks available to substantiate this assertion. According to a discussion on Hacker News, Meilisearch users have mentioned its search relevance, but without any empirical evidence, it is difficult to compare its performance to Manticore Search objectively.
Overall, Manticore Search’s use of proven ranking algorithms and participation in the BEIR benchmark highlights its commitment to providing highly relevant search results, making it a reliable choice for various applications. While Meilisearch may excel at full-text search relevance too, it is difficult to make a definitive statement since there are no established benchmarks and the algorithm used is not widely known.
Manticore Search demonstrates its ability to effectively handle large datasets (e.g. 1.7 billion docs taxi rides test or simply Craigslist.org) through the use of row-wise and columnar storages. The columnar approach is specifically designed to accelerate search performance and lower RAM consumption on large datasets. In contrast, Manticore Search’s default row-wise storage offers unbeatable performance on small and medium datasets. This flexibility makes Manticore Search an ideal choice for a wide range of applications.
Meilisearch, on the other hand, struggles with larger datasets, as we could not load the Hacker News larger dataset into the search engine even after 2 days of loading. Furthermore, Meilisearch experiences a degradation in performance when loading documents. As the dataset grows, the time it takes to load each subsequent batch of documents increases. This performance issue indicates that Meilisearch has a problem with data scalability and could be problematic for applications that require real-time data ingestion or indexing of large datasets. Meilisearch processes document updates in a single queue, which can lead to bottlenecks and reduced performance over time.
It is crucial to note that document updates in Meilisearch are not instantly reflected in search queries. This is because Meilisearch employs an asynchronous task queue for handling updates, ensuring search performance remains stable even during intensive indexing operations.
When updating a document, the change is added to the task queue and processed by the engine in the background. Once the task is completed, the updated data becomes available in the search results. The processing time can vary depending on the update size and server resources. To monitor task status, you can utilize the Tasks API, which offers information on task progress and completion.
rt, replace, and delete capabilities, allowing changes to be immediately visible as soon as the query is complete.
In summary, while Meilisearch provides fast and efficient search capabilities, keep in mind that updates to documents might not be immediately visible in search results due to the asynchronous task processing.
Meilisearch is known for its impressive speed, outperforming Elasticsearch in many cases. However, its performance is most noticeable when working with small datasets. As the dataset size increases, Meilisearch’s performance may decline.
Manticore Search consistently delivers fast query performance for various query types and dataset types, outperforming both Meilisearch and Elasticsearch. With optimized row-wise and columnar indexing methods, Manticore ensures a responsive search experience, crucial for maintaining user engagement in high-performance applications.
In contrast, Meilisearch struggles with efficiently handling large datasets and suffers from performance degradation during document loading. Therefore, Manticore is the superior choice for those who don’t want to worry about their dataset size.
The Hacker News small dataset benchmark, which features a collection of 1.1 million curated Hacker News comments with numeric fields (source: https://zenodo.org/record/45901/), highlights the higher search performance of Manticore Search over Meilisearch. The dataset contains textual data from comments and numeric fields such as upvotes, timestamps, and user IDs. The benchmark test involves running full-text and analytical queries to assess the search engines’ capabilities.
The benchmark results can also be verified through this link.
Unfortunately, Meilisearch is not capable of executing many types of queries, such as aggregation queries and those with negative full-text search terms.
An interesting aspect of this benchmark is the significant difference in disk space usage between the two search engines:
[email protected] /perf/test_engines/tests/hn_small/manticore # du -sh idx
1.1G idx
[email protected] /perf/test_engines/tests/hn_small/meilisearch # du -sh .
38G .
Meilisearch requires 34x more disk space to store the same dataset compared to Manticore Search.
In terms of data loading performance it took:
to fully complete data loading.
This test involves the same 1.1 million curated Hacker News comments dataset (source: https://zenodo.org/record/45901/), but multiplied 100 times, resulting in about 116 million documents. The benchmark covers both full-text and analytical queries, making it an excellent test case for evaluating search engine capabilities on a larger scale.
Meilisearch couldn’t load the data in 2 days. Its performance of inserts degraded as the database grew. We attempted to optimize it but were unsuccessful since all batches, even when we tried to make them parallel, went into a single queue. As a result, we couldn’t achieve any improvement in data loads for Meilisearch. It took Meilisearch about 2 days to load only 38% of the data, which already consumed over 850 GB of disk space. This is a stark contrast to Manticore Search, which stored the entire dataset using approximately 100 GB of disk space and took 2 hours 9 minutes to load using a single CPU core (which is virtually linearly scalable).
The inability of Meilisearch to process the entire Hacker News large dataset highlights its challenges in managing and scaling with more extensive data collections. Manticore Search’s superior performance in this benchmark underscores its capacity to handle large-scale search requirements, making it a more suitable choice for applications with larger data collections.
Since we couldn’t load the data into Meilisearch, you can check the Manticore-only results here.
This test is based on a dataset containing 10 million NGINX logs. The source of this dataset is Kaggle. Web server logs register various events, providing valuable insights into website visitors, user behavior, crawlers accessing the site, business intelligence, security issues, and more. The benchmark uses a curated list of typical queries that a random DevOps engineer might run.
Manticore Search and Meilisearch exhibited a significant difference in disk space usage for the dataset. Manticore Search used 4.4 GB of disk space, while Meilisearch consumed 69 GB, which is approximately 15 times more than Manticore. Although the difference is less dramatic than the Hacker News small dataset test, it is still noteworthy, especially considering the Logs10m dataset contains less text data.
It took Meilisearch around 20 minutes to fill up the data, whereas Manticore finished in 6 minutes.
You can find the detailed comparison of the performance results using the provided link. Please take note that many empty results are simply due to Meiliesarch being unable to handle certain types of queries. As a result, these queries were skipped during the benchmarking process.
Small-scale projects: Meilisearch’s lightweight nature and ease of deployment make it suitable for small projects with limited data and search requirements, such as small-scale e-commerce, personal websites, local directories, or simple web applications, where fast data loading, advanced search features and scalability are not critical factors.
When choosing a search engine for your project, it is crucial to consider factors such as search relevance, scalability, and performance. Manticore Search stands out as the superior choice for diverse applications and use cases, ensuring optimal search performance and relevance regardless of dataset size. Its advanced search and analytics capabilities make it a reliable choice for projects that demand high-performance search functionality.
Meilisearch is suitable for small projects where advanced search features and scalability are not critical factors.
Ultimately, the choice between Manticore Search and Meilisearch will depend on your specific needs and project requirements.
Also published here.