Once upon a time, a company I worked for had a problem: We had thousands of messages flowing through our data pipeline each second, and we want to be able to send email and SMS alerts to ours users when messages matching specific criteria were seen. The first attempt at an alerting system utilized . To make a long story short, not only was that architecture rigid and hard to make changes to, it didn't scale well and was constantly having performance issues. We would get called out by users for not sending alerts that should have triggered. PipelineDB Enter Elasticsearch Elasticsearch is a NoSQL distributed database that is good for, well, . I would never recommend it as a transactional database for basic CRUD actions, but aggregations, metrics, and percolate queries are where it has really shined in my experience. searching What is a Percolate Query? A good way to think about Elastic's percolate feature is that it is an inverse search. A normal search would entail storing data (JSON documents) and making a query to retrieve a subset of the data. A percolate query is when many queries are stored in the database, and documents are "percolated" to return a subset of the queries that match the document. What does it look like? From , we will create an index with a mapping (which is basically a loosy-goosy SQL schema) for an index that holds percolating queries: elastic's documentation PUT /my-index { : { : { : { : }, : { : }, : { : } } } } "mappings" "properties" "threshold" "type" "long" "count" "type" "long" "query" "type" "percolator" is the name of the index my-index and are fields that we plan on utilizing in either the queries or the documents. All fields should be defined in the mapping threshold count Now that we have an index that can store percolating queries, we can register a new query: PUT /my-index/my-doc/ ?refresh { : , : { : { : { : { : , : } } } } } 1 "threshold" 100 "query" "bool" "must" "query_string" "default_field" "query_string" "query" "count:>100" The query object contains all the logic for percolation. If a document's count field is greater than 100 then this query will be returned in the document's result set. The only purpose of the field is for convenience, that is, when we are doing operations on our queries, we can manage the threshold in its own field instead of parsing the query string every time. threshold CRUD Now, lets percolate a document and see if it matches: GET /my-index/_search { : { : { : , : { : } } } } "query" "percolate" "field" "query" "document" "count" 101 Response: { : , : , : { : , : , : , : }, : { : , : , : [ { : , : , : , : , : { : , : { : { : { : { : , : } } } } } } ] } } "took" 1 "timed_out" false "_shards" "total" 5 "successful" 5 "skipped" 0 "failed" 0 "hits" "total" 1 "max_score" 1 "hits" "_index" "my-index" "_type" "my-doc" "_id" "1" "_score" 1 "_source" "threshold" 100 "query" "bool" "must" "query_string" "default_field" "query_string" "query" "count:>100" Because the count was greater than the threshold, the percolate query was returned! As you can see, this works great for an alerting system because users can create "alerts" which we store as percolating queries. For example, a user can create a query that triggers when a twitter post mentions their name, or when a temperature in a city is above a certain threshold. Use it Percolate queries are perfect for when you have an ever changing set of criteria (probably created by users) that many documents need to be checked against. I've used it for alerting and auto-tagging systems in the past. Let me know on twitter if you have questions or can think of another interesting use case for them! Lane Wagner on Twitter: @wagslane Subscribe to Qvault: https://qvault.io Lane Wagner Github: https://github.com/lane-c-wagner Previously published at https://qvault.io/2019/11/14/how-percolate-queries-in-elasticsearch-make-alerting-a-breeze/