Why Amazon’s Search Feels Broken

Search Amazon for a phone, sort by lowest price, and watch the search results lose their mind.

You were trying to be sensible. Frugal, even. Instead you are knee-deep in glitter cases, mystery charger bundles, screen protectors for devices you do not own, and one suspiciously cheap listing that appears to be selling spiritual optimism rather than an actual handset.

It is tempting to say Amazon has bad search... It is true, sometimes. But it is also a little unfair.

Because the search box gets blamed for crimes committed much earlier in the supply chain.

I ended up building ProductNormaliser after running into exactly this problem from the wrong side of the screen. Once you start pulling product data from retailers and manufacturers, you realise the search layer is often being asked to perform miracles on top of a catalogue that does not consistently know what a product is, and if the catalogue is confused, search does not become clever.

Baymard’s 2024 benchmark makes the practical version of this point rather nicely. Across 5,000 manually scored search UX ratings, it found that 41% of sites fail to fully support eight core search query types. Even “exact” searches have issues on 33% of sites, and “compatibility” searches fail on 31%. So even when people paste in a model number or search for something like a Dell laptop charger, a depressing number of sites still manage to act surprised.

That matters because shoppers are not submitting beautiful clean database queries. They are rummaging. They are guessing. They are pasting model names from review sites. They are searching for the thing they own in the hope of finding the thing that fits it. Baymard’s examples are painfully familiar: a search for “Vostro 2420 adapter” can surface irrelevant adapters and incompatible products, while better implementations make compatibility obvious in the results. In other words, what looks like a ranking problem is often a product understanding problem wearing a ranking problem’s hat.

Amazon adds another layer of mischief because its results page is not a neutral librarian. Amazon’s own advertising documentation says Sponsored Products can appear at the top of, alongside, or within shopping results, and on product pages too. So even before we get to relevance, the page is already balancing retrieval, merchandising, and paid placement. Researchers in SIGIR’s e-commerce search overview make the same broader point: marketplace search has to juggle customer goals, business goals, ranking, and the messy logistics of product discovery in a two-sided marketplace.

Still, ads are only part of the mess. The deeper problem is that the web cannot agree what the product is. One retailer says a laptop is 15.6 inches. Another says 39.6 cm. One says OLED. Another says LED. One lists the refresh rate. Another forgets. One page is copied from a launch spec sheet and never updated. Another quietly mutates over time. And sometimes the one detail that actually matters has been artistically embedded inside an image, which is lovely if your target audience is a human and less lovely if your target audience is a machine trying not to embarrass itself.

On paper, standards exist. Schema.org has a Product type and an additionalProperty escape hatch for whatever does not fit neatly. Google tells merchants to use product structured data and, for variant-heavy products, ProductGroup with fields like variesBy, hasVariant, and productGroupID so Google can understand which listings are variations of the same parent product. Google Merchant also recommends supplying GTIN, brand, and MPN to improve listing performance and help customers find products. GS1, meanwhile, defines GTIN as the identifier used to uniquely identify trade items across the supply chain. And yet the existence of standards is not the same thing as agreement.

A 2022 paper on product data quality in e-commerce lands on an unusually plainspoken recommendation... Use a single standard for product master data and a common product identifier if you want higher quality product data. Which sounds obvious until you spend five minutes in the wild and discover that every seller, retailer, feed, and marketplace has found its own creative way to be approximately correct.

Amazon’s own catalogue structure tells the same story. Its seller documentation describes parent-child relationships as groups of related products that vary by attributes like size or colour, all gathered under one product detail page so buyers can compare options. Helpful for shoppers, yes. Also a reminder that marketplaces are not dealing in one neat object called “the product”. They are dealing in parents, children, bundles, accessories, variants, stale listings, and a great many near-misses pretending to be the same thing.

This is the point where “better search” stops being the first project, because before you can rank anything well, you need to decide whether two pages describe the same phone, two variants of the same phone, or two entirely different objects that happen to share an alarming number of nouns.

Computer scientists call this entity matching. Recent work on product entity matching describes the problem as identifying records from different sources that refer to the same real-world product despite differences in representation. Other work on multimodal product matching points out that text and numbers are not always enough, and that images can improve recall and overall match quality. An industry paper on fashion e-commerce puts it more bluntly: product matching is a key capability for discoverability, curation, and pricing.

That is the bit people mean when they complain that search feels drunk... The drunkenness is upstream. Search is being asked to rank over a pile of conflicting claims, inconsistent titles, half-parsed attributes, missing identifiers, and variant families that only sometimes make sense. Then we stare at the result page and blame the retrieval model, which is a bit like blaming a waiter because the kitchen cannot agree what a potato is. A product normaliser exists to deal with that ugly middle layer.

It standardises units. It maps aliases. It decides that 15.6 inches and 39.6 cm are not ideological enemies. It separates the stable product from the unstable offer. It keeps source-level truth, meaning what each page actually said and when it said it, separate from canonical truth, meaning the best current conclusion after weighing the evidence. That distinction sounds fussy right up until you need to explain why one retailer says 144Hz, another says 165Hz, and your system has calmly decided to believe one of them.

That is roughly why I built ProductNormaliser, as an open product-intelligence engine for turning messy retail and manufacturer page data into canonical, comparable product records while preserving the evidence trail, which is the part I care about most because it keeps the reasoning inspectable instead of flattening everything into one polished lie. Once you have that layer, the problem becomes much more manageable.

Exact searches work better because model aliases and identifiers are normalised instead of left to fend for themselves in title text. Filters start operating on actual attributes rather than decorative fragments. Compatibility stops being a hopeful keyword coincidence and starts becoming a relation you can represent properly. Ranking still matters, obviously. Merchandising still matters. Ads are still there, waving. But at least the system is no longer trying to reason over seven contradictory descriptions of the same monitor and calling it a clean catalogue.

So yes, Amazon’s search often feels broken. But the interesting part is why.. It isn't because search engineers forgot how search works. Not only because ads and marketplace incentives muddy the water. Mostly because product identity on the web is a shabby, contradictory, badly supervised affair, and search is the poor soul forced to make it look coherent at speed.