Image Credits: Google Image Credits: Google From Features to Vectors: The Secret Language of AI From Features to Vectors: The Secret Language of AI A couple of weeks back, I came across an interesting blog post about optimizing distance calculations while finding the top-k most similar vectors. The author was benchmarking performance across different CPUs to validate the improvements. That caught my attention. top-k most similar vectors I thought — why not help validate this across another architecture? So I decided to run similar performance tests on my own machine. But before doing that, I went back to revisit some of the concepts I had learned in school — concepts that suddenly felt alive again. why not help validate this across another architecture? And I realized something. These ideas are simple and once you understand them properly, you never forget them. So let’s break this down the way I wish someone had explained it to me. Everything Can Be Represented as Numbers We have all been using ChatGPT for searches, Claude for writing code, and Gemini for image searches. The outcome is so good that it feels like magic. But under the hood, everything is represented as numbers. At first, it feels strange. How can you turn a picture, a song, or even a sentence into numbers? But numbers are just measurements. A house can be described by: size, number of rooms, distance from the city → [2000, 3, 5] [2000, 3, 5] A song can be described by: tempo, loudness, danceability → [120, 0.8, 0.95] [120, 0.8, 0.95] A person can be described by: height, weight, age → [170, 60, 27] [170, 60, 27] Each of these lists is a vector — an ordered collection of numbers capturing the essential features of something. vector What Is a Vector, Really? Properties like commutative, associative, and distributive hold true for vectors in 1D, 2D, 3D, and even N-dimensional space. This is not because we simply extrapolate from lower dimensions, but because these properties are part of the very definition of a vector space in linear algebra. Whether we are working with a single number on a line or a million-dimensional embedding, the same fundamental rules apply. We can easily visualize a 1D vector on a line, a 2D vector on a plane, and a 3D vector in space. Beyond that, visualization becomes difficult — but mathematically, nothing changes. The structure remains consistent. Each axis measures something meaningful. Now imagine a vast, high-dimensional universe where every sentence, every image, every song, every user is just a point. That’s where embeddings live. Why Do We Need Vectors? Because machines only understand numbers. If we want to find similar images, recommend songs, retrieve semantically related text or cluster users we need a numeric representation. And once everything becomes a vector, similarity becomes geometry. Closer vectors = more similar meaning. Closer vectors = more similar meaning. But here’s the subtle question: What does “closer” actually mean? What does “closer” actually mean? The Geometry of Similarity Distance is not universal. There isn’t just one way to measure closeness. Different distance metrics create different geometries of meaning. Let’s explore them intuitively. L2 Distance — The Straight-Line View (Euclidean) Imagine standing at point A and wanting to reach point B. What’s the shortest path? A straight line. That’s L2 distance — also known as Euclidean distance. L2 distance It measures overall geometric closeness. Big differences matter more because they’re squared before being added. Visually: Equal distance forms circles (or spheres in 3D). Meaning spreads smoothly outward. Equal distance forms circles (or spheres in 3D). Meaning spreads smoothly outward. This is the “as the crow flies” distance. If two embeddings differ strongly in one dimension, L2 punishes that heavily. It rewards global similarity. L1 Distance — The City Block View (Manhattan) Now imagine you’re in Manhattan. You can’t walk diagonally through buildings. You move along streets and avenues. That’s L1 distance, also called Manhattan distance. L1 distance Instead of a straight line, you measure: Total horizontal movement Total vertical movement. Total horizontal movement Total vertical movement. No squaring. Just absolute differences. Visually: Equal distance forms diamonds instead of circles. Equal distance forms diamonds instead of circles. L1 measures total change across dimensions. It’s often more robust — one large difference doesn’t explode the distance as dramatically as in L2. It’s the “walk the grid” definition of similarity. Chebyshev Distance — The Maximum Difference Rule Now think of a chessboard. How many moves does a king need to reach another square? It depends only on the largest coordinate difference. That’s Chebyshev distance. Chebyshev distance It measures the single biggest difference across all dimensions. Everything else is ignored. Visually: Equal distance forms squares. Equal distance forms squares. This metric asks What is the strongest mismatch between these two things? It’s useful when the largest deviation defines similarity. It’s the “weakest link” metric. But There’s Something Even More Subtle All of these measure how far apart two points are. But in high-dimensional embedding spaces, we often care less about magnitude and more about direction. how far apart direction Cosine Similarity — The Angle of Meaning Imagine two arrows starting at the origin. One might be long. One might be short. But if they point in the same direction, they represent similar meaning. That’s cosine similarity. cosine similarity Instead of measuring distance, cosine measures the angle between vectors. angle between vectors If the angle is small → they’re similar.If the angle is 90° → unrelated.If opposite → completely different. Why does this matter? Because embeddings often encode meaning in direction, not length. Two sentences: “The dog is running in the park.” “A puppy is playing outside.” “The dog is running in the park.” “A puppy is playing outside.” Their vectors may not have the same magnitude. But if they point in roughly the same direction in meaning-space, cosine similarity sees them as close. Cosine asks: Are these two pieces of data talking about the same thing? Are these two pieces of data talking about the same thing? It ignores how “strong” or “confident” the embedding is — and focuses purely on semantic orientation. Visually: Similarity becomes angular. Meaning becomes directional. Similarity becomes angular. Meaning becomes directional. Changing the Metric Changes the Shape of Meaning In high dimensions: L2 makes similarity spherical. L1 makes it diamond-shaped. Chebyshev makes it box-shaped. Cosine makes it angular. L2 makes similarity spherical. L1 makes it diamond-shaped. Chebyshev makes it box-shaped. Cosine makes it angular. Same vectors. Different geometry. Different interpretation of meaning. When you choose a metric, you are choosing how intelligence perceives similarity. What Is an Embedding? An embedding is not just a vector. It is a learned coordinate in a space where distance reflects meaning. distance reflects meaning Modern models like: OpenAI embedding models Google multimodal models Meta representation learning systems OpenAI embedding models Google multimodal models Meta representation learning systems train neural networks to arrange data in this space so that: Similar ideas move closer. Different concepts move apart. Similar ideas move closer. Different concepts move apart. The training process literally reshapes geometry until meaning becomes measurable. Multimodal Embeddings — A Shared Map of Reality Now imagine something even more powerful. Different types of data: Text Images Audio Text Images Audio All mapped into the same space. same For example, An image of a golden retriever, the sentence: “A friendly golden retriever.”, the sound of barking. All three become nearby points. This is how systems like OpenAI’s multimodal models or Google DeepMind’s representation systems build a universal map of meaning. Search with text → retrieve images.Search with audio → retrieve video. Everything becomes geometry. Why This Matters Once everything is a vector, search becomes nearest-neighbor lookup, recommendations become geometric proximity, clustering becomes spatial grouping, classification becomes boundary drawing. Vector databases exist because meaning has become spatial. And underneath every modern AI system lies this quiet truth: Intelligence is geometry in disguise. Intelligence is geometry in disguise. And distance — whether L1, L2, Chebyshev, or cosine is the ruler by which machines measure thought.*