Traditional hashing techniques like perceptual hashing (pHash) and locality-sensitive hashing (LSH) are widely used for image similarity detection. However, they often fail in real-world scenarios where images undergo transformations such as cropping, noise addition or color changes. Graph Neural Networks: A Smarter Approach to Image Similarity Graph Neural Networks (GNNs) provide a more robust alternative by modeling image relationships as a graph and propagating similarity information through message passing. Instead of treating images as isolated entities, Graph Neural Networks (GNNs) represent images as nodes in a graph where edges define their relationships. This allows context aware learning making similarity detection far more robust. Let’s talk about a Real world Needs Consider an AI driven content moderation system that should detect inappropriate images across billions of uploads daily. If a flagged image is slightly altered/cropped, filtered or resized traditional hashing fails allowing harmful content to evade detection. Now imagine an e-commerce search engine that helps customers find similar products. Two shirts with different backgrounds but identical designs might have very different pixel values but should still be recognized as the same product. In both cases a GNN-based approach would outperform traditional hashing by understanding relationships beyond pixels using graph structures to connect similar items dynamically. Building an Image Similarity System with Graph Neural Networks Let’s walk through a tutorial to build GNNs on any image set. Extract Image Features Using CLIP or ResNet: Before constructing a graph, we extract feature embeddings from images using a pre-trained model import torch import torchvision.transforms as transforms from PIL import Image from torchvision import models #Load a pre-trained model model = models.resnet50(pretrained=True) model.fc = torch.nn.Identity() model.eval() #Image transformation transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor() ]) #Load and preprocess an image def extract_features(image_path): image = Image.open(image_path).convert('RGB') image = transform(image).unsqueeze(0) #Add batch dimension with torch.no_grad(): features = model(image) return features.numpy().flatten() feature_vector = extract_features("my_image.jpg") Construct a Graph from Image Embeddings We can now construct a KNN (K Nearest Neighbor) graph where each node represents an image and edges indicate similarity. import networkx as nx import numpy as np from sklearn.neighbors import NearestNeighbors #Simulated feature vectors for 100 images (assuming 512D embeddings) image_embeddings = np.random.rand(100, 512) #Build a KNN graph k = 5 nbrs = NearestNeighbors(n_neighbors=k, metric="cosine").fit(image_embeddings) distances, indices = nbrs.kneighbors(image_embeddings) #Construct the graph G = nx.Graph() for i in range(len(image_embeddings)): for j in range(1, k): # Skip self-loop G.add_edge(i, indices[i, j], weight=1 - distances[i, j]) #Visualize graph structure print("Graph nodes:", G.number_of_nodes()) print("Graph edges:", G.number_of_edges()) Train a Graph Neural Network (GNN) for Similarity Learning We now use PyTorch Geometric to train a GraphSAGE model on this graph. import torch import torch.nn.functional as F from torch_geometric.nn import SAGEConv from torch_geometric.data import Data #Convert the NetworkX graph to PyTorch Geometric format edge_index = torch.tensor(list(G.edges)).t().contiguous() x = torch.tensor(image_embeddings, dtype=torch.float) #Define a GraphSAGE model class GraphSAGE(torch.nn.Module): def __init__(self, in_channels, hidden_channels, out_channels): super().__init__() self.conv1 = SAGEConv(in_channels, hidden_channels) self.conv2 = SAGEConv(hidden_channels, out_channels) def forward(self, x, edge_index): x = self.conv1(x, edge_index) x = F.relu(x) x = self.conv2(x, edge_index) return x #Initialize and train model model = GraphSAGE(512, 256, 128) #Reduce embedding size to 128D optimizer = torch.optim.Adam(model.parameters(), lr=0.01) for epoch in range(50): optimizer.zero_grad() out = model(x, edge_index) loss = F.mse_loss(out, x) #Self-supervised learning loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item()}") Querying Similar Images Using the GNN Embeddings Now that training is complete, we can retrieve the top k most similar images for a given query. from sklearn.metrics.pairwise import cosine_similarity #Extract GNN embeddings with torch.no_grad(): learned_embeddings = model(x, edge_index).numpy() #Compute cosine similarity query_idx = 0 # Example query image similarities = cosine_similarity([learned_embeddings[query_idx]], learned_embeddings) top_k = np.argsort(-similarities[0])[:5] print("Most similar images:", top_k) Final Thoughts GNNs create a smarter and more adaptive approach to image similarity. Unlike hashing, which relies on rigid pixel comparisons, GNNs learn meaningful relationships between images, making them far more robust to transformations, occlusions, and noise. In today’s world where image datasets are exploding in size from social media platforms to medical imaging archives and the rise of synthetic data, scalability and accuracy are critical. GNNs provide an efficient way to structure and retrieve similar images improving applications like visual search, content moderation, recommendation engines and AI driven media management. The future of image search isn’t just about finding duplicates, it’s about understanding the deeper connections between images, and GNNs take us one step forward to that a reality. Traditional hashing techniques like perceptual hashing (pHash) and locality-sensitive hashing (LSH) are widely used for image similarity detection. However, they often fail in real-world scenarios where images undergo transformations such as cropping, noise addition or color changes. Graph Neural Networks: A Smarter Approach to Image Similarity Graph Neural Networks: A Smarter Approach to Image Similarity Graph Neural Networks (GNNs) provide a more robust alternative by modeling image relationships as a graph and propagating similarity information through message passing. Instead of treating images as isolated entities , Graph Neural Networks (GNNs) represent images as nodes in a graph where edges define their relationships. This allows context aware learning making similarity detection far more robust . isolated entities Graph Neural Networks (GNNs) nodes in a graph context aware learning far more robust Let’s talk about a Real world Needs Consider an AI driven content moderation system that should detect inappropriate images across billions of uploads daily. If a flagged image is slightly altered /cropped, filtered or resized traditional hashing fails allowing harmful content to evade detection . content moderation system slightly altered traditional hashing fails harmful content to evade detection Now imagine an e-commerce search engine that helps customers find similar products. Two shirts with different backgrounds but identical designs might have very different pixel values but should still be recognized as the same product. e-commerce search engine identical designs very different pixel values In both cases a GNN-based approach would outperform traditional hashing by understanding relationships beyond pixels using graph structures to connect similar items dynamically . GNN-based approach understanding relationships beyond pixels connect similar items dynamically Building an Image Similarity System with Graph Neural Networks Let’s walk through a tutorial to build GNNs on any image set. Extract Image Features Using CLIP or ResNet: Before constructing a graph, we extract feature embeddings from images using a pre-trained model Extract Image Features Using CLIP or ResNet: Before constructing a graph, we extract feature embeddings from images using a pre-trained model Extract Image Features Using CLIP or ResNet: Extract Image Features Using CLIP or ResNet: Before constructing a graph, we extract feature embeddings from images using a pre-trained model import torch import torchvision.transforms as transforms from PIL import Image from torchvision import models #Load a pre-trained model model = models.resnet50(pretrained=True) model.fc = torch.nn.Identity() model.eval() #Image transformation transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor() ]) #Load and preprocess an image def extract_features(image_path): image = Image.open(image_path).convert('RGB') image = transform(image).unsqueeze(0) #Add batch dimension with torch.no_grad(): features = model(image) return features.numpy().flatten() feature_vector = extract_features("my_image.jpg") import torch import torchvision.transforms as transforms from PIL import Image from torchvision import models #Load a pre-trained model model = models.resnet50(pretrained=True) model.fc = torch.nn.Identity() model.eval() #Image transformation transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor() ]) #Load and preprocess an image def extract_features(image_path): image = Image.open(image_path).convert('RGB') image = transform(image).unsqueeze(0) #Add batch dimension with torch.no_grad(): features = model(image) return features.numpy().flatten() feature_vector = extract_features("my_image.jpg") Construct a Graph from Image Embeddings We can now construct a KNN (K Nearest Neighbor) graph where each node represents an image and edges indicate similarity. Construct a Graph from Image Embeddings We can now construct a KNN (K Nearest Neighbor) graph where each node represents an image and edges indicate similarity. Construct a Graph from Image Embeddings Construct a Graph from Image Embeddings We can now construct a KNN (K Nearest Neighbor) graph where each node represents an image and edges indicate similarity. import networkx as nx import numpy as np from sklearn.neighbors import NearestNeighbors #Simulated feature vectors for 100 images (assuming 512D embeddings) image_embeddings = np.random.rand(100, 512) #Build a KNN graph k = 5 nbrs = NearestNeighbors(n_neighbors=k, metric="cosine").fit(image_embeddings) distances, indices = nbrs.kneighbors(image_embeddings) #Construct the graph G = nx.Graph() for i in range(len(image_embeddings)): for j in range(1, k): # Skip self-loop G.add_edge(i, indices[i, j], weight=1 - distances[i, j]) #Visualize graph structure print("Graph nodes:", G.number_of_nodes()) print("Graph edges:", G.number_of_edges()) import networkx as nx import numpy as np from sklearn.neighbors import NearestNeighbors #Simulated feature vectors for 100 images (assuming 512D embeddings) image_embeddings = np.random.rand(100, 512) #Build a KNN graph k = 5 nbrs = NearestNeighbors(n_neighbors=k, metric="cosine").fit(image_embeddings) distances, indices = nbrs.kneighbors(image_embeddings) #Construct the graph G = nx.Graph() for i in range(len(image_embeddings)): for j in range(1, k): # Skip self-loop G.add_edge(i, indices[i, j], weight=1 - distances[i, j]) #Visualize graph structure print("Graph nodes:", G.number_of_nodes()) print("Graph edges:", G.number_of_edges()) Train a Graph Neural Network (GNN) for Similarity Learning Train a Graph Neural Network (GNN) for Similarity Learning Train a Graph Neural Network (GNN) for Similarity Learning We now use PyTorch Geometric to train a GraphSAGE model on this graph. import torch import torch.nn.functional as F from torch_geometric.nn import SAGEConv from torch_geometric.data import Data #Convert the NetworkX graph to PyTorch Geometric format edge_index = torch.tensor(list(G.edges)).t().contiguous() x = torch.tensor(image_embeddings, dtype=torch.float) #Define a GraphSAGE model class GraphSAGE(torch.nn.Module): def __init__(self, in_channels, hidden_channels, out_channels): super().__init__() self.conv1 = SAGEConv(in_channels, hidden_channels) self.conv2 = SAGEConv(hidden_channels, out_channels) def forward(self, x, edge_index): x = self.conv1(x, edge_index) x = F.relu(x) x = self.conv2(x, edge_index) return x #Initialize and train model model = GraphSAGE(512, 256, 128) #Reduce embedding size to 128D optimizer = torch.optim.Adam(model.parameters(), lr=0.01) for epoch in range(50): optimizer.zero_grad() out = model(x, edge_index) loss = F.mse_loss(out, x) #Self-supervised learning loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item()}") import torch import torch.nn.functional as F from torch_geometric.nn import SAGEConv from torch_geometric.data import Data #Convert the NetworkX graph to PyTorch Geometric format edge_index = torch.tensor(list(G.edges)).t().contiguous() x = torch.tensor(image_embeddings, dtype=torch.float) #Define a GraphSAGE model class GraphSAGE(torch.nn.Module): def __init__(self, in_channels, hidden_channels, out_channels): super().__init__() self.conv1 = SAGEConv(in_channels, hidden_channels) self.conv2 = SAGEConv(hidden_channels, out_channels) def forward(self, x, edge_index): x = self.conv1(x, edge_index) x = F.relu(x) x = self.conv2(x, edge_index) return x #Initialize and train model model = GraphSAGE(512, 256, 128) #Reduce embedding size to 128D optimizer = torch.optim.Adam(model.parameters(), lr=0.01) for epoch in range(50): optimizer.zero_grad() out = model(x, edge_index) loss = F.mse_loss(out, x) #Self-supervised learning loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item()}") Querying Similar Images Using the GNN Embeddings Querying Similar Images Using the GNN Embeddings Querying Similar Images Using the GNN Embeddings Now that training is complete, we can retrieve the top k most similar images for a given query. from sklearn.metrics.pairwise import cosine_similarity #Extract GNN embeddings with torch.no_grad(): learned_embeddings = model(x, edge_index).numpy() #Compute cosine similarity query_idx = 0 # Example query image similarities = cosine_similarity([learned_embeddings[query_idx]], learned_embeddings) top_k = np.argsort(-similarities[0])[:5] print("Most similar images:", top_k) from sklearn.metrics.pairwise import cosine_similarity #Extract GNN embeddings with torch.no_grad(): learned_embeddings = model(x, edge_index).numpy() #Compute cosine similarity query_idx = 0 # Example query image similarities = cosine_similarity([learned_embeddings[query_idx]], learned_embeddings) top_k = np.argsort(-similarities[0])[:5] print("Most similar images:", top_k) Final Thoughts GNNs create a smarter and more adaptive approach to image similarity. Unlike hashing, which relies on rigid pixel comparisons, GNNs learn meaningful relationships between images, making them far more robust to transformations, occlusions, and noise. In today’s world where image datasets are exploding in size from social media platforms to medical imaging archives and the rise of synthetic data, scalability and accuracy are critical. GNNs provide an efficient way to structure and retrieve similar images improving applications like visual search, content moderation, recommendation engines and AI driven media management. The future of image search isn’t just about finding duplicates, it’s about understanding the deeper connections between images, and GNNs take us one step forward to that a reality.