How to Build a Web Page Summarization App With Next.js, OpenAI, LangChain, and Supabase

An App That Can Understand the Context of Any Web Page. In this article, we'll show you how to create a handy web app that can summarize the content of any web page. Using Next.js for a smooth and fast web experience, LangChain for processing language, OpenAI for generating summaries, and Supabase for managing and storing vector data, we'll build a powerful tool together. Why We're Building It We all face information overload with so much content online. By making an app that gives quick summaries, we help people save time and stay informed. Whether you're a busy worker, a student, or just someone who wants to keep up with news and articles, this app will be a helpful tool for you. How it's Going to Be Our app will let users enter any website URL and quickly get a brief summary of the page. This means you can understand the main points of long articles, blog posts, or research papers without reading them fully. Potential and Impact This summarization app can be useful in many ways. It can help researchers skim through academic papers, keep news lovers updated, and more. Plus, developers can build on this app to create even more useful features. Next.js Next.js is a powerful and flexible React framework developed by Vercel that enables developers to build server-side rendering (SSR) and static web applications with ease. It combines the best features of React with additional capabilities to create optimized and scalable web applications. OpenAI The OpenAI module in Node.js provides a way to interact with OpenAI’s API, allowing developers to leverage powerful language models like GPT-3 and GPT-4. This module enables you to integrate advanced AI functionalities into your Node.js applications. LangChain.js LangChain is a powerful framework designed for developing applications with language models. Originally developed for Python, it has since been adapted for other languages, including Node.js. Here’s an overview of LangChain in the context of Node.js: What is LangChain? LangChain is a library that simplifies the creation of applications using large language models (LLMs). It provides tools to manage and integrate LLMs into your applications, handle the chaining of calls to these models, and enable complex workflows with ease. How do Large Language Models (LLM) Work? Large Language Models (LLMs) like OpenAI’s GPT-3.5 are trained on vast amounts of text data to understand and generate human-like text. They can generate responses, translate languages, and perform many other natural language processing tasks. Supabase Supabase is an open-source backend-as-a-service (BaaS) platform designed to help developers quickly build and deploy scalable applications. It offers a suite of tools and services that simplify database management, authentication, storage, and real-time capabilities, all built on top of PostgreSQL Prerequisites Before we start, make sure you have the following: Node.js and npm installed A Supabase account An OpenAI account Step 1: Setting Up Supabase First, we need to set up a Supabase project and create the necessary tables to store our data. Create a Supabase Project Go to Supabase, and sign up for an account. Create a new project, and make note of your Supabase URL and API key. You'll need these later. SQL Script for Supabase Create a new SQL query in your Supabase dashboard, and run the following scripts to create the required tables and functions: First, create an extension if it doesn’t already exist for our vector store: create extension if not exists vector; Next, create a table named “documents.” This table will be used to store and embed the content of the web page in vector format: create table if not exists documents ( id bigint primary key generated always as identity, content text, metadata jsonb, embedding vector(1536) ); Now, we need a function to query our embedded data: create or replace function match_documents ( query_embedding vector(1536), match_count int default null, filter jsonb default '{}' ) returns table ( id bigint, content text, metadata jsonb, similarity float ) language plpgsql as $$ begin return query select id, content, metadata, 1 - (documents.embedding query_embedding) as similarity from documents where metadata @> filter order by documents.embedding query_embedding limit match_count; end; $$; Next, we need to set up our table for storing the web page's details: create table if not exists files ( id bigint primary key generated always as identity, url text not null, created_at timestamp with time zone default timezone('utc'::text, now()) not null ); Step 2: Setting Up OpenAI Create OpenAI Project Visit the OpenAI Website: Go to OpenAI's website, sign up, and create a new project. Navigate to API: After logging in, navigate to the API section and create a new API key. This is usually accessible from the dashboard. Step 3: Setting Up Next.js Create Next.js app $ npx create-next-app summarize-page $ cd ./summarize-page Install the required dependencies: npm install @langchain/community @langchain/core @langchain/openai @supabase/supabase-js langchain openai axios Then, we will install Material UI for building our interface; feel free to use another library: npm install @mui/material @emotion/react @emotion/styled Step 4: OpenAI and Supabase Clients Next, we need to set up the OpenAI and Supabase clients. Create a libs directory in your project, and add the following files. src/libs/openAI.ts This file will configure the OpenAI client. import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai"; const openAIApiKey = process.env.OPENAI_API_KEY; if (!openAIApiKey) throw new Error('OpenAI API Key not found.') export const llm = new ChatOpenAI({ openAIApiKey, modelName: "gpt-3.5-turbo", temperature: 0.9, }); export const embeddings = new OpenAIEmbeddings( { openAIApiKey, }, { maxRetries: 0 } ); llm: The language model instance, which will generate our summaries. embeddings: This will create embeddings for our documents, which help in finding similar content. src/libs/supabaseClient.ts This file will configure the Supabase client. import { createClient } from "@supabase/supabase-js"; const supabaseUrl = process.env.SUPABASE_URL || ""; const supabaseAnonKey = process.env.SUPABASE_ANON_KEY || ""; if (!supabaseUrl) throw new Error("Supabase URL not found."); if (!supabaseAnonKey) throw new Error("Supabase Anon key not found."); export const supabaseClient = createClient(supabaseUrl, supabaseAnonKey); supabaseClient: The Supabase client instance to interact with our Supabase database. Step 5: Creating Services for Content and Files Create a services directory, and add the following files to handle fetching content and managing files. src/services/content.ts This service will fetch the web page content and clean it by removing HTML tags, scripts, and styles. import axios from "axios"; export async function getContent(url: string): Promise { let htmlContent: string = ""; const response = await axios.get(url as string); htmlContent = response.data; if (!htmlContent) return ""; // Remove unwanted elements and tags return htmlContent .replace(/style="[^"]*"/gi, "") .replace(/ ]*>[\s\S]*? /gi, "") .replace(/\s*on\w+="[^"]*"/gi, "") .replace( / ]*application\/ld\+json)[^>]*>[\s\S]*? /gi, "" ) .replace(/ ]*>/g, "") .replace(/\s+/g, " "); } This function fetches the HTML content of a given URL and cleans it up by removing styles, scripts, and HTML tags. src/services/file.ts This service will save the web page content into Supabase and retrieve summaries. import { embeddings, llm } from "@/libs/openAI"; import { supabaseClient } from "@/libs/supabaseClient"; import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase"; import { StringOutputParser } from "@langchain/core/output_parsers"; import { ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, } from "@langchain/core/prompts"; import { RunnablePassthrough, RunnableSequence, } from "@langchain/core/runnables"; import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"; import { formatDocumentsAsString } from "langchain/util/document"; export interface IFile { id?: number | undefined; url: string; created_at?: Date | undefined; } export async function saveFile(url: string, content: string): Promise { const doc = await supabaseClient .from("files") .select() .eq("url", url) .single (); if (!doc.error && doc.data?.id) return doc.data; const { data, error } = await supabaseClient .from("files") .insert({ url }) .select() .single (); if (error) throw error; const splitter = new RecursiveCharacterTextSplitter({ separators: ["\n\n", "\n", " ", ""], }); const output = await splitter.createDocuments([content]); const docs = output.map((d) => ({ ...d, metadata: { ...d.metadata, file_id: data.id }, })); await SupabaseVectorStore.fromDocuments(docs, embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); return data; } export async function getSummarization(fileId: number): Promise { const vectorStore = await SupabaseVectorStore.fromExistingIndex(embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); const retriever = vectorStore.asRetriever({ filter: (rpc) => rpc.filter("metadata->>file_id", "eq", fileId), k: 2, }); const SYSTEM_TEMPLATE = `Use the following pieces of context, explain what is it about and summarize it. If you can't explain it, just say that you don't know, don't try to make up some explanation. ---------------- {context}`; const messages = [ SystemMessagePromptTemplate.fromTemplate(SYSTEM_TEMPLATE), HumanMessagePromptTemplate.fromTemplate("{format_answer}"), ]; const prompt = ChatPromptTemplate.fromMessages(messages); const chain = RunnableSequence.from([ { context: retriever.pipe(formatDocumentsAsString), format_answer: new RunnablePassthrough(), }, prompt, llm, new StringOutputParser(), ]); const format_summarization = ` Give it title, subject, description, and the conclusion of the context in this format, replace the brackets with the actual content: [Write the title here] By: [Name of the author or owner or user or publisher or writer or reporter if possible, otherwise leave it "Not Specified"] [Write the subject, it could be a long text, at least minimum of 300 characters] ---------------- [Write the description in here, it could be a long text, at least minimum of 1000 characters] Conclusion: [Write the conclusion in here, it could be a long text, at least minimum of 500 characters] `; const summarization = await chain.invoke(format_summarization); return summarization; } saveFile: Saves the file and its content to Supabase, splits the content into manageable chunks, and stores them in the vector store. getSummarization: Retrieves relevant documents from the vector store and generates a summary using OpenAI. Step 6: Creating an API Handler Now, let's create an API handler to process the content and generate a summary. pages/api/content.ts import { getContent } from "@/services/content"; import { getSummarization, saveFile } from "@/services/file"; import { NextApiRequest, NextApiResponse } from "next"; export default async function handler( req: NextApiRequest, res: NextApiResponse ) { if (req.method !== "POST") return res.status(404).json({ message: "Not found" }); const { body } = req; try { const content = await getContent(body.url); const file = await saveFile(body.url, content); const result = await getSummarization(file.id as number); res.status(200).json({ result }); } catch (err) { res.status( 500).json({ error: err }); } } This API handler receives a URL, fetches the content, saves it to Supabase, and generates a summary. It handles both the saveFile and getSummarization functions from our services. Step 7: Building the Frontend Finally, let's create the frontend in src/pages/index.tsx to allow users to input URLs and display the summarizations. src/pages/index.tsx import axios from "axios"; import { useState } from "react"; import { Alert, Box, Button, Container, LinearProgress, Stack, TextField, Typography, } from "@mui/material"; export default function Home() { const [loading, setLoading] = useState(false); const [url, setUrl] = useState(""); const [result, setResult] = useState(""); const [error, setError] = useState (null); const onSubmit = async () => { try { setError(null); setLoading(true); const res = await axios.post("/api/content", { url }); setResult(res.data.result); } catch (err) { console.error("Failed to fetch content", err); setError(err as any); } finally { setLoading(false); } }; return ( theme.palette.background.default, position: "sticky", top: 0, zIndex: 2, py: 2, }} > Summarize the content of any page { if (result) setResult(""); setUrl(e.target.value); }} sx={{ mb: 2 }} /> Summarize {loading ? ( ) : ( {result && ( {result} )} {error && {error.message || error} } )} ); } This React component allows users to input a URL, submit it, and display the generated summary. It handles loading states and error messages to provide a better user experience. Step 8: Running the Application Create a .env file in the root of your project to store your environment variables: SUPABASE_URL=your-supabase-url SUPABASE_ANON_KEY=your-supabase-anon-key OPENAI_API_KEY=your-openai-api-key Finally, start your Next.js application: npm run dev Now, you should have a running application where you can input the web page's URL, and receive the page's summarized responses. Conclusion Congratulations! You've built a fully functional web page summarization application using Next.js, OpenAI, LangChain, and Supabase. Users can input a URL, fetch the content, store it in Supabase, and generate a summary using OpenAI's capabilities. This setup provides a robust foundation for further enhancements and customization based on your needs. Feel free to expand on this project by adding more features, improving the UI, or integrating additional APIs. Check the Source Code in This Repo: https://github.com/firstpersoncode/summarize-page Happy coding! An App That Can Understand the Context of Any Web Page. In this article, we'll show you how to create a handy web app that can summarize the content of any web page. Using Next.js for a smooth and fast web experience, LangChain for processing language, OpenAI for generating summaries, and Supabase for managing and storing vector data, we'll build a powerful tool together. Next.js LangChain OpenAI Supabase Why We're Building It We all face information overload with so much content online. By making an app that gives quick summaries, we help people save time and stay informed. Whether you're a busy worker, a student, or just someone who wants to keep up with news and articles, this app will be a helpful tool for you. How it's Going to Be Our app will let users enter any website URL and quickly get a brief summary of the page. This means you can understand the main points of long articles, blog posts, or research papers without reading them fully. Potential and Impact This summarization app can be useful in many ways. It can help researchers skim through academic papers, keep news lovers updated, and more. Plus, developers can build on this app to create even more useful features. Next.js Next.js is a powerful and flexible React framework developed by Vercel that enables developers to build server-side rendering (SSR) and static web applications with ease. It combines the best features of React with additional capabilities to create optimized and scalable web applications. OpenAI The OpenAI module in Node.js provides a way to interact with OpenAI’s API, allowing developers to leverage powerful language models like GPT-3 and GPT-4. This module enables you to integrate advanced AI functionalities into your Node.js applications. LangChain.js LangChain is a powerful framework designed for developing applications with language models. Originally developed for Python, it has since been adapted for other languages, including Node.js. Here’s an overview of LangChain in the context of Node.js: What is LangChain? LangChain is a library that simplifies the creation of applications using large language models (LLMs) . It provides tools to manage and integrate LLMs into your applications, handle the chaining of calls to these models, and enable complex workflows with ease. large language models (LLMs) How do Large Language Models (LLM) Work? Large Language Models (LLMs) like OpenAI’s GPT-3.5 are trained on vast amounts of text data to understand and generate human-like text. They can generate responses, translate languages, and perform many other natural language processing tasks. OpenAI’s GPT-3.5 Supabase Supabase is an open-source backend-as-a-service (BaaS) platform designed to help developers quickly build and deploy scalable applications. It offers a suite of tools and services that simplify database management, authentication, storage, and real-time capabilities, all built on top of PostgreSQL PostgreSQL Prerequisites Before we start, make sure you have the following: Node.js and npm installed A Supabase account An OpenAI account Node.js and npm installed A Supabase account An OpenAI account Step 1: Setting Up Supabase First, we need to set up a Supabase project and create the necessary tables to store our data. Create a Supabase Project Go to Supabase, and sign up for an account. Create a new project, and make note of your Supabase URL and API key. You'll need these later. Go to Supabase, and sign up for an account. Go to Supabase , and sign up for an account. Supabase Create a new project, and make note of your Supabase URL and API key. You'll need these later. Create a new project, and make note of your Supabase URL and API key. You'll need these later. SQL Script for Supabase Create a new SQL query in your Supabase dashboard, and run the following scripts to create the required tables and functions: First, create an extension if it doesn’t already exist for our vector store: create extension if not exists vector; create extension if not exists vector; Next, create a table named “documents.” This table will be used to store and embed the content of the web page in vector format: create table if not exists documents ( id bigint primary key generated always as identity, content text, metadata jsonb, embedding vector(1536) ); create table if not exists documents ( id bigint primary key generated always as identity, content text, metadata jsonb, embedding vector(1536) ); Now, we need a function to query our embedded data: create or replace function match_documents ( query_embedding vector(1536), match_count int default null, filter jsonb default '{}' ) returns table ( id bigint, content text, metadata jsonb, similarity float ) language plpgsql as $$ begin return query select id, content, metadata, 1 - (documents.embedding query_embedding) as similarity from documents where metadata @> filter order by documents.embedding query_embedding limit match_count; end; $$; create or replace function match_documents ( query_embedding vector(1536), match_count int default null, filter jsonb default '{}' ) returns table ( id bigint, content text, metadata jsonb, similarity float ) language plpgsql as $$ begin return query select id, content, metadata, 1 - (documents.embedding query_embedding) as similarity from documents where metadata @> filter order by documents.embedding query_embedding limit match_count; end; $$; Next, we need to set up our table for storing the web page's details: create table if not exists files ( id bigint primary key generated always as identity, url text not null, created_at timestamp with time zone default timezone('utc'::text, now()) not null ); create table if not exists files ( id bigint primary key generated always as identity, url text not null, created_at timestamp with time zone default timezone('utc'::text, now()) not null ); Step 2: Setting Up OpenAI Create OpenAI Project Visit the OpenAI Website: Go to OpenAI's website, sign up, and create a new project. Visit the OpenAI Website: Go to OpenAI's website , sign up, and create a new project. Go to OpenAI's website Navigate to API: After logging in, navigate to the API section and create a new API key. This is usually accessible from the dashboard. Navigate to API: After logging in, navigate to the API section and create a new API key. This is usually accessible from the dashboard. API section Step 3: Setting Up Next.js Create Next.js app $ npx create-next-app summarize-page $ cd ./summarize-page $ npx create-next-app summarize-page $ cd ./summarize-page Install the required dependencies: npm install @langchain/community @langchain/core @langchain/openai @supabase/supabase-js langchain openai axios npm install @langchain/community @langchain/core @langchain/openai @supabase/supabase-js langchain openai axios Then, we will install Material UI for building our interface; feel free to use another library: npm install @mui/material @emotion/react @emotion/styled npm install @mui/material @emotion/react @emotion/styled Step 4: OpenAI and Supabase Clients Next, we need to set up the OpenAI and Supabase clients. Create a libs directory in your project, and add the following files. libs src/libs/openAI.ts src/libs/openAI.ts This file will configure the OpenAI client. import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai"; const openAIApiKey = process.env.OPENAI_API_KEY; if (!openAIApiKey) throw new Error('OpenAI API Key not found.') export const llm = new ChatOpenAI({ openAIApiKey, modelName: "gpt-3.5-turbo", temperature: 0.9, }); export const embeddings = new OpenAIEmbeddings( { openAIApiKey, }, { maxRetries: 0 } ); import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai"; const openAIApiKey = process.env.OPENAI_API_KEY; if (!openAIApiKey) throw new Error('OpenAI API Key not found.') export const llm = new ChatOpenAI({ openAIApiKey, modelName: "gpt-3.5-turbo", temperature: 0.9, }); export const embeddings = new OpenAIEmbeddings( { openAIApiKey, }, { maxRetries: 0 } ); llm: The language model instance, which will generate our summaries. llm : The language model instance, which will generate our summaries. llm embeddings: This will create embeddings for our documents, which help in finding similar content. embeddings : This will create embeddings for our documents, which help in finding similar content. embeddings src/libs/supabaseClient.ts src/libs/supabaseClient.ts This file will configure the Supabase client. import { createClient } from "@supabase/supabase-js"; const supabaseUrl = process.env.SUPABASE_URL || ""; const supabaseAnonKey = process.env.SUPABASE_ANON_KEY || ""; if (!supabaseUrl) throw new Error("Supabase URL not found."); if (!supabaseAnonKey) throw new Error("Supabase Anon key not found."); export const supabaseClient = createClient(supabaseUrl, supabaseAnonKey); import { createClient } from "@supabase/supabase-js"; const supabaseUrl = process.env.SUPABASE_URL || ""; const supabaseAnonKey = process.env.SUPABASE_ANON_KEY || ""; if (!supabaseUrl) throw new Error("Supabase URL not found."); if (!supabaseAnonKey) throw new Error("Supabase Anon key not found."); export const supabaseClient = createClient(supabaseUrl, supabaseAnonKey); supabaseClient: The Supabase client instance to interact with our Supabase database. supabaseClient : The Supabase client instance to interact with our Supabase database. supabaseClient Step 5: Creating Services for Content and Files Create a services directory, and add the following files to handle fetching content and managing files. services src/services/content.ts src/services/content.ts This service will fetch the web page content and clean it by removing HTML tags, scripts, and styles. import axios from "axios"; export async function getContent(url: string): Promise { let htmlContent: string = ""; const response = await axios.get(url as string); htmlContent = response.data; if (!htmlContent) return ""; // Remove unwanted elements and tags return htmlContent .replace(/style="[^"]*"/gi, "") .replace(/ ]*>[\s\S]*? /gi, "") .replace(/\s*on\w+="[^"]*"/gi, "") .replace( / ]*application\/ld\+json)[^>]*>[\s\S]*? /gi, "" ) .replace(/ ]*>/g, "") .replace(/\s+/g, " "); } import axios from "axios"; export async function getContent(url: string): Promise { let htmlContent: string = ""; const response = await axios.get(url as string); htmlContent = response.data; if (!htmlContent) return ""; // Remove unwanted elements and tags return htmlContent .replace(/style="[^"]*"/gi, "") .replace(/ ]*>[\s\S]*? /gi, "") .replace(/\s*on\w+="[^"]*"/gi, "") .replace( / ]*application\/ld\+json)[^>]*>[\s\S]*? /gi, "" ) .replace(/ ]*>/g, "") .replace(/\s+/g, " "); } This function fetches the HTML content of a given URL and cleans it up by removing styles, scripts, and HTML tags. src/services/file.ts src/services/file.ts This service will save the web page content into Supabase and retrieve summaries. import { embeddings, llm } from "@/libs/openAI"; import { supabaseClient } from "@/libs/supabaseClient"; import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase"; import { StringOutputParser } from "@langchain/core/output_parsers"; import { ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, } from "@langchain/core/prompts"; import { RunnablePassthrough, RunnableSequence, } from "@langchain/core/runnables"; import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"; import { formatDocumentsAsString } from "langchain/util/document"; export interface IFile { id?: number | undefined; url: string; created_at?: Date | undefined; } export async function saveFile(url: string, content: string): Promise { const doc = await supabaseClient .from("files") .select() .eq("url", url) .single (); if (!doc.error && doc.data?.id) return doc.data; const { data, error } = await supabaseClient .from("files") .insert({ url }) .select() .single (); if (error) throw error; const splitter = new RecursiveCharacterTextSplitter({ separators: ["\n\n", "\n", " ", ""], }); const output = await splitter.createDocuments([content]); const docs = output.map((d) => ({ ...d, metadata: { ...d.metadata, file_id: data.id }, })); await SupabaseVectorStore.fromDocuments(docs, embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); return data; } export async function getSummarization(fileId: number): Promise { const vectorStore = await SupabaseVectorStore.fromExistingIndex(embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); const retriever = vectorStore.asRetriever({ filter: (rpc) => rpc.filter("metadata->>file_id", "eq", fileId), k: 2, }); const SYSTEM_TEMPLATE = `Use the following pieces of context, explain what is it about and summarize it. If you can't explain it, just say that you don't know, don't try to make up some explanation. ---------------- {context}`; const messages = [ SystemMessagePromptTemplate.fromTemplate(SYSTEM_TEMPLATE), HumanMessagePromptTemplate.fromTemplate("{format_answer}"), ]; const prompt = ChatPromptTemplate.fromMessages(messages); const chain = RunnableSequence.from([ { context: retriever.pipe(formatDocumentsAsString), format_answer: new RunnablePassthrough(), }, prompt, llm, new StringOutputParser(), ]); const format_summarization = ` Give it title, subject, description, and the conclusion of the context in this format, replace the brackets with the actual content: [Write the title here] By: [Name of the author or owner or user or publisher or writer or reporter if possible, otherwise leave it "Not Specified"] [Write the subject, it could be a long text, at least minimum of 300 characters] ---------------- [Write the description in here, it could be a long text, at least minimum of 1000 characters] Conclusion: [Write the conclusion in here, it could be a long text, at least minimum of 500 characters] `; const summarization = await chain.invoke(format_summarization); return summarization; } import { embeddings, llm } from "@/libs/openAI"; import { supabaseClient } from "@/libs/supabaseClient"; import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase"; import { StringOutputParser } from "@langchain/core/output_parsers"; import { ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, } from "@langchain/core/prompts"; import { RunnablePassthrough, RunnableSequence, } from "@langchain/core/runnables"; import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"; import { formatDocumentsAsString } from "langchain/util/document"; export interface IFile { id?: number | undefined; url: string; created_at?: Date | undefined; } export async function saveFile(url: string, content: string): Promise { const doc = await supabaseClient .from("files") .select() .eq("url", url) .single (); if (!doc.error && doc.data?.id) return doc.data; const { data, error } = await supabaseClient .from("files") .insert({ url }) .select() .single (); if (error) throw error; const splitter = new RecursiveCharacterTextSplitter({ separators: ["\n\n", "\n", " ", ""], }); const output = await splitter.createDocuments([content]); const docs = output.map((d) => ({ ...d, metadata: { ...d.metadata, file_id: data.id }, })); await SupabaseVectorStore.fromDocuments(docs, embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); return data; } export async function getSummarization(fileId: number): Promise { const vectorStore = await SupabaseVectorStore.fromExistingIndex(embeddings, { client: supabaseClient, tableName: "documents", queryName: "match_documents", }); const retriever = vectorStore.asRetriever({ filter: (rpc) => rpc.filter("metadata->>file_id", "eq", fileId), k: 2, }); const SYSTEM_TEMPLATE = `Use the following pieces of context, explain what is it about and summarize it. If you can't explain it, just say that you don't know, don't try to make up some explanation. ---------------- {context}`; const messages = [ SystemMessagePromptTemplate.fromTemplate(SYSTEM_TEMPLATE), HumanMessagePromptTemplate.fromTemplate("{format_answer}"), ]; const prompt = ChatPromptTemplate.fromMessages(messages); const chain = RunnableSequence.from([ { context: retriever.pipe(formatDocumentsAsString), format_answer: new RunnablePassthrough(), }, prompt, llm, new StringOutputParser(), ]); const format_summarization = ` Give it title, subject, description, and the conclusion of the context in this format, replace the brackets with the actual content: [Write the title here] By: [Name of the author or owner or user or publisher or writer or reporter if possible, otherwise leave it "Not Specified"] [Write the subject, it could be a long text, at least minimum of 300 characters] ---------------- [Write the description in here, it could be a long text, at least minimum of 1000 characters] Conclusion: [Write the conclusion in here, it could be a long text, at least minimum of 500 characters] `; const summarization = await chain.invoke(format_summarization); return summarization; } saveFile: Saves the file and its content to Supabase, splits the content into manageable chunks, and stores them in the vector store. saveFile : Saves the file and its content to Supabase, splits the content into manageable chunks, and stores them in the vector store. saveFile getSummarization: Retrieves relevant documents from the vector store and generates a summary using OpenAI. getSummarization : Retrieves relevant documents from the vector store and generates a summary using OpenAI. getSummarization Step 6: Creating an API Handler Now, let's create an API handler to process the content and generate a summary. pages/api/content.ts pages/api/content.ts import { getContent } from "@/services/content"; import { getSummarization, saveFile } from "@/services/file"; import { NextApiRequest, NextApiResponse } from "next"; export default async function handler( req: NextApiRequest, res: NextApiResponse ) { if (req.method !== "POST") return res.status(404).json({ message: "Not found" }); const { body } = req; try { const content = await getContent(body.url); const file = await saveFile(body.url, content); const result = await getSummarization(file.id as number); res.status(200).json({ result }); } catch (err) { res.status( 500).json({ error: err }); } } import { getContent } from "@/services/content"; import { getSummarization, saveFile } from "@/services/file"; import { NextApiRequest, NextApiResponse } from "next"; export default async function handler( req: NextApiRequest, res: NextApiResponse ) { if (req.method !== "POST") return res.status(404).json({ message: "Not found" }); const { body } = req; try { const content = await getContent(body.url); const file = await saveFile(body.url, content); const result = await getSummarization(file.id as number); res.status(200).json({ result }); } catch (err) { res.status( 500).json({ error: err }); } } This API handler receives a URL, fetches the content, saves it to Supabase, and generates a summary. It handles both the saveFile and getSummarization functions from our services. saveFile getSummarization Step 7: Building the Frontend Finally, let's create the frontend in src/pages/index.tsx to allow users to input URLs and display the summarizations. src/pages/index.tsx src/pages/index.tsx src/pages/index.tsx import axios from "axios"; import { useState } from "react"; import { Alert, Box, Button, Container, LinearProgress, Stack, TextField, Typography, } from "@mui/material"; export default function Home() { const [loading, setLoading] = useState(false); const [url, setUrl] = useState(""); const [result, setResult] = useState(""); const [error, setError] = useState (null); const onSubmit = async () => { try { setError(null); setLoading(true); const res = await axios.post("/api/content", { url }); setResult(res.data.result); } catch (err) { console.error("Failed to fetch content", err); setError(err as any); } finally { setLoading(false); } }; return ( theme.palette.background.default, position: "sticky", top: 0, zIndex: 2, py: 2, }} > Summarize the content of any page { if (result) setResult(""); setUrl(e.target.value); }} sx={{ mb: 2 }} /> Summarize {loading ? ( ) : ( {result && ( {result} )} {error && {error.message || error} } )} ); } import axios from "axios"; import { useState } from "react"; import { Alert, Box, Button, Container, LinearProgress, Stack, TextField, Typography, } from "@mui/material"; export default function Home() { const [loading, setLoading] = useState(false); const [url, setUrl] = useState(""); const [result, setResult] = useState(""); const [error, setError] = useState (null); const onSubmit = async () => { try { setError(null); setLoading(true); const res = await axios.post("/api/content", { url }); setResult(res.data.result); } catch (err) { console.error("Failed to fetch content", err); setError(err as any); } finally { setLoading(false); } }; return ( theme.palette.background.default, position: "sticky", top: 0, zIndex: 2, py: 2, }} > Summarize the content of any page { if (result) setResult(""); setUrl(e.target.value); }} sx={{ mb: 2 }} /> Summarize {loading ? ( ) : ( {result && ( {result} )} {error && {error.message || error} } )} ); } This React component allows users to input a URL, submit it, and display the generated summary. It handles loading states and error messages to provide a better user experience. Step 8: Running the Application Create a .env file in the root of your project to store your environment variables: SUPABASE_URL=your-supabase-url SUPABASE_ANON_KEY=your-supabase-anon-key OPENAI_API_KEY=your-openai-api-key SUPABASE_URL=your-supabase-url SUPABASE_ANON_KEY=your-supabase-anon-key OPENAI_API_KEY=your-openai-api-key Finally, start your Next.js application: npm run dev npm run dev Now, you should have a running application where you can input the web page's URL, and receive the page's summarized responses. Conclusion Congratulations! You've built a fully functional web page summarization application using Next.js, OpenAI, LangChain, and Supabase. Users can input a URL, fetch the content, store it in Supabase, and generate a summary using OpenAI's capabilities. This setup provides a robust foundation for further enhancements and customization based on your needs. Feel free to expand on this project by adding more features, improving the UI, or integrating additional APIs. Check the Source Code in This Repo: https://github.com/firstpersoncode/summarize-page https://github.com/firstpersoncode/summarize-page Happy coding!