I write a newsletter called Above Average, where I talk about the second-order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, . subscribe to it A lot of people want podcasts transcribed and read instead of listening to them. We can go one level up and even extract insights from the podcasts as well using Open AI API. Here’s my tweet exchange, which provoked this experiment. So here is what we are going to do. And it will be slightly different from what I suggested in the tweet. The goal is to pick a YouTube video and get a transcription of that video, and then using prompt engineering, we extract insights, ideas, book quotes & summaries, etc., To Summarize, we achieve our goal in three steps: Step 1: Select a YouTube podcast video Step 2: Transcribe the video Step 3: Get Insights from the transcription Step 1: Select a YouTube podcast video A recent podcast conversation that broke YouTube was Jeff Bezos on Lex Friedman's podcast. So, for this exercise, I will pick this . video Step 2: Transcribe the video I used langchain along with Open AI’s audio-to-text model whisper to transcribe the youtube video. As usual you would need your OpenAI secret key to use the following script. YouTubeAudioLoader import os
import sys
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key  = os.environ['OPENAI_API_KEY']


## youtube video's audio loader - langchain 
from langchain_community.document_loaders.blob_loaders.youtube_audio import YoutubeAudioLoader
from langchain_community.document_loaders.generic import GenericLoader
from langchain_community.document_loaders.parsers import OpenAIWhisperParser #, OpenAIWhisperParserLocal

url="https://www.youtube.com/watch?v=DcWqzZ3I2cY&ab_channel=LexFridman"
save_dir="outputs/youtube/"
loader = GenericLoader(
     YoutubeAudioLoader([url],save_dir),
     OpenAIWhisperParser()
 )
docs = loader.load()
print(docs[0].page_content[0:500])

# Specify the file path where you want to save the text
file_path = "audio-transcript.txt"
try:
    with open(file_path, 'a', encoding='utf-8') as file:
        for doc in docs:
            file.write(doc.page_content)
    print(f'Large text saved to {file_path}')
except FileNotFoundError:
    print(f"Error: Input file '{file_path}' not found.")
except Exception as e:
    print(f"An error occurred: {e}") You might see the following error while running this script, and I pasted the solution that works in case you are using a Windows system. Postprocessing: and not found. Please install or provide the path using –ffmpeg-location. ERROR: ffprobe ffmpeg Running this script will generate the transcript and store it in a text file audio-transcript.txt. Step 3: Extract insights from the conversation To extract insights, I am using Open AI API, and here is the script. The code loads the transcript text and passes it along with a prompt designed to extract insights, people & books. To get more interesting things out of this conversation, you can come up with a more interesting prompt. Note that the file name is slightly different because I had to cut the transcript to a short length since my completion query to Open AI API exceeded my TPM limits. import os
import sys
import openai
import shutil
from pprint import pprint

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key  = os.environ['OPENAI_API_KEY']

client = openai.OpenAI()

file_path = "audio-transcript-copy.txt"
try:
    with open(file_path, 'r', encoding='utf-8') as file:
        long_text = file.read()
    print(f'{file_path} is rad')
except FileNotFoundError:
    print(f"Error: Input file '{file_path}' not found.")
except Exception as e:
    print(f"An error occurred: {e}")

prompt2 = f"""
You will be provided with text deilimited by triple quotes.
The given text is a podcast transcript.

Provide the host and guest name.
Summarize the transcript in to 10 points.

If there are any people referred in the transcript. Extract the people mentioned and list them along with some info about them in the following format
1. Person 1's Name: Person 1's profession or what he or she is known for or the context in which he or she was referred to.
2. Person 2's Name: Person 2's profession or what he or she is known for or the context in which he or she was referred to.
...
2. Person N's Name: Person N's profession or what he or she is known for or the context in which he or she was referred to.
If the transcript doesnt contain refereces to any people then simply write \"No people referred to in the conversation.\"

Extract the books mentioned and list them in the following format.
1. Book 1's Title: Context in which the book was referred to.
2. Book 2's Title: Context in which the book was referred to.
...
N. Book N's Title: Context in which the book was referred to.
If the transcript doesnt contain refereces to any books then simply write \"No books referred to in the conversation.\"

IF you find any inspiration quotoes complie them in to a list.

\"\"\"{long_text}\"\"\"
"""

response = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {
      "role": "user",
      "content": prompt2
    }
  ],
  temperature=0.7,
  #max_tokens=64,
  #top_p=1
)

print(response.choices[0].message.content) Here is what the output I got: Result: Lex Freidman & Jeff Bezos Podcast Summary Can provide a service to podcasters to generate smart transcripts with insights. This would be a B2B play. AI PRODUCT IDEA ALERT 1: Instead of a service to podcast creators, it could be B2C customers who listen to podcasts and want to read through podcasts and create their own library of insights. AI PRODUCT IDEA ALERT 2: Expect both these ideas to be used by existing podcast hosting companies like Spotify & launch these ideas as new features. If I was a Product Manager in any such companies I would be pitching them by now. That’s it for day 5 of 100 Days of AI. Follow me on , for latest updates on 100 days of AI or & Check out Twitter LinkedIn bookmark this page Day 4: How to Use ChatGPT to be More Productive? Also published . here

Augmentastic || Augmented Reality

100 Days of AI Day 6: Retrieval Techniques and Their Use Cases

100 Days of AI Day 5: Transcription and Extracting Insight from Podcasts with OpenAI

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

About the $220K+ Per Month Patreon Channel You Never Heard Of

100 Days of AI Day 4: Maximizing Productivity & Creativity with ChatGPT

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

100 Days of AI Day 3: Leveraging AI for Prompt Engineering and Inference

100 Days of AI Day 2: Enhancing Prompt Engineering for ChatGPT

Using ChatGPT to be More Productive: 100 Days of AI - Day 4

About the $220K+ Per Month Patreon Channel You Never Heard Of

100 Days of AI Day 4: Maximizing Productivity & Creativity with ChatGPT

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

100 Days of AI Day 3: Leveraging AI for Prompt Engineering and Inference

100 Days of AI Day 2: Enhancing Prompt Engineering for ChatGPT

Using ChatGPT to be More Productive: 100 Days of AI - Day 4

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps