このブログでは、ライブイメージ検索を構築し、自然言語で検索します. たとえば、入力として画像のリストを持つ「象」や「かわいい動物」を検索できます。 私たちは、画像を理解し、埋め込むために多様な埋め込みモデルを使用し、効率的な検索のためのベクトルインデックスを構築します。我々は、CocoIndexを使用してインデックスフローを構築するつもりです、これは非常にパフォーマンスのリアルタイムデータ変換フレームワークです。実行中に、あなたはフォルダに新しいファイルを追加することができ、それは変更されたファイルのみを処理し、1分以内にインデックスされます。 It would mean a lot for us if you could drop a star at このチュートリアルが役に立つなら CocoIndex on GitHub 技術 ココインデックス AIのための超強力なリアルタイムデータ変換フレームワークです。 ココインデックス クライプ・ビット-L/14 これは、画像とテキストの両方を理解できる強力なビジョン言語モデルで、共有された埋め込みスペースで視覚的およびテキスト的表現を調節するために訓練されており、画像検索の使用ケースに最適です。 クライプ・ビット-L/14 私たちのプロジェクトでは、Clip を使用して: 
 
 
 
 画像の組み込みを直接生成する 自然言語の検索クエリを同じ埋め込みスペースに変換する Semantic search を有効にする: query embeddings with caption embeddings クイズ ハイパフォーマンスベクターデータベースです. We use it to store and query the embeddings. クイズ スピード is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints. We use it to build the web API for the image search. Python は、標準的な Python タイプヒントに基づいて、Python 3.7+ を使用して API を構築するためのモダンで高速(高性能)な Web フレームワークです。 スピード 前提条件 
 
 
 CocoIndex は Postgres を使用して、増加処理のためのデータラインナップを追跡します。 Qdrant をインストールする インデックスフローの定義 FLOW DESIGN フロー・ダイアグラムは、コードベースをどのように処理するかを示しています。 
 
 
 
 ローカルファイルシステムから画像ファイルを読み取る CLIP を使用して画像を理解して埋め込む Embeddings in a vector database for retrieval 1.画像を挿入する @cocoindex.flow_def(name="ImageObjectEmbedding")
def image_object_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
    data_scope["images"] = flow_builder.add_source(
        cocoindex.sources.LocalFile(path="img", included_patterns=["*.jpg", "*.jpeg", "*.png"], binary=True),
        refresh_interval=datetime.timedelta(minutes=1)  # Poll for changes every 1 minute
    )
    img_embeddings = data_scope.add_collector()
 サブフィールドを含むテーブルを作成します( で、 ), we can refer to the もっと詳細へ flow_builder.add_source filename content 文書化 2.各画像を処理し、情報を収集する。 2.1 CLIPで画像を埋め込む @functools.cache
def get_clip_model() -> tuple[CLIPModel, CLIPProcessor]:
    model = CLIPModel.from_pretrained(CLIP_MODEL_NAME)
    processor = CLIPProcessor.from_pretrained(CLIP_MODEL_NAME)
    return model, processor
 THE デコレータは機能呼び出しの結果をキャッシュします. In this case, it ensures that we only load the CLIP model and processor once. @functools.cache @cocoindex.op.function(cache=True, behavior_version=1, gpu=True)
def embed_image(img_bytes: bytes) -> cocoindex.Vector[cocoindex.Float32, Literal[384]]:
    """
    Convert image to embedding using CLIP model.
    """
    model, processor = get_clip_model()
    image = Image.open(io.BytesIO(img_bytes)).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")
    with torch.no_grad():
        features = model.get_image_features(**inputs)
    return features[0].tolist()
 is a custom function that uses the CLIP model to convert an image into a vector embedding. It accepts image data in bytes format and returns a list of floating-point numbers representing the image's embedding. 画像データをバイト形式で受け入れ、画像の埋め込みを表す浮動点番号のリストを返します。 embed_image この機能は、Caching through the パラメータ を有効にすると、実行器は、再処理中に機能の結果を再利用するために保存し、これは特にコンピュータ強度の高い操作に有用です。 . cache 文書化 その後、各画像を処理し、情報を収集します。 with data_scope["images"].row() as img:
    img["embedding"] = img["content"].transform(embed_image)
    img_embeddings.collect(
        id=cocoindex.GeneratedField.UUID,
        filename=img["filename"],
        embedding=img["embedding"],
    )
 2.3 収集物の収集 埋め込みをQdrantのテーブルにエクスポートします。 img_embeddings.export(
    "img_embeddings",
    cocoindex.storages.Qdrant(
        collection_name="image_search",
        grpc_url=QDRANT_GRPC_URL,
    ),
    primary_key_fields=["id"],
    setup_by_user=True,
)
 3.Indexを求める CLIP を使用してクエリを埋め込み、テキストと画像を同じ埋め込みスペースにマッピングし、クロスモダル類似性の検索を可能にします。 def embed_query(text: str) -> list[float]:
    model, processor = get_clip_model()
    inputs = processor(text=[text], return_tensors="pt", padding=True)
    with torch.no_grad():
        features = model.get_text_features(**inputs)
    return features[0].tolist()
 FastAPI Endpointの定義 セマンティック画像検索を実行します。 /search @app.get("/search")
def search(q: str = Query(..., description="Search query"), limit: int = Query(5, description="Number of results")):
    # Get the embedding for the query
    query_embedding = embed_query(q)
    
    # Search in Qdrant
    search_results = app.state.qdrant_client.search(
        collection_name="image_search",
        query_vector=("embedding", query_embedding),
        limit=limit
    )
    
 これにより、Qdrant ベクターデータベースに類似の埋め込みを検索します。 結果 limit # Format results
out = []
for result in search_results:
    out.append({
        "filename": result.payload["filename"],
        "score": result.score
    })
return {"results": out}
 このエンドポイントは、ユーザーが正確なキーワードマッチを使用するのではなく、自然言語で画像を説明して画像を見つけることができるセマンティックな画像検索を可能にします。 アプリケーション 速火 app = FastAPI()
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
# Serve images from the 'img' directory at /img
app.mount("/img", StaticFiles(directory="img"), name="img")
 FastAPI アプリケーションのセットアップと CORS ミドルウェアと静的ファイルのサービス アプリケーションは以下に構成されています: 
 
 
 
 すべての起源からのクロス・オリジンの要求を許可する 「img」ディレクトリから静的画像ファイルをサービスする 画像検索機能のためのAPIエンドポイントの処理 @app.on_event("startup")
def startup_event():
    load_dotenv()
    cocoindex.init()
    # Initialize Qdrant client
    app.state.qdrant_client = QdrantClient(
        url=QDRANT_GRPC_URL,
        prefer_grpc=True
    )
    app.state.live_updater = cocoindex.FlowLiveUpdater(image_object_embedding_flow)
    app.state.live_updater.start()
 起動イベント マネージャーは、最初に起動したときにアプリケーションを初期化します. Here is what each part does: 
 
 
 
 
 load_dotenv(): .env ファイルから環境変数をロードし、API キーや URL などの構成に役立つ cocoindex.init():CocoIndexフレームワークを初期化し、必要なコンポーネントと構成を設定する Qdrant Client Setup:
 
 
 
 
 
 
 Creates a new   instance QdrantClient Configures it to use the gRPC URL specified in environment variables Enables gRPC preference for better performance Stores the client in the FastAPI app state for access across requests Live Updater Setup:
 
 
 
 
 
 Creates a   instance for the  FlowLiveUpdater image_object_embedding_flow This enables real-time updates to the image search index Starts the live updater to begin monitoring for changes この初期化は、すべての必要なコンポーネントが適切に構成され、アプリケーションの起動時に実行されることを保証します。 前線 フロントエンドコードをチェックできます。 私たちは意図的に、画像検索機能に焦点を当てることがシンプルでミニマリズム的でした。 ここ 楽しむ時間です! 
 
 
 
 
 
 
 
 Create a collection in Qdrant curl -X PUT 'http://localhost:6333/collections/image_search' \
-H 'Content-Type: application/json' \
-d '{
    "vectors": {
    "embedding": {
        "size": 768,
        "distance": "Cosine"
    }
    }
}'
 
 
 
 
 Setup indexing flow cocoindex setup main.py
 It is setup with a live updater, so you can add new files to the folder and it will be indexed within a minute. 
 
 
 Run backend uvicorn main:app --reload --host 0.0.0.0 --port 8000
 
 
 
 Run frontend cd frontend
npm install
npm run dev
 行こう 2 検索 http://localhost:5174 で、もう一枚の画像を追加して、 たとえば、このフォルダーは、 , または好きな画像. 新しい画像が処理され、インデックスされるまで、しばらくお待ちください。 img かわいいスカート インデックスの進捗状況を監視したい場合は、CocoInsightでご覧いただけます。 . cocoindex server -ci main.py  Finally - we are constantly improving, and more features and examples are coming soon. If you love this article, please give us a star ⭐ at   to help us grow. Thanks for reading! GitHub GitHub

This story contains new, firsthand information uncovered by the writer.

The code in this story is for educational purposes. The readers are solely responsible for whatever they build with it.

Star my work on github

Read My Stories

このオーディオは、ストーリーの元の言語で制作されています。

自然言語でビジョンモデルとクエリを使用してライブイメージ検索を構築する方法

About Author

コメント

ラベル

この記事は

Related Stories

フロキのヴァルハラがインドのスリランカツアーのアソシエイトスポンサーに加わる

フォーラムからフィードへ: ソーシャルメディアアルゴリズムがデジタルインタラクションを形作る仕組み

クラウド移行を成功させるための完全ガイド: 戦略とベストプラクティス

HackerNoon ライティングコンテストで優勝したいですか? #crypto-api コンテスト優勝者のおすすめはこちら

フロキのヴァルハラがインドのスリランカツアーのアソシエイトスポンサーに加わる

フォーラムからフィードへ: ソーシャルメディアアルゴリズムがデジタルインタラクションを形作る仕組み

クラウド移行を成功させるための完全ガイド: 戦略とベストプラクティス

HackerNoon ライティングコンテストで優勝したいですか? #crypto-api コンテスト優勝者のおすすめはこちら

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps