Data Engineering/Data Infra & Process

[5ํŽธ] FastAPI์™€ PostgreSQL์„ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ API ๊ตฌ์ถ•

ygtoken 2025. 3. 7. 14:56
728x90

๐Ÿ“Œ ๊ฐœ์š”

 

์ด ๊ธ€์—์„œ๋Š” FastAPI + PostgreSQL + pgvector๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ API๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

โœ… pgvector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ ์ €์žฅ

โœ… FastAPI๋ฅผ ์ด์šฉํ•ด REST API๋กœ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ๊ธฐ๋Šฅ ๊ตฌํ˜„

โœ… AI ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ์„ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ ํ›„ ์ €์žฅ

 


๐Ÿš€ 1. FastAPI ํ”„๋กœ์ ํŠธ ์„ค์ •

 

๋จผ์ €, FastAPI ํ”„๋กœ์ ํŠธ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ํ•„์š”ํ•œ ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

 

1๏ธโƒฃ FastAPI ํ”„๋กœ์ ํŠธ ๋””๋ ‰ํ„ฐ๋ฆฌ ์ƒ์„ฑ

mkdir fastapi-vector-search
cd fastapi-vector-search

 

2๏ธโƒฃ Python ๊ฐ€์ƒํ™˜๊ฒฝ ์„ค์ • (์„ ํƒ)

python3 -m venv venv
source venv/bin/activate  # macOS/Linux
venv\Scripts\activate     # Windows

 

3๏ธโƒฃ FastAPI ๋ฐ PostgreSQL ํด๋ผ์ด์–ธํŠธ ์„ค์น˜

pip install fastapi uvicorn psycopg2 sqlalchemy numpy

 

 


๐Ÿš€ 2. PostgreSQL ํ…Œ์ด๋ธ” ์ƒ์„ฑ

 

FastAPI์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋„๋ก pgvector๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ด๋ธ”์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

 

1๏ธโƒฃ PostgreSQL ์ ‘์†

kubectl exec -it $(kubectl get pod -n database -l app.kubernetes.io/name=postgresql -o jsonpath="{.items[0].metadata.name}") -n database -- psql -U postgres -d ragdb

 

2๏ธโƒฃ ๋ฒกํ„ฐ ํ…Œ์ด๋ธ” ์ƒ์„ฑ

CREATE TABLE IF NOT EXISTS embeddings (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(3)  -- 3์ฐจ์› ๋ฒกํ„ฐ ์ €์žฅ (์‹ค์ œ ์‚ฌ์šฉ ์‹œ 1536์ฐจ์› ๋“ฑ ์กฐ์ • ๊ฐ€๋Šฅ)
);

 

โœ… ํ…Œ์ด๋ธ” ์ƒ์„ฑ ํ™•์ธ

\dt
         List of relations
 Schema |    Name    | Type  |  Owner  
--------+-----------+-------+---------
 public | embeddings | table | postgres

 

 


๐Ÿš€ 3. FastAPI์—์„œ PostgreSQL ์—ฐ๊ฒฐ

 

์ด์ œ FastAPI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ PostgreSQL๊ณผ ์—ฐ๋™ํ•ฉ๋‹ˆ๋‹ค.

 

๐Ÿ“Œ main.py ์ƒ์„ฑ

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import psycopg2
import numpy as np
from typing import List

app = FastAPI()

# PostgreSQL ์—ฐ๊ฒฐ ์ •๋ณด
DB_HOST = "localhost"
DB_PORT = "5432"
DB_NAME = "ragdb"
DB_USER = "postgres"
DB_PASSWORD = "postgresql"

# PostgreSQL ์—ฐ๊ฒฐ
def get_db_connection():
    return psycopg2.connect(
        host=DB_HOST, port=DB_PORT,
        database=DB_NAME, user=DB_USER, password=DB_PASSWORD
    )

# ๋ฐ์ดํ„ฐ ๋ชจ๋ธ ์ •์˜
class VectorData(BaseModel):
    content: str
    embedding: List[float]

# ๋ฐ์ดํ„ฐ ์‚ฝ์ž… API
@app.post("/add_vector/")
def add_vector(data: VectorData):
    conn = get_db_connection()
    cur = conn.cursor()
    
    try:
        # ๋ฒกํ„ฐ๋ฅผ PostgreSQL์— ์‚ฝ์ž…
        cur.execute(
            "INSERT INTO embeddings (content, embedding) VALUES (%s, %s) RETURNING id;",
            (data.content, np.array(data.embedding).tolist())
        )
        conn.commit()
        return {"message": "Vector added successfully", "id": cur.fetchone()[0]}
    
    except Exception as e:
        conn.rollback()
        raise HTTPException(status_code=500, detail=str(e))
    
    finally:
        cur.close()
        conn.close()

# ์ตœ๊ทผ์ ‘ ์ด์›ƒ ๊ฒ€์ƒ‰ API
@app.get("/search/")
def search_vector(query_vector: List[float]):
    conn = get_db_connection()
    cur = conn.cursor()

    try:
        cur.execute(
            "SELECT content, embedding <-> %s AS distance FROM embeddings ORDER BY distance LIMIT 1;",
            (np.array(query_vector).tolist(),)
        )
        result = cur.fetchone()
        return {"content": result[0], "distance": result[1]}
    
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
    
    finally:
        cur.close()
        conn.close()

 

 


๐Ÿš€ 4. FastAPI ์‹คํ–‰ ๋ฐ ํ…Œ์ŠคํŠธ

 

1๏ธโƒฃ FastAPI ์‹คํ–‰

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

โœ… ์‹คํ–‰ ํ›„ http://localhost:8000/docs ์— ์ ‘์†ํ•˜๋ฉด Swagger UI๋ฅผ ํ†ตํ•ด API๋ฅผ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 


2๏ธโƒฃ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€ ํ…Œ์ŠคํŠธ

 

POST /add_vector/ API ํ…Œ์ŠคํŠธ

{
    "content": "Hello, world!",
    "embedding": [0.1, 0.2, 0.3]
}

 

โœ… ์‘๋‹ต ์˜ˆ์‹œ

{
    "message": "Vector added successfully",
    "id": 1
}

 

 


3๏ธโƒฃ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ํ…Œ์ŠคํŠธ

 

GET /search/?query_vector=[0.2,0.2,0.2]

โœ… ์‘๋‹ต ์˜ˆ์‹œ

{
    "content": "Hello, world!",
    "distance": 0.141421
}

 

 


๐Ÿš€ 5. AI ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ ๋ฒกํ„ฐ ์ƒ์„ฑ

 

์ผ๋ฐ˜์ ์œผ๋กœ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋Š” AI ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, OpenAI์˜ text-embedding-ada-002 ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ…์ŠคํŠธ๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

1๏ธโƒฃ OpenAI API ํ‚ค ์„ค์ •

pip install openai

 

๐Ÿ“Œ ๋ฒกํ„ฐ ์ƒ์„ฑ ํ•จ์ˆ˜ ์ถ”๊ฐ€ (main.py ์ˆ˜์ •)

import openai

openai.api_key = "your-openai-api-key"

def get_embedding(text: str):
    response = openai.Embedding.create(
        input=text,
        model="text-embedding-ada-002"
    )
    return response['data'][0]['embedding']

@app.post("/add_text/")
def add_text(content: str):
    embedding = get_embedding(content)
    return add_vector(VectorData(content=content, embedding=embedding))

 

 


๐Ÿ“Œ 6. ์ตœ์ข… ์ •๋ฆฌ

 

โœ… FastAPI + PostgreSQL + pgvector๋ฅผ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ API ๊ตฌ์ถ•

โœ… ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์ €์žฅํ•˜๊ณ  ๊ฒ€์ƒ‰ํ•˜๋Š” API ๊ตฌํ˜„

โœ… OpenAI API๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ฌธ์žฅ์„ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ ํ›„ ์ €์žฅ

โœ… Swagger UI(http://localhost:8000/docs)๋ฅผ ์ด์šฉํ•œ API ํ…Œ์ŠคํŠธ ๊ฐ€๋Šฅ

 

 

728x90