Data Engineering/Data Infra & Process

[7ํŽธ] FastAPI ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ ์ตœ์ ํ™” (pgvector ์ธ๋ฑ์Šค ๋ฐ Auto Scaling ์ ์šฉ)

ygtoken 2025. 3. 7. 15:01
728x90

๐Ÿ“Œ ๊ฐœ์š”

 

์ด ๊ธ€์—์„œ๋Š” FastAPI + PostgreSQL + pgvector๋ฅผ ํ™œ์šฉํ•œ ๋Œ€๊ทœ๋ชจ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ ์ตœ์ ํ™” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค.

โœ… pgvector์˜ HNSW(Hierarchical Navigable Small World) ์ธ๋ฑ์Šค๋ฅผ ํ™œ์šฉํ•œ ๊ฒ€์ƒ‰ ์†๋„ ๊ฐœ์„ 

โœ… ๋Œ€๋Ÿ‰ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๋ฐ ๊ฒ€์ƒ‰ ์ตœ์ ํ™”

โœ… Kubernetes์—์„œ FastAPI์˜ Auto Scaling ์ ์šฉ

 


๐Ÿš€ 1. pgvector์˜ ์„ฑ๋Šฅ ์ตœ์ ํ™”๋ฅผ ์œ„ํ•œ HNSW ์ธ๋ฑ์Šค ์ ์šฉ

 

pgvector๋Š” ๋ฒกํ„ฐ ๊ฒ€์ƒ‰์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•ด L2 distance (์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ), Cosine similarity, Inner product ๋“ฑ์˜ ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

HNSW(Hierarchical Navigable Small World) ์ธ๋ฑ์Šค๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์—์„œ๋„ ๋น ๋ฅธ ๊ฒ€์ƒ‰์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

 

1๏ธโƒฃ ๊ธฐ์กด ๋ฒกํ„ฐ ํ…Œ์ด๋ธ” ํ™•์ธ

 

๋จผ์ €, ํ˜„์žฌ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ํ…Œ์ด๋ธ” ๊ตฌ์กฐ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

\dt

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

          List of relations
 Schema |    Name    | Type  |  Owner  
--------+-----------+-------+---------
 public | embeddings | table | postgres

 

๐Ÿ“Œ ํ˜„์žฌ ํ…Œ์ด๋ธ” ๊ตฌ์กฐ ์กฐํšŒ

\d embeddings

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

 Column   |   Type   | Collation | Nullable | Default
----------+---------+-----------+----------+---------
 id       | integer |           | not null | nextval('embeddings_id_seq'::regclass)
 content  | text    |           |          |
 embedding| vector(3) |         |          |
Indexes:
    "embeddings_pkey" PRIMARY KEY, btree (id)

 

 


2๏ธโƒฃ HNSW ์ธ๋ฑ์Šค ์ ์šฉ

 

HNSW ์ธ๋ฑ์Šค๋Š” ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์„ ๋Œ€ํญ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

CREATE INDEX embeddings_hnsw ON embeddings USING hnsw (embedding vector_l2_ops);

 

โœ… ์ ์šฉ๋œ ์ธ๋ฑ์Šค ํ™•์ธ

SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'embeddings';

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

 indexname     |               indexdef
---------------+----------------------------------------------
 embeddings_pkey | CREATE UNIQUE INDEX embeddings_pkey ON embeddings USING btree (id)
 embeddings_hnsw | CREATE INDEX embeddings_hnsw ON embeddings USING hnsw (embedding vector_l2_ops)

๐Ÿ“Œ HNSW ์ธ๋ฑ์Šค๊ฐ€ ์ ์šฉ๋˜์—ˆ๋Š”์ง€ ํ™•์ธ ํ›„ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ ํ…Œ์ŠคํŠธ ์ง„ํ–‰

 


๐Ÿš€ 2. ๋Œ€๊ทœ๋ชจ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ ๊ฒ€์ƒ‰ ์ตœ์ ํ™”

 

HNSW ์ธ๋ฑ์Šค๋ฅผ ํ™œ์šฉํ•œ ๊ฒ€์ƒ‰ ์ตœ์ ํ™” ์ฟผ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 

๐Ÿ“Œ HNSW ์ธ๋ฑ์Šค๋ฅผ ํ™œ์šฉํ•œ ์ตœ๊ทผ์ ‘ ์ด์›ƒ ๊ฒ€์ƒ‰

SELECT content, embedding <-> '[0.2, 0.2, 0.2]' AS distance
FROM embeddings
ORDER BY distance
LIMIT 5;

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

      content       | distance  
--------------------+-----------
 Hello, world!      | 0.141421
 How are you?       | 0.173205
 This is PostgreSQL | 0.500000
(3 rows)

๐Ÿ“Œ Cosine Similarity ๊ฒ€์ƒ‰

SELECT content, embedding <#> '[0.2, 0.2, 0.2]' AS similarity
FROM embeddings
ORDER BY similarity DESC
LIMIT 5;

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

      content       | similarity  
--------------------+------------
 How are you?       | 0.98
 Hello, world!      | 0.95
 This is PostgreSQL | 0.80
(3 rows)

 

 


๐Ÿš€ 3. Kubernetes์—์„œ FastAPI Auto Scaling ์ ์šฉ

 

1๏ธโƒฃ Kubernetes HPA (Horizontal Pod Autoscaler) ํ™œ์„ฑํ™”

 

FastAPI๊ฐ€ ๋งŽ์€ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•  ๋•Œ ์ž๋™์œผ๋กœ ์Šค์ผ€์ผ๋งํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

 

๐Ÿ“Œ HPA ์ ์šฉ์„ ์œ„ํ•œ Metrics Server ์„ค์น˜

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

 

๐Ÿ“Œ HPA๋ฅผ ์œ„ํ•œ FastAPI autoscaling.yaml ์ƒ์„ฑ

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: fastapi-vector-search-hpa
  namespace: fastapi
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: fastapi-vector-search
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

 

๐Ÿ“Œ HPA ์ ์šฉ

kubectl apply -f autoscaling.yaml -n fastapi

 

โœ… HPA ์ƒํƒœ ํ™•์ธ

kubectl get hpa -n fastapi

 

โœ… ์ถœ๋ ฅ ์˜ˆ์‹œ

NAME                        REFERENCE                        TARGETS   MINPODS   MAXPODS   REPLICAS
fastapi-vector-search-hpa   Deployment/fastapi-vector-search 50%/70%   1         5         2

HPA๊ฐ€ ํŠธ๋ž˜ํ”ฝ์— ๋”ฐ๋ผ ์ž๋™์œผ๋กœ FastAPI Pod์„ ํ™•์žฅํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 


๐Ÿ“Œ 4. ์ตœ์ข… ์ •๋ฆฌ

 

โœ… HNSW ์ธ๋ฑ์Šค๋ฅผ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ ์ตœ์ ํ™”

โœ… Cosine Similarity, L2 distance๋ฅผ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ๊ฒ€์ƒ‰

โœ… Kubernetes HPA๋ฅผ ์ด์šฉํ•œ FastAPI Auto Scaling ์ ์šฉ

โœ… ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์„ ๋Œ€ํญ ํ–ฅ์ƒ์‹œ์ผœ ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ์—์„œ๋„ ๋น ๋ฅธ ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅ

 

 

728x90