infocepo.com — Cloud, IA & Labs
Architecture de l'écosystème API IA & Cloud — April 2026
Kubernetes Cluster (Traefik Ingress + cert-manager)
Traefik Ingress
TLS · Route · Auth
↓
API LLM
api.ailab.infocepo.com
API STT
api-audio2txt.ailab.infocepo.com
API TTS
api-txt2audio.ailab.infocepo.com
API TXT2IMAGE
OpenDalle
API Realtime
WebSocket / WebRTC
API Summary
Embeddings
bge-m3
ChromaDB
Vector DB
Image-to-Text
ai-vision / OCR
Diarization
MediaWiki
infocepo.com/wiki
↓
LiteLLM Proxy
Keys · Quotas · Logs
↓
vLLM
Prod — Qwen3.6, gemma4
Ollama
Dev — rapid prototyping
Whisper 3
STT engine
Kokoro/OmniVoice
TTS engine
OpenDalle
Image gen
↓
S3 Storage
s3.ailab.infocepo.com
Container Registry
registry.ailab.infocepo.com
MariaDB
bitnami_mediawiki
Keycloak
SSO / Auth
Uptime Kuma
Monitoring
Frontend
Backend API
Model/DB
Cloud Service
Security
Realtime
Stack IA
- 12+ APIs OpenAI-compatible
- LiteLLM proxy (clés, quotas, logs)
- Qwen3.6, gemma4, whisper3-turbo
- RAG optimisé (BAAI/bge-m3, LightRAG)
- Realtime AI: WebSocket/WebRTC
Infrastructure
- Kubernetes (Traefik + cert-manager)
- vLLM en prod, Ollama en dev
- S3 + Container Registry
- MariaDB + ChromaDB vectoriel
- Keycloak SSO, Uptime Kuma
Services
- LLM · STT · TTS · Image2Text
- Summary · Embeddings · Diarization
- TXT2IMAGE · Realtime API
- MediaWiki (infocepo.com/wiki)
- Cloud Lab & audit (ServerDiff.sh)
infocepo.com Cloud, IA & Labs — Architecture diagram generated April 2026