Mangum (or similar ASGI adapters) wraps FastAPI for Lambda’s event format; container images buy you larger dependencies plus a closer match to local dev. Both need disciplined dependency slimming and power-tier tuning.

Serverless or cloud dashboard showing function metrics relevant to API latency.
Tie alarms to cold-start count and duration—not just errors—when user-facing routes live on Lambda.

Operational musts

  • Warm critical routes with provisioned concurrency if p99 matters for user-facing auth or checkout flows.
  • Externalize large ML models to EFS or lazy-download on first invoke—don’t bloat every cold start.
  • Run the same Dockerfile stage in CI integration tests you run in Lambda to avoid “works on my laptop.”

If traffic steadies, a small Fargate service behind ALB may cost less and simplify long-lived connections than pushing every edge case into Lambda.