Skip to content

Infrastructure Recommendations

Deployment guidelines for hosting the AI Dashboard application across cloud environments.


1. Stateless API Node Sizing

The FastAPI gateway handles compute-heavy tasks like schema serialization, Pydantic validations, and real-time SSE streaming generation.

Sizing Baseline

  • CPU: Since multi-stage agent workflows run within localized Python async execution loops, assign multi-core instances (minimum 2 vCPUs) to prevent stream bottlenecks.
  • Memory: Base workers need sufficient memory buffers (minimum 2GB RAM per running container) to process intermediate Pandas DataFrames efficiently.

2. Horizontal Scaling Guidelines

Because generation streams emit directly over open network connections, routing needs careful infrastructure configuration:

graph TD
    Client[Client Connection] --> LB[Load Balancer]

    subgraph "Stateless Application Nodes"
        LB --> Node1[FastAPI Gateway Instance 1]
        LB --> Node2[FastAPI Gateway Instance 2]
    end

    Node1 --> DB[(Shared MongoDB Cluster)]
    Node2 --> DB

Routing Considerations

  • Session Restoration: Because final dashboard configurations are persisted centrally inside MongoDB, subsequent refinement operations (POST /dashboard/refine) can route to any healthy stateless gateway instance safely.
  • Connection Throughput: Ensure downstream load balancers have generous request timeouts configured (e.g., 30+ seconds) to prevent dropping open SSE streaming sockets during complex agent generations.