How we reduced Lambda cold starts at ACG
Cold starts are the tax you pay for Lambda's automatic scaling. At A Cloud Guru, our original GraphQL architecture amplified this tax significantly — a single page load could trigger four or five simultaneous cold starts. Here is how we fixed it.
The original architecture
Our GraphQL gateway resolved queries by directly invoking Lambda functions for each microservice. A single page load that required data from four services would invoke four Lambdas in parallel. If none of those Lambdas had been called in the last 15 minutes, all four would cold start simultaneously. We measured this as a five-second delay for users hitting that page — completely unacceptable.
The architecture also had a coupling problem: the GraphQL schema lived only on the gateway. Changing a type required two separate deployments — one to the service and one to the gateway — and tight coordination between teams.
Solution: schema stitching
Instead of the gateway invoking Lambda functions directly, we implemented schema stitching. Each microservice became responsible for its own schema definition and exposed it via HTTP/GraphQL. The gateway stitched these schemas together at startup and delegated query resolution to the appropriate service over HTTP rather than Lambda Invoke.
This had two important effects. First, each microservice now had a single Lambda function handling all of its API traffic rather than one Lambda per operation — higher utilisation means fewer cold starts per service. Second, the gateway no longer needed to know about the schema; services owned and deployed their own schemas independently.
Results
Performance testing showed a 20% improvement in p99 latency — from five seconds down to four seconds on cold paths. More meaningfully, we estimated the architecture change saved approximately 60 engineer-hours per month by eliminating the double-deployment requirement for schema changes.
Cold starts didn't disappear, but they became a problem for individual services to optimise rather than an unavoidable cascade on every page load.