Serverless architecture is a cloud execution model where the provider dynamically manages infrastructure allocation, running application code only in response to events and charging solely for actual compute time consumed.
Serverless architecture is a cloud execution model where the provider dynamically manages infrastructure allocation, running application code only in response to events and charging solely for actual compute time consumed.
In a typical serverless setup, an API Gateway sits at the edge of the system, receiving HTTP requests from clients and routing them to the appropriate function. Each function is a stateless unit of business logic — deployed independently and scaled to zero when idle. When a request arrives, the provider cold-starts a container to execute the function, or reuses a warm instance if one is available. This elastic scaling is automatic and happens within milliseconds.
Functions commonly interact with several downstream services:
- Object storage (e.g., S3) for reading and writing files or configuration - Managed databases (e.g., DynamoDB, Aurora Serverless) for persistent state - Message queues or event buses to trigger asynchronous downstream processing - External APIs for third-party integrations
An authorization layer — often a dedicated authorizer function or a JWT verification step inside the gateway — validates tokens before business logic runs, keeping security concerns separate. An event source mapping allows functions to also be triggered by queue messages, storage events, or scheduled rules rather than only HTTP calls.
The diagram shows the synchronous HTTP path from client through API Gateway to a handler function, followed by reads and writes to downstream services, and the asynchronous path where a function publishes events to a queue that triggers further processing. Compare with Auto Scaling Workflow for VM-based scaling patterns, or Cloud Load Balancing for request distribution across long-running services. For access control wiring, see Cloud IAM Permission Model.
Free online editor
Edit this diagram in Graphlet
Fork, modify, and export to SVG or PNG. No sign-up required.
Serverless architecture is a cloud execution model where the provider dynamically allocates infrastructure, running application code only in response to events and billing solely for actual compute time consumed. Developers deploy function units of business logic without managing servers, operating systems, or scaling configuration.
A cold start occurs when a serverless function is invoked but no warm container is available — the provider must allocate a new execution environment, download the function package, and initialise the runtime before the handler runs. Cold starts add latency ranging from hundreds of milliseconds to several seconds depending on the runtime and package size. Mitigation strategies include provisioned concurrency, reducing package size, choosing faster runtimes (Node.js, Python over JVM), and keeping functions warm with scheduled pings.
Serverless offers automatic scaling to zero, no infrastructure management, and per-invocation billing — ideal for event-driven, spiky, or infrequent workloads. Containers offer consistent performance, no cold start, full control over the runtime environment, and lower per-unit cost at sustained high throughput. Containers are better for long-running services, WebSocket connections, or workloads that outgrow function timeout limits.
Use serverless for event-driven workloads (file processing, queue consumers, webhooks), scheduled tasks, APIs with variable or unpredictable traffic, and prototypes where operational simplicity matters more than cost optimisation at scale. Avoid serverless for latency-sensitive APIs where cold starts are unacceptable, long-running background workers, or workloads with sustained high concurrency where reserved capacity is cheaper.
Putting too much business logic into a single large function defeats the purpose of serverless — functions should be small and single-purpose. Not configuring function timeouts and concurrency limits allows runaway functions to exhaust quotas or run up costs. Storing state between invocations in memory or local disk violates the stateless contract and causes unpredictable failures. Missing dead-letter queue configuration causes silent event loss when a function repeatedly fails to process a message.