Skip to content

About

Waiting Room queues visitors when your traffic approaches a previously defined threshold that might otherwise bring an application down.

Waiting Room process flow showing how a request is managed by Cloudflare and placed in a waiting room before reaching the origin website

User flow

Once you have created and activated a waiting room for a specific application page:

  • If a page is not experiencing heavy traffic, a visitor accesses the page directly.

  • If page traffic approaches a user-defined threshold, a visitor enters a virtual waiting room until it is their turn to access the page:

    • Each user receives a cookie to manage the dynamic outflow of requests from the waiting room to the origin website in First In First Out (FIFO) order.
    • While in the waiting room, the user’s browser automatically refreshes every 20 seconds to give them updated information about their estimated wait time.
    • When a user exits the waiting room and reaches your application, they can leave and re-enter without waiting for the length of time specified by the session duration.
    • Because waiting rooms support dynamic inflow and outflow, new spots appear more quickly and estimated wait times are lower and more accurate.

Architecture

Waiting Room is built on Workers that runs across a global network of Cloudflare data centers.

When a request comes to a host or path covered by a Waiting Room, that request goes to a Waiting Room Worker in the closest geographic data center. The Worker then needs to make a decision: whether to send users to the queue or the website.

That decision itself depends on two factors: admin-defined thresholds and the Waiting Room state.

For admin-defined thresholds, the two measures that matter are total active users and new users per minute:

  • total active users is a target threshold for how many simultaneous users you want to allow on the pages covered by your waiting room.

  • new users per minute defines the target threshold for the maximum rate of user influx to your website per minute.

A sharp spike in either of these values might result in queuing. Another configuration that affects how we calculate the total active users is session duration. A user is considered active for session duration minutes since the request is made to any page covered by a waiting room.

The other factor is the Waiting Room state, which is maintained at the local data center level but then also changes continuously based on the traffic around the world. Each data center works with its own Waiting Room state. This state is a snapshot of the traffic pattern for the website around the world available at that point in time. The advantage of using this approach - making decisions at the Worker level - is that we can make decisions without any significant latency added to the request. The algorithm for Waiting Room dynamically allocates a certain number of slots available to each Worker based on the Waiting Room state. Queueing starts when the slots run out within the Worker. The lack of additional latency added enables the customers to turn on the waiting room all the time without worrying about extra latency to their users.

The Waiting Room state is updated with global information every few seconds. We have a pipeline set up in Cloudflare Durable Objects that ensures changes in traffic get propagated around the world. This architecture ensures that we do not introduce additional latency, as well as that we are making decisions with as near-time accuracy as possible.

For even more details about the architecture and why we made these decisions, refer to our deep-dive technical blog.