The importance of batching writes at scale when using serverless tools
I know Firestore and BQ charge per use, but when you build you don't think about enormous use! Let's go batching!
In the world of building SaaS, we all chase “hockey stick” growth. It’s the dream, right? But sometimes that growth hits you fast and hard like a cold brew coffee on a summer’s morning.
Recently, I onboarded a new customer to RocketFlag, and let’s just say they were a bit larger than I anticipated. They are a scaling SaaS business themselves, and once they plugged into the API, the floodgates didn’t just open; they were ripped off the hinges and sent far off down the river into the sunset.
To give you some context, my application was comfortably cruising along serving about 1 to 2 requests per second (RPS). It was a gentle stream.
Overnight, that stream turned into a firehose. The application started serving a baseline of 30 RPS during the quietest parts of the day, peaking at about 70 RPS.
Now, normally traffic is a good thing. But when your backend architecture is built for “get it working” rather than “massive scale,” things get interesting (and expensive) very quickly.
The Challenge
The core issue wasn’t that the servers couldn’t handle the load. The problem was the cost model of the services I was using.
Rocket Flag relies heavily on Firestore for application state and BigQuery for the heavy lifting on analytics.
If you’ve used Firestore, you know it charges based on reads and writes. It’s incredibly cheap for small workloads, but when you go from 2 writes a second to 70, you are bulleting writes to the database at a rate that starts to make the billing dashboard look like a horror movie.
Every single API request was triggering:
A write to Firestore.
A streaming insert to BigQuery to track statistics.
At that volume, I wasn’t just processing data; I was effectively burning money. I wanted to keep my costs under control, and I knew that if I let this run for a month, the bill would be astronomical.
The Fix: Batching to the rescue
So, what do you do when your infrastructure is bleeding cash? You pull an all-nighter and build a batching system.
Overnight, I refactored the ingestion pipeline to move away from real-time, direct writes.
Instead of writing to the database every single time a request hits the API (which could have been thousands of times a minute), the application now groups requests into chunks in memory.
Here is the logic I settled on:
Buffer: Incoming requests are held in memory.
Flush: The buffer flushes to the database once a minute.
Write: We perform a single batch write operation to Firestroe and a batch load job to BigQuery.
The Results
The difference was immediate. Observe the diagram below. Can you see where the batch writes were deployed?
By moving from thousands of API calls per minute to just a handful, my costs came back under control and the app improved it’s efficiency:
Reduced HTTP Overhead: We aren’t opening and closing thousands of connections to the database APIs anymore. The application is spending less time waiting on network calls and more time processing logic.
Cost Control: This was the big one. Instead of paying for 4,200 writes a minute (at peak), we are paying for a fraction of that by grouping updates.
Analytics Efficiency: Loading data into BigQuery in batches is significantly more affordable and efficient than the expensive streaming inserts.
The Trade-off
There is no such thing as a free lunch in software engineering. Every architectural decision is a set of trade-offs, which is why any experienced architect will give you the same answer when you ask a question: it depends.
In this case, the trade-off is latency.
Previously, the data in RocketFlag was live. If a flag hit came in, you saw it instantly. With this new batching approach, the data can be up to one minute late.
Is that a problem?
For this specific use case, I don’t think so. When you are looking at high-volume analytics, seeing the data update in “near real-time” (within 60 seconds) is usually an acceptable compromise for the massive cost savings and stability it brings. For example, if you are hitting 70 RPS, then a stat which shows you’ve had 4.9m requests this month so far isn’t going to materially change much from one minute to the next.
Summary
Scaling is exciting, but it exposes the cracks in your pricing and architecture models pretty quickly.
By shifting from a naive “write-on-request” model to a buffered batching system, I managed to keep RocketFlag alive and not melting my card while serving a massive jump in traffic.
If you are building a SaaS, keep an eye on those write-heavy operations they might just bite you in the arse when you finally land that big customer!







I've had to do similar with a log ingestion -> BigQuery problem. I'll have to fill you in next time I see you
Good one. How did you handle the potential loss of messages in the buffer if the app crashes?