Site Update: RSS Bandwidth Fixes
Published on 01/14/2021, 554 words, 3 minutes to read
Well, so I think I found out where my Kubernetes cluster cost came from. For context, this blog gets a lot of traffic. Since the last deploy, my blog has served its RSS feed over 19,000 times. I have some pretty naiive code powering the RSS feed. It basically looked something like this:
- Write RSS feed content-type and beginning of feed
- For every post I have ever made, include its metadata and content
- Write end of RSS feed
This code was fantastically simple to develop, however it was very expensive in terms of bandwidth. When you add all this up, my RSS feed used to be more than a one megabyte response. It was also only getting larger as I posted more content.
This is unsustainable, so I have taken multiple actions to try and fix this from several angles.
Rationale: this is my most commonly hit and largest endpoint. I want to try and cut down its size.
current feed (everything): 1356706 bytes
20 posts: 177931 bytes
10 posts: 53004 bytes
5 posts: 29318 bytes pic.twitter.com/snjnn8RFh8
— Cadey A. Ratio (@theprincessxena)
January 15, 2021
Yes, that graph is showing in gigabytes. We're so lucky that bandwidth is free on Hetzner.
First I finally set up the site to run behind Cloudflare. The Cloudflare settings are set very permissively, so your RSS feed reading bots or whatever should NOT be affected by this change. If you run into any side effects as a result of this change, contact me and I can fix it.
Second, I also now set cache control headers on every response. By default the "static" pages are cached for a day and the "dynamic" pages are cached for 5 minutes. This should allow new posts to show up quickly as they have previously.
Thirdly, I set up ETags for the feeds. Each of my feeds will send an ETag in a response header. Please use this tag in future requests to ensure that you don't ask for content you already have. From what I recall most RSS readers should already support this, however I'll monitor the situation as reality demands.
Lastly, I adjusted the ttl of the RSS feed so that compliant feed readers should only check once per day. I've seen some feed readers request the feed up to every 5 minutes, which is very excessive. Hopefully this setting will gently nudge them into behaving.
As a nice side effect I should have slightly lower ram usage on the blog server too! Right now it's sitting at about 58 and a half MB of ram, however with fewer copies of my posts sitting in memory this should fall by a significant amount.
If you have any feedback about this, please contact me or mention me on Twitter. I read my email frequently and am notified about Twitter mentions very quickly.
Facts and circumstances may have changed since publication. Please contact me before jumping to conclusions if something seems wrong or unclear.
Tags: devops, optimization