In pursuit of compute
What I learned about Google Cloud compute while figuring out how to host a Discord bot
Hey there! It’s been awhile.1
I’ve been on parental leave for a few weeks, which has been amazing for lots of reasons. It’s the first time in my career I’ve taken several consecutive weeks off work, and while it would be wrong to call it a “break”, the change of pace has been transformative. Or do I mean informative? I’m not sure. I didn’t get much sleep last night.
Anyway, at the moment my daughter is sleeping on my lap and so I have a few minutes to write up a side project I’ve been playing with.
I’m working on some fun updates to the Cloud Resume Challenge (more on that in a few weeks), and one thing I need to build is a bot to help manage the 4,000-member CRC Discord server. My requirements for the bot architecture:
I want it to be as serverless/managed as possible, because I’m still that guy;
I want to run it on Google Cloud, because I’m still trying to get my head around Google Cloud;
I want to run it for free, because I want to understand the limits of the Google Cloud free tier and also because I like free things.
If you don’t know much about Discord or Discord bots, there is endless documentation out there, much of it either 1) contradictory, due to Discord’s habit of releasing vaguely similar new APIs at breakneck pace into a peculiar ecosystem of community-maintained language SDKs that are constantly winking in and out of existence like firefly lights, or 2) out-of-date, due to knock-on effects of 1). Bottom line: it’s a much more bewildering space than I expected, and I underestimated the complexity here.
So this post is going to feel a bit random, but what I’m going to try to do is talk you through my process of discovering the best way to host a Discord bot on Google Cloud and what I learned about Google Cloud’s compute options along the way.
First idea: Cloud Run
At first blush, this seems like a classic serverless use case. We want to write some Python2 that will listen to events from Discord and maybe send some scheduled messages. In the AWS world, I’d have used Lambda and API Gateway to do this. In the Google Cloud world we could use Cloud Functions, but recent updates to Cloud Functions have made clear that it’s pretty much just a programming conceit on top of Cloud Run now, so for the sake of keeping my mental models straight let’s consider Cloud Run to be Google Cloud’s default “serverless” offering.
I found Cloud Run interesting to work with, for reasons outlined in this Twitter thread:
AWS people can think of Cloud Run as a more developer-friendly, slightly more “serverless” version of Fargate; the emulated development options are nice, and it has super smooth, flexible scaling and concurrency out of the box - all the way down to zero.3
But there’s a problem. The problem is Discord’s classic way of interacting with bots, which is basically just to hold open a socket on a piece of long-running compute. (There is a newer way of talking to Discord bots that’s more event-driven - we’ll get to that in a minute; I’m sharing my learning in the order I figured it out.) That means we can’t host our bot using Cloud Run’s classic execution mode, which only spins up a CPU to process requests and scales to zero the rest of the time. Instead, we’d have to use the newer “always-on CPU allocation” feature that keeps the container running indefinitely.4 And this runs afoul of both my serverless and cost management goals. Classic Cloud Run has a great free tier; you can crunch through millions of free requests per month, just like on Lambda. Switch over to Always-On and the cheapest CPU is going to cost something like $20/month. Ugh. Let’s keep looking.5
GCE (the coward’s refuge)
For the reasons I just explained, the traditional way of hosting a Discord bot has always been just to dump it on the cheapest VM you can find. This is in fact what Google Cloud suggests, by way of a DevRel-authored tutorial blog from a year or two ago: just plop the thing on Google Compute Engine and have done with it. This has the advantage of being free: Google Cloud’s free tier lets you run one e2-micro instance at any time. That tutorial doesn’t really give a serious example, though; it suggests just SSHing into the server and executing your bot script from the command line. I’m looking for quick and cheap, but I’m not a barbarian.
It turns out that Google Cloud doesn’t have a standalone VPS product like AWS Lightsail or [insert your favorite “deploy a container to the cloud with 1 command” product here], but what they do have are container-optimized VMs. These are locked-down instances that run Docker and allow you to specify the entrypoint when you deploy the instance itself. If you want to run a single container on a single VM, for free, a container-optimized e2-micro GCE instance is the Google Cloud way to do it. And I did it.
(Pausing here to shout out the Google Cloud GitHub Actions developed by Google Cloud’s DevRel team, and specifically the GCE example workflow; they made it super easy for me to get my bot from repo to the cloud.)
So now I have a Discord bot running on Google Cloud, for free, but very much not serverless. This isn’t just a cosmetic complaint; my bot can’t scale beyond its single e2-micro instance, it’s not HA, and the long-polling architecture means all my control logic for scheduled tasks is wrapped up in my server instead of using Google Cloud Scheduler.
To boldly go where no event has gone before
Here’s where I started learning about newer Discord API modes like Interaction Events, which allow you to leave the old long-polling clients behind and just respond to HTTP POST requests like a normal web app. It would seem this would allow us to go back to Cloud Run and build a true event-driven bot, right?
Well, sort of. I’m not aware of anyone who’s built this on Cloud Run, but Gerald McAlister tried it on Lambda, and the problem he ran into was … wait for it…
(are you waiting?)
… cold starts.
Discord enforces a 3-second timeout on responses to its interaction events. Depending on your language, package size, etc, this can leave you with little time to process an I/O-bound command after a cold start. McAlister’s clever solution is a CDK construct that uses two chained Lambda functions - one to respond to the initial request and send Discord back a provisional response, the other to do the bulk of the command processing and return the final output. This is, needless to say, a hack - and I think it still might not avoid cold start timeouts in the p>99 case.
If I was building on AWS, I’d be tempted to borrow his CDK construct, but I kind of doubt this trick is worth porting over to Google Cloud. Come to think of it, most of what I need this bot to do can probably be implemented using webhooks, where the command is initiated by my app and not the Discord server. And THAT I can do - for free! - on Cloud Run.
Links and events
The Cloud Resume Challenge has always been about helping people build something real. But beyond the OG challenge, there’s great need for more reality-informed cloud project ideas touching on security, observability, cloud automation, and more. So the Cloud Resume Challenge is now an open-source project inviting community extensions. Recent add-on projects have included a classical networking jaunt and a secure software supply chain challenge. If you have cloud knowledge and would like to propose a challenge, I hope you’ll contribute a project prompt or suggest one via issue!
On a similar note, I keynoted Pluralsight’s Tech Skills Day, sharing my recommendations on how to build a great cloud side project.
I chatted with The Cloudcast about building multi-cloud skills. (Spoiler: I think you should, even if you’d never consider building a multicloud app.)
Just for fun
I’ve guested on a few podcasts recently, but none more fun than this episode of The Changelog’s “Song Encoder”, where we took a tour through a bunch of my tech songs and examined the key question: why would anyone do this?
Reminder for those who are new here: in each issue of Cloud Irregular, I try to provide a piece of useful cloud commentary, round up links to things I’ve published elsewhere, and leave you with something silly and fun. While there is technically a paid version of this newsletter, I disabled payments several months ago and have no plans to turn them back on.
Well, I want to write some Python.
“Serverless” compute services ranked from “most serverless to least serverless”, in my opinion: AWS Lambda, Google Cloud Run, AWS Fargate, AKS virtual nodes
We’d also need to be careful about concurrency; a Cloud Run container can handle up to 80 concurrent requests by default, but to keep from confusing the Discord client we’d need to lock it down to 1.
I did end up using Cloud Run for an adjacent little web service, though. No complaints so far.