The threat to open source comes from within

These are personal opinions and should not be construed as thought leadership

Apr 08, 2024

A great civilization is not conquered from without, until it has destroyed itself from within.

-Will Durant

The viability of open-source software was challenged twice over the past couple of weeks. One attack came from an outside adversary, the other from within the community itself. While the external threat triggered wider consternation, the internal threat seems to me far more dangerous.

NB: while I have created and contributed to various open-source projects over the years, I am certainly not an expert on OSS governance. Today’s edition of Good Tech Things is not intended to be “thought leadership” or whatever, just a tool for me to organize my own thoughts.

Outside threat: The XZ thing

If nobody had ever heard of open-source, and you wrote an undergrad paper today proposing it as the way to develop the software that runs the world, you would get an F minus.

Thesis sentence of your undergrad paper: Random people will anonymously create much of the world’s critically important software in their spare time, companies will make billions of dollars by using it for free, and this situation will remain shelf-stable for decades.

Instructor’s comment in red pen: This makes no sense. What would motivate any self-respecting software developer to participate in a scheme like this? And how do you know the resultant code will be any good?

Point 1 of your paper: Obsessive nerds will tear each other to pieces to make sure the code is as good as possible, because they just can’t help themselves.

Instructor, in red pen: That doesn’t sound like a sustainable state of affairs.

Point 2 of your paper: It works for Wikipedia.

Instructor, in even redder pen: You are not allowed to cite Wikipedia in an academic context. F minus.

Anyway, open-source—like Wikipedia and the BGP protocol—is one of those networked-collaboration things that makes no sense on paper, but in practice is more resilient than any centrally-managed system could ever be. Not because of some brilliant technical innovation, but because it’s an intensely human process. It works thanks to the pride and determination and sheer bloody-mindedness of beautiful weirdos all over the world.

The XZ thing put that messy, human resilience on full display. Here are the facts as I understand them:

There is an open-source data compression library called XZ that ships in popular Linux distros such as Debian, Fedora, and Ubuntu.
XZ’s maintainer, Lasse Collins, was evidently somewhat burned out.
A bad actor, probably state-sponsored, began harassing Collins through GitHub issues and comments, using multiple personas to demand that he pay more attention to the project.
At the same time, the bad actor used a different persona (“Jia Tan”) to worm their way into Collins’ confidence over a period of months, eventually convincing him to let Jia Tan help out as a maintainer on XZ to quiet all the complaints.
Collins, grateful for the help, allowed Jia Tan to commit code to the project.
Some of the code Jia Tan put into XZ was a backdoor that would have granted the adversaries SSH access to EVERY SERVER RUNNING THE COMPROMISED VERSION OF XZ.

The backdoor was only discovered when a Microsoft employee and certified legend named Andres Freund noticed SSH calls consuming extra CPU while he was running Postgres benchmarks, traced them to XZ, and sounded the alarm.

Cue freakout all over social media as everyone imagines what would have happened if said backdoor had survived undetected long enough for state-sponsored adversaries to get SSH privileges on every Linux box in the world.

This is exactly the scenario your instructor with the red pen was afraid of. What chance do solitary, uncompensated open-source maintainers have defending themselves from skilled, well-funded state actors with the patience to run years-long social engineering campaigns? Probably not much chance.

But that’s missing the point. As Mark Atwood says, the attack was not attempted because XZ was open source. The attack failed because it was open source.

Do you know what state actors do when they want to compromise closed-source software? They just get hired at the software company and put across their dirty tricks with much-reduced scrutiny. Sunlight is the best disinfectant, and open source is solar-powered.

The XZ attack certainly exposes longstanding issues with OSS: maintainer burnout, the difficulty of aligning funding incentives, etc. But overall, I think it’s a beautiful example of open source doing what it does best: swarming a problem, socializing the fix, and then propagating better practices into other projects within days, not years.

In my opinion, the least-helpful part of the response to XZ was actually from GitHub themselves, who suspended the project page—making it harder for the community to reconstruct the scope and timeline of the attack.

And that sort of gestures toward the real problem.

The inside threat

Again, caveat that I am neither a lawyer nor an open-source policy wonk, so I might get something wrong here, and feel free to let me know if I do.

External threats, as the XZ thing demonstrates, seem to have a galvanizing effect on the broader open-source community. OSS just gets stronger under stresses like that.

The thing I’m most worried about now is the opposite: a chilling effect. And it’s been creeping up on OSS like a glacier for over a decade.

Indulge me while I detour for a quick history lesson. It’s important, I promise.

Late 2000s / early 2010s: Open-source as a loss leader. Companies like MongoDB and Elastic, who gained popularity among developers by giving their software away for free, have a fairly decent business model for open-source: they also sell hosted versions of their software as a service, alongside commercial support and so on.
Mid-2010s: The host with the most. Guess who turns out to be REALLY good at selling hosted software as a service alongside commercial support and so on? Cloud providers! All of a sudden AWS has an Elasticsearch service, a Redis service, etc. Their network effects suck more and more customer dollars into the cloud provider, bypassing the company who is actually creating the open-source software. The OSS companies get increasingly frustrated by this.
Late 2010s-early 2020s: License to kill. The OSS companies fight back with the only weapon they have: the license terms of their software. MongoDB switches to the Server-Side Public License (SSPL), as does Elastic and, just recently, Redis. Hashicorp chooses the Business Source License (BSL or BUSL) for its projects like Terraform and Vault. The new terms are not so much “open-source” as they are “source-available”: they are specifically designed to prevent other companies (cough, AWS) from making money by hosting their software.
Mid-2020s: Fork you. The open-source community, who didn’t ask for any of this to happen, (and somewhat egged on by the cloud providers), parries with the only weapon THEY have: forking the projects and building parallel versions under the original, open-source license terms. Terraform spawns OpenTofu, backed by the Linux Foundation; Redis is getting a fork called Valkey.
Current situation: Everyone is mad at everyone.

Against that backdrop, a very curious article appeared last week by Matt Asay, a VP at MongoDB. The article suggests, more through insinuation than outright accusation, that the OpenTofu project is copying code from the new, BSL-licensed version of Terraform back into their open-source fork.

I should make it clear that I like and respect Matt; he and I have agreed on plenty of things over the years. He is also, unlike almost almost everyone involved in this discourse, trained as a lawyer. So I hesitate to dismiss his concerns just because he works for a source-available software company.

But I also can’t write this off as a minor squabble. Hashicorp evidently agrees with Matt; they’ve sent a formal knock-it-off letter to the OpenTofu maintainers. And at least one other critic of OpenTofu finds that the code in question sure does look an awful lot like Hashicorp’s new features. The whole thing is made weirder due to the presence of Hashicorp copyrights in OpenTofu’s new files, which Matt finds suspicious but the OpenTofu maintainers say they included out of caution because they’re moving a lot of code around. Again, not a lot of lawyers involved here.

On the other side of the dispute, you have many angry proponents of OpenTofu saying that this is a hit job, that of course the code looks similar because there’s only one reasonable way to implement the feature, etc. Calling back to our history lesson, it’s instructive to note that some of the loudest defenders of OpenTofu work for AWS.

I have no way of knowing whether OpenTofu stole Hashicorp’s code or not. Joe Duffy thinks that as a project whose goal is essentially to keep pace with Terraform’s roadmap, they’re in a tough spot either way. OpenTofu has said they’ll publish something soon that will clarify their innocence, but I doubt they’ll convince anyone who doesn’t already support them.

I guess I can see a couple ways this could go:

OpenTofu clearly demonstrates how they came up with the questionable code on their own, and Hashicorp’s lawyers back down. If that’s the case, I expect and hope that Matt would retract his article and offer an apology.
OpenTofu’s explanation raises more questions than it answers, and Hashicorp decides to pick a nasty legal fight with the Linux Foundation.
There isn’t convincing evidence one way or the other, so nobody backs down, but also it isn’t really worth suing over, so everybody just grumbles at each other and nurses a grudge for next time.

I think it’s a lose-lose situation. Whichever way it shakes out, cases like this—and I expect they’ll become more common as more OSS projects like OpenTofu and Valkey are started that “shadow” source-available versions—have a chilling effect on everybody. Enterprises, who struggled to trust open-source for years, now have a new reason to fear that OSS code might leave them liable to action from companies like Hashicorp. Developers hesitate to fork OSS projects that might someday become source-available. Contributors don’t want to be fired from their day jobs or named in a lawsuit because somebody makes a bad-faith case that they plagiarized code. Can you blame them if they burn out, slip away, become less enthusiastic about open-source solutions?

Here is where, if I were a thought leader, I would make some grand-sounding call for companies to “do better”, or for OSS foundations to “revisit their governance structures”, or something. But this is an impasse that can’t be solved with platitudes. Cloud companies, fundamentally, see open-source as something to exploit; OSS software companies see it as incompatible with a sustainable business model. Contributors who don’t work for either side are getting trampled on. And eventually, as the ecosystem fractures, everybody loses. The whole situation is just sad.

In the end, the fragile balance of open source—that unlikely blend of personalities and incentives that has driven tech’s innovation engine throughout the 21st century—won’t be upset by the odd state actor, or by malicious spam, or whatever scary new thing The Register is up in arms about tomorrow. It can only be disrupted when the community comes to believe maintaining that balance is more trouble than it’s worth. That’s the threat we should all be concerned about.

Good Sponsored Thing

Pluralsight’s Tech Skills Day is coming up on April 25th. I’ll be doing a fireside chat with Faye Ellis reflecting on 4 years of the Cloud Resume Challenge; you can catch that and lots more talks to level up your tech skills when you register for free here.

Cartoon of the day

I guess this more or less sums up the discourse:

JK Gunnink

Apr 9, 2024

Great article Forrest. Thank you for your views and commentary! I think you nailed it with your point in the timeline: Current situation: Everyone is mad at everyone.

Expand full comment