Amazon’s quiet open source revolution

After years of getting a free ride from open source projects, the company is developing its own obsession with contributing.

Thinkstock

Something has changed at Amazon Web Services (AWS) with regard to its formerly fraught relationship with open source. Though it was always incorrect to lambast AWS for “strip-mining” open source, as Daisuke Wakabayashi did in The New York Times, there was just enough smoke in that “strip-mining” fire to make the accusation seem somewhat credible.

After all, a quick perusal of top open source projects with the Cloud Native Computing Foundation, the Apache Software Foundation, or just about anywhere would have shown that Google tended to top the open source contribution charts, with Microsoft a strong second. AWS was off in the distance, likely congratulating itself on relieving customers of the “undifferentiated heavy lifting” of managing open source by themselves.

Well, that was then; this is now. Service (product) teams at AWS finally seem to be getting the message that to deliver on “Customer Obsession,” Amazon’s foremost leadership principle (or even other principles like Ownership, Deliver Results, etc.), they really need to be obsessed with open source contributions too.

Weird, but true

I’ve mentioned before that AWS seems to be changing its mentality around ownership. AWS’ number 2 leadership principle has led some AWS service teams to assume the only way to truly care for customers was to own all aspects of the experience. This made it difficult to engage open source communities because it seemed to imply Amazon would be at the mercy of the community to fix bugs, etc.

Some AWS service teams were reluctant to contribute lest they reveal too much about how their systems run or enable competitors with bug fixes or features that differentiated Amazon’s own services. In the process, they piled up technical debt, making it harder to give the customer what they really wanted: an easy way to run Apache Spark, or MySQL, or [insert open source project here].

While I was working at AWS, I saw this begin to change, if slowly. Now it seems to be accelerating quickly. Take, for example, PostgreSQL. A few years back, AWS was regularly criticized (rightly so, I’d argue) for free-riding on PostgreSQL. The company made lots of money managing PostgreSQL for customers but gave little back.

Now, however, the PostgreSQL committer page is filled with AWS employees. Some of these people were already committers and were hired by AWS to work on PostgreSQL (and presumably AWS database services such as RDS and Aurora), but Nathan Bossart, Masahiko Sawada, and others earned that distinction through their contributions. I’d hazard a guess that AWS is now the third-largest corporate contributor to PostgreSQL if you aggregate the contributions of its employees to PostgreSQL. I’m not at all downplaying the value of others’ contributions. Rather, I’m pointing out the astonishing increase in AWS’ involvement.

The long road

Let’s remember that the open source spadework is not done. For example, AWS makes a lot of money from its Kubernetes service but still barely scrapes into the top 10 contributors for the past year. The same is true for other banner open source projects that AWS has managed services for, such as OpenTelemetry, or projects its customers depend on, such as Knative (AWS comes in at #12). What about Apache Hadoop, the foundation for AWS Elastic MapReduce? AWS has just one committer. For Apache Airflow, the numbers are better.

This is glass-half-empty thinking, anyway. The fact that AWS has any committers to these projects is an important indicator that the company is changing. A few years back, there would have been zero committers to these projects. Now there are one or many.

All of this signals a different destination for AWS. The company has always been great at running open source projects as services for its customers. As I found while working there, most customers just want something that works. But getting it to “just work” in the way customers want (i.e., the vanilla version of an open source project, not some forked, “premium” version) requires that AWS get its hands dirty in the development of the project. Engineering teams weren’t traditionally incentivized to do that; apparently they are now.

All of this is good for AWS, good for its customers, and good for open source. It’s hard to overemphasize just how differently AWS runs, given its scale. At scale, things break and AWS has learned how to fix them. If we can get more of that know-how infusing open source projects, it benefits everyone and, I’d argue, creates much bigger markets where AWS can sell its services.