Open source in 2024: Tackling challenges related to security, AI, and long-term sustainability


The first piece of open source code was published just over 70 years ago, and now open-source software finds itself in almost every application that exists today. 

A 2024 report from Synopsys found that the average application has over 500 open source components in it, and most recent industry reports show that over 95% of codebases contain open source software. 

Chris Aniszczyk, CTO of the Cloud Native Computing Foundation and VP of developer relations at the Linux Foundation, says that while open source has largely been used in applications in the technology sector, it is expanding into nearly every industry in recent years, such as agriculture and pharma. The Linux Foundation also recently announced OS-Climate to tackle climate change problems. 

Given the pervasiveness of open source software, let’s look at some of the trends we’ve been seeing across the last year and what we can expect from the open source community this year. 

Open source security is now being tackled by governments

In general, open source software has been under more of a microscope lately, due to several major security issues over the past decade involving open source components, such as the Log4Shell vulnerability in Log4J. 

Both the United States and European Union are now acting to improve the security of open source projects. Within the U.S., President Joe Biden signed an executive order on improving cybersecurity, and a part of that is improving open source security. CISA also has several initiatives tackling this issue. 

In the EU, the Cyber Resilience Act places stricter security requirements on software. While it doesn’t target open source software specifically, Mike Milinkovich, executive director of the Eclipse Foundation, says “there’s really no way that you can regulate the software industry without regulating open source as some sort of a first order side effect.”

The Executive Order has made people start thinking more about things like Software Bill of Materials (SBOMs) and vulnerability management (including license management), said Michele Rosen, research director at IDC.

“If you’re installing a package that three dependencies deep is using some sort of GPL software, and you’re now building software on it, that can be a big legal risk for a company,” she said. “So one of the things that they’re finding is that SBOM management systems can help with not only managing the vulnerabilities, but also managing the licenses of the underlying code.”

According to Aniszczyk, this regulation and push for transparency makes sense, because when we go to the grocery store, for example, we want to know exactly what is in the food we’re buying. Until now, there hasn’t really been an incentive to do that with software.

“We just have so much choice in open source land and developers just use what they find on GitHub or GitLab, or all over the internet,” said Aniszczyk. “And there’s just not this maturity that you would find in industries like manufacturing or so on where there’s like a little bit more scrutiny on the supply chain.”

Milinkovich is hopeful that a side effect of this regulation is that it entices larger corporations to contribute back to open source more.

“There is absolutely no incentive in any part of that relationship for the companies in particular that are using open source to contribute anything back,” said Milinkovich. “There’s no reason to; it’s like ‘thanks for the free stuff.’ And then we’re going to put it into our applications in our internal systems. And that’s great. But regulation changes that equation somewhat. So with regulation, now, they might have a requirement to be able to produce SBOMs, they might have a requirement to demonstrate that the software components that they’re using in their products that they’re selling to the US government have to follow the NIST SSVF capabilities.”

Open source may win the AI race

A leaked memo from a Google staffer last May titled “We Have No Moat And Neither Does OpenAI” explored the idea that as Google was busy trying to compete with OpenAI, they realized the possibility that neither company would win the AI race: open source could.

“The moats memo was basically saying open source guys are getting similar results, or in some ways, even better results. And they’re advancing at a pace that’s faster, even with much smaller datasets,” said Milinkovich.

The memo states: “Plainly put, they are lapping us. Things we consider “major open problems” are solved and in people’s hands today … Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months.”

Some of the large companies are even starting to open source their models, and open source makers are also striking deals with the larger companies, said Rosen.

For instance, Meta has partially open sourced Llama and Mistral, the French startup producing open source models, recently made a deal with Microsoft.  

“So I think it’s pretty clear that open models are going to play a part in this whole AI space one way or the other … there was a question I would say last year where some people were implying that network effects being what they are, we were all going to sort of converge on a single model and I don’t see that happening at all, I think there’s going to be a proliferation,” she said.

Another thing to keep an eye on when it comes to AI is how contributions made using AI will be handled, given the fact that the author might not actually be the author, said Milinkovich.

He believes that it will become more popular to use tools that check for plagiarism. “There’s some options in Copilot, where it will check to see if the code that it has produced is almost identical to code that went into its training data,” he said. “If there’s something that would be interpreted by a human as looking like plagiarism, you need to try to use those tools to avoid that.”

Rosen says “the problem is that particularly with an open source model, it’s very hard to know how to apply those licenses to let’s say the training data set or the architecture or even the system prompt or something like that.”

The impact of tech layoffs on open source

According to Rosen, about half of the open source contributors are paid in some way to contribute to open source. That’s why when Google decided to lay off its open source division last year, it made some waves. 

Google wasn’t the only one; According to Crunchbase’s layoff tracker, 191,000 tech workers lost their jobs in 2023 and as of March 8th, another 31,000 had already been laid off this year. 

However, despite the layoffs, data from the Open Source Contributor Index reveals the number of active contributors from top tech companies (including Google) went up every single month in 2023. 

“It’s true that obviously some of the open source, commercial software leaders were subject to layoffs,” said Rosen. “And even though we know that there must have been some developers laid off who were contributing to open source projects, it’s important to put those layoffs in context. The losses represented a relative minority of the hiring that had taken place for the two or three previous years, so the overall impact, it’s not something that I’ve seen or that I have a sense that there has been a drain.”

How to sustain open-source projects long-term

Long-term sustainability of open source projects is another thing that has gotten more attention over the past few years. There were several examples of popular projects changing the license or business model of their projects in the last year. For instance, HashiCorp switched Terraform from MPL v2 to the Business Source License last year, and earlier this year, Buoyant announced that stable Linkerd releases would only go out to Enterprise users. Also, Red Hat had previously announced that its RHEL releases would only be available through CentOS Stream, which upset many in the open source community. 

These aren’t isolated incidents over the last year, however; A number of other open source projects have changed their licenses over the years, including Akka, CockroachDB, Elasticsearch, MongoDB, Redis, and more. 

Aniszczyk believes that because of the backlash companies faced, this isn’t going to be a common occurrence for open-source projects. “I think that’s going to happen less because of how much pain it caused them, like they lost a lot of community trust,” he said, speaking of HashiCorp. 

Rosen says that she believes companies are starting to think more about the long-term strategy of a project than they used to.

“[They’re] maybe being a little bit more active in diversifying the management and really trying to think about a longer term strategy,” she said. “Whereas I think a lot of open source projects are launched sort of in the innovation mindset, and maybe don’t think about longer term governance. If this project becomes successful, how are we going to maintain it, what’s going to happen?”

A paper published in January by the Harvard Business School revealed that 96% of the value of open source is generated by 5% of developers. 

“We have a relatively small population of people that, frankly, society is depending upon,” said Milinkovich. “And, you know, how do we make sure that those people don’t burn out? … How do we make sure those developers are sustained, but also how are they replaced as they retire and the next generation has to come back in behind them and pick up the mantle of some of these core pieces of infrastructure.” 

The value of open source

It’s an important problem to solve, because that same Harvard Business School paper valued the demand side of open source software at $8.8 trillion and supply side at $4.15 billion.

“We find that firms would need to spend 3.5 times more on software than they currently do if OSS did not exist,” the researchers stated in the report. 

Milinkovich believes Harvard’s numbers are an underestimate of the value because they only measured websites and not operating systems. 

“Some of the headlines I’ve seen make me think they didn’t actually read the paper, because it’s like, you know, ‘open source is worth $8.8 trillion?’ No, they only measured a fraction of the open source ecosystem, right? They only measured websites, and they specifically excluded operating systems. So basically, the economic value of all of the web infrastructure around the planet that we use every day, and open source’s contributions to that is about $8.8 trillion, but that excludes other uses. It excludes operating systems. So it’s obviously in fact, much, much higher than that.”



Source link