What Comes After
Welcome
It is well known that many of our most excruciating arguments about religion and philosophy are secretly arguments about definitions. With that said, I will add, it is quite a novel thing when the definition is in the foreground and everyone knows that is what they are arguing about.
Dear Reader,
It is well known that many of our most excruciating arguments about religion and philosophy are secretly arguments about definitions. With that said, I will add, it is quite a novel thing when the definition is in the foreground and everyone knows that is what they are arguing about.
A doozy of a definition argument is brewing right now at the Open Source Initiative (OSI). The OSI recently released version one of their definition for "Open Source AI" [1]. At a glance, the definition appears to fall squarely into the basic open source ethos. The user is accorded the right to:
- Use the system for any purpose and without having to ask for permission.
- Study how the system works and understand how its results were created.
- Modify the system for any purpose, including to change its output.
- Share the system for others to use with or without modifications, for any purpose.
So far so good, right? But the thing is, it all depends on what you call "the system." To the OSI, "the system" is the initial software used as a starting point for developing an AI model – not the model itself. In other words, the definition allows for secrecy in the training data used to build the model. If you can't fully see how the model was trained, you really can't study it and "understand how its results were created." As security expert Bruce Schneier (and others) have pointed out, "the training data is the source code – it's how the model gets programmed" [2].
To be fair, the Open Source AI Definition (OSAID) [3] does call for sharing "sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system." But they are a little cagey in the follow-through, requiring only what they call a "complete description" of the data used for training (as opposed to the data itself). They also make reference to "unshareable data" without fully explaining why the data would be unshareable. Somewhere in the commentary, they give an example of medical data, but they don't really restrict the term to legally unshareable data and appear open to the possibility that "unshareable" could be a business choice. If the data is obtained from a third party, you have to say where you got it, but again, you don't have to provide the data. In the FAQ that accompanies the definition, the OSI states that, for purposes of the definition, "training data does not equate to software source code".
As many have pointed out, the OSI is largely funded by high-tech companies who are actively competing against each other in AI. Google, Microsoft, and Meta are all sponsors, and these mega-vendors have a strong interest in protecting their business interests when it comes to AI development. On the other hand, part of the goal of the OSI is to stay relevant. If the definition is too strict, no one will follow it, and any potential benefits, such as disclosure of source code and free distribution, will be lost. The OSI has always been the pragmatist, as opposed to the Free Software Foundation, where the focus has always been a harder line on rights and principles.
Many experts have pointed to the dangers of drifting into a future where society is run by AI programs trained on secret data. Those dangers are as stark as ever. If you're wondering whether the open source definition will be a useful tool in combating the problem, the answer so far appears to be no.
Some are asking whether the open source definition is still even useful for promoting free software. IBM/Red Hat's legal gambit for restricting access to RHEL source code is another example of big companies looking for loopholes to slip around the open source concept. DRM is another example, and the whole idea of the web-based service architecture poses a significant problem for the definition. (If the software runs on a web server, is it distributed in a way that would trigger the need to share changes as defined in the GPL?) Bruce Perens, creator of the original open source definition, has already said he is looking for "what comes after open source" [4].
Looks like we'll all be looking.
Editor in Chief, Joe Casad
Infos
- Open Source AI definition: https://opensource.org/ai/open-source-ai-definition
- "AI Industry is Trying to Subvert the Definition of Open Source AI" by Bruce Schneier, November 8, 2024: https://securityboulevard.com/2024/11/ai-industry-is-trying-to-subvert-the-definition-of-open-source-ai/
- OSAID FAQ: https://hackmd.io/@opensourceinitiative/osaid-faq
- "What Comes After Open Source? Bruce Perens is Working On It" by Thomas Claburn, The Register, December 27, 2023: https://www.theregister.com/2023/12/27/bruce_perens_post_open/
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Wine 10 Includes Plenty to Excite Users
With its latest release, Wine has the usual crop of bug fixes and improvements, along with some exciting new features.
-
Linux Kernel 6.13 Offers Improvements for AMD/Apple Users
The latest Linux kernel is now available, and it includes plenty of improvements, especially for those who use AMD or Apple-based systems.
-
Gnome 48 Debuts New Audio Player
To date, the audio player found within the Gnome desktop has been meh at best, but with the upcoming release that all changes.
-
Plasma 6.3 Ready for Public Beta Testing
Plasma 6.3 will ship with KDE Gear 24.12.1 and KDE Frameworks 6.10, along with some new and exciting features.
-
Budgie 10.10 Scheduled for Q1 2025 with a Surprising Desktop Update
If Budgie is your desktop environment of choice, 2025 is going to be a great year for you.
-
Firefox 134 Offers Improvements for Linux Version
Fans of Linux and Firefox rejoice, as there's a new version available that includes some handy updates.
-
Serpent OS Arrives with a New Alpha Release
After months of silence, Ikey Doherty has released a new alpha for his Serpent OS.
-
HashiCorp Cofounder Unveils Ghostty, a Linux Terminal App
Ghostty is a new Linux terminal app that's fast, feature-rich, and offers a platform-native GUI while remaining cross-platform.
-
Fedora Asahi Remix 41 Available for Apple Silicon
If you have an Apple Silicon Mac and you're hoping to install Fedora, you're in luck because the latest release supports the M1 and M2 chips.
-
Systemd Fixes Bug While Facing New Challenger in GNU Shepherd
The systemd developers have fixed a really nasty bug amid the release of the new GNU Shepherd init system.