Zack's Kernel News
Zack's Kernel News
Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.
Inheriting Filesystem Capabilities
Christoph Lameter posted a patch to make filesystem capabilities inheritable the way the SUID bit is. When you set the SUID bit in an executable and another user runs that executable, it runs with your permissions, rather than the permissions of that user. Any files it creates, or other programs it invokes, are likewise run as you instead of as that user.
Capabilities don't have that kind of inheritability. So, if you write a script and give it certain capabilities, such as allowing raw network access, any scripts invoked by that script will not have the capability to do raw network access. Thus, the script would not be able to rely on any other tools to help do that part of its work. Christoph said, "This is behavior that is counterintuitive to the expected behavior of processes in Unix."
Making capabilities inheritable, Christoph said, was preferable to simply running executables with the SUID bit set. The SUID bit is a very blunt tool, giving the executable *all* the permissions of its owner; whereas capabilities are more surgical, allowing you to constrain those permissions to just the set what is needed.
Christoph pointed out that this had been a problem for quite awhile and that no better alternative seemed to be available. He remarked that "some involved in security development under Linux have even stated that they want to rip out the whole thing and replace it." He explained:
This patch does not change the default behavior but it allows to set up a list of capabilities in the proc filesystem that will enable regular unix inheritance only for the selected group of capabilities.
With that it is then possible to do something trivial like setting CAP_NET_RAW on an executable that can then allow that capability to be inherited by others.
Christoph also added, "I usually do not dabble in security and I am not sure if this is done correctly. If someone has a better solution then please tell me but so far we have not seen anything else that actually works."
Serge Hallyn felt there were some dangers here. POSIX capabilities were tied to the privileges of both the user and the file itself, whereas Christoph's code seemed to tie capabilities to just the file. Serge suggested adding a new capability, listing the capabilities available to be inheritable by that particular user. The user could then choose which capabilities would be inheritable from a given executable. This way both user privileges and file privileges would be respected.
However, Serge also said, "Not saying this is a good idea necessarily, but worth thinking about."
Casey Schaufler remarked that the POSIX draft relevant to this whole question was only a draft document that had ultimately been withdrawn. So, there could be no question of true POSIX conformance or lack thereof on this issue. Casey also said:
The POSIX capability scheme is the simplest mechanism we could come up with that allows existing setuid programs to work unmodified and still make it possible to constrain specific capabilities. Is it complicated? Yes. Why is it complicated? Because you need the option of using the file capabilities to raise and lower the privilege of a program. Had we the option of requiring the programs to do that themselves, the whole thing would have been easier. You also need the option of having a capability aware program manipulate it's own capabilities.
All the UNIX systems that implemented capabilities did so using one variate or another of the POSIX scheme. One, Trusted IRIX, successfully eliminated root privilege.
In terms of Christoph's comment that some security folks wanted to rip out capabilities entirely and replace them with something else, Casey remarked, "I'm game to participate in such an effort. The POSIX scheme is workable, but given that it's 20 years old and hasn't developed real traction it's hard to call it successful."
To address POSIX capabilities' lack of traction over 20 years, Serge said, "I personally think it's two things: 1. lack of toolchain and fs support. The fact that we cannot to this day enable ping using capabilities by default because of cpio, tar and non-xattr filesystems is disheartening. 2. It's hard for users and applications to know what caps they need. Yes the API is a bear to use, but we can hide that behind fancier libraries. But using capabilities requires too much in-depth knowledge of precisely what caps you might need for whatever operations library may now do when you asked for something."
In response to Serge's first point, Mimi Zohar said, "We're working on resolving the CPIO issue. tar currently supports xattrs. At this point, how many non-xattr file systems are there really?" Austin Hemmelgarn replied, "FAT* and UFS immediately come to mind, and I know of people who use UFS for their root filesystem."
In response to Serge's second point, Casey said, "If the audit system reported the capabilities relevant to the decision you'd have what you need. If you failed because you didn't have CAP_CHMOD or you succeeded because you had CAP_SYS_ADMIN it should show up in the audit record. Other systems have used this approach."
Andy Lutomirski, however, didn't agree with Serge's point about needing filesystem support. He said, "if I hold a capability and I want to pass that capability to an exec'd helper, I shouldn't need the fs's help to do this." To which Christoph said, "amen!"
At a certain point, Christoph reined in the discussion somewhat, reiterating that the problem had lingered for too long and needed a real live patch. He reiterated that in his patch, "the file being executed can inherit the parent caps without having to set caps in the new executable." He wasn't going for anything fancier than that, given that nothing fancier was actually on the table.
Serge said that he actually still preferred his earlier suggestion of introducing a new capability that listed inheritable capabilities. Christoph asked for a real live patch, and, at that point, folks delved into a technical discussion addressing various fine points and implementation details, with objections and affirmations focusing more on the code than on the high-level direction.
At this point, it seems that one form or another of Christoph's desired inheritability feature will probably eventually go into the kernel. There is still some controversy surrounding it, however. For one thing, as Casey pointed out, POSIX capabilities were never truly standardized. For another, there is an existing base of software that still needs to run properly, on top of whatever solution comes along. And, finally, there are security concerns that trump all other concerns but that also tend to be somewhat convoluted. All of these things are a recipe for compromise that makes it hard to predict what the final result will look like.
Reporting on Inactive CPUs
Some ideas that seem good end up going nowhere, at least for awhile.
Yalin Wang posted a patch to cause /proc/stat
to list all CPUs on a running system, regardless of whether a given CPU was online or offline. The reason, he said, was that some CPUs went online or offline dynamically and might need to be tracked. And, if a library wanted to know how many CPUs were on the system, it should get the real number, rather than just the number of CPUs currently online.
David Rientjes liked the idea, but he didn't think it was necessary to add the information to /proc/stat
. The /sys/devices/cpu
file reports the number of CPUs in the system. It made more sense to update that file to list all CPUs instead of just online CPUs. Andrew Morton also pointed out that /proc/cpuinfo
should be updated to list all CPUs as well.
At this point, however, Yalin hit a stumbling block. In the Android kernel, some code depended on these files to determine the number of online CPUs rather than their total number. If he changed the behavior of those files, he'd break compatibility.
So, that was the end of that.
Adding Timekeeping Tests to the Kernel
John Stultz posted some timekeeping test patches. He'd hosted them on GitHub for a few years, but now he had the time to make them kernel-ready, so he wanted to get some feedback on what he should change.
The tests did things like setting the system time to something that might lead to problems, like that last moment of the last day of the year 1999 or something like that. Some of John's tests could have a destructive effect along those lines, and some would produce quiet warnings if the system behaved in an improper but non-threatening way.
Richard Cochran liked the patches, and Shuah Khan did as well. She suggested having non-destructive tests run by default and having the user specify destructive tests as desired. She also suggested that John "use kselftest.h reporting mechanism for new tests. posix_timers.c is updated to use it and it would make sense use it for new tests as well."
John said he'd give that a try, although he also added, "one thing I've tried to do with my test suite is minimize any sort of test-infrastructure dependencies, so as much as possible, single test files can be plucked out, built and run by themselves." But, he said he'd ditch that plan if no one was into it.
Shuah saw the value in his approach and said that adapting to kselftest was not mandatory. She suggested he give it a shot, but "if it does become hard, I am not going to make it a requirement to use it."
It seems clear that John's patches will get into the kernel soon, perhaps with some changes. They don't seem to have any controversy attached.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
AlmaLinux OS Kitten 10 Gives Power Users a Sneak Preview
If you're looking to kick the tires of AlmaLinux's upstream version, the developers have a purrfect solution.
-
Gnome 47.1 Released with a Few Fixes
The latest release of the Gnome desktop is all about fixing a few nagging issues and not about bringing new features into the mix.
-
System76 Unveils an Ampere-Powered Thelio Desktop
If you're looking for a new desktop system for developing autonomous driving and software-defined vehicle solutions. System76 has you covered.
-
VirtualBox 7.1.4 Includes Initial Support for Linux kernel 6.12
The latest version of VirtualBox has arrived and it not only adds initial support for kernel 6.12 but another feature that will make using the virtual machine tool much easier.
-
New Slimbook EVO with Raw AMD Ryzen Power
If you're looking for serious power in a 14" ultrabook that is powered by Linux, Slimbook has just the thing for you.
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.