Use mhddfs to group hard disks and directories
United
The multi-hard drive disk filesystem (mhddfs) combines directories or hard disks on a union filesystem to create a single, large, virtual filesystem that you can use both locally and via Samba or NFS.
Establishing a reliable system and keeping track of a continually growing collection of movies and audio files can be very time consuming. What makes matters worse is that multimedia data typically resides on various disks.
This is where mhddfs enters the game: Using a union filesystem, it groups files from different locations to create a virtual directory. The tool not only combines existing data, it also provides details about free storage space on the individual filesystems. (See the box "What Is a Union Filesystem?")
Consequently, it is no longer a problem to use small disks to store a music collection that extends over three disks. You could just as easily store rock music on one disk, classical tracks on another, and e-books on the third. What happens, however, if your rock music disk is full, but your e-book disk still has room to spare? Things start to become untidy again.
An alternative would be to create a RAID [1] array, but you would always have to compromise between keeping your data safe and using storage space; it does not appear to be a viable solution for the example in this article. The use of LVM [2] only makes sense with RAID for reasons of data safety, and again this does not help solve the problem presented here.
Fortunately, mhddfs offers precisely the functionality that most users need in this case: If you run out of space on one of the grouped disks, the data can be migrated in the background to a different disk with free space without the user even noticing. By default, mhddfs reserves 4GB on each disk for emergencies: If needed, you can use
mlimit=<Limit>
to reduce this value down to as little as 100MB at the outset.
Transparent Write Access
For the virtual array to work, mhddfs – in contrast to UnionFS; AuFS, as commonly used by Live media; or OverlayFS, which was recently added to the kernel – not only makes read access transparent, but also data writes. Whereas legacy union filesystems rely on the copy-on-write (COW) [3], here mhddfs not only writes to the top level of the filesystem, but to all underlying levels, too.
Mhddfs stores files that you add to the virtual array on the first hard disk, as long as it has sufficient space (i.e., as long as the mlimit
is still upheld). After this, it checks the remaining disks in sequence to see if they have sufficient space. If none of the mlimits
on the disks meet the requirements, mhddfs uses the disk with the most space.
Mhddfs always stores files atomically, avoiding the kind of file splitting that you see with LVM. This works on all popular Linux filesystems, including Samba and NFS, because both return correct information about occupied and free space on the respective filesystems. SSHFS does not meet this criterion and the mhddfs developers thus warn against integrating it.
If mhddfs notices during a write that the disk in question does not have enough space, it moves the data it has already written to another disk with more space as a background operation and continues the write action on that disk. The writing program does not notice this. In other words, you can work with the virtual filesystem as if you were working on a single large disk.
No matter where data resides or how much space is available on individual disks, you only see the complete remaining free space. If you later buy a disk with enough capacity and decide to stop using the smaller disks in the mhddfs array, or if you want to use the smaller disks elsewhere, you can simply copy the content of the virtual filesystem to the new disk and unmount the smaller disks.
Flexible Storage
Mhddfs is available from the repositories of most distributions; you can thus use your distribution's package manager for the install. If you prefer to build mhddfs yourself, you can pick up the source code from the mhddfs Subversion repository [4]. Using the tool is very easy in practice. In the following example, I use three hard disks: sda1, sdb1, and sdc1; Listing 1 shows the situation at the start.
Listing 1
Example Disks
You can now create a new mountpoint for the array you will be creating and assign the permissions by typing:
mkdir /mnt/media chmod 775 /mnt/media
From now on, the FUSE filesystem, which as installed to fulfill one of mhddfs's dependencies, comes into its own with its ability to migrate kernel space functions to userspace.
You do not need to be root to use mhddfs; a normal user account is fine. The account simply needs to belong to the fuse
group. You can ensure this by typing:
addgroup <User> fuse
Now create the new array (Listing 2, line 1); the -o allow_other
option allows other users to create files.
Listing 2
Mount Disks to Filesystem
Additionally, you can specify the mlimit
parameter, as mentioned before, but options really belong in /etc/fstab
. Assuming that the mount works, you will see output as shown in Listing 2. All three disks are mounted; all logged-in users have access, and the limit is 4GB. The results, viewed using df -h
, should look something like Listing 3.
Listing 3
New Filesystem
As you can see, the software has created the new filesystem; the total capacity is that of the sum total of the individual disks, and the same is true of the free space. The next task is to provide this setup automatically at boot time. To do so, add a new line to your /etc/fstab
file (Listing 4).
Listing 4
Creating the Filesystem at Bootup
If you do experience problems, it is a good idea to use another option to define where the software creates a logfile and to define the verbosity level for mhddfs's output (Listing 5). For more details, refer to the mhddfs man page [5].
Listing 5
Mhddfs Options
If needed, you can add more disks to the array at any time. To do so, unmount the array, restart the software, and add the disks. Then add the mountpoint to your /etc/fstab
to mount the array automatically.
If you want to stop using the program, remove the line from the /etc/fstab
file and delete the mountpoint for the array. If you have a distribution that uses Systemd, you can launch mhddfs via the init system; the "Launching mhddfs with Systemd" box describes this option.
Launching mhddfs with Systemd
Mhddfs does not come with a service file for Systemd. For this reason, you need to create the file, named /etc/systemd/system/mnt-media.mount
, then copy the script from Listing 6 to the file. The command
systemctl daemon-reload
then reloads the file so the service can be started with the systemctl enable mnt-virtual.mount
command at boot time. You can then type
systemctl start mnt-virtual.mount
for an automatic start.
Listing 6
/etc/systemd/system/mnt-media.mount
On the Safe Side
The driver, which is what mhddfs is at the end of the day, focuses on a single task in the classic Unix style, and it does its job well. However, it does not offer any kind of backup in the case of failure. A disk failure in the array will therefore cause loss of data. One drawback in practice is that you do not know where the software will store a new file and thus do not know what data you stand to lose if a disk dies on you.
The only remedy is to back up the data involved. Mhddfs is often used in combination with SnapRAID [6] to add a modicum of safety. Beyond this, you can also mirror the array one-to-one. To do so, create a second mhddfs instance on your backup disk and synchronize the two instances using Rsync or a similar tool [7].
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Wine 10 Includes Plenty to Excite Users
With its latest release, Wine has the usual crop of bug fixes and improvements, along with some exciting new features.
-
Linux Kernel 6.13 Offers Improvements for AMD/Apple Users
The latest Linux kernel is now available, and it includes plenty of improvements, especially for those who use AMD or Apple-based systems.
-
Gnome 48 Debuts New Audio Player
To date, the audio player found within the Gnome desktop has been meh at best, but with the upcoming release that all changes.
-
Plasma 6.3 Ready for Public Beta Testing
Plasma 6.3 will ship with KDE Gear 24.12.1 and KDE Frameworks 6.10, along with some new and exciting features.
-
Budgie 10.10 Scheduled for Q1 2025 with a Surprising Desktop Update
If Budgie is your desktop environment of choice, 2025 is going to be a great year for you.
-
Firefox 134 Offers Improvements for Linux Version
Fans of Linux and Firefox rejoice, as there's a new version available that includes some handy updates.
-
Serpent OS Arrives with a New Alpha Release
After months of silence, Ikey Doherty has released a new alpha for his Serpent OS.
-
HashiCorp Cofounder Unveils Ghostty, a Linux Terminal App
Ghostty is a new Linux terminal app that's fast, feature-rich, and offers a platform-native GUI while remaining cross-platform.
-
Fedora Asahi Remix 41 Available for Apple Silicon
If you have an Apple Silicon Mac and you're hoping to install Fedora, you're in luck because the latest release supports the M1 and M2 chips.
-
Systemd Fixes Bug While Facing New Challenger in GNU Shepherd
The systemd developers have fixed a really nasty bug amid the release of the new GNU Shepherd init system.