Detect duplicates with fdupes
Double Trouble
The command-line fdupes tool helps you find duplicate folders and directories.
Hard disks have the unpleasant tendency of filling up faster than expected. It is not always immediately obvious why. Keeping things tidy should not be underestimated in this context. Untidy, poorly organized hard disks tend to fill up faster than well-organized ones. Because life is a mixture of order and chaos, most users probably face this problem.
The unexpectedly high utilization level of hard disks is often caused by duplicate files. The typical candidates are photos, music, or videos, which can quickly occupy several gigabytes of space and are often difficult to find. There are several graphical applications on Linux to help you detect and remove duplicates like this, and there are several more for the command line.
GUI or CLI?
Well-known tools with a graphical interface for a cleanup include FSlint and dupeGuru. In this article, I will look at fdupes for the command line [1], first released in 2000. Most distributions include the tool, which weighs in at just over 100KB, in the archives; you can install using your distribution's choice of package manager. Listing 1 shows a guide for Debian, Fedora, and Arch Linux.
Listing 1
Installing fdupes
##### Debian and derivatives: $ sudo apt install fdupes ### Fedora: $ sudo dnf install fdupes ##### Arch Linux and derivatives $ sudo pacman -S fdupes
The current 2.2.1 version from September 2022 has not made its way into all repositories [2]. If you want to compile fdupes from the source code, you can use the tarball from GitHub. After unpacking, just follow the familiar three-step process of ./configure
, make
, and make install
. As of fdupes 2.0, there are two dependencies that you may also need to resolve yourself, depending on the distribution. To do this, follow the instructions in the INSTALL
file from the unpacked archive.
After the install, you can use the tool immediately without any configuration. It identifies duplicate files in the specified directories in several steps. The file name is not important for detection as a duplicate. Instead, two files must first be the same size; given this, fdupes compares their MD5 checksums. Finally, the software performs a byte-by-byte comparison, to make sure that it is definitely the same file.
Fdupes has numerous options that let you control the search and the subsequent deduplication. Initially, you will want to familiarize yourself with the tool by running the fdupes --help
command. This will help you identify the options that suit your use case.
Test Run
For the test, I created an fdupes
directory in the Documents
directory and then created 10 text files whose content read fdupes finds and removes duplicates. Listing 2 shows you how to do this quickly.
Listing 2
Create Multiple Text Files at the Same Time
mkdir /home/"$USER"/Documents/fdupes\ && cd /home/"$USER"/Documents/fdupes\ && for i in {1..10}; do echo\ "fdupes finds and removes duplicates."\ > fdupes${i}.txt ; done
A following ls -l
confirms that the files were created. The easiest way to search for duplicates in the new directory is to use the fdupes ~/Documents/fdupes
command (Figure 1). By separating the paths with spaces, you can specify multiple directories at the same time. To search recursively in directories, you need to use the -r
option, as in fdupes -r ~/documents
(Figure 2). In this case, the tool finds my 10 text files along with some other duplicates. Use the -r
option to specify the path of subdirectories you want to include.
The -S
(--size
) options shows you the size of the hits. You can use -t
or --time
to find out when a file was last modified. -G
or --minsize=SIZE
and -L
or --maxsize=SIZE
lets you further narrow down the selection.
Be Careful When Removing
But finding is only the first part of the task; after all, we want to delete duplicates to clean up the hard disk. This is where the (--delete
) option comes in. When using -d
, always make sure that your path specification is correct – files deleted with fdupes cannot be recovered. The command
fdupes -d ~/documents/fdupes
first lists the files in a numbered list (Figure 3). Note that the number at the beginning of the line will not necessarily match the number in the file name. If you now enter numbers separated by commas, they are tagged with a plus sign and remain intact, while the software removes all of the duplicates with a minus sign.
If you make a mistake, the rg
command cancels your previous entries. Pressing Delete applies your entries. If you want to remove all duplicates except the first one displayed, use the command
fdupes -r -d -N /path
You do not need to press Delete here – the -N
(noprompt) option works without any confirmation.
Another selection option after calling fdupes with the -d
option relies on the sel
parameter. You can select all files with a specific term in the path by typing sel <term>
. To select all files whose path starts with the term, use selb <term>
. Use sele <term>
to select files whose path ends with the term. To select all files whose path corresponds exactly to the term, use the selm <term>
command. After that, you can decide which of the candidates you want to keep. Further options are described by the help
command, which displays the matching fdupes man page sections.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
System76 Refreshes Meerkat Mini PC
If you're looking for a small form factor PC powered by Linux, System76 has exactly what you need in the Meerkat mini PC.
-
Gnome 48 Alpha Ready for Testing
The latest Gnome desktop alpha is now available with plenty of new features and improvements.
-
Wine 10 Includes Plenty to Excite Users
With its latest release, Wine has the usual crop of bug fixes and improvements, along with some exciting new features.
-
Linux Kernel 6.13 Offers Improvements for AMD/Apple Users
The latest Linux kernel is now available, and it includes plenty of improvements, especially for those who use AMD or Apple-based systems.
-
Gnome 48 Debuts New Audio Player
To date, the audio player found within the Gnome desktop has been meh at best, but with the upcoming release that all changes.
-
Plasma 6.3 Ready for Public Beta Testing
Plasma 6.3 will ship with KDE Gear 24.12.1 and KDE Frameworks 6.10, along with some new and exciting features.
-
Budgie 10.10 Scheduled for Q1 2025 with a Surprising Desktop Update
If Budgie is your desktop environment of choice, 2025 is going to be a great year for you.
-
Firefox 134 Offers Improvements for Linux Version
Fans of Linux and Firefox rejoice, as there's a new version available that includes some handy updates.
-
Serpent OS Arrives with a New Alpha Release
After months of silence, Ikey Doherty has released a new alpha for his Serpent OS.
-
HashiCorp Cofounder Unveils Ghostty, a Linux Terminal App
Ghostty is a new Linux terminal app that's fast, feature-rich, and offers a platform-native GUI while remaining cross-platform.