How to Delete Duplicate Files in Linux with Fdupes
Getting rid of duplicate files will save you a ton of disk space.
As important as it is to keep your disks clear of duplicate files, finding copies of files is a tiresome job and most people don’t want to do it. This isn’t a problem if all you have are tiny text files that take up a few kilobytes each. But media files, especially raw images and HD videos can eat a lot of disk space, leaving you with less room for new data and apps.
Thankfully, the Fdupes command-line utility provides a faster and more efficient way of identifying duplicate files than just manually combing through your folders. Released under the MIT License, this nifty tool can be used to find duplicate files in the specified directories. The tool works by comparing the MD5 signature of the files, followed by a byte-to-byte comparison to ensure that all copies are identified.
In addition to tracking down duplicates, you can also use Fdupes to delete duplicate files, replace deleted files with links to the original, etc.
How to Install Fdupes for Linux
You’ll find Fdupes in the software repositories of most desktop distributions such as Ubuntu, Fedora, Arch, etc. The following instructions work on Ubuntu, Debian and other Linux flavors that are based on them (ex: Mint and Raspberry Pi OS).
1. Update the list of repositories by entering this command in a terminal window.
$ sudo apt update
2. Install the dependencies and npm package manager.
sudo apt install fdupes
You can similarly install Fdupes on Fedora, or other rpm based distributions with the ///BEGIN CODE///sudo dnf install fdupes///END CODE/// command.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
How to Find Duplicate Files in Linux with Fdupes
Despite performing a seemingly straightforward task, Fdupes boasts of a vast number of useful features. As the utility can also be used to delete duplicate files, we would advise you to spend some time with the man page to familiarise yourself with the different command options. At a minimum, Fdupes expect the path to a directory to perform a search for duplicates.
$ fdupes </path/to/directory>
- To identify duplicate files in a given directory:
Refer the directory to fdupes
$ fdupes ~/Documents
- To recursively search through all sub-directories in the specified directory and identify all the duplicate files.
$ fdupes -r ~/Documents
Both the above commands only list the duplicate files onto the screen, without deleting them. You must use the -d command option if you want Fdupes to also delete the duplicate files it identifies. But even then, Fdupes will ask to confirm which of the identified files you wish to retain. You can choose to keep a single file, or provide a comma separated list of the ones you wish to retain, or alternatively to keep all.
Here we've opted to retain two files. Also, if your specified directory has multiple copies of different files, each group of copies of a single file is referred to as a set.
This is only a basic introduction to Fdupes. There’s still more that you can do such as ignore hidden files or follow symbolic links, etc.
More Linux Tutorials:
Shashank Sharma is a freelance writer for Tom’s Hardware US, where he writes about his triumphs and joys of working with the Linux CLI.
-
unis_torvalds But how would I have ended up with duplicate files on my disk in the first place?Reply -
Why would you create duplicate files even. They talk about user directory only. Thank heavens the author didn't mention root dir. Newcomers would use Windows mentality and clean "the system", then complain on forums how Linux stopped working.Reply
-
hotaru.hino Even if the root directory was mentioned, unless you run this as sudo, it's not going to delete anything since user accounts typically don't own anything outside of their home directory by default.Reply
And if you run this as sudo well, that's your fault. -
jkflipflop98 tommo1982 said:Why would you create duplicate files even. They talk about user directory only. Thank heavens the author didn't mention root dir. Newcomers would use Windows mentality and clean "the system", then complain on forums how Linux stopped working.
Well, I mean go try to delete your system32 folder and see what happens. Windows won't allow it.
Windows wins. Again. -
unis_torvalds
And that's kind of the whole point of FOSS operating systems, isn't it. The vendor shouldn't win, the user should always win. But of course with closed proprietary OSes, whenever it comes down to user vs. vendor, it's MS (or Apple if you swing that way) who wins. In the FOSS world, users are in control of everything, period. It's like how we don't mandate speed governors on cars even though it means they can go over the speed limit. We just trust people to not be stupid. Do folks still get themselves into trouble? Of course. But such is the cost of freedom.jkflipflop98 said:Windows won't allow it. Windows wins. Again. -
You'd have a hard time deleting duplicate files in system32 directory. Even if you created them intentionally, there might be some security policy preventing you from deleting it and Windows access rights are so convoluted, being an owner of a directory, an admin even, doesn't mean you can modify it.jkflipflop98 said:Well, I mean go try to delete your system32 folder and see what happens. Windows won't allow it.
Windows wins. Again.
I had to resort to linux to remove win7 from an old drive, because win10, which was somewhere else entirely, prevented me from deleting a broken install.
You see it's not black and white. Neither of these OS's are perfect. They cater to different people and this stupid, long lasting superiority argument is getting old. -
hotaru.hino
It's actually because if engines were designed with a maximum speed limit of the legal speed limit, they would be hitting close to the red line. Running an engine hard all the time is a great way to kill it faster.unis_torvalds said:It's like how we don't mandate speed governors on cars even though it means they can go over the speed limit.
And in some cases where there is a speed governor, like on trucks, it's usually for safety/maintenance reasons (do you really want a dual trailer semi packed to the brim running down the highway at 90MPH?)