Skip to main content

How to Delete Duplicate Files in Linux with Fdupes

fdupes
(Image credit: Tom's Hardware)

As important as it is to keep your disks clear of duplicate files, finding copies of files is a tiresome job and most people don’t want to do it. This isn’t a problem if all you have are tiny text files that take up a few kilobytes each. But media files, especially raw images and HD videos can eat a lot of disk space, leaving you with less room for new data and apps. 

Thankfully, the Fdupes command-line utility provides a faster and more efficient way of identifying duplicate files than just manually combing through your folders. Released under the MIT License, this nifty tool can be used to find duplicate files in the specified directories. The tool works by comparing the MD5 signature of the files, followed by a byte-to-byte comparison to ensure that all copies are identified.

In addition to tracking down duplicates, you can also use Fdupes to delete duplicate files, replace deleted files with links to the original, etc.

How to Install Fdupes for Linux 

You’ll find Fdupes in the software repositories of most desktop distributions such as Ubuntu, Fedora, Arch, etc. The following instructions work on Ubuntu, Debian and other Linux flavors that are based on them (ex: Mint and Raspberry Pi OS).

1. Update the list of repositories by entering this command in a terminal window.

$ sudo apt update

2. Install the dependencies and npm package manager.

sudo apt install fdupes

You can similarly install Fdupes on Fedora, or other rpm based distributions with the ///BEGIN CODE///sudo dnf install fdupes///END CODE/// command.

How to Find Duplicate Files in Linux with Fdupes 

Despite performing a seemingly straightforward task, Fdupes boasts of a vast number of useful features. As the utility can also be used to delete duplicate files, we would advise you to spend some time with the man page to familiarise yourself with the different command options. At a minimum, Fdupes expect the path to a directory to perform a search for duplicates.

$ fdupes </path/to/directory>
  • To identify duplicate files in a given directory:

Refer the directory to fdupes

$ fdupes ~/Documents

(Image credit: Tom's Hardware)
  • To recursively search through all sub-directories in the specified directory and identify all the duplicate files.
$ fdupes -r ~/Documents

(Image credit: Tom's Hardware)

Both the above commands only list the duplicate files onto the screen, without deleting them. You must use the -d command option if you want Fdupes to also delete the duplicate files it identifies. But even then, Fdupes will ask to confirm which of the identified files you wish to retain. You can choose to keep a single file, or provide a comma separated list of the ones you wish to retain, or alternatively to keep all.

(Image credit: Tom's Hardware)

Here we've opted to retain two files. Also, if your specified directory has multiple copies of different files, each group of copies of a single file is referred to as a set.

This is only a basic introduction to Fdupes. There’s still more that you can do such as ignore hidden files or follow symbolic links, etc.

More Linux Tutorials:

  • unis_torvalds
    But how would I have ended up with duplicate files on my disk in the first place?
    Reply
  • tommo1982
    Why would you create duplicate files even. They talk about user directory only. Thank heavens the author didn't mention root dir. Newcomers would use Windows mentality and clean "the system", then complain on forums how Linux stopped working.
    Reply
  • hotaru.hino
    Even if the root directory was mentioned, unless you run this as sudo, it's not going to delete anything since user accounts typically don't own anything outside of their home directory by default.

    And if you run this as sudo well, that's your fault.
    Reply
  • jkflipflop98
    tommo1982 said:
    Why would you create duplicate files even. They talk about user directory only. Thank heavens the author didn't mention root dir. Newcomers would use Windows mentality and clean "the system", then complain on forums how Linux stopped working.

    Well, I mean go try to delete your system32 folder and see what happens. Windows won't allow it.
    Windows wins. Again.
    Reply
  • unis_torvalds
    jkflipflop98 said:
    Windows won't allow it. Windows wins. Again.
    And that's kind of the whole point of FOSS operating systems, isn't it. The vendor shouldn't win, the user should always win. But of course with closed proprietary OSes, whenever it comes down to user vs. vendor, it's MS (or Apple if you swing that way) who wins. In the FOSS world, users are in control of everything, period. It's like how we don't mandate speed governors on cars even though it means they can go over the speed limit. We just trust people to not be stupid. Do folks still get themselves into trouble? Of course. But such is the cost of freedom.
    Reply
  • tommo1982
    jkflipflop98 said:
    Well, I mean go try to delete your system32 folder and see what happens. Windows won't allow it.
    Windows wins. Again.
    You'd have a hard time deleting duplicate files in system32 directory. Even if you created them intentionally, there might be some security policy preventing you from deleting it and Windows access rights are so convoluted, being an owner of a directory, an admin even, doesn't mean you can modify it.
    I had to resort to linux to remove win7 from an old drive, because win10, which was somewhere else entirely, prevented me from deleting a broken install.

    You see it's not black and white. Neither of these OS's are perfect. They cater to different people and this stupid, long lasting superiority argument is getting old.
    Reply
  • hotaru.hino
    unis_torvalds said:
    It's like how we don't mandate speed governors on cars even though it means they can go over the speed limit.
    It's actually because if engines were designed with a maximum speed limit of the legal speed limit, they would be hitting close to the red line. Running an engine hard all the time is a great way to kill it faster.

    And in some cases where there is a speed governor, like on trucks, it's usually for safety/maintenance reasons (do you really want a dual trailer semi packed to the brim running down the highway at 90MPH?)
    Reply