How To Find Large Files on Linux

Find Large Files on Linux
(Image credit: Tom's Hardware)

We’ve all got to that point on a given system where we start to run out of storage space. Do we buy more storage, perhaps one of the best SSDs, or do we search and find the largest files quickly? In this how to we will look at a few simple approaches to help us maintain and manage our filesystems.

All the commands in this article will work on most Linux machines. We’ve used a Ubuntu 20.04 install but you could run this how-to on a Raspberry Pi. All of the how-to is performed via the Terminal. If you’re not already at the command line, you can open a terminal window on most Linux machines by pressing ctrl, alt and t.

Listing Files In Size Order Using the ls Command in Linux

(Image credit: Tom's Hardware)

The ls command is used to list the contents of a directory in Linux. By adding the -lS argument we can  order the returned results according to the file size. We have copied a collection of files into a test directory to show this command but it can be run in any directory you choose.

To list the directory contents in descending file size order, use the ls command along with the -IS argument. You will see the larger files at the top of the list descending to the smallest files at the bottom.

ls -lS

While this command is useful for seeing, it lacks the actual size of the files so how can we identify the largest files in Linux and display their size?

Identifying Files Larger Than a Specified Size in Linux

(Image credit: Tom's Hardware)

In another article, we explained how to find files in Linux using the find command to search based on a filename or part of a filename. We can also use the find command in combination with the -size argument specifying a size threshold where any file larger than specified will be returned.\

1. Use find to search for any file larger than 100MB in the current directory. We are working inside our test directory and the “.” indicates to search the current directory. The -type f argument specifies returning files as results. Finally the +100M argument specifies that the command will only return files larger than 100MB in size. We only have one file in our test folder Baby_Yoda.obj that is larger than 100MB.

find . -type f -size +100M

2. Use the same command, but this time specify a path to search. We can run the same command as in the previous section but replace the “.” with a specified path. This means we can search the test directory from the home directory.

cd
find ./test -type f -size +100M

Searching the Whole Linux Filesystem For Large Files 

(Image credit: Tom's Hardware)

It’s sometimes useful to search the whole Linux filesystem for large files. We may have some files hidden away in our home directory that need removing. To search the entire filesystem, we will need to use the command with sudo. We might also want to either limit the search to the current filesystem which can be achieved via the -xdev argument, for example when we suspect the files we seek are in our current main filesystem or we can choose not to add the -xdev argument which will then include results from other mounted filesystems, for example an attached USB drive.

1. Open a terminal.

2. Search the current filesystem for files larger than 100MB. As we are invoking root privileges using sudo we will need to input our password. Note that we are using / to set the command to search the entire filesystem from the root of the filesystem.

sudo find / -xdev -type f -size +100M

3. Search all filesystems for files larger than 100MB. For this example connect a USB drive with a collection of files on it including some that are over 100MB in size. You should be able to scroll through the returned results and see that the larger files on the pen drive have been included in the results.

sudo find / -type f -size +100M

Finding the 10 Largest Linux Files on Your Drive

What are the top ten files  or directories on our machine? How large are they and where are they located? Using a little Linux command line magic we can target these files with only one line of commands.

1. Open a terminal.

2. Use the du command to search all files and then use two pipes to format the returned data.

du -aBM will search all files and directories, returning their sizes in megabytes. 

/ is the root directory, the starting point for the search.

2>/dev/null will send any errors to /dev/null ensuring that no errors are printed to the screen. 

| sort -nr is a pipe that sends the output of du command to be the input of sort which is then listed in reverse order.

| head -n 10 will list the top ten files/directories returned from the search.

sudo du -aBm / 2>/dev/null | sort -nr | head -n 10

3. Press Enter to run the command. It will take a little time to run as it needs to check every directory of the filesystem. Once complete it will return the top ten largest files / directories, their sizes and locations.

(Image credit: Tom's Hardware)

With this collection of commands, you have several ways to identify and locate large files in Linux. It's extremely useful to be able to do this when you need to quickly select big files for deletion to free up your precious system resources. As always, take care when poking around your filesystem to ensure you aren’t deleting something critical!

Jo Hinchliffe

Jo Hinchliffe is a UK-based freelance writer for Tom's Hardware US. His writing is focused on tutorials for the Linux command line.  

  • luizfnpinho
    I was wondering what would be the best way to list the most dense folder, in terms of disk occupation. I had already a situation where an application was (wrongly) creating thousands of small files, and it was hard to find out the disk space's offender.
    Reply
  • DotNetMaster777
    Useful commands
    Reply
  • domih
    luizfnpinho said:
    I was wondering what would be the best way to list the most dense folder, in terms of disk occupation. I had already a situation where an application was (wrongly) creating thousands of small files, and it was hard to find out the disk space's offender.

    Suggestioncd to a directory you have access to.
    run:find . -maxdepth 1 -type d -print0 | xargs -0 -I {} sh -c 'echo $(find "{}" -printf "\n" | wc -l) "{}"' | sort -nr | head -n 10

    This return the top 10 folders with the most files (not necessarily the biggest folders in size).

    Example
    domih@trx:~$ find . -maxdepth 1 -type d -print0 | xargs -0 -I {} sh -c 'echo $(find "{}" -printf "\n" | wc -l) "{}"' | sort -nr | head -n 10
    257681 .
    86846 ./.cache
    26093 ./.mozilla
    19392 ./.config
    18697 ./.phoronix-test-suite
    15151 ./.gradle
    13930 ./clion-2021.3.4
    13289 ./.local
    10035 ./.cargo
    5962 ./.wine

    Ref: https://stackoverflow.com/questions/9157138/recursively-counting-files-in-a-linux-directory
    Otherwise to manually peruse through a file system:

    https://www.systranbox.com/how-to-install-baobab-linux/
    Alternatively, use ncdu which is more useful than the GUI-based Disk Usage Analyzer (baobab):

    sudo apt install ncdu
    # Go to the directory you want to analyze or pass the directory path as parameter
    ncdu ~
    # Read the ncdu man page to look at all the options and commands.

    Ref: comment in https://www.tomshardware.com/how-to/check-disk-usage-linux
    Reply
  • ravenpi
    luizfnpinho said:
    I was wondering what would be the best way to list the most dense folder, in terms of disk occupation. I had already a situation where an application was (wrongly) creating thousands of small files, and it was hard to find out the disk space's offender.

    One thing that I tend to do in these circumstances -- which I don't have time to flesh out right now -- is to take a snapshot of whatever it is I'm looking at (in your case, directories), then take another look, say, 20 seconds later, and see which one has the biggest delta -- e.g., in your case, the most files created in that interval.
    Reply