Finding Large Files On a Linux/UNIX

Posted by JD 08/29/2011 at 05:00

If you are like me, you are always running low on disk storage. After all, a hard drive that isn’t nearly full isn’t living up to the full potential. About every month, I need the find the largest files and clean them up – taking me back to 75% full on the “temporary storage” HDD.

Here’s a little script to make finding the largest files on your system easy. I call it hogs.sh.

$ find /home -type f -size +2G -exec ls -lh {} \; | cut -d" " -f5,8 | sort -t" " | tee ~/hogs.txt

May I suggest:

  • Drop these commands into a bash/sh file
  • Call this from the root crontab weekly, perhaps 3am Mondays
  • Redirect the output into a file for later viewing as needed
  • Overwrite the file with every run

The script isn’t perfect. The sorting fails but does group files nicely. It also takes awhile to complete, so running it at 3am is nice. Tweak the directory and files-larger-than parameters for your location.

In running this today, I’ve found a few VDI and VDMK files that I’d forgotten about.


3.1G /Data/VirtualBox/win7-x64-pro.iso
5.3G /Data/VirtualBox/Machines/Natty/Natty-new.vdi.gz
6.2G /Data/VirtualBox/Machines/Lubuntu.vdi
8.1G /Data/VirtualBox/Machines/Gnome3/Gnome3.vdi

That’s a little space that I can recover.

This script doesn’t find any directories with thousands of tiny files that use 10G of storage, that is a flaw. Perhaps we can create a du-cut-sort piped solution for that as well. Combining that with the old tree shell script would make some nice output.

Finding larger directories, in the GB sized range …

$ du -h /Data | grep “^[0-9]*.[0-9]G”

No sorting by size, but having the related directories together is very handy.