Thursday, January 14, 2010

[BLUG] Something to keep in mind when using du -b

I use the du command a lot. Usually I use it with just -s and
sometimes with -h, for human readable size information like 100M or
4.3G

Recently, I was checking directory sizes of some home dirs to determine
how to split up a home partition for a new server. I ran du using the
-b option, so that I would get the bytes in full form, 1byte per block.
What I didn't realize before when I read the man page was that when you
use -b instead of --block-size=1, you also get an extra option called
--apparent-size. Now what this does may not be so obvious, but
sometimes you can have files that look like a certain size but take up
more or less space in reality. For instance, if you make a sparse file
like this:

# dd if=/dev/zero of=lookslikebigfile bs=1M count=1 seek=1024

You will make a file that will appear to be 1GB in size, but is actually
only taking up 1MB on the disk.

So I was very confused when I ran du -sb on a directory and got
490837261, but then -sh returned 862M. WTF?

It turns out that this directory had lots of files that were smaller
than the block size of the filesystem, which is the default 4096 bytes
per block.

Here is the difference between the three commands for the same data:

# du -s --block-size=1 dir
903499776 dir
# du -sh dir
862M dir
# du -sb dir
490837261 dir

Its kinda annoying that the -b option adds that --appearent-size
option, but none of the other size display modifying options do (like
-h, -k, and -m). Its probably a performance issue or something.


By the way, you can experiment with this problem on your own by creating
a test directory and then running this command to create a bunch of
under the block size files:

# mkdir test
# cd test
# for i in `seq 1 1000` ; do dd if=/dev/urandom of=$i bs=1 count=900 ; done

So you'd think that 1000 900 byte files would only use 900KB on the
disk, but they actually use 4MB:

# du -sb test
920480 test
# du -sh test
4.0M test

So be careful when using du -b

BTW, 900 bytes is about the same size as a single line email message.
;-)


--
Mark Krenz
Bloomington Linux Users Group
http://www.bloomingtonlinux.org/
_______________________________________________
BLUG mailing list
BLUG@linuxfan.com
http://mailman.cs.indiana.edu/mailman/listinfo/blug