sprocket i/o

thomas stromberg on technology, nature, and motorcycles

sprocket i/o header image 2

Monitoring filesystem activity under Linux with block_dump

May 27th, 2006 · Comments

kexWhen giving interviews, I tend to ask a lot of scenario type questions to determine how someone makes use of tools and logic to solve a system administration problem. One of my favorite questions was this, for whichever operating system that person was interviewing for:

The administrator for an application running on your machine is complaining about slow disk access. Tell me how you would determine the root cause of the issue?

I then get a bit deeper, and eventually lead them to the pinnacle question: How do you find out what files are being hammered with writes at that time? It’s not an easy question to answer, particularly if your operating system does not support dtrace. Until Linux 2.6.1, this question was not easily answered for Linux administrators.

In Linux 2.6.1, a widely unknown and poorly implemented feature was added that allows you to monitor filesystem activity on a global scale: /proc/sys/vm/block_dump. Rather than being a system utility in other operating systems, it dumps each filesystem event into syslog. This of course, generates more filesystem activity (especially if syslogd is writing in sync mode!). Regardless, block_flush can be enormously useful. To enable it, the easiest way is to use sysctl:

% sudo sysctl vm.block_dump=1

Now it should start dumping gobs of activity into your syslog files. If you don’t see anything, you may need to enable kern.debug logging in syslog.conf. It should look a lot like this:

% tail -f /var/log/debug
May 27 10:00:20 kex kernel: tail(11548): dirtied inode 14093100 (ld.so.cache) on sda1
May 27 10:00:20 kex kernel: tail(11548): dirtied inode 15269995 (libm.so.6) on sda1
May 27 10:00:20 kex kernel: tail(11548): dirtied inode 15270399 (libm-2.3.2.so) on sda1
...
May 27 10:00:20 kex kernel: tail(11548): dirtied inode 18154515 (locale-archive) on sda1
May 27 10:00:21 kex kernel: pdflush(140): WRITE block 76808312 on sda1
..
May 27 10:00:21 kex kernel: pdflush(140): WRITE block 211026104 on sda1

There are two main types of entries that block_flush generates: dirty inodes and block read/writes. Dirty inode entries just mean that something altered the inode for that file, most commonly it’s due to an update to that files atime (access time) or mtime (modification time). If you look at the beginning of the output above, you’ll see a lot of dirtied library inodes. These are the libraries linked against the tail command that was just executed, so their atime was just updated. The remainder, Block reads and writes, are just normal reads and writes to the hard disk.

Collecting some basic statistics from block_dump

Since it is logging everything to a file, it’s pretty easy to use some basic UNIX utilities to discover what processes are reading and writing the most blocks on a particular hard disk. This will tell you the top 10 processes reading from the sda1 hard disk:

% grep sda1 /var/log/debug | grep READ | cut -d: -f4 | sort | uniq -c | sort -rn | head
   1056  find(11566)
    247  mysqld(11855)
     43  mysqld(11751)
     26  mysqld(11790)
     21  tar(11717)
     16  mysqladmin(11748)
     13  mysqld(11899)
     13  grep(4560)
     12  mysqld(11863)
      6  mysqld(11760)

And writes, too. This time we will summarize based on process name instead of having one line per process id. This shows off an interesting bug you will see, though I’m not sure what component is responsible for the truncated process names:

  50948  kjournald
  19564  pdflush
     33  mysqld
      7  syslog-ng
      4  k)
      3  kjour)
      3  )
      2  kjou)
      2  kjo)
      2  kj)

Ahwell.. since when can we expect perfection from either syslog or Linux? :)

Who the heck deleted my file?

One question that came up at work this week was trying to discover who or what kept deleting files within a certain directory. Lacking any real kind of security auditing for file events, I figured that block_dump would be the best solution. The trick with deleted files is that block_dump doesn’t know the filename to log, so you have to know what the inode number was for the file in question:

% ls -li /localwww/demo/member_test.pyc
3735592 -rwxr-xr-x  1 frogdog eng 636 May 18 12:02 run_cgi.pyc

So, we know we are looking for something that alters inode 3735592. Lets get to work:

% sudo sysctl vm.block_dump=1
vm.block_dump = 1

Once the file gets deleted, grep for the inode number in your syslog output. Don’t forget to turn off block_dump to avoid filling up your hard disk:

% sudo sysctl vm.block_dump=0
vm.block_dump = 0
% grep -C5 3735592 /var/log/debug  | grep -v kjournald
May 27 10:34:01 kex kernel: zsh(12486): dirtied inode 4800591 (rm) on sda1
May 27 10:34:01 kex kernel: rm(12486): dirtied inode 3735592 (?) on sda1
May 27 10:34:01 kex kernel: zsh(12373): dirtied inode 13178003 (utmp) on sda1

So, we see that rm deleted the file, which was in turn launched by zsh. Since the pid’s for zsh and rm match (12486), you can tell that it was a fork+exec, so the trick now is to find what began 12486. Sadly, this is not exactly a science, but if you look back at the logs, you can help narrow down what might have happened. If we search for the remaining directory components (/local/www/demo), we can see that someone traversed the directories one at a time and ran ls:

May 27 10:33:18 kex kernel: zsh(12373): dirtied inode 2 (/) on sda1
May 27 10:33:19 kex kernel: zsh(12373): dirtied inode 2506753 (local) on sda1
May 27 10:33:19 kex kernel: ls(12455): dirtied inode 2654437 (www) on sda1
May 27 10:33:25 kex kernel: zsh(12466): dirtied inode 4800585 (ls) on sda1
May 27 10:33:25 kex kernel: ls(12466): dirtied inode 3735561 (demo) on sda1

The inodes that ls dirtied are of no help to us, because it does not tell us the master zsh process. However, the inodes that zsh dirties directly is of great help. These are directories that the user used “cd” to get to, which as an internal command, does not generate another process id. We can now search by the 12373 and find out more information:

% grep 12373 /var/log/debug
May 27 10:32:52 kex kernel: sshd(12373): dirtied inode 14094296 (motd) on sda1
May 27 10:32:52 kex kernel: sshd(12373): dirtied inode 4800568 (zsh4) on sda1
May 27 10:32:52 kex kernel: zsh(12373): dirtied inode 15269997 (libnsl.so.1) on sda1
May 27 10:32:52 kex kernel: zsh(12373): dirtied inode 15270402 (libnsl-2.3.2.so) on sda1

Ah ha! The user came in through sshd. Now it’s just a matter of tracking down the last logs for this time:

% last | grep 10:32
frogdog  pts/6        localhost.locald Sat May 27 10:32   still logged in

Doh, it’s just me. Still an interesting way to research filesystem events, no?

Tags: technology

Viewing 7 Comments

 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus