2.1.79 disk quota lockup problem found

Jonathan Corbet (corbet@atd.ucar.edu)
Thu, 15 Jan 1998 15:11:17 -0700


As you may recall from yesterday's episode, I was having trouble getting
quotas working on a new partition. Once quotas were enabled, processes
would start hanging in 'D' wait states, never to run again. While this
did, in fact, keep users from using too much disk space, some of our more
sensitive users found it to be a bit heavy-handed.

So I set out to find a better solution. Clue #1 was that 'edquota -t'
would always cause the *next* quota-related activity (i.e. 'repquota') to
hang. Once that happened, lots of other things (i.e. 'sync') were doomed
as well.

Here's what happens. Something like 'edquota -t' modifies the quota
'entry' for UID 0, which then has to be written out. In fs/dquot.c, there
is routine write_dquot which will happily do that. It comes down to this:

if (filp->f_op->write(filp, (char *)&dquot->dq_dqb,
sizeof(struct dqblk), &offset) == sizeof(struct dqblk))
dquot->dq_flags &= ~DQ_MOD;

Normally this works fine. However, the disk in question did not yet have a
quota entry for UID zero, so it had to allocate another block for the quota
file, which meant that the quota information for root had to be adjusted,
which meant that the root dquot entry had to be locked. But write_dquot
already has it locked, so everything stops.

The workaround for this is easy: use 'dd' to be sure that the first block
of 'quota.user' is really allocated, or simply run 'quotacheck' before
turning quotas on (see below). I do, however, see this as a bug -- you
should not be able to lock your system in this way. I'm not quite sure
what the right fix for this would be; I may try to work that out a little
later on.

Meanwhile: what was I doing running quotas without having run quotacheck?
Well... I *had* run it, via the Red Hat 'rc.sysinit' script. It all
happens automatically at boot time, so I thought I didn't need to worry
about it. However, they run quotacheck on the local partitions *before*
mounting them. Quotacheck gladly runs in this mode, since it seems to work
with the disk device directly. Then it writes 'quota.user' at the given
mount point. But, since the partition is not yet actually mounted,
quota.user ends up in the root partition, and doesn't do anybody any good.
This, too, is a bug, and I'll be sending Red Hat a note shortly.

Sorry for the length of this, and thanks to those who sent me replies to my
first message...

jon

Jonathan Corbet
National Center for Atmospheric Research, Atmospheric Technology Division
corbet@atd.ucar.edu http://www.atd.ucar.edu/rdp/jmc.html