Re: [PATCH] 498+ days uptime

Bernhard Heidegger (bheide@hyperwave.com)
Fri, 28 Aug 1998 11:09:59 +0200 (MET DST)


>>>>> ">" == Eric W Biederman <ebiederm@inetnebr.com> writes:

>>>>> "BH" == Bernhard Heidegger <bheide@hyperwave.com> writes:
>>>>> ">" == Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes:
>>>> Bernhard Heidegger <bheide@hyperwave.com> writes:
>>>> >>>>> ">" == Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes:
>>>>
>>>> >> "H. Peter Anvin" <hpa@transmeta.com> writes:
>>>> >> >
>>>> >> > bdflush yes, but update is not obsolete.
>>>> >> >
>>>> >> > It is still needed if you want to make sure data (and metadata)
>>>> >> > eventually gets written to disk.
>>>> >> >
>>>> >> > Of course, you can run without update, but then don't bother if you
>>>> >> > lose file in system crash, even if you edited it and saved it few
>>>> >> > hours ago. :)
>>>> >> >
>>>> >> > Update is very important if you have lots of RAM in your computer.
>>>> >> >
>>>> >>
>>>> >> Oh. I guess my next question then is "why", as why can't this be done
>>>> >> by kflushd as well?
>>>> >>
>>>>
>>>> >> To tell you the truth, I'm not sure why, these days.
>>>>
>>>> >> I thought it was done this way (update running in userspace) so to
>>>> >> have control how often buffers get flushed. But, I believe bdflush
>>>> >> program had this functionality, and it is long gone (as you correctly
>>>> >> noticed).
>>>>
>>>> IMHO, update/bdflush (in user space) calls sys_bdflush regularly. This
>>>> function (fs/buffer.c) calls sync_old_buffers() which itself sync_supers
>>>> and sync_inodes before it goes through the dirty buffer lust (to write
>>>> some dirty buffers); the kflushd only writes some dirty buffers dependent
>>>> on the sysctl parameters.
>>>> If I'm wrong, please feel free to correct me!
>>>>

>>>> You are not wrong.

>>>> Update flushes metadata blocks every 5 seconds, and data block every
>>>> 30 seconds.

BH> My version of update (something around Slakware 3.4) does the following:
BH> 1.) calls bdflush(1,0) (fs/buffer.c:sys_bdflush) which will call
BH> sync_old_buffers() and return
BH> 2.) only if the bdflush(1,0) fails (it returns < 0) it returns to the
BH> old behavior of sync()ing every 30 seconds

BH> But case 2) should only happen on really old kernels; on newer kernels
BH> (I'm using 2.0.34) the bdflush() should never fail.

BH> But as I told, sync_old_buffers() do:
BH> 1.) sync_supers(0)
BH> 2.) sync_inodes(0)
BH> 3.) go through dirty buffer list and may flush some buffers

BH> Conclusion: the meta data get synced every 5 seconds and some buffers may
BH> be flushed.

>>>> Questions is why can't this functionality be integrated in the kernel,
>>>> so we don't have to run yet another daemon?

>> We can do this in kernel thread but I don't see the win.

I don't have a problem with the user level thing (so I can decide to not
start it ;-)

BH> Good question, but I've another one: IMHO sync_old_buffers (especially
BH> the for loop) do similar things as the kflushd. Why??

>> kflushd removes buffers only when we are low on memory, and unconditionally.

>> bdflush lets buffers sit for 30 seconds and every 5 seconds it checks
>> for buffers that are at least 30 seconds old and flushes them.

Ahh, is this bh->b_flushtime?

>> bdflush does most of the work.

Yes, I know :-(

BH> Is it possible to reduce the sync_old_buffers() routine to soemthing like:

>> No. Major performance problem.

Why?

Imagine an application which has most of the (index) file pages in memory
and many of the pages are dirty. bdflush will flush the pages regularly,
but the pages will get dirty immediately again.
If you can be sure, that the power cannot fail the performance should be
much better without bdflush, because kflushd has to write pages only if
the system is running low on memory...

Bernhard

get my pgp key from a public keyserver (keyID=0x62446355)
-----------------------------------------------------------------------------
Bernhard Heidegger bheide@hyperwave.com
Hyperwave Software Research & Development
Schloegelgasse 9/1, A-8010 Graz
Voice: ++43/316/820918-25 Fax: ++43/316/820918-99
-----------------------------------------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html