Eric Paris wrote on 13/08/2008 19:57:44:
theIt's clear from the protection model that you described that on 'read'
you want to wait until the scan is done before you give the data to
delayed)process asking for it... and that's totally reasonable: "Do not give
out bad data" is a very clear line in terms of security.
for the "dirty" case it gets muddy. You clearly want to scan "some
time" after the write, from the principle of getting rid of malware
that's on the disk, but it's unclear if this HAS to be synchronous.
(obviously, synchronous behavior hurts performance bigtime so lets do
as little as we can of that without hurting the protection).
One advantage of doing the dirty case async (and a little time
is that repeated writes will get lumped up into one scan in practice,
saving a ton of performance.
(scan-on-close is just another way of implementing "delay the dirty
scan").
Based on Alans comments, to me this sounds like we should have an
efficient mechanism to notify userspace of "dirty events"; this is not
virus scan specific in any way or form. And this mechanism likely will
need to allow multiple subscribers.
I'm certainly willing to go down the inotify'ish path for async
notification of 'dirty' inodes instead of implement my own async
mechanism if I can find a way to do it.
Do I understand correctly that everyone agrees scanning whenever an inode
gets dirty would be a terrible thing for performance?
Another thing we have here is that malware could not be neccessariliy
identified until the very last write (one example where it will always be
the case are PDF files (I think)).
So the whole question is at which point should be performing an async
scan. Close seems like a natural point which should be ideal for majority
of applications, I don't see how any time-based lumping/delaying scheme
can be better than close?