Re: JFFS2 deadlock

From: David Woodhouse
Date: Mon Feb 01 2016 - 09:28:51 EST


On Thu, 2016-01-28 at 09:16 +0100, Thomas.Betker@xxxxxxxxxxxxxxxxx wrote:
>
> Subject: [PATCH] Revert "jffs2: Fix lock acquisition order bug inÂ
> jffs2_write_begin"
> http://article.gmane.org/gmane.linux.drivers.mtd/62951
>
> This is a patch revising my original patch, which I sent to linux-mtd onÂ
> 10-Nov-2015. I didn't see a response yet, but it's one of the outstandingÂ
> patches above.

That looks necessary but not sufficient. I think we need this
(untested) patch on top of it, to ensure that we *always* take the page
lock before f->sem?

Please could you try what's in the tree at
http://git.infradead.org/users/dwmw2/jffs2-fixes.git

----
From: David Woodhouse <David.Woodhouse@xxxxxxxxx>
Subject: [PATCH] jffs2: Fix page lock / f->sem deadlock

With this fix, all code paths should now be obtaining the page lock before
f->sem.

Reported-by: Szabà TamÃs <sztomi89@xxxxxxxxx>
Reported-by: Thomas Betker <thomas.betker@xxxxxxxxxxxxxxxxx>
Signed-off-by: David Woodhouse <David.Woodhouse@xxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
Âfs/jffs2/README.Locking |ÂÂ5 +----
Âfs/jffs2/gc.cÂÂÂÂÂÂÂÂÂÂÂ| 17 ++++++++++-------
Â2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/jffs2/README.Locking b/fs/jffs2/README.Locking
index 3ea3655..8918ac9 100644
--- a/fs/jffs2/README.Locking
+++ b/fs/jffs2/README.Locking
@@ -2,10 +2,6 @@
 JFFS2 LOCKING DOCUMENTATION
 ---------------------------
Â
-At least theoretically, JFFS2 does not require the Big Kernel Lock
-(BKL), which was always helpfully obtained for it by Linux 2.4 VFS
-code. It has its own locking, as described below.
-
ÂThis document attempts to describe the existing locking rules for
ÂJFFS2. It is not expected to remain perfectly up to date, but ought to
Âbe fairly close.
@@ -69,6 +65,7 @@ Ordering constraints:
 ÂÂÂany f->sem held.
 2. Never attempt to lock two file mutexes in one thread.
 ÂÂÂNo ordering rules have been made for doing so.
+ 3. Never lock a page cache page with f->sem held.
Â
Â
 erase_completion_lock spinlock
diff --git a/fs/jffs2/gc.c b/fs/jffs2/gc.c
index 6fb0802..5919fef 100644
--- a/fs/jffs2/gc.c
+++ b/fs/jffs2/gc.c
@@ -1316,14 +1316,17 @@ static int jffs2_garbage_collect_dnode(struct jffs2_sb_info *c, struct jffs2_era
 BUG_ON(start > orig_start);
 }
Â
- /* First, use readpage() to read the appropriate page into the page cache */
- /* Q: What happens if we actually try to GC the _same_ page for which commit_write()
- Â*ÂÂÂÂtriggered garbage collection in the first place?
- Â* A: I _think_ it's OK. read_cache_page shouldn't deadlock, we'll write out the
- Â*ÂÂÂÂpage OK. We'll actually write it out again in commit_write, which is a little
- Â*ÂÂÂÂsuboptimal, but at least we're correct.
- Â*/
+ /* The rules state that we must obtain the page lock *before* f->sem, so
+ Â* drop f->sem temporarily. Since we also hold c->alloc_sem, nothing's
+ Â* actually going to *change* so we're safe; we only allow reading.
+ Â*
+ Â* It is important to note that jffs2_write_begin() will ensure that its
+ Â* page is marked Uptodate before allocating space. That means that if we
+ Â* end up here trying to GC the *same* page that jffs2_write_begin() is
+ Â* trying to write out, read_cache_page() will not deadlock. */
+ mutex_unlock(&f->sem);
 pg_ptr = jffs2_gc_fetch_page(c, f, start, &pg);
+ mutex_lock(&f->sem);
Â
 if (IS_ERR(pg_ptr)) {
 pr_warn("read_cache_page() returned error: %ld\n",
--Â
2.5.0

--
David Woodhouse Open Source Technology Centre
David.Woodhouse@xxxxxxxxx Intel Corporation

Attachment: smime.p7s
Description: S/MIME cryptographic signature