Re: [RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure

From: Michal Hocko
Date: Wed Aug 12 2015 - 05:14:21 EST

Next message: Luis Henriques: "[PATCH 3.16.y-ckt 083/118] iio: adc: vf610: fix the adc register read fail issue"
Previous message: Wanpeng Li: "Re: [PATCH v2 5/5] mm/hwpoison: replace most of put_page in memory error handling by put_hwpoison_page"
Next in thread: Theodore Ts'o: "Re: [RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed 05-08-15 09:49:24, Greg Thelen wrote:
>
> mhocko@xxxxxxxxxx wrote:
>
> > From: Michal Hocko <mhocko@xxxxxxxx>
> >
> > Journal transaction might fail prematurely because the frozen_buffer
> > is allocated by GFP_NOFS request:
> > [ 72.440013] do_get_write_access: OOM for frozen_buffer
> > [ 72.440014] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
> > [ 72.440015] EXT4-fs error (device sda1) in ext4_reserve_inode_write:4735: Out of memory
> > (...snipped....)
> > [ 72.495559] do_get_write_access: OOM for frozen_buffer
> > [ 72.495560] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
> > [ 72.496839] do_get_write_access: OOM for frozen_buffer
> > [ 72.496841] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access
> > [ 72.505766] Aborting journal on device sda1-8.
> > [ 72.505851] EXT4-fs (sda1): Remounting filesystem read-only
> >
> > This wasn't a problem until "mm: page_alloc: do not lock up GFP_NOFS
> > allocations upon OOM" because small GPF_NOFS allocations never failed.
> > This allocation seems essential for the journal and GFP_NOFS is too
> > restrictive to the memory allocator so let's use __GFP_NOFAIL here to
> > emulate the previous behavior.
> >
> > jbd code has the very same issue so let's do the same there as well.
> >
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> > fs/jbd/transaction.c | 11 +----------
> > fs/jbd2/transaction.c | 14 +++-----------
> > 2 files changed, 4 insertions(+), 21 deletions(-)
> >
> > diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
> > index 1695ba8334a2..bf7474deda2f 100644
> > --- a/fs/jbd/transaction.c
> > +++ b/fs/jbd/transaction.c
> > @@ -673,16 +673,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
> > jbd_unlock_bh_state(bh);
> > frozen_buffer =
> > jbd_alloc(jh2bh(jh)->b_size,
> > - GFP_NOFS);
> > - if (!frozen_buffer) {
> > - printk(KERN_ERR
> > - "%s: OOM for frozen_buffer\n",
> > - __func__);
> > - JBUFFER_TRACE(jh, "oom!");
> > - error = -ENOMEM;
> > - jbd_lock_bh_state(bh);
> > - goto done;
> > - }
> > + GFP_NOFS|__GFP_NOFAIL);
> > goto repeat;
> > }
> > jh->b_frozen_data = frozen_buffer;
> > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> > index ff2f2e6ad311..bff071e21553 100644
> > --- a/fs/jbd2/transaction.c
> > +++ b/fs/jbd2/transaction.c
> > @@ -923,16 +923,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
> > jbd_unlock_bh_state(bh);
> > frozen_buffer =
> > jbd2_alloc(jh2bh(jh)->b_size,
> > - GFP_NOFS);
> > - if (!frozen_buffer) {
> > - printk(KERN_ERR
> > - "%s: OOM for frozen_buffer\n",
> > - __func__);
> > - JBUFFER_TRACE(jh, "oom!");
> > - error = -ENOMEM;
> > - jbd_lock_bh_state(bh);
> > - goto done;
> > - }
> > + GFP_NOFS|__GFP_NOFAIL);
> > goto repeat;
> > }
> > jh->b_frozen_data = frozen_buffer;
> > @@ -1157,7 +1148,8 @@ int jbd2_journal_get_undo_access(handle_t *handle, struct buffer_head *bh)
> >
> > repeat:
> > if (!jh->b_committed_data) {
> > - committed_data = jbd2_alloc(jh2bh(jh)->b_size, GFP_NOFS);
> > + committed_data = jbd2_alloc(jh2bh(jh)->b_size,
> > + GFP_NOFS|__GFP_NOFAIL);
> > if (!committed_data) {
> > printk(KERN_ERR "%s: No memory for committed data\n",
> > __func__);
>
> Is this "if (!committed_data) {" check now dead code?
>
> I also see other similar suspected dead sites in the rest of the series.

You are absolutely right. I have updated the patches.

Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Luis Henriques: "[PATCH 3.16.y-ckt 083/118] iio: adc: vf610: fix the adc register read fail issue"
Previous message: Wanpeng Li: "Re: [PATCH v2 5/5] mm/hwpoison: replace most of put_page in memory error handling by put_hwpoison_page"
Next in thread: Theodore Ts'o: "Re: [RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]