Re: [PATCH v10 08/14] mm: multi-gen LRU: support page table walks

From: Yu Zhao
Date: Fri Apr 15 2022 - 02:26:27 EST


On Thu, Apr 14, 2022 at 7:57 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, 14 Apr 2022 19:14:54 -0600 Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>
> > On Mon, Apr 11, 2022 at 8:16 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, 6 Apr 2022 21:15:20 -0600 Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> > >
> > > > +static void update_batch_size(struct lru_gen_mm_walk *walk, struct folio *folio,
> > > > + int old_gen, int new_gen)
> > > > +{
> > > > + int type = folio_is_file_lru(folio);
> > > > + int zone = folio_zonenum(folio);
> > > > + int delta = folio_nr_pages(folio);
> > > > +
> > > > + VM_BUG_ON(old_gen >= MAX_NR_GENS);
> > > > + VM_BUG_ON(new_gen >= MAX_NR_GENS);
> > >
> > > General rule: don't add new BUG_ONs, because they crash the kenrel.
> > > It's better to use WARN_ON or WARN_ON_ONCE then try to figure out a way
> > > to keep the kernel limping along. At least so the poor user can gather logs.
> >
> > These are VM_BUG_ONs, which are BUILD_BUG_ONs except for (mostly MM) developers.
>
> I'm told that many production builds enable runtime VM_BUG_ONning.

Nobody wants to debug VM in production. Some distros that offer both
the latest/LTS kernels do enable CONFIG_DEBUG_VM in the former so the
latter can have better test coverage when it becomes available. Do
people use the former in production? Absolutely, otherwise we won't
have enough test coverage. Are we supposed to avoid CONFIG_DEBUG_VM? I
don't think so, because it defeats the purpose of those distros
enabling it in the first place.

The bottomline is that none of RHEL 8.5, SLES 15, Debian 11 enables
CONFIG_DEBUG_VM.