Re: [RFC] LKML Archive in Maildir Format

From: Joey Pabalinas
Date: Sun Dec 16 2018 - 14:53:49 EST


On Sun, Dec 16, 2018 at 02:46:49PM -0500, Konstantin Ryabitsev wrote:
> On Sun, Dec 16, 2018 at 09:06:39AM -1000, Joey Pabalinas wrote:
> > I spent a lot of time trying to find an LKML archive in Maildir format
> > that I could use for local searches with nutmuch or something, but all
> > the links I was able to find were all dead.
> >
> > I ended up just compiling one myself and I currently host it at:
> >
> > https://alyptik.org/lkml.tar.xz
>
> You seem to have duplicated a lot of effort that has already been done
> to compile the archive on lore.kernel.org.

Absolutely correct, haha.

>
> > It's possible I'm the only weirdo who finds this kind of thing useful, but
> > I figured I should share it just in case I'm not.
>
> The maildir format is kind of terrible for LKML, because having millions
> of messages in a single directory is very hard on the underlying FS. If
> you break it up into multiple folders, then it becomes difficult to
> search. This is the main reason why we have chosen to go with the
> public-inbox format, which solves both of these problems and allows for
> a very efficient archive updating and replication using git.
>
> > It's about 1.1 million files, I was wondering if anyone had an idea of a
> > better way to host this? I've tried Github and GitLab, but they don't
> > appreciate repos with that many files, hah.
>
> Like I said, you seem to be going down the road we've already tried and
> rejected. :)

Yes, I had a strong suspicion I might be the only crazy person who prefers this
kind of format :)

My only comment on the public-mailbox choice is that the documentation
is very sparse and erratic. Myself and a couple other people just
couldn't figure out how to convert that format to Maildir or some other
format you could feed into a reader like neomutt.

Do you have any advice on how to convert those public-inbox files
correctly?

--
Cheers,
Joey Pabalinas

Attachment: signature.asc
Description: PGP signature