Re: Bug#996839: [PATCH v1] perf script flamegraph: Avoid d3-flame-graph package dependency

From: Salvatore Bonaccorso
Date: Sat Jan 14 2023 - 10:04:15 EST


Hi,

On Thu, Jan 12, 2023 at 02:28:59PM -0800, Ian Rogers wrote:
> On Wed, Jan 11, 2023 at 10:08 AM Andreas Gerstmayr
> <agerstmayr@xxxxxxxxxx> wrote:
> >
> > On 11.01.23 18:13, Ian Rogers wrote:
> > > On Wed, Jan 11, 2023 at 5:18 AM Andreas Gerstmayr <agerstmayr@xxxxxxxxxx> wrote:
> > >>
> > >> On 10.01.23 20:51, Ian Rogers wrote:
> > >>> On Tue, Jan 10, 2023 at 10:02 AM Andreas Gerstmayr
> > >>> <agerstmayr@xxxxxxxxxx> wrote:
> > >>>>
> > >>>> On 05.01.23 10:24, Ian Rogers wrote:
> > >>>>> On Wed, Jan 4, 2023 at 7:04 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> > >>>>>>
> > >>>>>> Currently flame graph generation requires a d3-flame-graph template to
> > >>>>>> be installed. Unfortunately this is hard to come by for things like
> > >>>>>> Debian [1]. If the template isn't installed warn and download it from
> > >>>>>> jsdelivr CDN. If downloading fails generate a minimal flame graph
> > >>>>>> again with the javascript coming from jsdelivr CDN.
> > >>>>>>
> > >>>>>> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996839
> > >>>>>>
> > >>>>>> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > >>>>
> > >>>> I'm not entirely convinced that this perf script should make a network
> > >>>> request. In my opinion
> > >>>>
> > >>>> * best would be if this template gets packaged in Debian
> > >>>> * alternatively print instructions how to install the template:
> > >>>>
> > >>>> sudo mkdir /usr/share/d3-flame-graph
> > >>>> sudo wget -O /usr/share/d3-flame-graph/d3-flamegraph-base.html
> > >>>> https://cdn.jsdelivr.net/npm/d3-flame-graph@4/dist/templates/d3-flamegraph-base.html
> > >>>>
> > >>>> * or eventually just embed the entire template (66 KB) in the Python
> > >>>> file (not sure if this is permissible though, it's basically a minified
> > >>>> HTML/JS blob)?
> > >>>> * if the above options don't work, I'd recommend asking the user before
> > >>>> making the network request, and eventually persist the template
> > >>>> somewhere so it doesn't need to be downloaded every time
> > >>>>
> > >>>> What do you think?
> > >>>
> > >>> So the script following this patch is going to try 3 things:
> > >>> 1) look for a local template HTML,
> > >>> 2) if a local template isn't found warn and switch to downloading the
> > >>> CDN version,
> > >>> 3) if the CDN version fails to download then use a minimal (embedded
> > >>> in the script) HTML file with JS dependencies coming from the CDN.
> > >>>
> > >>> For (1) there are references in the generated HTML to www.w3.org,
> > >>> meaning the HTML isn't fully hermetic - although these references may
> > >>> not matter.
> > >>
> > >> The references to w3.org are XML namespace names. As far as I'm aware
> > >> they do not matter, as they are merely identifiers and are never
> > >> accessed [1,2]. Therefore the generated HTML file in its current form is
> > >> hermetic.
> > >>
> > >> This was a design decision from the start - the flame graph generation
> > >> and the resulting HTML file should not use any external resources loaded
> > >> over the network (the external host could be inaccessible or compromised
> > >> at any time). I understand that this requirement may not apply to every
> > >> user or company.
> > >>
> > >>> For (2) there will be a download from the CDN of the template during
> > >>> the execution of flamegraph. The template is better than (3) as it
> > >>> features additional buttons letting you zoom out, etc. The HTML will
> > >>> contain CDN references.
> > >>> For (3) the generated HTML isn't hermetic and will use the CDN.
> > >>>
> > >>> For (2) we could give a prompt before trying to download the template,
> > >>> or we could change it so that we give the wget command. I wouldn't
> > >>> have the wget command require root privileges, I'd say that the
> > >>> template could be downloaded and then the path of the download given
> > >>> as an option to the flamegraph program. Downloading the file and then
> > >>> adding an option to flamegraph seems inconvenient to the user,
> > >>> something that the use of urllib in the patch avoids - it is
> > >>> essentially just doing this for the user without storing the file on
> > >>> disk. I think I prefer to be less inconvenient, and so the state of
> > >>> the code at the moment looks best to me. Given that no choice results
> > >>> in a fully hermetic HTML file, they seem similar in this regard. Is it
> > >>> okay to try a download with urllib? Should there be a prompt? I think
> > >>> the warning is enough and allows scripts, etc. using flamegraph to
> > >>> more easily function across differing distributions.
> > >>
> > >> I fully agree, your changes make the flame graph generation more
> > >> convenient in case the template is not installed. Also, downloading the
> > >> template to a local directory and having to specify the path to the
> > >> template every time is an inconvenience.
> > >>
> > >> I think it is a tradeoff between convenience and security/privacy.
> > >> jsdelivr can get compromised, serving malicious JS (well, in that case,
> > >> printing instructions to jsdelivr is also flawed :|).
> > >> In the end it's up to the maintainers to decide.
> > >
> > > Agreed. It is something of an issue with the perf tool, I think noted
> > > by Arnaldo and Linus, that it has too many dependencies. We also have
> > > a problem that a number of dependencies aren't license compatible for
> > > distribution with perf's GPLv2. flamegraph.py is always installed but
> > > it carries a dependency that can't be satisfied on Debian and
> > > everything deriving from it. The prompting to install a package
> > > solution, as is generally used in the build, is one approach. Using
> > > http rather than files is the approach this change introduces, and is
> > > an approach advocated by the d3 flamegraph README.md. Perhaps package
> > > maintainers like Michael and Ben have opinions here.
> > >
> > >>> An aside, Namhyung pointed out to me that there is a "perf report -g
> > >>> folded" option, that was added in part to simplify creating
> > >>> flamegraphs:
> > >>> http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@xxxxxxxxxx
> > >>> Building off of perf report may improve/simplify the flamegraph code,
> > >>> or avoid the requirement that libpython be linked against perf.
> > >>
> > >> Yep, in this case another tool is required to generate the flame graph
> > >> visualization, for example [3].
> > >
> > > Thanks, perhaps [3] needs updating to use it as current processing
> > > appears to be done using perf script:
> > > https://github.com/brendangregg/FlameGraph/blob/master/stackcollapse-perf.pl
> > >
> > > I think users end up trying out flamegraph as they want something
> > > easier to read than perf report and the command ships with the tool.
> > > flamegraph is unique in being like this and it is a pity that the
> > > Debian half of the world has difficulty making it work.
> >
> > Maybe someone with Debian packager rights could package the template
> > [1]? Then it'd work the same way like with RPM-based distros, and it'll
> > also work in environments without network access.
>
> I chatted a little with Arnaldo and have sent out a v2 that adds the
> prompt as well as an md5sum check on the downloaded file. PTAL:
> https://lore.kernel.org/lkml/20230112220024.32709-1-irogers@xxxxxxxxxx/

Regarding having packaged the template: The request is here:
https://bugs.debian.org/1002492

Regards,
Salvatore