Re: [PATCH 2/4] of: DT quirks infrastructure

From: Pantelis Antoniou
Date: Fri Feb 20 2015 - 10:38:32 EST


Hi Peter,

> On Feb 20, 2015, at 17:24 , Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
>
> On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
>> Hi Peter,
>>
>>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
>>>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>>>>
>>>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>>>> Hi Frank,
>>>>>>>>
>>>>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>>>>> Hi Mark,
>>>>>>>>>>
>>>>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@xxxxxxx> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>>>>
>>>>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>>>>
>>>>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't workâ
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thatâs not correct; the only part of the DTB that needs to be common
>>>>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>>>>
>>>>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>>>>
>>>>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>>>>> you risk board damage.
>>>>>>>>>>
>>>>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Youâre missing the point. I canât use the same DTB for each revision of the
>>>>>>>>>>>> board. Each board is similar but itâs not identical.
>>>>>>>>>>>
>>>>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>>>>> the kernel without need for quirks.
>>>>>>>>>>>
>>>>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>>>>> its own DTB).
>>>>>>>>>>
>>>>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>>>>> thatâs 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>>>>
>>>>>>>>> < snip >
>>>>>>>>>
>>>>>>>>> Or you can install the correct DTB on the board. You trust your manufacturing line
>>>>>>>>> to install the correct resistors. You trust your manufacturing line to install the
>>>>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>>>>
>>>>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>>>>> bootloaders understood. Are you willing to break that? (This is one of those
>>>>>>>>> ripples I mentioned in my other emails.)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Trust no-one.
>>>>>>>>
>>>>>>>> This is one of those things that the kernel community doesnât understand which makes people
>>>>>>>> who push product quite mad.
>>>>>>>>
>>>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>>>>
>>>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>>>> instead of turning a profit youâre losing money if you only have a few cents of profit
>>>>>>>> per unit.
>>>>>>>
>>>>>>> I'm not installing physical components manually. Why would I be installing software
>>>>>>> manually? (rhetorical question)
>>>>>>>
>>>>>>
>>>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>>>>
>>>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>>>> easier.
>>>>>>
>>>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if itâs present
>>>>>> on the flash medium) takes time and is error-prone.
>>>>>>
>>>>>> Factory time == money, errors == money.
>>>>>>
>>>>>>>>
>>>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>>>> for a few million units.
>>>>>>>
>>>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>>>>
>>>>>>
>>>>>> The first one off the line works. The rest will get some burn in and functional testing if youâre
>>>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>>>> as is and deal with recalls, if youâre reasonably happy after a little bit of statistical sampling.
>>>>>>
>>>>>> Hardware is hard :)
>>>>>
>>>>> I'm failing to see how this series improves your manufacturing process at all.
>>>>>
>>>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>>> White and Black? You _trust_ them to get that right, or more likely, you
>>>>> have process control procedures in place so that you don't get 1 million Blacks
>>>>> flashed with the White eeprom image.
>>>>>
>>>>> 2. The White and Black use different memory technology so it's not as if the
>>>>> eMMC from the Black will end up on the White SMT line (or vice versa).
>>>>>
>>>>> 3 For that matter, why wouldn't you worry that all the microSD cards intended
>>>>> for the White were accidentally assembled with the first 50,000 Blacks; at
>>>>> that point you're losing a lot more than a few cents of profit. And that has
>>>>> nothing to do with what image you provided.
>>>>>
>>>>
>>>> As you said, we can imagine many reasons to have a failure during the
>>>> production, having several DTB files will increase the risk.
>>>
>>> It's interesting that you don't see the added complexity of open-coding
>>> the i2c driver or mixing DTS fragments for different designs as increased risk
>>> (for us all).
>>>
>>>
>>
>> You donât have to use it.
>
>> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
>> index 5d27dfd..02129e7 100644
>> --- a/arch/arm/mach-omap2/Makefile
>> +++ b/arch/arm/mach-omap2/Makefile
>> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD) += board-am3517crane.o
>>
>> obj-$(CONFIG_MACH_SBC3530) += board-omap3stalker.o
>>
>> +# DT quirks
>> +ifneq ($(CONFIG_OF_DYNAMIC),)
>> +obj-$(CONFIG_SOC_AM33XX) += am33xx-dt-quirks.o
>> +endif
>
> Won't this automatically be included on my Black that supports DT overlays?
>

Yes it will. It is a grand total of 498 lines of code, and the total size of
the code segment is about 2.2K.

You do realize that youâre probably booting a multi-platform kernel on the
black right? Including things for all 2xxx/3xxx and 44xx platforms?
For instance:

> ~/ti/kernels/linux-github.git $ wc -l arch/arm/mach-omap2/*44xx*.c
> 443 arch/arm/mach-omap2/clockdomains44xx_data.c
> 526 arch/arm/mach-omap2/cminst44xx.c
> 251 arch/arm/mach-omap2/cpuidle44xx.c
> 250 arch/arm/mach-omap2/dpll44xx.c
> 4864 arch/arm/mach-omap2/omap_hwmod_44xx_data.c
> 295 arch/arm/mach-omap2/pm44xx.c
> 358 arch/arm/mach-omap2/powerdomains44xx_data.c
> 62 arch/arm/mach-omap2/prcm_mpu44xx.c
> 770 arch/arm/mach-omap2/prm44xx.c
> 210 arch/arm/mach-omap2/prminst44xx.c
> 117 arch/arm/mach-omap2/vc44xx_data.c
> 130 arch/arm/mach-omap2/voltagedomains44xx_data.c
> 104 arch/arm/mach-omap2/vp44xx_data.c
> 8380 total

I bet those things are far larger than 2.2K. And I also bet that in the
tradeoff analysis that the board maintainer did things came down to
increasing complexity so that he can consolidate the kernels for all the
other platforms he has to support besides the black.

>
>> Some people really do though. As for increased risk
>> I expect to see arguments instead of a statement.
>
> No one is wasting your time with random arguments. Please keep your tone civil.
>

A statement like 'increasing risk for all of us' is very open ended. What is
the risk? How much of it exists?

If I offended you Iâm really sorry though, I meant no such thing.

> Regards,
> Peter Hurley
>

Regards

â Pantelis

>
>>>>> 3. The factory is just as likely to use some other customer's image by accident,
>>>>> so you're just as likely to have the same failure rate if you have no test
>>>>> process at the factory.
>>>>>
>>>>> 4. If you're using offline programming, the image has to be tested after
>>>>> reflow anyway.
>>>>>
>>>>> IOW, your QA process will not change at all == same cost.
>>>>>
>>>>> Regards,
>>>>> Peter Hurley
>>>
>>
>> Regards
>>
>> â Pantelis

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/