Re: [Patch v2 2/2] x86/tsc: use logical_packages as a better estimation of socket numbers
From: Zhang, Rui
Date: Fri Jun 16 2023 - 05:19:33 EST
On Fri, 2023-06-16 at 10:10 +0200, Peter Zijlstra wrote:
> On Fri, Jun 16, 2023 at 10:02:31AM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 16, 2023 at 06:53:21AM +0000, Zhang, Rui wrote:
> > > On Thu, 2023-06-15 at 11:20 +0200, Peter Zijlstra wrote:
> >
> > > > So I have at least two machines where I boot with
> > > > 'possible_cpus=#'
> > > > because the BIOS MADT is reporting a stupid number of CPUs that
> > > > aren't
> > > > actually there.
> > >
> > > Does the MADT report those CPUs as disabled but online capable?
> > > can you send me a copy of the acpidmp?
> >
> > Sent privately, it's a bit big.
>
> So if I remove 'possible_cpus=40' it does crazy shit like this:
>
> [ 1.268447] setup_percpu: NR_CPUS:512 nr_cpumask_bits:160
> nr_cpu_ids:160 nr_node_ids:2
>
> [ 1.303567] pcpu-alloc: [0] 000 001 002 003 004 005 006 007
> [ 1.309871] pcpu-alloc: [0] 008 009 020 021 022 023 024 025
> [ 1.316172] pcpu-alloc: [0] 026 027 028 029 040 042 044 046
> [ 1.322475] pcpu-alloc: [0] 048 050 052 054 056 058 060 062
> [ 1.328777] pcpu-alloc: [0] 064 066 068 070 072 074 076 078
> [ 1.335084] pcpu-alloc: [0] 080 082 084 086 088 090 092 094
> [ 1.341387] pcpu-alloc: [0] 096 098 100 102 104 106 108 110
> [ 1.347688] pcpu-alloc: [0] 112 114 116 118 120 122 124 126
> [ 1.353992] pcpu-alloc: [0] 128 130 132 134 136 138 140 142
> [ 1.360293] pcpu-alloc: [0] 144 146 148 150 152 154 156 158
> [ 1.366596] pcpu-alloc: [1] 010 011 012 013 014 015 016 017
> [ 1.372900] pcpu-alloc: [1] 018 019 030 031 032 033 034 035
> [ 1.379201] pcpu-alloc: [1] 036 037 038 039 041 043 045 047
> [ 1.385504] pcpu-alloc: [1] 049 051 053 055 057 059 061 063
> [ 1.391806] pcpu-alloc: [1] 065 067 069 071 073 075 077 079
> [ 1.398109] pcpu-alloc: [1] 081 083 085 087 089 091 093 095
> [ 1.404411] pcpu-alloc: [1] 097 099 101 103 105 107 109 111
> [ 1.410714] pcpu-alloc: [1] 113 115 117 119 121 123 125 127
> [ 1.417016] pcpu-alloc: [1] 129 131 133 135 137 139 141 143
> [ 1.423319] pcpu-alloc: [1] 145 147 149 151 153 155 157 159
>
> [ 2.110382] smp: Bringing up secondary CPUs ...
> [ 2.112255] x86: Booting SMP configuration:
> [ 2.113253] .... node #0, CPUs: #1 #2 #3 #4 #5
> #6 #7 #8 #9
> [ 2.221253] .... node #1, CPUs: #10
> [ 0.163522] smpboot: CPU 10 Converting physical 0 to logical die 1
> [ 2.337372] #11 #12 #13 #14 #15 #16 #17 #18 #19
> [ 2.504253] .... node #0, CPUs: #20 #21 #22 #23 #24 #25
> #26 #27 #28 #29
> [ 2.563253] .... node #1, CPUs: #30 #31 #32 #33 #34 #35
> #36 #37 #38 #39
> [ 2.662321] smp: Brought up 2 nodes, 40 CPUs
> [ 2.664257] smpboot: Max logical packages: 8
>
> It is an IVB-EP with *2* sockets, 10 cores and SMT, 40 is right, 160
> is
> quite insane.
According to the MADT, there are indeed 40 valid CPUs. And then 80 CPUs
with
APIC ID : FF
enabled : 0
Online capable : 0
a dumb question, why are these CPUs added into the possible_mask?
I can dig into this later but I just don't have a quick answer at the
moment.
thanks,
rui