Re: [v6 PATCH 06/21] x86/insn-eval: Add utility functions to get segment selector

From: Ricardo Neri
Date: Wed Apr 26 2017 - 16:47:36 EST


On Wed, 2017-04-26 at 13:44 -0700, Ricardo Neri wrote:
> >
> > > + */
> > > + for (i = 0; i < insn->prefixes.nbytes; i++) {
> > > + switch (insn->prefixes.bytes[i]) {
> > > + case SEG_CS:
> > > + return SEG_CS;
> > > + case SEG_SS:
> > > + return SEG_SS;
> > > + case SEG_DS:
> > > + return SEG_DS;
> > > + case SEG_ES:
> > > + return SEG_ES;
> > > + case SEG_FS:
> > > + return SEG_FS;
> > > + case SEG_GS:
> > > + return SEG_GS;
> >
> > So what happens if you're in 64-bit mode and you have CS, DS, ES, or
> SS?
> > Or is this what @get_default is supposed to do? But it doesn't look
> like
> > it, it still returns segments ignored in 64-bit mode.
>
> I regard that the role of this function is to obtain the the segment
> selector from either of the prefixes or inferred from the operands. It
> is the role of caller to determine if the segment selector should be
> ignored. So far the only caller is insn_get_seg_base() [1]. If in long
> mode, the segment base address is regarded as 0 unless the segment
> selector is FS or GS.
> >
> > > + default:
> > > + return -EINVAL;
> > > + }
> > > + }
> > > +
> > > +default_seg:
> > > + /*
> > > + * If no overrides, use default selectors as described in the
> > > + * Intel documentation: SS for ESP or EBP. DS for all data
> references,
> > > + * except when relative to stack or string destination.
> > > + * Also, AX, CX and DX are not valid register operands in
> 16-bit
> > > + * address encodings.
> > > + * Callers must interpret the result correctly according to
> the type
> > > + * of instructions (e.g., use ES for string instructions).
> > > + * Also, some values of modrm and sib might seem to indicate
> the use
> > > + * of EBP and ESP (e.g., modrm_mod = 0, modrm_rm = 5) but
> actually
> > > + * they refer to cases in which only a displacement used.
> These cases
> > > + * should be indentified by the caller and not with this
> function.
> > > + */
> > > + switch (regoff) {
> > > + case offsetof(struct pt_regs, ax):
> > > + /* fall through */
> > > + case offsetof(struct pt_regs, cx):
> > > + /* fall through */
> > > + case offsetof(struct pt_regs, dx):
> > > + if (insn && insn->addr_bytes == 2)
> > > + return -EINVAL;
> > > + case -EDOM: /* no register involved in address computation */
> > > + case offsetof(struct pt_regs, bx):
> > > + /* fall through */
> > > + case offsetof(struct pt_regs, di):
> > > + /* fall through */
> >
> > return SEG_ES;
> >
> > ?
>
> I double-checked the latest version of the Intel Software Development
> manual [2], in the table 3-5 in section 3.7.4 mentions that DS is
> default segment for all data references, except string destinations. I
> tested this code with the UMIP-protected instructions and whenever I
> use
> %edi the default segment is %ds.


I forgot my references:

[1]. https://lkml.org/lkml/2017/3/7/876
[2]. https://software.intel.com/en-us/articles/intel-sdm#combined