~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/arch/x86/xstate.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 Using XSTATE features in user space applications
  2 ================================================
  3 
  4 The x86 architecture supports floating-point extensions which are
  5 enumerated via CPUID. Applications consult CPUID and use XGETBV to
  6 evaluate which features have been enabled by the kernel XCR0.
  7 
  8 Up to AVX-512 and PKRU states, these features are automatically enabled by
  9 the kernel if available. Features like AMX TILE_DATA (XSTATE component 18)
 10 are enabled by XCR0 as well, but the first use of related instruction is
 11 trapped by the kernel because by default the required large XSTATE buffers
 12 are not allocated automatically.
 13 
 14 The purpose for dynamic features
 15 --------------------------------
 16 
 17 Legacy userspace libraries often have hard-coded, static sizes for
 18 alternate signal stacks, often using MINSIGSTKSZ which is typically 2KB.
 19 That stack must be able to store at *least* the signal frame that the
 20 kernel sets up before jumping into the signal handler. That signal frame
 21 must include an XSAVE buffer defined by the CPU.
 22 
 23 However, that means that the size of signal stacks is dynamic, not static,
 24 because different CPUs have differently-sized XSAVE buffers. A compiled-in
 25 size of 2KB with existing applications is too small for new CPU features
 26 like AMX. Instead of universally requiring larger stack, with the dynamic
 27 enabling, the kernel can enforce userspace applications to have
 28 properly-sized altstacks.
 29 
 30 Using dynamically enabled XSTATE features in user space applications
 31 --------------------------------------------------------------------
 32 
 33 The kernel provides an arch_prctl(2) based mechanism for applications to
 34 request the usage of such features. The arch_prctl(2) options related to
 35 this are:
 36 
 37 -ARCH_GET_XCOMP_SUPP
 38 
 39  arch_prctl(ARCH_GET_XCOMP_SUPP, &features);
 40 
 41  ARCH_GET_XCOMP_SUPP stores the supported features in userspace storage of
 42  type uint64_t. The second argument is a pointer to that storage.
 43 
 44 -ARCH_GET_XCOMP_PERM
 45 
 46  arch_prctl(ARCH_GET_XCOMP_PERM, &features);
 47 
 48  ARCH_GET_XCOMP_PERM stores the features for which the userspace process
 49  has permission in userspace storage of type uint64_t. The second argument
 50  is a pointer to that storage.
 51 
 52 -ARCH_REQ_XCOMP_PERM
 53 
 54  arch_prctl(ARCH_REQ_XCOMP_PERM, feature_nr);
 55 
 56  ARCH_REQ_XCOMP_PERM allows to request permission for a dynamically enabled
 57  feature or a feature set. A feature set can be mapped to a facility, e.g.
 58  AMX, and can require one or more XSTATE components to be enabled.
 59 
 60  The feature argument is the number of the highest XSTATE component which
 61  is required for a facility to work.
 62 
 63 When requesting permission for a feature, the kernel checks the
 64 availability. The kernel ensures that sigaltstacks in the process's tasks
 65 are large enough to accommodate the resulting large signal frame. It
 66 enforces this both during ARCH_REQ_XCOMP_SUPP and during any subsequent
 67 sigaltstack(2) calls. If an installed sigaltstack is smaller than the
 68 resulting sigframe size, ARCH_REQ_XCOMP_SUPP results in -ENOSUPP. Also,
 69 sigaltstack(2) results in -ENOMEM if the requested altstack is too small
 70 for the permitted features.
 71 
 72 Permission, when granted, is valid per process. Permissions are inherited
 73 on fork(2) and cleared on exec(3).
 74 
 75 The first use of an instruction related to a dynamically enabled feature is
 76 trapped by the kernel. The trap handler checks whether the process has
 77 permission to use the feature. If the process has no permission then the
 78 kernel sends SIGILL to the application. If the process has permission then
 79 the handler allocates a larger xstate buffer for the task so the large
 80 state can be context switched. In the unlikely cases that the allocation
 81 fails, the kernel sends SIGSEGV.
 82 
 83 AMX TILE_DATA enabling example
 84 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 85 
 86 Below is the example of how userspace applications enable
 87 TILE_DATA dynamically:
 88 
 89   1. The application first needs to query the kernel for AMX
 90      support::
 91 
 92         #include <asm/prctl.h>
 93         #include <sys/syscall.h>
 94         #include <stdio.h>
 95         #include <unistd.h>
 96 
 97         #ifndef ARCH_GET_XCOMP_SUPP
 98         #define ARCH_GET_XCOMP_SUPP  0x1021
 99         #endif
100 
101         #ifndef ARCH_XCOMP_TILECFG
102         #define ARCH_XCOMP_TILECFG   17
103         #endif
104 
105         #ifndef ARCH_XCOMP_TILEDATA
106         #define ARCH_XCOMP_TILEDATA  18
107         #endif
108 
109         #define MASK_XCOMP_TILE      ((1 << ARCH_XCOMP_TILECFG) | \
110                                       (1 << ARCH_XCOMP_TILEDATA))
111 
112         unsigned long features;
113         long rc;
114 
115         ...
116 
117         rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_SUPP, &features);
118 
119         if (!rc && (features & MASK_XCOMP_TILE) == MASK_XCOMP_TILE)
120             printf("AMX is available.\n");
121 
122   2. After that, determining support for AMX, an application must
123      explicitly ask permission to use it::
124 
125         #ifndef ARCH_REQ_XCOMP_PERM
126         #define ARCH_REQ_XCOMP_PERM  0x1023
127         #endif
128 
129         ...
130 
131         rc = syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, ARCH_XCOMP_TILEDATA);
132 
133         if (!rc)
134             printf("AMX is ready for use.\n");
135 
136 Note this example does not include the sigaltstack preparation.
137 
138 Dynamic features in signal frames
139 ---------------------------------
140 
141 Dynamically enabled features are not written to the signal frame upon signal
142 entry if the feature is in its initial configuration.  This differs from
143 non-dynamic features which are always written regardless of their
144 configuration.  Signal handlers can examine the XSAVE buffer's XSTATE_BV
145 field to determine if a features was written.
146 
147 Dynamic features for virtual machines
148 -------------------------------------
149 
150 The permission for the guest state component needs to be managed separately
151 from the host, as they are exclusive to each other. A coupled of options
152 are extended to control the guest permission:
153 
154 -ARCH_GET_XCOMP_GUEST_PERM
155 
156  arch_prctl(ARCH_GET_XCOMP_GUEST_PERM, &features);
157 
158  ARCH_GET_XCOMP_GUEST_PERM is a variant of ARCH_GET_XCOMP_PERM. So it
159  provides the same semantics and functionality but for the guest
160  components.
161 
162 -ARCH_REQ_XCOMP_GUEST_PERM
163 
164  arch_prctl(ARCH_REQ_XCOMP_GUEST_PERM, feature_nr);
165 
166  ARCH_REQ_XCOMP_GUEST_PERM is a variant of ARCH_REQ_XCOMP_PERM. It has the
167  same semantics for the guest permission. While providing a similar
168  functionality, this comes with a constraint. Permission is frozen when the
169  first VCPU is created. Any attempt to change permission after that point
170  is going to be rejected. So, the permission has to be requested before the
171  first VCPU creation.
172 
173 Note that some VMMs may have already established a set of supported state
174 components. These options are not presumed to support any particular VMM.

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php