~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/arch/arm64/sve.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

Diff markup

Differences between /Documentation/arch/arm64/sve.rst (Architecture m68k) and /Documentation/arch/ppc/sve.rst (Architecture ppc)


  1 ==============================================    
  2 Scalable Vector Extension support for AArch64     
  3 ==============================================    
  4                                                   
  5 Author: Dave Martin <Dave.Martin@arm.com>          
  6                                                   
  7 Date:   4 August 2017                             
  8                                                   
  9 This document outlines briefly the interface p    
 10 order to support use of the ARM Scalable Vecto    
 11 interactions with Streaming SVE mode added by     
 12 (SME).                                            
 13                                                   
 14 This is an outline of the most important featu    
 15 intended to be exhaustive.                        
 16                                                   
 17 This document does not aim to describe the SVE    
 18 model.  To aid understanding, a minimal descri    
 19 model features for SVE is included in Appendix    
 20                                                   
 21                                                   
 22 1.  General                                       
 23 -----------                                       
 24                                                   
 25 * SVE registers Z0..Z31, P0..P15 and FFR and t    
 26   tracked per-thread.                             
 27                                                   
 28 * In streaming mode FFR is not accessible unle    
 29   in the system, when it is not supported and     
 30   access streaming mode FFR is read and writte    
 31                                                   
 32 * The presence of SVE is reported to userspace    
 33   AT_HWCAP entry.  Presence of this flag impli    
 34   instructions and registers, and the Linux-sp    
 35   described in this document.  SVE is reported    
 36                                                   
 37 * Support for the execution of SVE instruction    
 38   detected by reading the CPU ID register ID_A    
 39   instruction, and checking that the value of     
 40                                                   
 41   It does not guarantee the presence of the sy    
 42   following sections: software that needs to v    
 43   present must check for HWCAP_SVE instead.       
 44                                                   
 45 * On hardware that supports the SVE2 extension    
 46   be reported in the AT_HWCAP2 aux vector entr    
 47   optional extensions to SVE2 may be reported     
 48                                                   
 49         HWCAP2_SVE2                               
 50         HWCAP2_SVEAES                             
 51         HWCAP2_SVEPMULL                           
 52         HWCAP2_SVEBITPERM                         
 53         HWCAP2_SVESHA3                            
 54         HWCAP2_SVESM4                             
 55         HWCAP2_SVE2P1                             
 56                                                   
 57   This list may be extended over time as the S    
 58                                                   
 59   These extensions are also reported via the C    
 60   which userspace can read using an MRS instru    
 61   cpu-feature-registers.txt for details.          
 62                                                   
 63 * On hardware that supports the SME extensions    
 64   reported in the AT_HWCAP2 aux vector entry.     
 65   streaming mode which provides a subset of th    
 66   separate SME vector length and the same Z/V     
 67   for more details.                               
 68                                                   
 69 * Debuggers should restrict themselves to inte    
 70   NT_ARM_SVE regset.  The recommended way of d    
 71   is to connect to a target process first and     
 72   ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &i    
 73   present and streaming SVE mode is in use the    
 74   will be read via NT_ARM_SVE and NT_ARM_SVE w    
 75   in the target.                                  
 76                                                   
 77 * Whenever SVE scalable register values (Zn, P    
 78   between userspace and the kernel, the regist    
 79   an endianness-invariant layout, with bits [(    
 80   byte offset i from the start of the memory r    
 81   example the signal frame (struct sve_context    
 82   (struct user_sve_header) and associated data    
 83                                                   
 84   Beware that on big-endian systems this resul    
 85   for the FPSIMD V-registers, which are stored    
 86   values, with bits [(127 - 8 * i) : (120 - 8     
 87   byte offset i.  (struct fpsimd_context, stru    
 88                                                   
 89                                                   
 90 2.  Vector length terminology                     
 91 -----------------------------                     
 92                                                   
 93 The size of an SVE vector (Z) register is refe    
 94                                                   
 95 To avoid confusion about the units used to exp    
 96 adopts the following conventions:                 
 97                                                   
 98 * Vector length (VL) = size of a Z-register in    
 99                                                   
100 * Vector quadwords (VQ) = size of a Z-register    
101                                                   
102 (So, VL = 16 * VQ.)                               
103                                                   
104 The VQ convention is used where the underlying    
105 as in data structure definitions.  In most oth    
106 is used.  This is consistent with the meaning     
107 the SVE instruction set architecture.             
108                                                   
109                                                   
110 3.  System call behaviour                         
111 -------------------------                         
112                                                   
113 * On syscall, V0..V31 are preserved (as withou    
114   Z0..Z31 are preserved.  All other bits of Z0    
115   become zero on return from a syscall.           
116                                                   
117 * The SVE registers are not used to pass argum    
118   any syscall.                                    
119                                                   
120 * All other SVE state of a thread, including t    
121   length, the state of the PR_SVE_VL_INHERIT f    
122   length (if any), is preserved across all sys    
123   exceptions for execve() described in section    
124                                                   
125   In particular, on return from a fork() or cl    
126   process or thread share identical SVE config    
127   parent before the call.                         
128                                                   
129                                                   
130 4.  Signal handling                               
131 -------------------                               
132                                                   
133 * A new signal frame record sve_context encode    
134   delivery. [1]                                   
135                                                   
136 * This record is supplementary to fpsimd_conte    
137   are only present in fpsimd_context.  For con    
138   is duplicated between sve_context and fpsimd    
139                                                   
140 * The record contains a flag field which inclu    
141   if set indicates that the thread is in strea    
142   and register data (if present) describe the     
143   length.                                         
144                                                   
145 * The signal frame record for SVE always conta    
146   the thread's vector length (in sve_context.v    
147                                                   
148 * The SVE registers may or may not be included    
149   whether the registers are live for the threa    
150   and only if:                                    
151   sve_context.head.size >= SVE_SIG_CONTEXT_SIZ    
152                                                   
153 * If the registers are present, the remainder     
154   size and layout.  Macros SVE_SIG_* are defin    
155   the members.                                    
156                                                   
157 * Each scalable register (Zn, Pn, FFR) is stor    
158   layout, with bits [(8 * i + 7) : (8 * i)] st    
159   start of the register's representation in me    
160                                                   
161 * If the SVE context is too big to fit in sigc    
162   space is allocated on the stack, an extra_co    
163   __reserved[] referencing this space.  sve_co    
164   extra space.  Refer to [1] for further detai    
165                                                   
166                                                   
167 5.  Signal return                                 
168 -----------------                                 
169                                                   
170 When returning from a signal handler:             
171                                                   
172 * If there is no sve_context record in the sig    
173   present but contains no register data as des    
174   then the SVE registers/bits become non-live     
175                                                   
176 * If sve_context is present in the signal fram    
177   data, the SVE registers become live and are     
178   data.  However, for backward compatibility r    
179   are always restored from the corresponding m    
180   and not from sve_context.  The remaining bit    
181                                                   
182 * Inclusion of fpsimd_context in the signal fr    
183   irrespective of whether sve_context is prese    
184                                                   
185 * The vector length cannot be changed via sign    
186   the signal frame does not match the current     
187   attempt is treated as illegal, resulting in     
188                                                   
189 * It is permitted to enter or leave streaming     
190   the SVE_SIG_FLAG_SM flag but applications sh    
191   when doing so sve_context.vl and any registe    
192   vector length in the new mode.                  
193                                                   
194                                                   
195 6.  prctl extensions                              
196 --------------------                              
197                                                   
198 Some new prctl() calls are added to allow prog    
199 length:                                           
200                                                   
201 prctl(PR_SVE_SET_VL, unsigned long arg)           
202                                                   
203     Sets the vector length of the calling thre    
204     arg == vl | flags.  Other threads of the c    
205                                                   
206     vl is the desired vector length, where sve    
207                                                   
208     flags:                                        
209                                                   
210         PR_SVE_VL_INHERIT                         
211                                                   
212             Inherit the current vector length     
213             vector length is reset to the syst    
214             Section 9.)                           
215                                                   
216         PR_SVE_SET_VL_ONEXEC                      
217                                                   
218             Defer the requested vector length     
219             performed by this thread.             
220                                                   
221             The effect is equivalent to implic    
222             call immediately after the next ex    
223                                                   
224                 prctl(PR_SVE_SET_VL, arg & ~PR    
225                                                   
226             This allows launching of a new pro    
227             length, while avoiding runtime sid    
228                                                   
229                                                   
230             Without PR_SVE_SET_VL_ONEXEC, the     
231             immediately.                          
232                                                   
233                                                   
234     Return value: a nonnegative on success, or    
235         EINVAL: SVE not supported, invalid vec    
236             invalid flags.                        
237                                                   
238                                                   
239     On success:                                   
240                                                   
241     * Either the calling thread's vector lengt    
242       to be applied at the next execve() by th    
243       PR_SVE_SET_VL_ONEXEC is present in arg),    
244       supported by the system that is less tha    
245       SVE_VL_MAX, the value set will be the la    
246       system.                                     
247                                                   
248     * Any previously outstanding deferred vect    
249       thread is cancelled.                        
250                                                   
251     * The returned value describes the resulti    
252       PR_SVE_GET_VL.  The vector length report    
253       current vector length for this thread if    
254       present in arg; otherwise, the reported     
255       vector length that will be applied at th    
256       thread.                                     
257                                                   
258     * Changing the vector length causes all of    
259       Z0..Z31 except for Z0 bits [127:0] .. Z3    
260       unspecified.  Calling PR_SVE_SET_VL with    
261       vector length, or calling PR_SVE_SET_VL     
262       flag, does not constitute a change to th    
263                                                   
264                                                   
265 prctl(PR_SVE_GET_VL)                              
266                                                   
267     Gets the vector length of the calling thre    
268                                                   
269     The following flag may be OR-ed into the r    
270                                                   
271         PR_SVE_VL_INHERIT                         
272                                                   
273             Vector length will be inherited ac    
274                                                   
275     There is no way to determine whether there    
276     vector length change (which would only nor    
277     fork() or vfork() and the corresponding ex    
278                                                   
279     To extract the vector length from the resu    
280     PR_SVE_VL_LEN_MASK.                           
281                                                   
282     Return value: a nonnegative value on succe    
283         EINVAL: SVE not supported.                
284                                                   
285                                                   
286 7.  ptrace extensions                             
287 ---------------------                             
288                                                   
289 * New regsets NT_ARM_SVE and NT_ARM_SSVE are d    
290   PTRACE_GETREGSET and PTRACE_SETREGSET. NT_AR    
291   streaming mode SVE registers and NT_ARM_SVE     
292   non-streaming mode SVE registers.               
293                                                   
294   In this description a register set is referr    
295   the target is in the appropriate streaming o    
296   using data beyond the subset shared with the    
297                                                   
298   Refer to [2] for definitions.                   
299                                                   
300 The regset data starts with struct user_sve_he    
301                                                   
302     size                                          
303                                                   
304         Size of the complete regset, in bytes.    
305         This depends on vl and possibly on oth    
306                                                   
307         If a call to PTRACE_GETREGSET requests    
308         size, the caller can allocate a larger    
309         read the complete regset.                 
310                                                   
311     max_size                                      
312                                                   
313         Maximum size in bytes that the regset     
314         thread.  The regset won't grow bigger     
315         thread changes its vector length etc.     
316                                                   
317     vl                                            
318                                                   
319         Target thread's current vector length,    
320                                                   
321     max_vl                                        
322                                                   
323         Maximum possible vector length for the    
324                                                   
325     flags                                         
326                                                   
327         at most one of                            
328                                                   
329             SVE_PT_REGS_FPSIMD                    
330                                                   
331                 SVE registers are not live (GE    
332                 non-live (SETREGSET).             
333                                                   
334                 The payload is of type struct     
335                 meaning as for NT_PRFPREG, sta    
336                 SVE_PT_FPSIMD_OFFSET from the     
337                                                   
338                 Extra data might be appended i    
339                 payload should be obtained usi    
340                                                   
341                 vq should be obtained using sv    
342                                                   
343                 or                                
344                                                   
345             SVE_PT_REGS_SVE                       
346                                                   
347                 SVE registers are live (GETREG    
348                 (SETREGSET).                      
349                                                   
350                 The payload contains the SVE r    
351                 SVE_PT_SVE_OFFSET from the sta    
352                 size SVE_PT_SVE_SIZE(vq, flags    
353                                                   
354         ... OR-ed with zero or more of the fol    
355         meaning and behaviour as the correspon    
356                                                   
357             SVE_PT_VL_INHERIT                     
358                                                   
359             SVE_PT_VL_ONEXEC (SETREGSET only).    
360                                                   
361         If neither FPSIMD nor SVE flags are pr    
362         payload is available, this is only pos    
363                                                   
364                                                   
365 * The effects of changing the vector length an    
366   those documented for PR_SVE_SET_VL.             
367                                                   
368   The caller must make a further GETREGSET cal    
369   actually set by SETREGSET, unless is it know    
370   VL is supported.                                
371                                                   
372 * In the SVE_PT_REGS_SVE case, the size and la    
373   the header fields.  The SVE_PT_SVE_*() macro    
374   access to the members.                          
375                                                   
376 * In either case, for SETREGSET it is permissi    
377   case only the vector length and flags are ch    
378   consequences of those changes).                 
379                                                   
380 * In systems supporting SME when in streaming     
381   NT_REG_SVE will return only the user_sve_hea    
382   similarly a GETREGSET for NT_REG_SSVE will n    
383   when not in streaming mode.                     
384                                                   
385 * A GETREGSET for NT_ARM_SSVE will never retur    
386                                                   
387 * For SETREGSET, if an SVE_PT_REGS_SVE payload    
388   requested VL is not supported, the effect wi    
389   payload were omitted, except that an EIO err    
390   attempt is made to translate the payload dat    
391   for the vector length actually set.  The thr    
392   preserved, but the remaining bits of the SVE    
393   unspecified.  It is up to the caller to tran    
394   for the actual VL and retry.                    
395                                                   
396 * Where SME is implemented it is not possible     
397   state for normal SVE when in streaming mode,    
398   register state when in normal mode, regardle    
399   behaviour of the hardware for sharing data b    
400                                                   
401 * Any SETREGSET of NT_ARM_SVE will exit stream    
402   streaming mode and any SETREGSET of NT_ARM_S    
403   if the target was not in streaming mode.        
404                                                   
405 * The effect of writing a partial, incomplete     
406                                                   
407                                                   
408 8.  ELF coredump extensions                       
409 ---------------------------                       
410                                                   
411 * NT_ARM_SVE and NT_ARM_SSVE notes will be add    
412   each thread of the dumped process.  The cont    
413   data that would have been read if a PTRACE_G    
414   type were executed for each thread when the     
415                                                   
416 9.  System runtime configuration                  
417 --------------------------------                  
418                                                   
419 * To mitigate the ABI impact of expansion of t    
420   mechanism is provided for administrators, di    
421   to set the default vector length for userspa    
422                                                   
423 /proc/sys/abi/sve_default_vector_length           
424                                                   
425     Writing the text representation of an inte    
426     default vector length to the specified val    
427     using the same rules as for setting vector    
428                                                   
429     The result can be determined by reopening     
430     contents.                                     
431                                                   
432     At boot, the default vector length is init    
433     supported vector length, whichever is smal    
434     vector length of the init process (PID 1).    
435                                                   
436     Reading this file returns the current syst    
437                                                   
438 * At every execve() call, the new vector lengt    
439   the system default vector length, unless        
440                                                   
441     * PR_SVE_VL_INHERIT (or equivalently SVE_P    
442       calling thread, or                          
443                                                   
444     * a deferred vector length change is pendi    
445       PR_SVE_SET_VL_ONEXEC flag (or SVE_PT_VL_    
446                                                   
447 * Modifying the system default vector length d    
448   of any existing process or thread that does     
449                                                   
450 10.  Perf extensions                              
451 --------------------------------                  
452                                                   
453 * The arm64 specific DWARF standard [5] added     
454   at index 46. This register is used for DWARF    
455   SVE registers are pushed onto the stack.        
456                                                   
457 * Its value is equivalent to the current SVE v    
458   by 64.                                          
459                                                   
460 * The value is included in Perf samples in the    
461   PERF_SAMPLE_REGS_USER is set and the sample_    
462                                                   
463 * The value is the current value at the time t    
464   change over time.                               
465                                                   
466 * If the system doesn't support SVE when perf_    
467   settings, the event will fail to open.          
468                                                   
469 Appendix A.  SVE programmer's model (informati    
470 ==============================================    
471                                                   
472 This section provides a minimal description of    
473 ARMv8-A programmer's model that are relevant t    
474                                                   
475 Note: This section is for information only and    
476 to replace any architectural specification.       
477                                                   
478 A.1.  Registers                                   
479 ---------------                                   
480                                                   
481 In A64 state, SVE adds the following:             
482                                                   
483 * 32 8VL-bit vector registers Z0..Z31             
484   For each Zn, Zn bits [127:0] alias the ARMv8    
485                                                   
486   A register write using a Vn register name ze    
487   Zn except for bits [127:0].                     
488                                                   
489 * 16 VL-bit predicate registers P0..P15           
490                                                   
491 * 1 VL-bit special-purpose predicate register     
492                                                   
493 * a VL "pseudo-register" that determines the s    
494                                                   
495   The SVE instruction set architecture provide    
496   Instead, it can be modified only by EL1 and     
497   system registers.                               
498                                                   
499 * The value of VL can be configured at runtime    
500   16 <= VL <= VLmax, where VL must be a multip    
501                                                   
502 * The maximum vector length is determined by t    
503   16 <= VLmax <= 256.                             
504                                                   
505   (The SVE architecture specifies 256, but per    
506   revisions to raise this limit.)                 
507                                                   
508 * FPSR and FPCR are retained from ARMv8-A, and    
509   operations in a similar way to the way in wh    
510   floating-point operations::                     
511                                                   
512          8VL-1                       128          
513         +----          ////            -------    
514      Z0 |                               :         
515       :                                           
516      Z7 |                               :         
517      Z8 |                               :         
518       :                                           
519     Z15 |                               :         
520     Z16 |                               :         
521       :                                           
522     Z31 |                               :         
523         +----          ////            -------    
524                                                   
525          VL-1                  0                  
526         +----       ////      --+          FPS    
527      P0 |                       |                 
528       : |                       |         *FPC    
529     P15 |                       |                 
530         +----       ////      --+                 
531     FFR |                       |                 
532         +----       ////      --+            V    
533                                                   
534                                                   
535 (*) callee-save:                                  
536     This only applies to bits [63:0] of Z-/V-r    
537     FPCR contains callee-save and caller-save     
538                                                   
539                                                   
540 A.2.  Procedure call standard                     
541 -----------------------------                     
542                                                   
543 The ARMv8-A base procedure call standard is ex    
544 the additional SVE register state:                
545                                                   
546 * All SVE register bits that are not shared wi    
547                                                   
548 * Z8 bits [63:0] .. Z15 bits [63:0] are callee    
549                                                   
550   This follows from the way these bits are map    
551   save in the base procedure call standard.       
552                                                   
553                                                   
554 Appendix B.  ARMv8-A FP/SIMD programmer's mode    
555 ==============================================    
556                                                   
557 Note: This section is for information only and    
558 to replace any architectural specification.       
559                                                   
560 Refer to [4] for more information.                
561                                                   
562 ARMv8-A defines the following floating-point /    
563                                                   
564 * 32 128-bit vector registers V0..V31             
565 * 2 32-bit status/control registers FPSR, FPCR    
566                                                   
567 ::                                                
568                                                   
569          127           0  bit index               
570         +---------------+                         
571      V0 |               |                         
572       : :               :                         
573      V7 |               |                         
574    * V8 |               |                         
575    :  : :               :                         
576    *V15 |               |                         
577     V16 |               |                         
578       : :               :                         
579     V31 |               |                         
580         +---------------+                         
581                                                   
582                  31    0                          
583                 +-------+                         
584            FPSR |       |                         
585                 +-------+                         
586           *FPCR |       |                         
587                 +-------+                         
588                                                   
589 (*) callee-save:                                  
590     This only applies to bits [63:0] of V-regi    
591     FPCR contains a mixture of callee-save and    
592                                                   
593                                                   
594 References                                        
595 ==========                                        
596                                                   
597 [1] arch/arm64/include/uapi/asm/sigcontext.h      
598     AArch64 Linux signal ABI definitions          
599                                                   
600 [2] arch/arm64/include/uapi/asm/ptrace.h          
601     AArch64 Linux ptrace ABI definitions          
602                                                   
603 [3] Documentation/arch/arm64/cpu-feature-regis    
604                                                   
605 [4] ARM IHI0055C                                  
606     http://infocenter.arm.com/help/topic/com.a    
607     http://infocenter.arm.com/help/topic/com.a    
608     Procedure Call Standard for the ARM 64-bit    
609                                                   
610 [5] https://github.com/ARM-software/abi-aa/blo    
                                                      

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php