~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/arch/arm/kernel_mode_neon.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 ================
  2 Kernel mode NEON
  3 ================
  4 
  5 TL;DR summary
  6 -------------
  7 * Use only NEON instructions, or VFP instructions that don't rely on support
  8   code
  9 * Isolate your NEON code in a separate compilation unit, and compile it with
 10   '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
 11 * Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
 12   NEON code
 13 * Don't sleep in your NEON code, and be aware that it will be executed with
 14   preemption disabled
 15 
 16 
 17 Introduction
 18 ------------
 19 It is possible to use NEON instructions (and in some cases, VFP instructions) in
 20 code that runs in kernel mode. However, for performance reasons, the NEON/VFP
 21 register file is not preserved and restored at every context switch or taken
 22 exception like the normal register file is, so some manual intervention is
 23 required. Furthermore, special care is required for code that may sleep [i.e.,
 24 may call schedule()], as NEON or VFP instructions will be executed in a
 25 non-preemptible section for reasons outlined below.
 26 
 27 
 28 Lazy preserve and restore
 29 -------------------------
 30 The NEON/VFP register file is managed using lazy preserve (on UP systems) and
 31 lazy restore (on both SMP and UP systems). This means that the register file is
 32 kept 'live', and is only preserved and restored when multiple tasks are
 33 contending for the NEON/VFP unit (or, in the SMP case, when a task migrates to
 34 another core). Lazy restore is implemented by disabling the NEON/VFP unit after
 35 every context switch, resulting in a trap when subsequently a NEON/VFP
 36 instruction is issued, allowing the kernel to step in and perform the restore if
 37 necessary.
 38 
 39 Any use of the NEON/VFP unit in kernel mode should not interfere with this, so
 40 it is required to do an 'eager' preserve of the NEON/VFP register file, and
 41 enable the NEON/VFP unit explicitly so no exceptions are generated on first
 42 subsequent use. This is handled by the function kernel_neon_begin(), which
 43 should be called before any kernel mode NEON or VFP instructions are issued.
 44 Likewise, the NEON/VFP unit should be disabled again after use to make sure user
 45 mode will hit the lazy restore trap upon next use. This is handled by the
 46 function kernel_neon_end().
 47 
 48 
 49 Interruptions in kernel mode
 50 ----------------------------
 51 For reasons of performance and simplicity, it was decided that there shall be no
 52 preserve/restore mechanism for the kernel mode NEON/VFP register contents. This
 53 implies that interruptions of a kernel mode NEON section can only be allowed if
 54 they are guaranteed not to touch the NEON/VFP registers. For this reason, the
 55 following rules and restrictions apply in the kernel:
 56 * NEON/VFP code is not allowed in interrupt context;
 57 * NEON/VFP code is not allowed to sleep;
 58 * NEON/VFP code is executed with preemption disabled.
 59 
 60 If latency is a concern, it is possible to put back to back calls to
 61 kernel_neon_end() and kernel_neon_begin() in places in your code where none of
 62 the NEON registers are live. (Additional calls to kernel_neon_begin() should be
 63 reasonably cheap if no context switch occurred in the meantime)
 64 
 65 
 66 VFP and support code
 67 --------------------
 68 Earlier versions of VFP (prior to version 3) rely on software support for things
 69 like IEEE-754 compliant underflow handling etc. When the VFP unit needs such
 70 software assistance, it signals the kernel by raising an undefined instruction
 71 exception. The kernel responds by inspecting the VFP control registers and the
 72 current instruction and arguments, and emulates the instruction in software.
 73 
 74 Such software assistance is currently not implemented for VFP instructions
 75 executed in kernel mode. If such a condition is encountered, the kernel will
 76 fail and generate an OOPS.
 77 
 78 
 79 Separating NEON code from ordinary code
 80 ---------------------------------------
 81 The compiler is not aware of the special significance of kernel_neon_begin() and
 82 kernel_neon_end(), i.e., that it is only allowed to issue NEON/VFP instructions
 83 between calls to these respective functions. Furthermore, GCC may generate NEON
 84 instructions of its own at -O3 level if -mfpu=neon is selected, and even if the
 85 kernel is currently compiled at -O2, future changes may result in NEON/VFP
 86 instructions appearing in unexpected places if no special care is taken.
 87 
 88 Therefore, the recommended and only supported way of using NEON/VFP in the
 89 kernel is by adhering to the following rules:
 90 
 91 * isolate the NEON code in a separate compilation unit and compile it with
 92   '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
 93 * issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
 94   into the unit containing the NEON code from a compilation unit which is *not*
 95   built with the GCC flag '-mfpu=neon' set.
 96 
 97 As the kernel is compiled with '-msoft-float', the above will guarantee that
 98 both NEON and VFP instructions will only ever appear in designated compilation
 99 units at any optimization level.
100 
101 
102 NEON assembler
103 --------------
104 NEON assembler is supported with no additional caveats as long as the rules
105 above are followed.
106 
107 
108 NEON code generated by GCC
109 --------------------------
110 The GCC option -ftree-vectorize (implied by -O3) tries to exploit implicit
111 parallelism, and generates NEON code from ordinary C source code. This is fully
112 supported as long as the rules above are followed.
113 
114 
115 NEON intrinsics
116 ---------------
117 NEON intrinsics are also supported. However, as code using NEON intrinsics
118 relies on the GCC header <arm_neon.h>, (which #includes <stdint.h>), you should
119 observe the following in addition to the rules above:
120 
121 * Compile the unit containing the NEON intrinsics with '-ffreestanding' so GCC
122   uses its builtin version of <stdint.h> (this is a C99 header which the kernel
123   does not supply);
124 * Include <arm_neon.h> last, or at least after <linux/types.h>

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php