~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

TOMOYO Linux Cross Reference
Linux/Documentation/userspace-api/mfd_noexec.rst

Version: ~ [ linux-6.12-rc7 ] ~ [ linux-6.11.7 ] ~ [ linux-6.10.14 ] ~ [ linux-6.9.12 ] ~ [ linux-6.8.12 ] ~ [ linux-6.7.12 ] ~ [ linux-6.6.60 ] ~ [ linux-6.5.13 ] ~ [ linux-6.4.16 ] ~ [ linux-6.3.13 ] ~ [ linux-6.2.16 ] ~ [ linux-6.1.116 ] ~ [ linux-6.0.19 ] ~ [ linux-5.19.17 ] ~ [ linux-5.18.19 ] ~ [ linux-5.17.15 ] ~ [ linux-5.16.20 ] ~ [ linux-5.15.171 ] ~ [ linux-5.14.21 ] ~ [ linux-5.13.19 ] ~ [ linux-5.12.19 ] ~ [ linux-5.11.22 ] ~ [ linux-5.10.229 ] ~ [ linux-5.9.16 ] ~ [ linux-5.8.18 ] ~ [ linux-5.7.19 ] ~ [ linux-5.6.19 ] ~ [ linux-5.5.19 ] ~ [ linux-5.4.285 ] ~ [ linux-5.3.18 ] ~ [ linux-5.2.21 ] ~ [ linux-5.1.21 ] ~ [ linux-5.0.21 ] ~ [ linux-4.20.17 ] ~ [ linux-4.19.323 ] ~ [ linux-4.18.20 ] ~ [ linux-4.17.19 ] ~ [ linux-4.16.18 ] ~ [ linux-4.15.18 ] ~ [ linux-4.14.336 ] ~ [ linux-4.13.16 ] ~ [ linux-4.12.14 ] ~ [ linux-4.11.12 ] ~ [ linux-4.10.17 ] ~ [ linux-4.9.337 ] ~ [ linux-4.4.302 ] ~ [ linux-3.10.108 ] ~ [ linux-2.6.32.71 ] ~ [ linux-2.6.0 ] ~ [ linux-2.4.37.11 ] ~ [ unix-v6-master ] ~ [ ccs-tools-1.8.12 ] ~ [ policy-sample ] ~
Architecture: ~ [ i386 ] ~ [ alpha ] ~ [ m68k ] ~ [ mips ] ~ [ ppc ] ~ [ sparc ] ~ [ sparc64 ] ~

  1 .. SPDX-License-Identifier: GPL-2.0
  2 
  3 ==================================
  4 Introduction of non-executable mfd
  5 ==================================
  6 :Author:
  7     Daniel Verkamp <dverkamp@chromium.org>
  8     Jeff Xu <jeffxu@chromium.org>
  9 
 10 :Contributor:
 11         Aleksa Sarai <cyphar@cyphar.com>
 12 
 13 Since Linux introduced the memfd feature, memfds have always had their
 14 execute bit set, and the memfd_create() syscall doesn't allow setting
 15 it differently.
 16 
 17 However, in a secure-by-default system, such as ChromeOS, (where all
 18 executables should come from the rootfs, which is protected by verified
 19 boot), this executable nature of memfd opens a door for NoExec bypass
 20 and enables “confused deputy attack”.  E.g, in VRP bug [1]: cros_vm
 21 process created a memfd to share the content with an external process,
 22 however the memfd is overwritten and used for executing arbitrary code
 23 and root escalation. [2] lists more VRP of this kind.
 24 
 25 On the other hand, executable memfd has its legit use: runc uses memfd’s
 26 seal and executable feature to copy the contents of the binary then
 27 execute them. For such a system, we need a solution to differentiate runc's
 28 use of executable memfds and an attacker's [3].
 29 
 30 To address those above:
 31  - Let memfd_create() set X bit at creation time.
 32  - Let memfd be sealed for modifying X bit when NX is set.
 33  - Add a new pid namespace sysctl: vm.memfd_noexec to help applications in
 34    migrating and enforcing non-executable MFD.
 35 
 36 User API
 37 ========
 38 ``int memfd_create(const char *name, unsigned int flags)``
 39 
 40 ``MFD_NOEXEC_SEAL``
 41         When MFD_NOEXEC_SEAL bit is set in the ``flags``, memfd is created
 42         with NX. F_SEAL_EXEC is set and the memfd can't be modified to
 43         add X later. MFD_ALLOW_SEALING is also implied.
 44         This is the most common case for the application to use memfd.
 45 
 46 ``MFD_EXEC``
 47         When MFD_EXEC bit is set in the ``flags``, memfd is created with X.
 48 
 49 Note:
 50         ``MFD_NOEXEC_SEAL`` implies ``MFD_ALLOW_SEALING``. In case that
 51         an app doesn't want sealing, it can add F_SEAL_SEAL after creation.
 52 
 53 
 54 Sysctl:
 55 ========
 56 ``pid namespaced sysctl vm.memfd_noexec``
 57 
 58 The new pid namespaced sysctl vm.memfd_noexec has 3 values:
 59 
 60  - 0: MEMFD_NOEXEC_SCOPE_EXEC
 61         memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
 62         MFD_EXEC was set.
 63 
 64  - 1: MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL
 65         memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
 66         MFD_NOEXEC_SEAL was set.
 67 
 68  - 2: MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
 69         memfd_create() without MFD_NOEXEC_SEAL will be rejected.
 70 
 71 The sysctl allows finer control of memfd_create for old software that
 72 doesn't set the executable bit; for example, a container with
 73 vm.memfd_noexec=1 means the old software will create non-executable memfd
 74 by default while new software can create executable memfd by setting
 75 MFD_EXEC.
 76 
 77 The value of vm.memfd_noexec is passed to child namespace at creation
 78 time. In addition, the setting is hierarchical, i.e. during memfd_create,
 79 we will search from current ns to root ns and use the most restrictive
 80 setting.
 81 
 82 [1] https://crbug.com/1305267
 83 
 84 [2] https://bugs.chromium.org/p/chromium/issues/list?q=type%3Dbug-security%20memfd%20escalation&can=1
 85 
 86 [3] https://lwn.net/Articles/781013/

~ [ source navigation ] ~ [ diff markup ] ~ [ identifier search ] ~

kernel.org | git.kernel.org | LWN.net | Project Home | SVN repository | Mail admin

Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.

sflogo.php