1 ================================================= 2 Linux API for read access to z/VM Monitor Records 3 ================================================= 4 5 Date : 2004-Nov-26 6 7 Author: Gerald Schaefer (geraldsc@de.ibm.com) 8 9 10 11 12 Description 13 =========== 14 This item delivers a new Linux API in the form of a misc char device that is 15 usable from user space and allows read access to the z/VM Monitor Records 16 collected by the `*MONITOR` System Service of z/VM. 17 18 19 User Requirements 20 ================= 21 The z/VM guest on which you want to access this API needs to be configured in 22 order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the 23 IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is 24 restricted (likely), you also need the NAMESAVE <DCSS NAME> statement. 25 This item will use the IUCV device driver to access the z/VM services, so you 26 need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1. 27 28 There are two options for being able to load the monitor DCSS (examples assume 29 that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the 30 location of the monitor DCSS with the Class E privileged CP command Q NSS MAP 31 (the values BEGPAG and ENDPAG are given in units of 4K pages). 32 33 See also "CP Command and Utility Reference" (SC24-6081-00) for more information 34 on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning 35 and Administration" (SC24-6116-00) for more information on DCSSes. 36 37 1st option: 38 ----------- 39 You can use the CP command DEF STOR CONFIG to define a "memory hole" in your 40 guest virtual storage around the address range of the DCSS. 41 42 Example: DEF STOR CONFIG 0.140M 200M.200M 43 44 This defines two blocks of storage, the first is 140MB in size an begins at 45 address 0MB, the second is 200MB in size and begins at address 200MB, 46 resulting in a total storage of 340MB. Note that the first block should 47 always start at 0 and be at least 64MB in size. 48 49 2nd option: 50 ----------- 51 Your guest virtual storage has to end below the starting address of the DCSS 52 and you have to specify the "mem=" kernel parameter in your parmfile with a 53 value greater than the ending address of the DCSS. 54 55 Example:: 56 57 DEF STOR 140M 58 59 This defines 140MB storage size for your guest, the parameter "mem=160M" is 60 added to the parmfile. 61 62 63 User Interface 64 ============== 65 The char device is implemented as a kernel module named "monreader", 66 which can be loaded via the modprobe command, or it can be compiled into the 67 kernel instead. There is one optional module (or kernel) parameter, "mondcss", 68 to specify the name of the monitor DCSS. If the module is compiled into the 69 kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified 70 in the parmfile. 71 72 The default name for the DCSS is "MONDCSS" if none is specified. In case that 73 there are other users already connected to the `*MONITOR` service (e.g. 74 Performance Toolkit), the monitor DCSS is already defined and you have to use 75 the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name 76 of the monitor DCSS, if already defined, and the users connected to the 77 `*MONITOR` service. 78 Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor 79 DCSS if your z/VM doesn't have one already, you need Class E privileges to 80 define and save a DCSS. 81 82 Example: 83 -------- 84 85 :: 86 87 modprobe monreader mondcss=MYDCSS 88 89 This loads the module and sets the DCSS name to "MYDCSS". 90 91 NOTE: 92 ----- 93 This API provides no interface to control the `*MONITOR` service, e.g. specify 94 which data should be collected. This can be done by the CP command MONITOR 95 (Class E privileged), see "CP Command and Utility Reference". 96 97 Device nodes with udev: 98 ----------------------- 99 After loading the module, a char device will be created along with the device 100 node /<udev directory>/monreader. 101 102 Device nodes without udev: 103 -------------------------- 104 If your distribution does not support udev, a device node will not be created 105 automatically and you have to create it manually after loading the module. 106 Therefore you need to know the major and minor numbers of the device. These 107 numbers can be found in /sys/class/misc/monreader/dev. 108 109 Typing cat /sys/class/misc/monreader/dev will give an output of the form 110 <major>:<minor>. The device node can be created via the mknod command, enter 111 mknod <name> c <major> <minor>, where <name> is the name of the device node 112 to be created. 113 114 Example: 115 -------- 116 117 :: 118 119 # modprobe monreader 120 # cat /sys/class/misc/monreader/dev 121 10:63 122 # mknod /dev/monreader c 10 63 123 124 This loads the module with the default monitor DCSS (MONDCSS) and creates a 125 device node. 126 127 File operations: 128 ---------------- 129 The following file operations are supported: open, release, read, poll. 130 There are two alternative methods for reading: either non-blocking read in 131 conjunction with polling, or blocking read without polling. IOCTLs are not 132 supported. 133 134 Read: 135 ----- 136 Reading from the device provides a 12 Byte monitor control element (MCE), 137 followed by a set of one or more contiguous monitor records (similar to the 138 output of the CMS utility MONWRITE without the 4K control blocks). The MCE 139 contains information on the type of the following record set (sample/event 140 data), the monitor domains contained within it and the start and end address 141 of the record set in the monitor DCSS. The start and end address can be used 142 to determine the size of the record set, the end address is the address of the 143 last byte of data. The start address is needed to handle "end-of-frame" records 144 correctly (domain 1, record 13), i.e. it can be used to determine the record 145 start offset relative to a 4K page (frame) boundary. 146 147 See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description 148 of the monitor control element layout. The layout of the monitor records can 149 be found here (z/VM 5.1): https://www.vm.ibm.com/pubs/mon510/index.html 150 151 The layout of the data stream provided by the monreader device is as follows:: 152 153 ... 154 <0 byte read> 155 <first MCE> \ 156 <first set of records> | 157 ... |- data set 158 <last MCE> | 159 <last set of records> / 160 <0 byte read> 161 ... 162 163 There may be more than one combination of MCE and corresponding record set 164 within one data set and the end of each data set is indicated by a successful 165 read with a return value of 0 (0 byte read). 166 Any received data must be considered invalid until a complete set was 167 read successfully, including the closing 0 byte read. Therefore you should 168 always read the complete set into a buffer before processing the data. 169 170 The maximum size of a data set can be as large as the size of the 171 monitor DCSS, so design the buffer adequately or use dynamic memory allocation. 172 The size of the monitor DCSS will be printed into syslog after loading the 173 module. You can also use the (Class E privileged) CP command Q NSS MAP to 174 list all available segments and information about them. 175 176 As with most char devices, error conditions are indicated by returning a 177 negative value for the number of bytes read. In this case, the errno variable 178 indicates the error condition: 179 180 EIO: 181 reply failed, read data is invalid and the application 182 should discard the data read since the last successful read with 0 size. 183 EFAULT: 184 copy_to_user failed, read data is invalid and the application should 185 discard the data read since the last successful read with 0 size. 186 EAGAIN: 187 occurs on a non-blocking read if there is no data available at the 188 moment. There is no data missing or corrupted, just try again or rather 189 use polling for non-blocking reads. 190 EOVERFLOW: 191 message limit reached, the data read since the last successful 192 read with 0 size is valid but subsequent records may be missing. 193 194 In the last case (EOVERFLOW) there may be missing data, in the first two cases 195 (EIO, EFAULT) there will be missing data. It's up to the application if it will 196 continue reading subsequent data or rather exit. 197 198 Open: 199 ----- 200 Only one user is allowed to open the char device. If it is already in use, the 201 open function will fail (return a negative value) and set errno to EBUSY. 202 The open function may also fail if an IUCV connection to the `*MONITOR` service 203 cannot be established. In this case errno will be set to EIO and an error 204 message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER 205 codes are described in the "z/VM Performance" book, Appendix A. 206 207 NOTE: 208 ----- 209 As soon as the device is opened, incoming messages will be accepted and they 210 will account for the message limit, i.e. opening the device without reading 211 from it will provoke the "message limit reached" error (EOVERFLOW error code) 212 eventually.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.