1 ==================== 1 ==================== 2 Userspace MAD access 2 Userspace MAD access 3 ==================== 3 ==================== 4 4 5 Device files 5 Device files 6 ============ 6 ============ 7 7 8 Each port of each InfiniBand device has a "u 8 Each port of each InfiniBand device has a "umad" device and an 9 "issm" device attached. For example, a two- 9 "issm" device attached. For example, a two-port HCA will have two 10 umad devices and two issm devices, while a s 10 umad devices and two issm devices, while a switch will have one 11 device of each type (for switch port 0). 11 device of each type (for switch port 0). 12 12 13 Creating MAD agents 13 Creating MAD agents 14 =================== 14 =================== 15 15 16 A MAD agent can be created by filling in a s 16 A MAD agent can be created by filling in a struct ib_user_mad_reg_req 17 and then calling the IB_USER_MAD_REGISTER_AG 17 and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file 18 descriptor for the appropriate device file. 18 descriptor for the appropriate device file. If the registration 19 request succeeds, a 32-bit id will be return 19 request succeeds, a 32-bit id will be returned in the structure. 20 For example:: 20 For example:: 21 21 22 struct ib_user_mad_reg_req req = { /* 22 struct ib_user_mad_reg_req req = { /* ... */ }; 23 ret = ioctl(fd, IB_USER_MAD_REGISTER_A 23 ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req); 24 if (!ret) 24 if (!ret) 25 my_agent = req.id; 25 my_agent = req.id; 26 else 26 else 27 perror("agent register"); 27 perror("agent register"); 28 28 29 Agents can be unregistered with the IB_USER_ 29 Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT 30 ioctl. Also, all agents registered through 30 ioctl. Also, all agents registered through a file descriptor will 31 be unregistered when the descriptor is close 31 be unregistered when the descriptor is closed. 32 32 33 2014 33 2014 34 a new registration ioctl is now provide 34 a new registration ioctl is now provided which allows additional 35 fields to be provided during registrati 35 fields to be provided during registration. 36 Users of this registration call are imp 36 Users of this registration call are implicitly setting the use of 37 pkey_index (see below). 37 pkey_index (see below). 38 38 39 Receiving MADs 39 Receiving MADs 40 ============== 40 ============== 41 41 42 MADs are received using read(). The receive 42 MADs are received using read(). The receive side now supports 43 RMPP. The buffer passed to read() must be at 43 RMPP. The buffer passed to read() must be at least one 44 struct ib_user_mad + 256 bytes. For example: 44 struct ib_user_mad + 256 bytes. For example: 45 45 46 If the buffer passed is not large enough to 46 If the buffer passed is not large enough to hold the received 47 MAD (RMPP), the errno is set to ENOSPC and t 47 MAD (RMPP), the errno is set to ENOSPC and the length of the 48 buffer needed is set in mad.length. 48 buffer needed is set in mad.length. 49 49 50 Example for normal MAD (non RMPP) reads:: 50 Example for normal MAD (non RMPP) reads:: 51 51 52 struct ib_user_mad *mad; 52 struct ib_user_mad *mad; 53 mad = malloc(sizeof *mad + 256); 53 mad = malloc(sizeof *mad + 256); 54 ret = read(fd, mad, sizeof *mad + 256) 54 ret = read(fd, mad, sizeof *mad + 256); 55 if (ret != sizeof mad + 256) { 55 if (ret != sizeof mad + 256) { 56 perror("read"); 56 perror("read"); 57 free(mad); 57 free(mad); 58 } 58 } 59 59 60 Example for RMPP reads:: 60 Example for RMPP reads:: 61 61 62 struct ib_user_mad *mad; 62 struct ib_user_mad *mad; 63 mad = malloc(sizeof *mad + 256); 63 mad = malloc(sizeof *mad + 256); 64 ret = read(fd, mad, sizeof *mad + 256) 64 ret = read(fd, mad, sizeof *mad + 256); 65 if (ret == -ENOSPC)) { 65 if (ret == -ENOSPC)) { 66 length = mad.length; 66 length = mad.length; 67 free(mad); 67 free(mad); 68 mad = malloc(sizeof *mad + len 68 mad = malloc(sizeof *mad + length); 69 ret = read(fd, mad, sizeof *ma 69 ret = read(fd, mad, sizeof *mad + length); 70 } 70 } 71 if (ret < 0) { 71 if (ret < 0) { 72 perror("read"); 72 perror("read"); 73 free(mad); 73 free(mad); 74 } 74 } 75 75 76 In addition to the actual MAD contents, the 76 In addition to the actual MAD contents, the other struct ib_user_mad 77 fields will be filled in with information on 77 fields will be filled in with information on the received MAD. For 78 example, the remote LID will be in mad.lid. 78 example, the remote LID will be in mad.lid. 79 79 80 If a send times out, a receive will be gener 80 If a send times out, a receive will be generated with mad.status set 81 to ETIMEDOUT. Otherwise when a MAD has been 81 to ETIMEDOUT. Otherwise when a MAD has been successfully received, 82 mad.status will be 0. 82 mad.status will be 0. 83 83 84 poll()/select() may be used to wait until a 84 poll()/select() may be used to wait until a MAD can be read. 85 85 86 Sending MADs 86 Sending MADs 87 ============ 87 ============ 88 88 89 MADs are sent using write(). The agent ID f 89 MADs are sent using write(). The agent ID for sending should be 90 filled into the id field of the MAD, the des 90 filled into the id field of the MAD, the destination LID should be 91 filled into the lid field, and so on. The s 91 filled into the lid field, and so on. The send side does support 92 RMPP so arbitrary length MAD can be sent. Fo 92 RMPP so arbitrary length MAD can be sent. For example:: 93 93 94 struct ib_user_mad *mad; 94 struct ib_user_mad *mad; 95 95 96 mad = malloc(sizeof *mad + mad_length) 96 mad = malloc(sizeof *mad + mad_length); 97 97 98 /* fill in mad->data */ 98 /* fill in mad->data */ 99 99 100 mad->hdr.id = my_agent; /* req 100 mad->hdr.id = my_agent; /* req.id from agent registration */ 101 mad->hdr.lid = my_dest; /* in 101 mad->hdr.lid = my_dest; /* in network byte order... */ 102 /* etc. */ 102 /* etc. */ 103 103 104 ret = write(fd, &mad, sizeof *mad + ma 104 ret = write(fd, &mad, sizeof *mad + mad_length); 105 if (ret != sizeof *mad + mad_length) 105 if (ret != sizeof *mad + mad_length) 106 perror("write"); 106 perror("write"); 107 107 108 Transaction IDs 108 Transaction IDs 109 =============== 109 =============== 110 110 111 Users of the umad devices can use the lower 111 Users of the umad devices can use the lower 32 bits of the 112 transaction ID field (that is, the least sig 112 transaction ID field (that is, the least significant half of the 113 field in network byte order) in MADs being s 113 field in network byte order) in MADs being sent to match 114 request/response pairs. The upper 32 bits a 114 request/response pairs. The upper 32 bits are reserved for use by 115 the kernel and will be overwritten before a 115 the kernel and will be overwritten before a MAD is sent. 116 116 117 P_Key Index Handling 117 P_Key Index Handling 118 ==================== 118 ==================== 119 119 120 The old ib_umad interface did not allow sett 120 The old ib_umad interface did not allow setting the P_Key index for 121 MADs that are sent and did not provide a way 121 MADs that are sent and did not provide a way for obtaining the P_Key 122 index of received MADs. A new layout for st 122 index of received MADs. A new layout for struct ib_user_mad_hdr 123 with a pkey_index member has been defined; h 123 with a pkey_index member has been defined; however, to preserve binary 124 compatibility with older applications, this 124 compatibility with older applications, this new layout will not be used 125 unless one of IB_USER_MAD_ENABLE_PKEY or IB_ 125 unless one of IB_USER_MAD_ENABLE_PKEY or IB_USER_MAD_REGISTER_AGENT2 ioctl's 126 are called before a file descriptor is used 126 are called before a file descriptor is used for anything else. 127 127 128 In September 2008, the IB_USER_MAD_ABI_VERSI 128 In September 2008, the IB_USER_MAD_ABI_VERSION will be incremented 129 to 6, the new layout of struct ib_user_mad_h 129 to 6, the new layout of struct ib_user_mad_hdr will be used by 130 default, and the IB_USER_MAD_ENABLE_PKEY ioc 130 default, and the IB_USER_MAD_ENABLE_PKEY ioctl will be removed. 131 131 132 Setting IsSM Capability Bit 132 Setting IsSM Capability Bit 133 =========================== 133 =========================== 134 134 135 To set the IsSM capability bit for a port, s 135 To set the IsSM capability bit for a port, simply open the 136 corresponding issm device file. If the IsSM 136 corresponding issm device file. If the IsSM bit is already set, 137 then the open call will block until the bit 137 then the open call will block until the bit is cleared (or return 138 immediately with errno set to EAGAIN if the 138 immediately with errno set to EAGAIN if the O_NONBLOCK flag is 139 passed to open()). The IsSM bit will be cle 139 passed to open()). The IsSM bit will be cleared when the issm file 140 is closed. No read, write or other operatio 140 is closed. No read, write or other operations can be performed on 141 the issm file. 141 the issm file. 142 142 143 /dev files 143 /dev files 144 ========== 144 ========== 145 145 146 To create the appropriate character device f 146 To create the appropriate character device files automatically with 147 udev, a rule like:: 147 udev, a rule like:: 148 148 149 KERNEL=="umad*", NAME="infiniband/%k" 149 KERNEL=="umad*", NAME="infiniband/%k" 150 KERNEL=="issm*", NAME="infiniband/%k" 150 KERNEL=="issm*", NAME="infiniband/%k" 151 151 152 can be used. This will create device nodes 152 can be used. This will create device nodes named:: 153 153 154 /dev/infiniband/umad0 154 /dev/infiniband/umad0 155 /dev/infiniband/issm0 155 /dev/infiniband/issm0 156 156 157 for the first port, and so on. The InfiniBa 157 for the first port, and so on. The InfiniBand device and port 158 associated with these devices can be determi 158 associated with these devices can be determined from the files:: 159 159 160 /sys/class/infiniband_mad/umad0/ibdev 160 /sys/class/infiniband_mad/umad0/ibdev 161 /sys/class/infiniband_mad/umad0/port 161 /sys/class/infiniband_mad/umad0/port 162 162 163 and:: 163 and:: 164 164 165 /sys/class/infiniband_mad/issm0/ibdev 165 /sys/class/infiniband_mad/issm0/ibdev 166 /sys/class/infiniband_mad/issm0/port 166 /sys/class/infiniband_mad/issm0/port
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.