Thread: [SSI-devel] Infiniband ICS - curious ICS payload addresses?
Brought to you by:
brucewalker,
rogertsang
From: Smith, S. <sta...@in...> - 2006-08-30 00:42:09
|
Greetings, The Infiniband verbs ICS/x86 is working reasonably well until one travels down the migration path. If RDMA.read operations are excluded, migration works fine - same as ICS over Ethernet. When RDMA.reads are enabled, they work fine for CFS & onnode type operations and then proceed to deliver incorrect data for migration operations. The IB RDMA.read operations are all successful although a debug CRC reveals source & sink data mismatches. Upon further investigation I noticed the source side of migration sends numerous 4KB chunks/pages which are located in the highmem address range, 0xff8f5000 for example. The problem seems to be in DMA mapping of the source page. Although the page addresses are greater-than VMALLOC_START, the pages when kmap()'ed produce the same address and eventually a CRC mismatch? Perplexing, can you provide some insight as to how the highmem ranged pages are created? Thanks, Stan. |
From: John B. <joh...@hp...> - 2006-08-30 17:25:27
|
Smith, Stan wrote: > Greetings, > The Infiniband verbs ICS/x86 is working reasonably well until one > travels down the migration path. If RDMA.read operations are excluded, > migration works fine - same as ICS over Ethernet. When RDMA.reads are > enabled, they work fine for CFS & onnode type operations and then > proceed to deliver incorrect data for migration operations. The IB > RDMA.read operations are all successful although a debug CRC reveals > source & sink data mismatches. Upon further investigation I noticed the > source side of migration sends numerous 4KB chunks/pages which are > located in the highmem address range, 0xff8f5000 for example. The > problem seems to be in DMA mapping of the source page. Although the page > addresses are greater-than VMALLOC_START, the pages when kmap()'ed > produce the same address and eventually a CRC mismatch? > > Perplexing, can you provide some insight as to how the highmem ranged > pages are created? > > Thanks, > > Stan. > On a standard x86 Linux kernel, the kernel has 1GB of virtual address space available to it; so, on a machine with more the 1GB of physical memory (actually about 960MB with overhead, as I recall), there is no way the kernel can have permanent kernel mappings to all the physical memory in the system. So the kernel uses the kmap/kunmap calls to temporarily map them into a small range of reserved mappings when the kernel needs to access them directly. (The duplications come you see come about as the mappings are re-used. Note that for pages that are permanently mapped by the kernel [low memory], kmap just returns the permanent virtual address.) These high pages are used for user application pages, so anyone allocating pages for application data should use them. (Passing the GFP_HIGHUSER flag to alloc_page().) I didn't grab a copy of your patches when you last put them out on ftp.intel.com, so I have no clue how you might be passing the addresses to IB. I have also not played with the DMA interfaces in the Linux kernel. So, I'm just guessing: So, if you need to pass a kernel virtual address to IB, I'd think doing this (and the transfer) between the kmap/kunmap in svr_encode_var_ool_as_pg_info_p_p()/as_pg_info_free_data_callback(). (You'd also need to generate the CRC in this window with a separate kmap/kunmap if you needed it sooner.) On the client side, you would need to do everything in cli_decode_var_ool_as_pg_info_p_p(). If you are calling DMA interfaces, dma_map_page() looks to be the one to call. If you want to drop your code again, I could take a look. John |