[01:39:45] *** Joins: dlw1 (~Thunderbi@114.255.44.143) [01:39:58] *** Quits: dlw (~Thunderbi@114.255.44.143) (Read error: Connection reset by peer) [01:39:58] *** dlw1 is now known as dlw [01:40:27] *** Quits: stefanha` (~stefanha@yuzuki.vmsplice.net) (Ping timeout: 240 seconds) [01:40:34] *** Joins: stefanha (~stefanha@yuzuki.vmsplice.net) [03:45:18] *** Quits: dlw (~Thunderbi@114.255.44.143) (Ping timeout: 256 seconds) [04:14:59] jimharris: replied. I'll try to come up with some extra vhost checks tomorrow [06:27:04] thanks darsto - i remembered the qemu/nvdimm memory registration issues you hit - i just didn't remember how exactly they got "resolved" [06:55:36] FYI community meting in just over an hour from now. Different WebEx info than usual, see https://trello.com/b/DvM7XayJ/spdk-community-meeting-agenda for details [07:57:56] *** Joins: tomzawadzki (~tomzawadz@192.55.54.44) [08:01:15] *** Joins: tkulasek (~tkulasek@134.134.139.75) [08:38:38] *** Joins: tzawadzki (tomzawadzk@nat/intel/x-njyfzxkzgdrzticn) [08:38:38] *** Quits: tomzawadzki (~tomzawadz@192.55.54.44) (Remote host closed the connection) [08:48:47] drv, I'm having a bit of a linux kernel driver issue with QAT, if you have some thoughts.... I have 2 different systems that were both working with QAT until I crashed them for various reasons. On reboot neither system seems to be able to reload the kernel modules. [08:48:54] i get a bunch of shit like this in dmesg : qat_c62x: Unknown symbol adf_devmgr_add_dev (err 0) [08:49:10] and I confirmed the correct kernel version of the module and what I'm running [08:50:37] and when I manually try to insmod the modules I get "invalid symbol in module" or some other nonsense. Have a bunch of meetings but if you, or anyone else has any ideas that'd be great [08:51:20] oh, and all of the adf* dmesg errors represent function names in the module I'm trying to load.... [09:28:03] *** Quits: tzawadzki (tomzawadzk@nat/intel/x-njyfzxkzgdrzticn) (Remote host closed the connection) [09:28:12] *** Joins: tzawadzki (~tomzawadz@192.55.54.44) [09:39:21] tkulasek, you there? [09:42:42] yes [09:47:57] wrt your session creation comment, are you suggesting that I create my own pool of "created sessions" up front and then grab one and init it w/each crypto operation? Because regardless I do need a unique one for every outstanding IO correct? (I can just re-use without recreating is what I understood) [09:52:02] peluse: the QAT driver thing sounds like maybe the QAT modules are compiled against a different version of the kernel or something like that? [09:52:17] are the qat drivers upstream or are they a separate package that you have to build? [09:55:46] You need a pair of sessions for encoding and decoding. When you create it and initialize with a crypto device and xform, you may reuse it. As for a driver issue I didn't work with a QaT more than a year, but it was from 01.org, what I remember. [09:58:02] it looks like at least some of the qat drivers are in the upstream kernel [09:58:12] drivers/crypto/qat in the source tree [09:58:20] so I'm not sure if peluse is using those or a different package [10:00:53] *** Joins: travis-ci (~travis-ci@ec2-54-198-47-146.compute-1.amazonaws.com) [10:00:54] (spdk/master) bdev/qos: Break out code to destroy the qos into a separate function (Ben Walker) [10:00:54] Diff URL: https://github.com/spdk/spdk/compare/1d168901c6f2...6cd524d87c3a [10:00:54] *** Parts: travis-ci (~travis-ci@ec2-54-198-47-146.compute-1.amazonaws.com) () [10:03:33] drv, yeah the strange thing is everything was working until I rebooted and I didn't change any of the drivers. I didn't build them orignially either, they were just there w/the kernel [10:03:58] ok, if they're part of the normal kernel build, that sounds like it should "just work" then [10:04:17] and I didn't update anything before reboot either. I did go through all the steps of unbinding, enabling VFs and binding to the DPDK drivers though [10:05:07] tkulasek, but I need a pair for each outstanding IO right? [10:07:14] looks like that symbol you mentioned above is part of intel_qat.ko - does it work if you manually load that first? [10:08:08] let me check [10:09:08] tkulasek, the reason I mention that is because in the rte_cryptodev_sym_session_init() one of the parms is the cipher_xform which includes the address of the individual crypto operation [10:11:46] I need to look in the code to make sure [10:16:28] tkulasek, ok, thanks. No hurry as I've still got a list of TODO items :) Appreciate the inputs! [10:24:25] which parameter? [10:24:26] drv, yeah that's the first module that I was trying to load. Note that when I first plugged this card in, I didn't have to insmod anything. I did 'lsmod | grep qa' and it showed up [10:25:40] * peluse is about ready to remove the card, reboot, add it back in and see if it comes up. Akin to pissing on a spark plug (drv must get that reference...) [10:30:49] jimharris: darsto posted a follow-up comment on https://review.gerrithub.io/#/c/spdk/spdk/+/410071/ (the RTE_BAD_IOVA patch) [10:31:56] yeah - i saw it - i'm not sure what darsto has planned to fix it though - looking forward to seeing it :) [10:32:40] I'm not sure I really understand the issue [10:33:24] is it that rte_mem_virt2phy() returns 0, and then we try to load from the virtual address and it crashes? [10:33:25] darsto was playing with QEMU, vhost and Clear Containers a while back - it will pass an emulated NVDIMM to the VM [10:33:43] and QEMU will send that memory region in the SET_MEM_TABLE vhost message [10:34:01] seems like if we get passed an address in memory registration that can't be dereferenced, then there was a problem somewhere earlier in the chain [10:34:28] agreed - right now (before my patch), it "works" because DPDK returns BAD_IOVA and we're checking for 0 [10:34:44] so we don't try to touch the address and just return failure [10:35:05] i agree - we need to get to the bottom of why we can't register the NVDIMM region in the vhost process [10:37:04] is there a version where DPDK switched from returning 0 to returning BAD_PHYS_ADDR/BAD_IOVA, or was 0 always the wrong thing to look for? [10:37:18] (0 as a physical address could theoretically be valid) [10:42:57] *** Quits: tkulasek (~tkulasek@134.134.139.75) (Ping timeout: 240 seconds) [10:55:50] yes - some guy named bwalker pushed patches to DPDK to change it from 0 to BAD_IOVA [10:56:05] :-) [10:59:11] sounds like something he would do [11:00:33] you should ask him why - I'm curious to know [12:49:39] bwalker: this intermittent failure looks related to the recent QoS changes: https://ci.spdk.io/spdk/builds/review/24b6526d7c9de70e8a880a71eab319cb6ef92761.1525807871/ubuntu16.04/build.log [13:15:13] i need someone to hit me with a clue bat [13:15:35] how does the nvmf default 128K max io size interoperate with the bdev... [13:15:46] ...and as I type it, I know the answer [13:15:56] nvmf has its own buffer pools [13:20:21] yes [13:20:34] currently, we don't use the bdev data buffer pools at all in nvmf [13:21:17] yeah - looking at this github report on larger MDTS [13:22:16] yeah, that one needs some better clarification - most of the replies are just confusing matters [13:22:45] I think the patch that the submitter posted changes more stuff than necessary to enable MDTS of 512 KB [13:23:46] (and the patch actually changes it to 2 MB, not 512 KB) [13:25:53] it should be possible to just set MaxIOSize in the conf file with no code changes, then run this nvme-cli test [13:27:36] (if we want to test with the SPDK NVMe-oF host code, that would need more code changes, but I don't think that's what the submitter is testing) [13:40:28] oh - i see the problem - the 512KB gets through nvmf and bdev OK - but the backing NVMe SSD rejects since its MDTS is 256K [13:40:50] well, no - the nvme driver should split it in that case [13:51:32] we should at least put together a simpler repro script - it should be possible to do it with SoftRoCE [13:52:05] I'm set up to run in loopback with soft roce if needed [13:52:19] trying to get through a few code reviews before I plow through the github issues [13:52:19] oh - no, the nvme driver won't split it if its io passthru [13:59:51] yeah, I'm leaning toward not supporting NVMe I/O passthru commands at all in nvmf [14:00:00] it really seems like we can't correctly do it [14:01:50] almost tempting to put back in Direct mode or something equivalent to it and remove NVMe passthru from the bdev controller [14:02:18] and limit direct mode so that it directly exposes the mdts, etc. of the underlying controller and can only be attached from a single host at a time [14:07:33] *** Quits: tzawadzki (~tomzawadz@192.55.54.44) (Remote host closed the connection) [14:25:37] I think we need to document which commands will be automatically split and which won't [14:25:39] at a minimum [14:29:13] hmmm, so on the initiator its doing nvme io-passthru, but once it gets to the target won't we just treat it as a normal write? [14:30:09] he's not doing writes - he's doing vendor specific commands I think [14:30:19] if you look at his original posting [14:30:28] originally he was - but the latest post he's doing OPC=0x1 [14:31:14] in the real write case, it's going to call spdk_bdev_write_blocks [14:31:19] which will translate to a regular nvme write call [14:31:22] no passthru involved [14:31:27] and the driver should split it [14:31:30] yep [14:50:05] bwalker: for the nvme-of perf data you collected with vishal last week - how many queue pairs was that spread across? [14:50:16] and it was raw nvme - no lvol? [14:50:26] for the 4.2M number? [14:50:33] yes [14:51:12] 4 subsystems each with 1 namespace (malloc bdev). 4 initiators in a 1:1 mapping to the subsystems, each with 1 qpair [14:51:36] ok [14:52:39] the system only has 8 NVMe devices, so it can't get up to 4.2M using real SSDs [14:53:23] doing the I/O to the SSDs is cheaper than malloc in a number of ways, so if it had enough PCIe bandwidth and NVMe SSDs attached I have no doubt it could get that done. [14:53:30] I don't have IOAT enabled [16:56:10] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [17:07:49] ugh [18:19:00] *** Joins: dlw (~Thunderbi@114.255.44.143) [19:11:34] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 260 seconds) [22:58:01] *** Joins: dlw1 (~Thunderbi@114.255.44.143) [22:58:01] *** Quits: dlw (~Thunderbi@114.255.44.143) (Read error: Connection reset by peer) [22:58:02] *** dlw1 is now known as dlw