[00:33:22] *** Joins: vmysak (~vmysak@192.55.54.45) [00:34:43] *** Quits: vmysak (~vmysak@192.55.54.45) (Client Quit) [01:49:07] *** Joins: tomzawadzki (uid327004@gateway/web/irccloud.com/x-moopmitvntxpnvvd) [01:52:55] *** Joins: travis-ci (~travis-ci@ec2-54-198-43-132.compute-1.amazonaws.com) [01:52:56] (spdk/master) test: Use ut_multithread framework in nvmf/ctrlr (Ben Walker) [01:52:56] Diff URL: https://github.com/spdk/spdk/compare/c26bd15881ab...06a0d9ab3319 [01:52:56] *** Parts: travis-ci (~travis-ci@ec2-54-198-43-132.compute-1.amazonaws.com) () [01:55:26] *** Joins: travis-ci (~travis-ci@ec2-54-91-100-53.compute-1.amazonaws.com) [01:55:27] (spdk/master) vagrant: add examples about installing virtualbox in README.md (yidong0635) [01:55:28] Diff URL: https://github.com/spdk/spdk/compare/06a0d9ab3319...23a95386c86c [01:55:28] *** Parts: travis-ci (~travis-ci@ec2-54-91-100-53.compute-1.amazonaws.com) () [02:15:08] *** Joins: wzh (~wzh@114.255.44.139) [02:16:58] Follow up my previous question about GerritHub in yesterday. After I send a message to GerritHub support, they flush the accounts caches, and they don't call me coward anymore. :-) [02:19:57] *** Quits: wzh (~wzh@114.255.44.139) (Quit: WeeChat 1.9.1) [02:42:29] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [02:45:35] *** Joins: guerby (~guerby@april/board/guerby) [05:14:15] *** Joins: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) [08:18:24] wzh, LOL [08:19:12] Karol/all, I see https://review.gerrithub.io/c/spdk/spdk/+/421083/2 is in the Jenkins queue as it matches our query criteria but take a look, it has a merge conflict. Seems like we ought to ignore those ya? [08:19:21] klateck, sethhowe see above [08:20:21] or we could just let them run and get a -1 I suppose.... [08:31:50] peluse: I can't actually recall running into this issue before. We don't do any checking for is:mergeable in the CTP. [08:33:22] peluse: Typically when people upload a new patchset they will rebase on top of master before doing so, forcing them to remove any merge conflicts. If we do decide to start only testing mergeable patches, how will we convey that to users? [08:40:13] Do we even need to check taht? Merge conflict is only a problem in case you want to merge/rebase. [08:40:34] For tests we just checkout the change, so there should not be any conflicts [08:40:40] There would be if we cherry picked them [08:41:05] I mean... we could filter this out to narrow down the scope of triggered patchtes [08:41:08] patches* [08:41:12] That'd be an upside [08:49:40] I think its only a "thing" right now because we ran this for the first time and picked up stuff that Jenkins had never seen before and because theyre so old they have conflicts. Probably fine to leave it the way it is [08:50:08] FYI I'll be offline for a bit while in transit.... [09:01:09] Time to hit the gym, finally! Bye!:) [09:46:03] *** Joins: travis-ci (~travis-ci@ec2-54-159-180-104.compute-1.amazonaws.com) [09:46:04] (spdk/v18.10.x) nvmf: Improve error handling in spdk_nvmf_transport_poll_group_create (Evgeniy Kochetov) [09:46:04] Diff URL: https://github.com/spdk/spdk/compare/8ca38e04b1d2...59c5be6231cc [09:46:04] *** Parts: travis-ci (~travis-ci@ec2-54-159-180-104.compute-1.amazonaws.com) () [09:50:10] *** Joins: travis-ci (~travis-ci@ec2-54-167-161-33.compute-1.amazonaws.com) [09:50:11] (spdk/v18.10.x) changelog: sum up changes going into 18.10.1 (Tomasz Zawadzki) [09:50:11] Diff URL: https://github.com/spdk/spdk/compare/59c5be6231cc...51fc40e2d042 [09:50:11] *** Parts: travis-ci (~travis-ci@ec2-54-167-161-33.compute-1.amazonaws.com) () [09:55:52] bwalker: Noting the merge of some commits to v18.10.x: has there been some consensus among the maintainers with respect to 18.10.1 and the testing difference between that and master ? [09:56:05] yes - sending mail to mailing list atm [09:56:15] k [09:56:55] *** Quits: guerby (~guerby@april/board/guerby) (Ping timeout: 252 seconds) [10:06:35] tomzawadzki: could you take a look at these two patches from ziye? [10:06:42] https://review.gerrithub.io/#/c/spdk/spdk/+/436901/ [10:06:43] VPP related [10:06:56] https://review.gerrithub.io/#/c/spdk/spdk/+/436909/ [10:08:51] *** Quits: tomzawadzki (uid327004@gateway/web/irccloud.com/x-moopmitvntxpnvvd) (Quit: Connection closed for inactivity) [10:11:03] *** Joins: travis-ci (~travis-ci@ec2-54-226-31-27.compute-1.amazonaws.com) [10:11:04] (spdk/master) test/common: add stop.sh in ceph setup testting (yidong0635) [10:11:04] Diff URL: https://github.com/spdk/spdk/compare/ebac2feba548...6907c36f2cd6 [10:11:04] *** Parts: travis-ci (~travis-ci@ec2-54-226-31-27.compute-1.amazonaws.com) () [10:12:11] *** Joins: travis-ci (~travis-ci@ec2-54-159-241-172.compute-1.amazonaws.com) [10:12:12] (spdk/master) test/iscsi_tgt: Add iscsi performance test. (Pawel Niedzwiecki) [10:12:13] Diff URL: https://github.com/spdk/spdk/compare/23a95386c86c...ebac2feba548 [10:12:13] *** Parts: travis-ci (~travis-ci@ec2-54-159-241-172.compute-1.amazonaws.com) () [10:27:12] *** Joins: travis-ci (~travis-ci@ec2-54-211-223-16.compute-1.amazonaws.com) [10:27:13] (spdk/master) event: reactor loop delay configurable from cmdline (Wojciech Malikowski) [10:27:13] Diff URL: https://github.com/spdk/spdk/compare/6907c36f2cd6...d1a6901c6176 [10:27:13] *** Parts: travis-ci (~travis-ci@ec2-54-211-223-16.compute-1.amazonaws.com) () [11:12:34] https://review.gerrithub.io/#/c/spdk/spdk/+/437875/ fixes check_format.sh errors on master [11:22:27] *** Joins: travis-ci (~travis-ci@ec2-54-211-223-16.compute-1.amazonaws.com) [11:22:28] (spdk/master) bdev/qos: enable and disable when the QoS thread is not set (GangCao) [11:22:28] Diff URL: https://github.com/spdk/spdk/compare/d1a6901c6176...70a34886574a [11:22:28] *** Parts: travis-ci (~travis-ci@ec2-54-211-223-16.compute-1.amazonaws.com) () [11:37:49] Attempting to run a variant of nvmf_lvol.sh (18.10.x version). When I run 'lsblk' by hand (as is used in waitforblk), I see: [11:38:14] complaints of the form: lsblk: nvme0c1n2: unknown device name [11:38:47] shouldn't the device just be exposed as nvme0n2? [11:38:56] That's what I would expect. [11:38:59] maybe the multipathing stuff is kicking in or something, hmm [11:39:36] Are there any rpc cmds I might issue that would surface these unusually named entities? [11:39:41] the kernel tries to merge devices that it thinks are the same, but we should be reporting separate uuids for each one [11:39:52] you can do a get_bdevs on the target after it is configured [11:42:44] I run that, but not seeing anything even closely named as such. In fact, there are no "nvmeXXX" anything in the output of get_bdevs. [11:43:07] oh, sorry - the names are entirely chosen by the kernel [11:43:24] but you can confirm that none of the bdevs have the same serial number (uuid) [11:45:24] They're all unique. [11:45:58] ok, well then there isn't any multipathing weirdness happening at least [11:46:17] not sure what scenario makes the kernel try to name things "nvmeXcYnZ" though [11:46:51] I've never seen that cY fragment before at all. [11:47:43] A 'dmesg | grep nvme' fails to show any such entity in the kernel log. [11:48:21] Do you know how these things can named? udev ? [11:48:49] I think it's in the nvme driver in the kernel actually [11:48:57] but I don't know where specifically [11:49:23] I walked through /etc/udev. No appearance of "nvme" therein. [11:52:05] Think you are on to something about multipath. [11:53:36] In drivers/nvme/host/multipath.c:nvme_set_disk_name() has some code that looks like: sprintf(disk_name, "name%dc%dn%d", ctrl->subsys->instance, ctrl->cntlid, ns->head->instance); [12:00:06] Looking at the output from get_bdevs, I see entities for three (3) malloc disks (Malloc0, Malloc1, Malloc2), one pooled device (Raid0), and then six (6) logical volumes. The lsblk complaints with the 'c' output 6 names. Is it safe to assume those are 1 for 1 with the logical volumes? [12:09:20] Re-examining the get_bdevs output, I see that each Logical Volume, with its own unique uuid, does share the same 'lvol_store_uuid'. Think that's what triggering the multipath stuff ? [12:20:58] *** Joins: travis-ci (~travis-ci@ec2-54-167-161-33.compute-1.amazonaws.com) [12:20:59] (spdk/master) test: mark iscsi_initiator.sh as executable (Jim Harris) [12:20:59] Diff URL: https://github.com/spdk/spdk/compare/70a34886574a...dbcbddbb22e6 [12:20:59] *** Parts: travis-ci (~travis-ci@ec2-54-167-161-33.compute-1.amazonaws.com) () [12:35:58] *** Joins: guerby (~guerby@april/board/guerby) [12:36:49] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [12:40:19] *** Joins: guerby (~guerby@april/board/guerby) [13:11:34] this patch from Seth baffles me: https://review.gerrithub.io/#/c/spdk/spdk/+/432211/ [13:11:53] it has 2 +2 votes, +1 verified vote, and Gerrit reports no merge conflicts [13:12:03] yet I'm not able to merge the patch [13:12:09] I can't even rebase it [13:12:55] hmmm - i figured out how to rebase it [13:13:24] this patch's parent had been abandoned [14:22:34] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [14:26:17] *** Joins: guerby (~guerby@april/board/guerby) [14:27:07] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [14:29:39] *** Joins: guerby (~guerby@april/board/guerby) [14:37:32] *** Joins: travis-ci (~travis-ci@ec2-54-160-155-226.compute-1.amazonaws.com) [14:37:33] (spdk/v18.10.x) nvmf/rdma: Fix refcnt check on RDMA QP destroy (Evgeniy Kochetov) [14:37:34] Diff URL: https://github.com/spdk/spdk/compare/51fc40e2d042...2f1ac3644a4d [14:37:34] *** Parts: travis-ci (~travis-ci@ec2-54-160-155-226.compute-1.amazonaws.com) () [14:37:37] lhodev: wish I knew more about the multipath stuff that is in the kernel now. I'm not sure exactly what criteria triggers it [14:38:53] bwalker: Thx. I'm trying to play around with it. I found there's a boolean kernel param in nvme_core called multipath. By default it's true. I'm rebooting now with it set to false and will see if anything changes. [14:43:37] *** Quits: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) (Quit: My Mac Pro has gone to sleep. ZZZzzz…) [15:48:08] *** Joins: travis-ci (~travis-ci@ec2-54-156-114-1.compute-1.amazonaws.com) [15:48:09] (spdk/master) pkgdep: Handle SPDK CLI dep packages not available in old distros (Paul Luse) [15:48:09] Diff URL: https://github.com/spdk/spdk/compare/dbcbddbb22e6...3d4c012a2e5b [15:48:09] *** Parts: travis-ci (~travis-ci@ec2-54-156-114-1.compute-1.amazonaws.com) () [15:49:43] *** Joins: travis-ci (~travis-ci@ec2-54-204-105-177.compute-1.amazonaws.com) [15:49:44] (spdk/master) nvme.c: break out parsing from trid parse (Seth Howell) [15:49:44] Diff URL: https://github.com/spdk/spdk/compare/3d4c012a2e5b...672115fef411 [15:49:44] *** Parts: travis-ci (~travis-ci@ec2-54-204-105-177.compute-1.amazonaws.com) () [16:07:10] bwalker: FYI- setting the kernel cmdline arg, nvme_core.multipath=0, on my 4.14 (+ more) kernel, eliminated the problematic nvme blk entities. [16:27:48] Run of './test/nvmf/fio/nvmf_fio.py 262144 64 randwrite 10 verify' issues warnings: "fio: verification read phase will never start because write phase uses all of runtime" [16:28:42] Does that mean no verify can be done; i.e. unable to do a read to verify after a write ? [16:33:15] lhodev: I don't think so. I saw that error and started following it back a few months ago and when I got to the base of it I remember thinking Oh ok, that's not what it means. I can't remember what it was actually doing though. Just that I was all set and ready to change something then realized I didn't have to. [16:34:28] lhodev: Also, watch out for the multipath stuff in kernels before about 4.16. It was pretty broken for a while. There are several fixes that have gone into the kernel recently to fix it. [16:35:02] Thx sethhowe. Trying to see if I can replicate the failure when running the nvmf_lvol test with nvmf_fio.py (instead of bdevperf). [16:35:17] lhodev on master? [16:35:28] No, on v18.10.x [16:36:23] Taken me a while to get around the multipath stuff to which you alluded. Ugh. See note above about setting kernel cmdline param. [16:36:35] ah. Yeah, I can't replicate it all on my system. [16:36:52] The most recent failure on 18.10.x in the build pool is interesting though. [16:37:00] http://spdk.intel.com/public/spdk/builds/review/555a69a83e91df96b32dfd5d0cec3d1b6671d8ec.1545243667/fedora-03/build.log [16:37:03] Which kernel are you running on your system? [16:37:05] http://spdk.intel.com/public/spdk/builds/review/555a69a83e91df96b32dfd5d0cec3d1b6671d8ec.1545243667/fedora-03/dmesg.log [16:38:00] I have 4.18.16, just slightly newer than what's on the build pool machine. [16:38:15] I can't access those URL's you provided. [16:38:48] Ah, sorry! https://ci.spdk.io/spdk/builds/review/555a69a83e91df96b32dfd5d0cec3d1b6671d8ec.1545243667/fedora-03/build.log [16:39:12] Much better ;-) [16:39:15] https://ci.spdk.io/spdk/builds/review/555a69a83e91df96b32dfd5d0cec3d1b6671d8ec.1545243667/fedora-03/dmesg.log [16:39:45] It's interesting because this is a failure with the fix for destroying the rqpairs properly. [16:41:49] My assumption was that if the qpairs were being destroyed through the normal path, they would wait for the tracer I/O to go through before calling spdk_nvmf_rdma_qpair_destroy. In that case, we wouldn't have any outstanding I/O when the rqpair is destroyed. [16:42:55] So the fix was in master and cherry-pick'd to v18.10.x ? [16:43:17] Yeah. It was part of the backport. [16:44:36] It doesn't fix this latent failure, but I think it gives us an idea of what might be going on. I think the fact that we see those errors about outstanding I/O indicate that we aren't taking the normal path. [16:44:58] When I ran ./test/nvmf/fio/nvmf_fio.py, I altered the runtime from 10s to an hour. I stopped it after about 10 minutes. Didn't see any of those errors. [16:45:34] When was the last time you pulled that branch? [16:45:57] It was just merged this afternoon. [16:46:41] It's been a while. I think multiple days. [16:47:06] lhodev: Did you stop the target yourself, or did you hit the error? [16:47:58] I hit CTRL-C to stop it. Was starting to wonder, given that I hadn't hit an error, if there was possibly something up with my config owing to that warning we discussed. [16:52:06] lhodev: Gotcha, yeah. I don't think that there is anything wrong with your config. You can also just run sudo ./test/nvmf/lvol/nvmf_lvol.sh iso directly and it will do the setup for you and spit out the config file and everything. [16:55:23] I'm going to do a pull of v18.10.x and rebuild, then relaunch my test. See I observe anything different. [17:08:29] Interesting....since pulling latest v18.10.x, rebuilding, and re-running, I quickly saw failures. [17:08:48] fio: io_u error on file /dev/nvme0n2: Input/output error: read offset=262144, buflen=262144 [17:09:13] fio: pid=31951, err=5/file:io_u.c:1833, func=io_u error, error=Input/output error [17:10:17] By "quickly", I think the first error occurred after about 2m40s of runtime. [18:42:51] *** Joins: travis-ci (~travis-ci@ec2-54-156-114-1.compute-1.amazonaws.com) [18:42:52] (spdk/master) net: make the net initialization in a correct way (Ziye Yang) [18:42:53] Diff URL: https://github.com/spdk/spdk/compare/672115fef411...ef8e47571baa [18:42:53] *** Parts: travis-ci (~travis-ci@ec2-54-156-114-1.compute-1.amazonaws.com) () [18:54:33] *** Joins: travis-ci (~travis-ci@ec2-54-160-155-226.compute-1.amazonaws.com) [18:54:34] (spdk/master) vpp: Change the type of sa into sockaddr_storage (Ziye Yang) [18:54:34] Diff URL: https://github.com/spdk/spdk/compare/ef8e47571baa...c9c4a3346d7a [18:54:34] *** Parts: travis-ci (~travis-ci@ec2-54-160-155-226.compute-1.amazonaws.com) ()