[04:13:00] *** Joins: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) [04:17:24] *** Quits: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) (Ping timeout: 246 seconds) [06:18:31] @lhodev - NVMeOF tests got stuck and did not reset properly. I took care of it over the weekend so it should be fine now. Looking into the problem as the servers should reboot after timeout instead of getting stuck [06:53:31] klateck, thanks! [07:01:08] It's NVMeOF HW again, uh:/ [09:38:31] klateck: Thx for looking into the issue. [09:56:26] bwalker: jimharris: Can someone please remove the Chandler vote on https://review.gerrithub.io/#/c/spdk/spdk/+/437196/ and retrigger ? [09:58:33] done [09:58:37] Thanks [09:58:41] but there's something fishy with these failures [09:59:00] That particular failure? Seen before? [09:59:07] Paul and I briefly discussed over weekend. [09:59:20] Some kind of verify failure during lvol fio testing? [09:59:29] the spec file patch (earlier in this same patch series) shows the same failure [09:59:54] as does patch version 5 of the changelog patch in this series [10:00:03] bwalker: can you take a look at these? [10:00:08] or sethhowe [10:08:38] so it's using the kernel initiator and the SPDK target [10:09:34] and the kernel block device is reporting an I/O error [10:09:48] right before that occurred, the SPDK target reported that all of the connections closed [10:09:53] jimharris: Have these types of failures been observed in master? Is it possible we're lacking a commit that needed to be cherry-pick'd to v18.10.x ? [10:12:51] and the SPDK target only disconnects connections down that path for a couple of reasons [10:13:09] the main one being that the RDMA event channel told it that the connection was closed [10:13:57] we could try turning on the debug logging and running it through the pool 10 times to see if it gets a hit [10:16:00] there we lots of bugs like this in the 18.10 release lhodev - we think we have them all fixed on master. We backported all of the relevant fixes, as we could, to 18.10.x [10:16:22] it's possible that we haven't closed all of the holes just yet [10:23:56] General question: when running initiator/target tests in the CI systems, are all of them done in loopback? That is, the initiator and target are both running on the same host? Any done between VMs (or host-to-VM), or baremetal to baremetal ? [10:24:06] all loopback [10:24:26] the performance tests are bare metal to bare metal [10:24:27] but don't run per-patch [10:29:34] bwalker: are you proceeding with the proposed test run 10 times with debug logging turned on? [10:29:44] I haven't gone that far just yet [10:29:54] we're looking through failures on master to see if we've ever seen this before [10:30:07] Ah, ok. Thx. [11:02:42] *** Joins: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) [12:18:59] clorox [12:19:01] cl0r0xw1p [12:19:11] cl0r0xw1p3s [12:20:26] dang, window manager....sorry [12:46:51] *** Joins: travis-ci (~travis-ci@ec2-54-80-244-107.compute-1.amazonaws.com) [12:46:52] (spdk/master) test/iscsi_tgt: fixup bytes to str for python 3.5 (Tomasz Kulasek) [12:46:53] Diff URL: https://github.com/spdk/spdk/compare/04d09f920731...c68c2d28c13f [12:46:53] *** Parts: travis-ci (~travis-ci@ec2-54-80-244-107.compute-1.amazonaws.com) () [17:52:16] bwalker: Are you by chance removing Chandler votes on some proposed changes in v18.10.x thus triggering re-runs? [17:54:13] Seeing updates to 437195 and 437196. Inspection on GarretHub shows that in spite of having a +1 vote previously, someone (or something?) is removing those votes and re-runs are occurring. [21:05:30] *** Joins: travis-ci (~travis-ci@ec2-54-87-54-244.compute-1.amazonaws.com) [21:05:31] (spdk/master) nvmf: Do not set the error state of the qpair (Ziye Yang) [21:05:32] Diff URL: https://github.com/spdk/spdk/compare/c68c2d28c13f...9d11abfd0ee2 [21:05:32] *** Parts: travis-ci (~travis-ci@ec2-54-87-54-244.compute-1.amazonaws.com) ()