[00:37:35] JoeGruher: check two rpc commands `rpc.py save_config` and `rpc.py load_config`. Loading configuration directly from JSON file is WIP. [03:03:00] *** Joins: travis-ci (~travis-ci@ec2-54-161-234-43.compute-1.amazonaws.com) [03:03:01] (spdk/master) test: handle align=0 in test_env spdk_malloc (Jim Harris) [03:03:01] Diff URL: https://github.com/spdk/spdk/compare/08d36a55f9b0...8c50e8c10632 [03:03:01] *** Parts: travis-ci (~travis-ci@ec2-54-161-234-43.compute-1.amazonaws.com) () [04:27:03] *** Joins: tomzawadzki (uid327004@gateway/web/irccloud.com/x-vafioycotjvetuqs) [04:41:17] *** Joins: BelAzi (~BelAzi@193.158.83.227) [04:42:20] *** Quits: BelAzi (~BelAzi@193.158.83.227) (Remote host closed the connection) [08:58:47] jimharris, yeah I saw them last night. Hard for me to tell the exact correct sequence of steps because I don't believe any of the other modules are doing everything they're suppposed to. I could be wrong though, will push something soon if I can get it working the way I think it should [08:59:38] i just posted another comment on your patch - most of the modules are doing it right currently, at least nvme, virtio-blk/scsi, error, split, gpt, rbd [08:59:57] the fix is simpler than i originally proposed though - see the comment on your patch [09:00:07] i think raid has it wrong currently [09:00:28] passthru doesn't register any io devices [09:01:01] yeah, that's the direction I was because if I do it in the RPC delete call I have to save my RPC callback and args somewhere so that I can continue to pass them on after my iodevice unregister callback [09:01:11] yeas, I'm aware of that ;) [09:01:40] I'll go through a list of modules that look to me to be inconsistent and email them out to the dist list [09:01:57] which ones looked inconsistent, besides raid? [09:02:28] I'll put together a list when I have time, too many things happening at once for my small brain :) [09:04:15] aio is correct, although it does it differently than the others (could be simplified a bit) [09:05:10] null and pmem do io_device_unregisters, but it's a global io_device [09:11:07] BTW I'm sure you already noticed wrt your first comment I was already freeing the bdev in the desctuct CB [09:20:40] jimharris, patch updated: https://review.gerrithub.io/#/c/spdk/spdk/+/433194 [09:25:08] raid was the one that stuck out to me as being wrong. the others were just inconsistent and can easily confuse anyone trying to use them as an example. Do you wanna file a github issue for RAID or do you want me to? [09:25:51] also, I think we should an io_device registration to the PT module and make it clear in comments that it's not actually needed for PT but there to show the proper way to reg/unred the io_device. What do you think? [09:27:22] bwalker, jimharris - when you get a chance please take a gander that the configure/pkg_dep patch cfor the compression vbdev module. Rebased after Ed's changes.... https://review.gerrithub.io/c/spdk/spdk/+/429523 [09:29:39] i'll file the raid github issue [09:29:52] besides aio, what kind of inconsistencies did you see? [09:30:49] sethhowe, if you could look at that one above too I'd appreciate it [09:33:35] jimharris, I don't remember but inconsistencies doesn't mean wrong the way I was using it, I just couldn't find one canonical example which is why I think we should io_dev reg to PT since that's our 'template' for bdevs. I'll look through them real quick and see if I notice anything you didn't already mention in the patch [09:33:37] thanks!! [09:39:40] one thing: I only see a few modules calling spdk_bdev_destruct_done() - it seems like I need this in my unregister callback also? [09:42:11] pmem is also different in terms of where unregister is done - its done in finish() instead of destruct [09:42:15] *** Joins: JoeGruher (86868b48@gateway/web/freenode/ip.134.134.139.72) [09:42:31] you don't need it - you don't have anything asynchronous in your remove path [09:42:47] pmem is different because it doesn't have per-bdev io_devices [09:42:49] same for null [09:45:31] yeah, it all makes sense with additional explanation. [09:52:41] if I start nvmf_tgt without any config file, and then configure bdevs, raid, lvstores, lvs, and nvmeof subsystems with rpc.py, is there a command to then dump that out to a config file i can feed to nvmf_tgt next time i start it? [09:56:04] ./scripts/rpc.py save_config [09:56:09] and ./scripts/rpc.py load_config [09:56:21] perfect, thanks! [09:56:40] that's replacing the config file entirely in either the next release or the release after [09:56:59] so it replaces the -c option on nvmf_tgt? [09:57:07] the file format is the same for both however? [09:57:11] yes - everything will be done via spdkcli (or rpc.py) [09:57:21] the file format is different [09:57:32] k [09:57:53] if you haven't looked at spdkcli, it's worth a look [09:57:59] yeah, i didn't know there was an spdkcli, is that in the current release? [09:58:00] that's our future command line configuration tool [09:58:30] yes it is checked in [09:58:49] if you run it without a target available it throws an exception which is ugly - I'll fix that and give it a better message [09:58:57] but if you have the nvmf target up and you run it, it will work [09:59:10] scripts/spdkcli.py [09:59:50] ok, I see it, interesting [10:01:28] by the way, is this message a concern: EAL: No free hugepages reported in hugepages-1048576kB [10:01:36] it doesn't seem to affect functionality or performance [10:02:01] just telling you that you don't have 1G hugepages allocated [10:02:16] but it's presumably using the 2M hugepages you have [10:02:25] k [10:32:58] bwalker: Was just scanning some of the IRC chatter, and noticed the part about the '-c cfg' going away "...in either the next release or the release after." Is there a compelling reason to eliminate that option? [10:34:57] just code support burden [10:35:14] we're moving to a model where you start the target, configure it using spdkcli [10:35:17] then you can save the configuration [10:35:19] and load it again later [10:35:32] and "saving" the configuration writes out a file that's just all the JSON RPC calls it has to make in sequence [10:35:37] and "loading" is replaying them [10:35:52] so it's all one code path - no separate configuration file parser and such [10:43:13] jimharris, you around? [10:50:14] in a meeting until 11 [10:50:20] K [10:56:16] uhoh, i have a nvmf_tgt seg fault: https://pastebin.com/F2z5bvqE [10:57:16] accompanying dmesg output: https://pastebin.com/3ET6FaqY [10:57:45] bwalker: can you review https://review.gerrithub.io/#/c/spdk/spdk/+/433194/? [10:58:11] I'm walking through that with peluse now [11:00:09] jimharris, yeah, it failed CI which is why I pinged you. bwalker sugegsted moving the claim release and close to the unregister callback (I was doing them first) so I just pushed that. I have 4-5 other patches that depend on this so we'll get a good number of tests [11:00:53] PS: it wasn't that exact patch that failed, it was one in the series that incldued those changes. So damned intermittent [11:01:43] cool - fyi, i put together some patches to help catch cases where we forget to free io_devices - https://review.gerrithub.io/#/c/spdk/spdk/+/433201/ [11:04:20] sweet! I have to leave in 20 min for a couple of hours but will check them out for sure! I have 6 total patches now scheduled w/Jenkins that have the latest fix in them. If all 6 pass w/that specific error I think we're good. [11:05:15] excellent [11:11:28] seg fault seems to be reproducible https://pastebin.com/mJppYmNB [11:25:23] bwalker: I'd like to advocate retention of the '-c cfg' option. I can envision at least one group's deployment model where they would prefer any initial manual steps via the build of the cfg via cmdline/RPC and saving it before proceeding with other deployments. [11:26:26] s/prefer/prefer to avoid/ [11:33:44] seg fault seems induced by running FIO workload against the nvmeof target with 128k blocksize and sequential pattern, 4k blocksize with random pattern runs fine [12:12:49] *** Quits: JoeGruher (86868b48@gateway/web/freenode/ip.134.134.139.72) (Quit: Page closed) [12:36:50] you can build the config once on a different machine and then simply copy the generated file to a new machine for deployments lhodev [12:36:53] would that not be suitable? [12:37:04] the new format is also JSON, so you can write it by hand [12:40:27] bwalker: I'll float it by the groups. It wouldn't surprise me if at least one of them might choose to craft the JSON by hand (or maybe altogether some other tool of their own). [12:40:53] So to confirm: in the future, the only format that will be supported is JSON? Not INI ? [12:52:20] *** Quits: tomzawadzki (uid327004@gateway/web/irccloud.com/x-vafioycotjvetuqs) (Quit: Connection closed for inactivity) [12:59:06] lhodev: ping [12:59:19] only JSON will be supported - but Jim just pointed out that I wasn't totally clear. We'll still support the -c option, but you'll pass to it the JSON output of the save_config RPC [12:59:25] we'll also support the load RPC [12:59:56] so the only part going away is the ini file format [13:11:20] bwalker: cool, thx [13:34:21] *** Joins: travis-ci (~travis-ci@ec2-54-167-65-17.compute-1.amazonaws.com) [13:34:22] (spdk/master) thread: print error for io_devices at lib fini (Jim Harris) [13:34:22] Diff URL: https://github.com/spdk/spdk/compare/8c50e8c10632...61385feeac8e [13:34:22] *** Parts: travis-ci (~travis-ci@ec2-54-167-65-17.compute-1.amazonaws.com) () [15:02:26] anyone ever seen this error in a log: "nbd.c: 928:spdk_nbd_start: *ERROR*: ioctl(NBD_SET_SOCK) failed: Device or resource busy" [16:20:26] *** Joins: travis-ci (~travis-ci@ec2-54-235-21-32.compute-1.amazonaws.com) [16:20:27] (spdk/master) bdev/crypto: unregister io_device on failure in examine callback (Paul Luse) [16:20:27] Diff URL: https://github.com/spdk/spdk/compare/61385feeac8e...df79ac689192 [16:20:27] *** Parts: travis-ci (~travis-ci@ec2-54-235-21-32.compute-1.amazonaws.com) () [17:04:21] *** Quits: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) (Quit: My Mac Pro has gone to sleep. ZZZzzz…) [18:06:24] *** Joins: travis-ci (~travis-ci@ec2-54-242-244-52.compute-1.amazonaws.com) [18:06:24] (spdk/master) reduce_ut: add test for null backing_dev fn_ptrs (Jim Harris) [18:06:25] Diff URL: https://github.com/spdk/spdk/compare/df79ac689192...fba24e2fdb3e [18:06:25] *** Parts: travis-ci (~travis-ci@ec2-54-242-244-52.compute-1.amazonaws.com) () [18:07:10] *** Joins: travis-ci (~travis-ci@ec2-54-242-244-52.compute-1.amazonaws.com) [18:07:11] (spdk/master) reduce: generate uuid if user doesn't pass one (Jim Harris) [18:07:11] Diff URL: https://github.com/spdk/spdk/compare/fba24e2fdb3e...38259b998276 [18:07:11] *** Parts: travis-ci (~travis-ci@ec2-54-242-244-52.compute-1.amazonaws.com) () [21:27:55] anyone else on the bug scrub yet, I'm gussing WebEx is working but it only shows me and "Call in User 1" :) [21:28:08] that was a question actually [21:28:58] OK, I see more people coming.... [21:46:05] jimharris, I remember now why there's that odd duplication of code in crypto post claim function - I can clean that up for sure. It has to do with how the very early version of cryptodev API was being init'd, its evolved many times since then.... [21:53:30] *** Joins: travis-ci (~travis-ci@ec2-54-205-185-67.compute-1.amazonaws.com) [21:53:31] (spdk/master) conf: Transport should be explictly configured in conf file (Ziye Yang) [21:53:31] Diff URL: https://github.com/spdk/spdk/compare/38259b998276...6134d778d46e [21:53:31] *** Parts: travis-ci (~travis-ci@ec2-54-205-185-67.compute-1.amazonaws.com) () [21:56:13] cool [21:56:28] so one of my patches just hit the io_device_unregister bug again - but in the passthru path :) [22:01:41] I just submitted a patch for that! [22:02:05] jimharris, https://review.gerrithub.io/c/spdk/spdk/+/433383 [22:02:25] * peluse is done for tonight....