[00:13:57] *** Quits: felipef (~felipef@109.144.208.131) (Remote host closed the connection) [00:42:47] *** Quits: ziyeyang_ (~ziyeyang@192.55.46.36) (Quit: Leaving) [01:48:34] *** Joins: amit1234 (c6af4424@gateway/web/freenode/ip.198.175.68.36) [01:48:41] Hi Amit! [01:48:54] *** Joins: tomzawadzki (uid327004@gateway/web/irccloud.com/x-zjestbfdiiyiresu) [01:49:11] Hi [01:50:00] Together with @pwodkowx we were thinkg about the names of the notifications [01:50:49] ok [01:50:50] We don't want to duplicate the code and itroduce too many new entities, so I think we could reuse the rpc command names as event names [01:52:38] So, for example if malloc bdev is deleted - delete_malloc_bdev notification is generated [01:54:25] That approach has an advantage, that potentialy you could recreate json config that leads to same configuration based only on notifications [01:54:27] so, you mean prepend the type of event like add, delete, update with the object name [01:55:48] I mean that we should name the event same as an rpc call, so instead of add - construct_malloc_bdev (example), instead of delete - delete_malloc_bdev [01:55:50] for e.g malloc bdev, potentially we might have three types for e.g add_malloc_bdev, delete_malloc_bdev and update_malloc_bdev, is my understanding correct? [01:55:56] yes [01:55:58] exactly [01:56:55] besides name, you would receive also notification object name - for bdevs that parameter is obvious, its just bdev name [01:57:23] we still have to figure out if we want to pass other parameters with that json reponse [01:58:38] ok, so the json response might look something like -- { name of the notification, object name, uuid } [01:59:20] yes [02:10:21] amit1234: we are thinking about maping events directly to existing RPC methods [02:11:27] so event response will look like { "methods": , "params": } [02:13:08] so existing methods and documentation will be still valid [02:14:02] I'm not sure i understand it correctly...Can you please give an example? [02:15:05] you have NVMe bdev called "Nvme0n1". You create it using "construct_nvme_bdev" RPC call [02:15:36] you need to provide some parameter to create it, yes? [02:17:19] The response will contain the information we mentioned before - { name of the notification, object name, uuid }, but the actual format of the response will be compatibile with JSON-RPC spec [02:17:34] what about the events which are not explicitly triggered or mapped to any RPC methods e.g NVMe bdev failure or QoS change detected by SPDK layer itself [02:19:39] So when the Nvme is created you will get the same data as event [02:19:40] { "method": "construct_nvme_bdev", "params": { "name": "Nvme1", "traddr": "0000:01:00.0", "trtype": "pcie" } } [02:20:32] For events that are can't be maed to existin RPC calls we will have to add them [02:21:28] so, what will be the json response format for those events which can't be mapped to existing RPC calls? [02:21:29] * ... can't be maped to ... [02:23:07] This is just an idea, but I think it will be very simmilar to this above. [02:24:11] I guess that for those events which can't be mapped to existing one we need to define new type [02:24:26] Amit, could you provide an example of such event? [02:26:07] e.g during run time SPDK detects NVMe bdev failure (due to write IO errors etc.), so it wants to send a update event for the NVMe bdev to the upper layer [02:26:38] on receving the event upper layer can query the bdev and figure out the reason for failure. [02:28:16] ok, that makes sense [02:29:35] amit1234: we will have provide additional event type [02:29:55] for events like error detection [02:30:45] ok [04:08:45] *** Joins: travis-ci (~travis-ci@ec2-54-80-96-200.compute-1.amazonaws.com) [04:08:46] (spdk/master) nvmf: destroy mutex on controller destruction (Maciej Szwed) [04:08:46] Diff URL: https://github.com/spdk/spdk/compare/9e2f1bff8146...6569a529d68f [04:08:46] *** Parts: travis-ci (~travis-ci@ec2-54-80-96-200.compute-1.amazonaws.com) () [04:57:51] *** Quits: amit1234 (c6af4424@gateway/web/freenode/ip.198.175.68.36) (Ping timeout: 256 seconds) [04:58:12] anyone know what happened with Jenkins over the last 12 hours? Was there an outage of some kind? See dist list discussion on TPT, Jim commented that is was down at some point. I just looked and it seems like its working now [05:00:34] reminder: Euro community meeting is coming up in about 3 hours from now. We are not using WebEx and the Community Meeting page has not been updated yet as we're still in "trial basis" on the new conf tool [05:00:57] Dial-in number (US): (605) 472-5422 [05:00:57] Access code: 652734# [05:00:57] International dial-in numbers: https://fccdl.in/i/paul_e_luse [05:00:57] Online meeting ID: paul_e_luse [05:00:57] Join the online meeting: https://join.freeconferencecall.com/paul_e_luse [05:09:35] *** Quits: darsto (~darsto@89-78-174-111.dynamic.chello.pl) (Ping timeout: 240 seconds) [05:33:14] Hi peluse. Yeah, it was offline. The VM with the service hanged and I had to reboot it. Bad luck I guess. [06:29:06] *** Joins: travis-ci (~travis-ci@ec2-54-159-157-78.compute-1.amazonaws.com) [06:29:06] (spdk/master) vhost/nvme: remove VHOST_USER_NVME_IO_CMD socket message (Changpeng Liu) [06:29:06] Diff URL: https://github.com/spdk/spdk/compare/6569a529d68f...2bebd09bd724 [06:29:06] *** Parts: travis-ci (~travis-ci@ec2-54-159-157-78.compute-1.amazonaws.com) () [06:54:10] klateck, thanks. Can you make sure that everyone who has access/ability to perform those steps has clear instructions on how to identify the situation as well as rectify it? [06:55:45] Yeah, I think so. Should be in the document I sent out some time ago. Will double check. [07:03:47] thanks! [07:18:59] klateck, forgot to ask you about this one: https://trello.com/c/QsgCjiab Looks like its WIP, any ETA? [07:37:10] peluse - Yes, still WiP. With darsto we managed to cut down some time on initial sources fetch and in meantime looking for a way to speed up the submodules. [07:38:18] cool, thanks [07:39:38] Improving the initial fetch was more critical because it added more time to the total time build. Cloning submodules is not that bad, but still improving this would be nice [07:50:01] *** Joins: felipef (~felipef@62.254.187.233) [07:59:26] klateck, sethhowe FYI I just updated all of the todo and wip cards in trello for CI. Please take a look, there any many that could use updates from one or both of you... thanks!!! [08:01:00] FYI, starting community meeting now... well within the next min or so :) [08:10:32] *** Joins: pzedlews_ (~pzedlews@192.55.54.40) [08:11:50] *** Quits: felipef (~felipef@62.254.187.233) (Remote host closed the connection) [08:33:09] All right, peluse, updated / commented on some trello stuff. Will continue tomorrow with the rest:) [08:41:58] fantastic, thanks! No big hurry just want to get it cleaned up and make sure nothing slips through the cracks as we plan the actual switch over to Jenkins [08:46:15] *** Quits: pzedlews_ (~pzedlews@192.55.54.40) (Ping timeout: 268 seconds) [10:39:34] *** Joins: felipef (~felipef@92.40.249.53.threembb.co.uk) [10:57:00] *** Quits: tomzawadzki (uid327004@gateway/web/irccloud.com/x-zjestbfdiiyiresu) (Quit: Connection closed for inactivity) [11:14:24] bwalker_: ping [11:15:22] lhodev: bwalker_ is out of office until later this week [11:17:15] jimharris: Perhaps you can respond to my inquiry, as it appears you did participate in the view of a change Ben wrote. [11:18:51] sethhowe: saw your e-mail - for this issue you're still seeing on the initiator side, this is just a bdevperf test with an nvme-of namespace? all of the lvol stuff is on the target side? [11:19:15] lhodev: ok - i'll try :-) [11:19:47] *** Quits: felipef (~felipef@92.40.249.53.threembb.co.uk) (Remote host closed the connection) [11:20:16] jimharris: The git hash is: af4d1bdc608 The commit adds the namespace ("ns:") option in nvme/perf. [11:21:23] ok - i have it up [11:21:59] The primary transport id parsing library function, spdk_nvme_transport_id_parse(), only recognizes "trtype", "adrfam", "traddr", "trsvcid" and "subnqn". It logs an error (or warning) on anything else, including "ns". [11:23:32] it prints an error but still returns success? [11:23:53] Correct, and it needs to so that Ben's change can support the 'ns' as currently written. [11:24:22] It kinda bugs me the way that works; i.e. logs an error. [11:24:33] agreed [11:24:55] I'd like to propose that 'spdk_nvme_transport_id_parse()' either silently accepts and ignores the 'ns" key, -OR- [11:24:57] add_trid should process the ns first I think, then move the rest of the string such that the ns part doesn't get passed to spdk_nvme_transport_id_parse [11:25:23] perf is changed to accept a namespace id arg separate from the '-r string' [11:26:01] Or, we could do it as you just suggested though I won't deny that still kinda bothers me ;-) [11:26:49] jimharris: correct. Each lvol is being exposed as an nvme controller on the fabric. From the initiator side, it thinks it is connecting to a bunch of NVMe drives. [11:26:56] i don't think a separate namespace id would work - you couldn't really do multiple trids with that approach [11:27:26] jimharris: Ah, I hadn't considered that scenario. [11:28:27] That leaves the suggestion you proposed, or altering spdk_nvme_transport_id_parse() such that it at least doesn't log an error. [11:29:36] I'm happy to implement either, as long as it's changed some way to eliminate the confusing/worrisome appearance of an error getting logged. [11:31:14] i'm ok with a special case (that has a clear comment) for key = "ns" in spdk_nvme_transport_id_parse() which squashes the ERRLOG [11:32:02] Ok, then I'll do that and submit a patch. Thank you! [11:34:23] sethhowe: can you instrument memory_hotplug_cb (in lib/env_dpdk/memory.c) to dump each allocation? [11:36:15] jimharris: will do. [11:41:40] jimharris: It looks like I get a bunch of 4MB allocations from the initiator. [11:42:10] and you end up with a buffer that spans two of these allocations? [11:42:19] ugh [11:51:19] sethhowe: how big are the I/O in your test? [12:27:11] The I/O are 65K. [12:27:37] Er 64K sorry. [12:32:34] jimharris: Yeah, I get several that fail to submit. [13:06:46] sethhowe: ping [13:07:00] jimharris: pong [13:07:20] 14s ping time. I'll get better haha :) [13:07:28] lol [13:09:09] I am doing some deeper digging. The memory addresses I am getting failures on appear to be in the middle of memory regions. . . Which is not good at all. I am now printing out how we are mapping those into the rdma mem map [13:09:11] so i'm not set up to try to reproduce your nvmf initiator findings on my system [13:09:28] Do you not have an rdma nic? [13:09:30] yeah - i'm not following how that could fail - the mem_map_translate always returns a value of at least 2MB [13:09:36] i don't [13:10:12] can you print out the sge_length and the mr_length when it fails? [13:10:52] I am doing that. I am also printing out the virtual address of the buffer. [13:11:53] It should be that I can do a little arithmetic and find that part of the buffer is in one allocation and another part is in a second allocation, but that math doesn't work. So something else is going wrong. [13:14:39] right now buffer_vaddr + sge_length < dpdk_allocation_address + dpdk_mapping_length for those addresses that are failing. Which doesn't make any sense at all. We must be splitting it up somehow before we register with the NIC. [13:18:16] is it specifically the mr_length < sge_length check that is failing? [13:20:20] Yes. [13:23:52] what do mr_length and sge_length look like when it fails? [13:25:44] sge_length is always 65536 which makes sense. We are allocating those buffers in bdevperf. mr_length is all over the place. from 4352 - 60864. [13:26:45] jimharris, duh... yeah I guess you're here :) [13:29:33] and you're running master + my patch with nothing else? [13:31:57] does the failing buffer cross a 2MB boundary? [13:34:42] peluse: yeah - i'm here but will be on and off based on what's going on in this f2f [13:35:11] np, I'll just plug through what I can and spend more time learning instead of asking questions.... [13:37:50] sethhowe: i don't see how mr_length could be anything but an even multiple of 2MB [13:39:24] The buffers aren't 2MB aligned. You get back the remaining length from the start of your buffer to the end of the mr. [13:41:46] no - do they cross a 2MB boundary [13:41:56] can you give me an example of a buffer that's failing? [13:42:23] spdk_mem_map translates on a 2MB granularity, so my hunch is that the buffer crosses a 2MB boundary [13:42:25] jimharris: I found the issue. I hadn't registered a callback for checking if two translations were contiguous (since we didn't need it before) so with Darek's patch, they were failing every time they crossed a 2MB boundary because the callback wasn't being invoked. [13:43:18] ok - i'm still not sure about the weird mr_length though - when I read the spdk_mem_map_translate code, it looks like it always puts some multiple of VALUE_2MB into the size variable [13:43:52] jimharris: It does on master. Darek's patch fixes that. That is what I was using. [13:44:10] oh - that's what i was missing [13:44:14] https://review.gerrithub.io/#/c/spdk/spdk/+/433076/ [13:45:25] I am going to try a few different I/O patterns to see if I can break this implementation. Like if we allocate buffers that aren't a factor of 2 or 4MB can we still go over an MR boundary? [13:45:35] I think the answer to that might be yes. [13:47:37] i don't think so - even when you do an rte_malloc of 64KB, under the covers, DPDK adds a little bit to that [13:47:52] so you're kind of already doing things that aren't a factor of 2 [13:48:06] by all means, keep doing more testing - but my guess is the answer is "no" :-) [13:49:17] Oh like for the elem object that is associated with the allocation? [13:49:23] yes [13:50:18] Gotcha. That makes sense. I'll still stress it a little to try to convince myself. [13:50:48] But I believe you lol [13:51:19] i think we should get darek's patch (with a couple of small mods) and your patch to add a contiguous callback merged asap [13:52:32] OK, I will rebase ours on top of yours and see if we can get his passing on the test pool. [13:53:27] could you add all of these details to the github issue, and just post a short reply to the mailing list that you have updates and that folks should check github for more info? [13:53:43] Will do. [13:53:53] thanks! good work [14:01:47] nice! [14:31:13] jimharris, OK I give up. Which is the right makefile to link the reduce lib so I can make calls from the compress vbdev module [14:40:43] right makefile? [14:40:49] lib/reduce/Makefile [14:41:05] i think i'm not understanding the question though :) [14:41:59] I'm making calls to the reduce lib from my module, of course, and I'm assuming I need to link it in somewhere? As opposed to just including the header? [14:42:30] collect2: error: ld returned 1 exit status [14:42:33] "/home/peluse/spdk/build/lib/libspdk_bdev_compress.a(vbdev_compress.o): In function `vbdev_init_reduce': [14:42:33] collect2: error: ld returned 1 exit status" [14:42:45] crap, not copying and pasting very well.... [14:43:03] one more time "vbdev_compress.c:1225: undefined reference to `spdk_reduce_vol_init" [14:43:31] should just be "reduce" [14:43:53] OK, what should just be "reduce"? [14:44:02] the name of the library you need to link in [14:44:41] so I do this in my vbdev module makefile with 'LIBS += -lreudce' then? I am clueless when it comes to Makefile magic [14:44:52] i'm guessing you've already add your bdev library to mk/spdk.modules/mk? [14:45:03] sorry - mk/spdk.modules.mk [14:45:03] yes [14:45:24] no - not -lreduce, that's for system libraries (not spdk libraries) [14:45:38] just add "reduce" after bdev_compress [14:46:11] oh, OK. I'll try that I was trying shit like 'LIBS += -L$(SPDK_ROOT_DIR)/lib/reduce' and just having no luck [14:47:59] much better, bunch of pmdk linker errors now but I'm on the right track again. thanks [14:52:12] cool [15:38:12] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [17:00:21] *** Joins: pniedzwx_ (~pniedzwx_@89-77-161-93.dynamic.chello.pl) [17:05:50] *** Quits: pniedzwx_ (~pniedzwx_@89-77-161-93.dynamic.chello.pl) (Ping timeout: 268 seconds) [17:47:38] *** Joins: felipef (~felipef@109.144.216.131) [17:47:46] *** Quits: felipef (~felipef@109.144.216.131) (Remote host closed the connection) [20:01:41] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 256 seconds) [20:55:40] *** Joins: felipef (~felipef@109.144.217.107) [23:05:16] *** Joins: darsto (~darsto@89-78-174-111.dynamic.chello.pl)