|
From: Vladislav B. <vs...@vl...> - 2015-05-02 02:10:12
|
Hi,
Yes, such race is possible and deliberate. I really-really-really doubt it can be
noticeable on practice, especially lead to such consequences, because the race window
very tiny, in order of a ten or two of CPU instructions. So, to exploit it, you need to
have many billions of incoming commands per second. It is impossible as such, plus all
the refused with BUSY commands are refused after 50ms delays (see scst_init_cmd()).
Your solution is not good, because it is breaking the whole lockless mechanics behind
LUNs translation. You are now racing with SCST_FLAG_SUSPENDED flag setting. Your
solution will work only with a lock around all related places.
Thanks,
Vlad
Shyam Kaushik wrote on 05/01/2015 03:28 AM:
> Hi All,
>
> We are using SCST 2.2.1 to expose ~200 LUNs to ~200 initiators. Some LUNs
> are exposed to multiple initiators(~10-15), while others are to one
> initiator. Periodically we add/remove LUN exposure to initiators based on
> needs.
>
> As part of the add/remove LUN exposure to initiator, SCST does the global
> suspend/resume. This typically completes in a short-while (like 1-2 secs)
> but periodically we see that SCST suspend takes 1-minute or higher. During
> this entire period all new IOs are rejected with BUSY/QUEUE_FULL. We have
> confirmed that we don't have any backend-side issues to complete the
> in-progress IO's (they complete very fast in our environment). We have
> logging below SCST to capture IO times & don't see any issue in terms of
> speedy completion of IO's.
>
> Looking at SCST code, we see a race condition:
> # When a command comes in, we have __scst_init_cmd() call in
> scst_translate_lun(). However scst_translate_lun() is bad that it first
> increments a per-cpu cmd counter indicating that there are commands to
> process
> cmd->cpu_cmd_counter = scst_get();
>
> & only then checks the suspension flag like
> if (likely(!test_bit(SCST_FLAG_SUSPENDED, &scst_flags))) {
> } else {
>
> & if we are suspended, it drops the command-counter
> TRACE_MGMT_DBG("%s", "FLAG SUSPENDED set, skipping");
> scst_put(cmd->cpu_cmd_counter);
> res = 1;
>
> These commands with res==1 commands will be rejected later with
> scst_set_busy()
>
> So if we have several of such incoming commands that are BUSY'd out to
> several initiators, we will constantly make the per-cpu counter to stay up
> resulting in the suspend to not finish its wait
> scst_susp_wait()
> res = wait_event_interruptible_timeout(scst_dev_cmd_waitQ,
> (scst_get_cmd_counter() == 0),
> SCST_SUSPENDING_TIMEOUT);
>
> We see the same behavior with SCST 3.0. Can you please confirm this race?
> Below are logs from one of the occurrence + a proposed patch.
>
> Can you pls advise if you approve the below patch/suggest variations to
> the patch? Thanks.
>
> log from one of the occurrence:
> Apr 30 09:56:31 zserver kernel: [11785390.311229] scst[1749]
> scst_suspend_activity[787]: Waiting for 18 active commands to complete...
> This might take few minutes for disks or few hours for tapes, if you use
> long executed commands, like REWIND or FORMAT. In case, if you have a hung
> user space device (i.e. made using scst_user module) not responding to any
> commands, if might take virtually forever until the corresponding user
> space program recovers and starts responding or gets killed.
>
> After this we have all initiator IO's replied with BUSY/QUEUE_FULL
> Apr 30 09:56:31 zserver kernel: [11785390.311657] scst: Sending QUEUE_FULL
> status to initiator iqn.hostvm:vm1 (cmds count 14, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.333606] scst: Sending BUSY
> status to initiator iqn.hostvm:vm2 (cmds count 1, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.338958] scst: Sending QUEUE_FULL
> status to initiator iqn.hostvm:vm6 (cmds count 2, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.644106] scst: Sending BUSY
> status to initiator iqn.hostvm:vm3 (cmds count 1, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.644490] scst: Sending BUSY
> status to initiator iqn.hostvm:vm4 (cmds count 1, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.668050] scst: Sending BUSY
> status to initiator iqn.hostvm:vm5 (cmds count 1, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:56:32 zserver kernel: [11785390.672271] scst: Sending BUSY
> status to initiator iqn.hostvm:vm6 (cmds count 1, queue_type 1,
> sess->init_phase 3)
>
> & continues till
> Apr 30 09:57:50 zserver kernel: [11785469.100147] scst: Sending BUSY
> status to initiator iqn.hostvm:vm7 (cmds count 1, queue_type 1,
> sess->init_phase 3)
> Apr 30 09:57:50 zserver kernel: [11785469.116207] scst: Sending BUSY
> status to initiator iqn.hostvm:vm8 (cmds count 1, queue_type 1,
> sess->init_phase 3)
>
> When scst-suspend completes
> Apr 30 09:57:50 zserver kernel: [11785469.119975] scst[1749]
> scst_suspend_activity[811]: All active commands completed
>
>
> PROPOSED PATCH:
> --- a/scst/scst/src/scst_targ.c
> +++ b/scst/scst/src/scst_targ.c
> @@ -3825,11 +3825,12 @@ static int scst_translate_lun(struct scst_cmd
> *cmd)
>
> TRACE_ENTRY();
>
> - cmd->cpu_cmd_counter = scst_get();
> -
> if (likely(!test_bit(SCST_FLAG_SUSPENDED, &scst_flags))) {
> - struct list_head *head =
> -
> &cmd->sess->sess_tgt_dev_list[SESS_TGT_DEV_LIST_HASH_FN(cmd->lun)];
> + struct list_head *head;
> +
> + cmd->cpu_cmd_counter = scst_get();
> + head =
> &cmd->sess->sess_tgt_dev_list[SESS_TGT_DEV_LIST_HASH_FN(cmd->lun)];
> +
> TRACE_DBG("Finding tgt_dev for cmd %p (lun %lld)", cmd,
> (long long unsigned int)cmd->lun);
> res = -1;
> @@ -3865,7 +3866,6 @@ static int scst_translate_lun(struct scst_cmd *cmd)
> }
> } else {
> TRACE_MGMT_DBG("%s", "FLAG SUSPENDED set, skipping");
> - scst_put(cmd->cpu_cmd_counter);
> res = 1;
> }
>
> Thanks.
>
> --Shyam
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scst-devel mailing list
> https://lists.sourceforge.net/lists/listinfo/scst-devel
>
|