From: Lev V. <le...@za...> - 2023-06-20 13:07:44
|
Hello, � isert-scst has a dedicated workqueue where RDMA completion handler (isert_cq_comp_work_cb) runs. New commands arrive to isert_cq_comp_work_cb() that eventually checks (in scst_translate_lun()) if SCST is suspended: � isert_cq_comp_work_cb -> isert_poll_cq -> isert_pdu_rx -> cmnd_rx_start -> scsi_cmnd_start -> scst_cmd_init_stage1_done -> scst_cmd_init_done -> scst_init_cmd -> __scst_init_cmd -> scst_translate_lun If SCST is suspended, scst_init_cmd() sleeps 50ms in order to "keep initiator away from too many BUSY commands". The problem is that this sleep sticks the workqueue thread, that is responsible also to complete "old" requests, that were received _before_ scst_suspend_activity(). As a result suspend itself may take much time, because "old" commands do not get thread context to be finished. For example in my environment with ~80 outstanding commands suspend takes ~3sec. If I comment out msleep(50) in scst_init_cmd(), it takes just ~100ms. Another less important but still annoying issue is with log in scst_suspend_activity(). It prints the number of active commands, but it calls scst_get_cmd_counter() -> percpu_ref_read(&scst_cmd_count) immediately after percpu_ref_kill(). percpu_ref_kill() calls __percpu_ref_switch_to_atomic() that _asynchronously_ invokes percpu_ref_switch_to_atomic_rcu(), only there scst_cmd_count becomes really atomic. Until percpu_ref_switch_to_atomic_rcu() is finished, the value returned by scst_get_cmd_counter() is wrong and confusing. In my tests it takes ~20ms after percpu_ref_kill() until scst_get_cmd_counter() returns real command count. I'm using SCST 3.6 but on a [pretty old] kernel 4.14. ಚ Thanks, -Lev. |