From: Ross S. W. W. <RW...@me...> - 2009-10-26 22:44:48
|
Arne Redlich [mailto:arn...@go...] wrote: > > Am Freitag, den 16.10.2009, 18:08 -0600 schrieb Abdullah Reza: > > I have this test script that creats a target, login to the > target (from > > a remote machine) using opn-iscsi, logout from the target and then > > delete the target. This process is repeated in an infinite loop. I have > > been running the script for two days now and I hit the following problem > > twice. > > > > The target delete command (ietadm --op delete --tid=<tid>), which is the > > fourth step in the above test procedure, fails with the following error > > message: > > > > ietd_response_recv 170 0 -5 > > Input/output error. > > > > The ietd daemon is dead (hence the above error message). Here are the > > last few lines of ietd log before it died: > > > > Oct 16 15:39:22 qye-serv1 ietd: conn_take_fd: 10 0 1 0 1aca00003b3d0200 > > Oct 16 15:39:22 qye-serv1 ietd: connection closed > > Oct 16 15:39:30 qye-serv1 ietd: 1 106 0 0 0 > > Oct 16 15:39:30 qye-serv1 ietd: target_del: target 106 still has sessions > > > > And here are the last few lines of the kernel log: > > > > Oct 16 15:39:30 qye-serv1 kernel: [2082244.860295] iscsi_trgt: > > session_free(80) 0x1aca00003b3d0200 > > Oct 16 15:39:30 qye-serv1 kernel: [2082244.860623] iscsi_trgt: > > target_destroy(230) 106 > > > > I went through the code and found (I hope) the cause of the prblem. The > > logout command is received by the iet side kernel and it stops the > > connection and the session and sends a E_CONN_CLOSE event to the ietd > > daemon to notify the fact. The daemon may or may not accept and process > > the event right away. The kernel has finished its part and replies > > success to the logout command. Thus, my script issues the command > > 'ietadm --op delete --tid=106'. This is processed by the function > > target_del() in target.c. This function issues an ioctl to the kernel > > telling it to delete the target and that succeeds (since the kernel > > thinks that the session is already removed). Then this function removes > > the target from ietd's data structure. Then this function checks whether > > the sessions_list is empty for the target, if not, it exits. Normally, > > the session list should be empty by now since the daemon usually gets > > the E_CONN_CLOSE event by this time and has emptied the session list for > > the target. In my scenario, the E_CONN_CLOSE event took a while to reach > > the daemon and the target delete command was issued. This is really a > > timing problem. And as I mentioned, I hit the problem only twice in last > > two days of repeated testing. Here is the target_del function where the > > daemon dies. > > > > > > int target_del(u32 tid) > > { > > struct target *target = target_find_by_id(tid); > > int err = ki->target_destroy(tid); > > > > if (err < 0 && errno != ENOENT) > > return -errno; > > else if (!err && !target) > > /* A leftover kernel object was cleaned up - don't complain. */ > > return 0; > > > > if (!target) > > return -ENOENT; > > > > remque(&target->tlist); > > > > if (!list_empty(&target->sessions_list)) { > > log_error("%s: target %u still has sessions\n", __FUNCTION__, > > tid); > > exit(-1); > > } > > > > all_accounts_del(tid, AUTH_DIR_INCOMING); > > all_accounts_del(tid, AUTH_DIR_OUTGOING); > > > > isns_target_deregister(target->name); > > free(target); > > > > return 0; > > } > > *Very* nice problem description and debugging, thanks a lot. > > > My proposed solution is to move the last if block of the above function > > to the beginning of the function (to be particular, before the call to > > ki->target_destroy). ietd should make sure that its conditions are met > > before it invokes the kernel and tell it to delete its target. Also, if > > the condition (if (!list_empty(&target->sessions_list))) is satisfied, > > the function should simply return -EBUSY instead of exiting. > > The exit() of course has to go away. Not sure about moving the test for > sessions up - we could also just try to clean up the remaining sessions > instead of exit()ing, because their kernel part cannot exist anymore if > ki->target_destroy() was successful. My new ioctl code will allow a target to be deleted while there are active sessions/connections, target deletion will iterate through sessions deleting them and session deletion will iterate through connections deleting them, so... that whole function could be rewritten as: int target_del(u32 tid) { struct target *target = target_find_by_id(tid); int err = ki->target_destroy(tid); if (errno == EINTR || errno == EBUSY) return err; remque(&target->tlist); all_accounts_del(tid, AUTH_DIR_INCOMING); all_accounts_del(tid, AUTH_DIR_OUTGOING); isns_target_deregister(target->name); free(target); return 0; } -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. |