Menu

#19 Solaris Dynamo Performance Drops On Net Error

closed-accepted
None
5
2004-02-13
2004-01-21
Doug Haigh
No

----- Original Message -----

IoMeter running on two different Intel machines
Dynamo running on Solaris 8 on Ultra 60 SMP
Iometer version 2003-05-10.

Using the default access, I kill all disk workers and start
a network worker on the Intel machine with a
destination on the Solaris machine. Start the test and
wait. Everything is going fine until something triggers a 2
second delay in the send/receive after a group of IOs.
Performance goes into the tank.

Using snoop I initially see everything coming and going
properly. When the performance drops, I see what
appears the Solaris machine waiting for a message from
the Intel machine. Most messages from the Intel
machine have 0 bytes of data. This message that comes
after two seconds and restarts the sending process has
8 bytes associated with it.

Since there are no archives, I could not tell if this had
been posted before. Any ideas? I did not see anything
obvious in the IO* files other than maybe the
Complete_IO is being called with a 1000ms timeout
twice. I am trying to set up a machine to recompile
dynamo for debug purposes.

------- Update ----

I found out what is causing the problem. A read/write
error occurs where the number of bytes transferred is
equal to -1 and errno is 0. When this propogates up, it
results in no IOs in the UNIX AIO list, but the an IO is
assumed to be in the dynamo transfer list.

Specifically, in IOCompletionQ.cpp in the
GetQueuedCompletionStatus is the following code:

if (cqid->element_list[i].done == TRUE)
{
// IO operation completed with either success or
failure.
cqid->element_list[i].done = FALSE;
cqid->last_freed = i;
cqid->position = i+1;
*bytes_transferred = cqid->element_list
[i].bytes_transferred;
// We are returning the status of this aio. Set it to
NULL to
// free the slot.
cqid->aiocb_list[i] = 0;
if ((long )*bytes_transferred < 0)
{
*bytes_transferred = 0;
*completion_key = 0;
if (cqid->element_list[i].error)
SetLastError(cqid->element_list[i].error);
return(FALSE);
}
else
{
*completion_key = cqid->element_list
[i].completion_key;
*lpOverlapped = (LPOVERLAPPED)cqid-
>element_list[i].data;
return(TRUE);
}
}

Problems with this code include
1) If bytes transferred is < 0, the completion key is not
returned even though the transfer is off the list.
2) If the bytes_returned is < 0 and the errno is 0, the
returned error value is whatever the previous value
happened to be.

This results in the function GetStatus (the function that
calls GetQueuedCompletionStatus) to always
return 'ReturnTimeout' because that was the last value
set using SetLastError. All following calls to
GetQueuedCompletionStatus result in WAIT_TIMEOUT
because there are no other IOs on the list.

I suggest changing it to

if (cqid->element_list[i].done == TRUE)
{
// IO operation completed with either success or
failure.
*completion_key = cqid->element_list
[i].completion_key;
// Always set completion key
*lpOverlapped = (LPOVERLAPPED)cqid->element_list
[i].data;
// Always set overlap data
cqid->element_list[i].done = FALSE;
cqid->last_freed = i;
cqid->position = i+1;
*bytes_transferred = cqid->element_list
[i].bytes_transferred;
// We are returning the status of this aio. Set it to
NULL to
// free the slot.
cqid->aiocb_list[i] = 0;
if ((long )*bytes_transferred < 0)
{
*bytes_transferred = 0;
SetLastError(cqid->element_list[i].error);

// Always set error code
return(FALSE);
}
else
{
return(TRUE);
}
}

In this code, the completion key is always returned
when the IO is taken off the list and the error returned
is always the current value. This allows the callers to
remove the bad IO and the callers to know the actual
error code. After these changes, my IO Error Count
increased every time I ran into this problem, but my IO
perfomance remained the same as it should.

This patch was made against the 12/16/2003 code.

Discussion

  • Daniel Scheibli

    Daniel Scheibli - 2004-02-13
    • status: open --> closed
     
  • Daniel Scheibli

    Daniel Scheibli - 2004-02-13
    • assigned_to: nobody --> xca1019
    • status: closed --> closed-accepted
     

Log in to post a comment.