Hi,
I think this is a nasty bug in MRTd?
I have come across situations in which the UII (telnet interface) is not responding, but just got connected. I am compiling with the pthread libraries in Linux (Red Hat 6.0). (This does not always happen, but I am able to reproduce when there are many threads running.)
I've done some tracing and found the bug lies within the mrt_thread_create(). Potentially, it is possible that context switch occurs right after the pthread_create(), so that the thread function runs before the rest of mrt_thread_create() runs and returns. Thus, the thread may run, before all bookkeeping functions are done. This particular bug happens because the following codes (in mrt_thread_create) are not run before the thread begins,
if (schedule)
schedule->self = thread;
In uii_send_buffer(),
if (uii->schedule->self != pthread_self ()) {
buffer_t *temp = New_Buffer (buffer_data_len (buffer));
/* when called from other threads, copy the buffer */
buffer_puts (buffer_data (buffer), temp); */
/* schedule myself to be run by my own thread */
schedule_event2 ("uii_send_buffer_del", uii->schedule,
(event_fn_t) uii_send_buffer_del, 2, uii, temp);
return (0);
}
This piece of code is run, (which shouldn't be because uii->schedule should be equal to pthread_self(), it isn't because the thread runs (start_uii()) before the schedule->self is set in mrt_thread_create()).
Now, uii_send_buffer() returns 0, which makes the UII thread stops responding.
My current solution is the comment out the codes listed above and it works. I am not sure if this bug would affects other MRTd's functionalities. Otherwise there should some mechanisms which make sure mrt_thread_create returns before the thread is run.
Comments?
Tony Cheung