This is such a subtle "bug" that it was really hard to
track.
The problem is that if some CallContext variables are
in-use, be them ILogicalThreadAffinative or not, an
exception in the RemotingAppender.SendBuffer method
(when the messages are actually sent), causes the
remoting infrastructure to clear the CallContext, thus,
the caller loses that information that may be critical
for the application (that was our case).
Worse, as the logger never throws to the caller, you
never realize the problem was at the appender, and the
application simply starts getting errors when accessing
those CallContext.GetData variables that are no longer
present.
The solution is really simple but finding the source of
the error costed me a day of work :S ... Here it is
(in RemotingAppender):
override protected void SendBuffer(LoggingEvent[] events)
{
...
// Send the events : COMMENTED!
//m_sinkObj.LogEvents(events);
// Send asynchronous avoid interfering with the
caller's context.
System.Threading.ThreadPool.QueueUserWorkItem(
new System.Threading.WaitCallback(DoSendEvents),
events);
}
private void DoSendEvents(object state)
{
m_sinkObj.LogEvents((LoggingEvent[]) state);
}
It has a possitive side effect: sending asynchronously
means that the IO hit it not paid by the caller issuing
the event.
If an exception happens in that asynchronous method,
it's silently swallowed without disturbing anyone
(maybe you'd like to add internal logging in that
case... ).
Logged In: YES
user_id=1166926
Sorry Daniel that is not how I see it
The long and the short is that the remoting appender in log4net
is not ThreadSafe if the client and the server are on
the same machine
The client locks a piece of code in the append.
and then calls the servers append.
However because the server and the client are in the
same address space and the
log is a static member the same call gets made to
append that is already locked.
A Classic dead lock!!
The real problems were the really criptic things in
the debugger as it never seemed
to be breaking in the same place.
You solution just stops the logger loggin on the server.
I think the solution is to run the remote object in a
different app domain. I will post more when I have done it.
Logged In: YES
user_id=222270
First, this was detected in a multi-machine scenario, not
only WebFarm + App Server, but also the remoting appender
"server" was yet another machine.
Even so, I fail to follow your reasoning. The remoting
listener is run, by definition and implementation, as a
separate application, a console app (or any other host you
create), which obviously runs in a completely different
address space and process. Therefore, it's impossible that
the server append is EVER locked with the client append.
I spent an entire day debugging and chasing for this
problem. Although it was quite some time ago, I clearly
remember this as one of those issues we had to patch in
log4net in order for it to be usable in our scenario.
This is the way to repro (as far as a year-old bug remains
in memory):
1- set up a machine that is the "logging source". This must
be a remoting server application that is called from a client.
2- The client passes some information in the CallContext.
The server may also use the CallContext to store its own data.
3- Enable the remoting appender. This can point to any URI.
The point is that is must not be reachable (a remoting
exception will be thrown.
4- Have the server application use CallContext *after*
performing a log operation. As the remoting appender will
throw an exception, for some reason I couldn't track into
the .NET framework by using Reflector alone, the CallContext
is cleared.
That's the reason why I propose to make the call to the
remoting appender asynchronous.
Logged In: YES
user_id=1166926
Apologies Daniel
I miss construde your original post.
I thought that your post bug was a continuation
of this mail
http://sourceforge.net/mailarchive/message.php?msg_id=6539061
Having just spent a very similar day fishing
round and hoping your patch would fix my similar issue
i was a 'tad' disapointed when it didn't.
Though that is not your fault and
Next time I promise I will RTFP
For the record I think we have two different issues here and
mine may relate more to Thierrys probs
My understanding is that yours is about data not getting
passed back from the remoting server and
you are correct your post does solve that.
My problem was when I created a remoting channel server and
client application in one application my problem occours as
described with the locking.
I have now proved to myself that hosting it as a seperate
application domain solves the issue.
http://www.gotdotnet.com/team/clr/AppdomainFAQ.aspx
If it is any consolation your original bug and solution had
enough pointers for me to resolve the problem. thankyou.
The big? question is does the inability to host a client
and a server on the same box value a bug
Logged In: YES
user_id=222270
You're still missing one very important point. The problem
is NOT that data is passed back to the server. The problem
is that the EXISTING data just before the call to the logger
gets complete REMOVED from the CallContext as a consequence
of the RemotingAppender exception. So in effect the
application can't continue to work if it relies on passing
values (even if NOT to the client, but among its very own
modules inside the SAME appdomain) through the CallContext.
Note that the CallContext is not necessarily tied to
remoting only scenarios (passing data back and forth with a
server). You can also use it INSIDE a single app domain to
pass out of band data for your framework to use. This was
exactly our use case, and the one that triggered this thread.
So, I still believe it's VERY important to fix this by
applying my patch (maybe together with yours?). It doesn't
have any additional side-effects other than taking a thread
from the pool just to perform the remoting call.
So, the REAL problem (once more, just to be clear) is that
the CallContext gets screwed right after the call to the
remoting appender if it happens to throw a remoting
exception. This should NOT be the case as it's basically
affecting the application/runtime state as a result of a
logging operation, which should NEVER be the case. Like I
said, if the logging operation is just an Information
message, the application can't continue working anyway
unless the call to the remoting appender succeeds.
Logged In: YES
user_id=1166926
OK I think I am getting this....
On the client
int x = 1 ;
log.debug(x) ;
on server
LogEvent(events[] evt){
int x = int.Parse(evt[0].GetMessage())
x = x + y
//y not defined error (or In my case a lock)
m_sinkObj.LogEvents(events);
}
Back on the Client
x = x +2
will not eval properly because x doesn't exist anymore.
If I am following this correctly then it explains why the
moving of application domains solved the problem for me
as the RemoteAppender no longer errored therefore no issue.
If my server failed for another reason then I would have
been back to square one.
Thanks Again for the time you have put in to explain this
as it is important to understand the frialties of this type of
architecture.
Logged In: YES
user_id=222270
Well, the use case I had was actually:
CallContex.SetData("MyData", "Hello World");
log.Debug("Destroying my call context :S");
object mydata = CallContext.GetData("MyData");
// mydata is NULL if remoting appender fails to connect to
remoting server. Otherwise, it works OK.
Logged In: YES
user_id=278872
I suspect that you two are discussing different issues.
Daniel's issue with the destruction of CallContext has been
resolve in the CVS version by delivering the remoting calls
from a ThreadPool worker. The CallContext is not flowed into
a ThreadPool worker. I suspect that this issue is cased by an
error in the framework, just because a remote call throws an
error should not destroy the local CallContext.
t0nt if you have a deadlock problem then that is a separate
issue we should look at. Can you post the situation and error
you are seeing on the log4net-users list (to subscribe see
http://logging.apache.org/log4net/support.html\). Also if you
have a sample that demonstrates the issue that would also
be most welcome.
Thanks.
----
Nicko Cadell