From: Vasiljevic Z. <zv...@ar...> - 2007-09-11 08:55:40
|
void Ns_Fatal(CONST char *fmt, ...) { va_list ap; va_start(ap, fmt); LogAdd(Fatal, fmt, ap); va_end(ap); _exit(1); /**** <<<<< WHY THIS ????? */ } Why not abort()? This creates lots of troubles for us to nail-down such problems: nsd(17299,0xa000ed88) malloc: *** vm_allocate(size=471502848) failed (error code=3) nsd(17299,0xa000ed88) malloc: *** error: can't allocate region nsd(17299,0xa000ed88) malloc: *** set a breakpoint in szone_error to debug [11/Sep/2007:09:28:43][17299.2684415368][-main-] Notice: unable to alloc 471499774 [11/Sep/2007:09:33:56][17299.2684415368][-main-] Fatal: unable to alloc 471499774 bytes I recall the Ns_Fatal bringing down the server and producing coredump. Now it just exists and we have no clue how to see where the fatal message is happening.... |
From: Stephen D. <sd...@gm...> - 2007-09-11 10:38:45
|
On 9/11/07, Vasiljevic Zoran <zv...@ar...> wrote: > void > Ns_Fatal(CONST char *fmt, ...) > { > va_list ap; > > va_start(ap, fmt); > LogAdd(Fatal, fmt, ap); > va_end(ap); > > _exit(1); /**** <<<<< WHY THIS ????? */ > } > > Why not abort()? It's called from many different places and a clean-shutdown is often a reasonable thing to do..? > This creates lots of troubles for us > to nail-down such problems: > > nsd(17299,0xa000ed88) malloc: *** vm_allocate(size=471502848) failed > (error code=3) > nsd(17299,0xa000ed88) malloc: *** error: can't allocate region > nsd(17299,0xa000ed88) malloc: *** set a breakpoint in szone_error to > debug > [11/Sep/2007:09:28:43][17299.2684415368][-main-] Notice: unable to alloc > 471499774 > [11/Sep/2007:09:33:56][17299.2684415368][-main-] Fatal: unable to alloc > 471499774 bytes > > I recall the Ns_Fatal bringing down the server and producing coredump. > Now it just exists and we have no clue how to see where the fatal > message is happening.... > void * ns_malloc(size_t size) { return ckalloc(size); } Tcl handles the checking for allocation failure and calls Tcl_Panic(). void NsInitLog(void) { Ns_MutexSetName(&lock, "ns:log"); Ns_TlsAlloc(&tls, FreeCache); AddClbk(LogToFile, (void*)STDERR_FILENO, NULL); Tcl_SetPanicProc(Panic); } static void Panic(CONST char *fmt, ...) { va_list ap; va_start(ap, fmt); LogAdd(Fatal, fmt, ap); va_end(ap); abort(); } If a memory routine failed, I would expect the process to abort... (Hmm, maybe there should be a flush after that LogAdd()...?) |
From: Vasiljevic Z. <zv...@ar...> - 2007-09-11 10:58:18
|
Am 11.09.2007 um 12:38 schrieb Stephen Deasey: > If a memory routine failed, I would expect the process to abort... > Me too. That's why I was confused. If however some package does not use Tcl allocator but uses direct calls to malloc, we could have this situation. I will need to check this. > > (Hmm, maybe there should be a flush after that LogAdd()...?) Strictly speaking, yes. But this is not helping here. We never got to that abort() call. This is the problem. Allright, I know now what was the idea. The "Fatal" log is yet another log level and should not burn down the server. The Panic() should burn down the server. But, this Panic is called only within the Tcl and there is no means of calling it from other modules. So, now should other modules (beside) Tcl initiate server dump when they recognize some inconsistent state? |
From: Stephen D. <sd...@gm...> - 2007-09-11 11:16:53
|
On 9/11/07, Vasiljevic Zoran <zv...@ar...> wrote: > > Am 11.09.2007 um 12:38 schrieb Stephen Deasey: > > > If a memory routine failed, I would expect the process to abort... > > > > Me too. That's why I was confused. If however some package > does not use Tcl allocator but uses direct calls to malloc, > we could have this situation. I will need to check this. > > > > > (Hmm, maybe there should be a flush after that LogAdd()...?) > > Strictly speaking, yes. But this is not helping here. > We never got to that abort() call. This is the problem. > > Allright, I know now what was the idea. The "Fatal" log > is yet another log level and should not burn down the server. > The Panic() should burn down the server. But, this Panic > is called only within the Tcl and there is no means of > calling it from other modules. So, now should other modules > (beside) Tcl initiate server dump when they recognize some > inconsistent state? Tcl_Panic is public. I think it's the right thing call if you want to abort with message from some other module. We call Tcl_Panic ourselves a few times. |