From: David P G. <gr...@us...> - 2006-06-20 18:55:45
|
last night's regression run was not very pretty. A number of new failures, many of which take the form of either assertions failures during GC stack scanning or segfaults during GC scanning of heap (which could also be a symptom of stack scanning problems). If we're lucky, this all has the same root cause so fixing the assertion problem could fix the other problems as well. I'm fairly sure the switch to classpath 0.91 was not the problem, since I'd been testing that the middle of last week and didn't see any problems. So some combination of GC changes + other changes is the most likely thing to look at. I'm going to go ahead and test this theory by running my exact system against 0.90 and 0.91, but unless I post something else assume it is not the classpath version that is the problem. I'd like to suggest that people try to get these problems straightened out before we do much more forward development. With this last batch of failures I think we've tipped over to the point where the system is unstable enough that running regression tests against new functionality will not be sufficient to catch bugs, since new problems can be masked by the exisiting crashes. --dave |
From: Steve B. <Ste...@an...> - 2006-06-20 21:07:22
|
Sorry Dave, I suspect you are right about the cause of the problem. I spent a lot of time testing the refactoring (two solid half days); I'm surprised I missed this. I'll will commit to getting that bug out, hopefully by the end of today (in time for tomorrow's regressions). I don't think it will be too hard. --Steve David P Grove wrote: > last night's regression run was not very pretty. > > A number of new failures, many of which take the form of either assertions > failures during GC stack scanning or segfaults during GC scanning of heap > (which could also be a symptom of stack scanning problems). If we're > lucky, this all has the same root cause so fixing the assertion problem > could fix the other problems as well. > > I'm fairly sure the switch to classpath 0.91 was not the problem, since > I'd been testing that the middle of last week and didn't see any problems. > So some combination of GC changes + other changes is the most likely > thing to look at. I'm going to go ahead and test this theory by running > my exact system against 0.90 and 0.91, but unless I post something else > assume it is not the classpath version that is the problem. > > I'd like to suggest that people try to get these problems straightened out > before we do much more forward development. With this last batch of > failures I think we've tipped over to the point where the system is > unstable enough that running regression tests against new functionality > will not be sufficient to catch bugs, since new problems can be masked by > the exisiting crashes. > > --dave > > > > > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > |
From: Ian R. <ian...@ma...> - 2006-06-20 21:17:57
|
Hi, Just to note that the regressions appear ok to me with 1 processor, but with -X:processors=2 I get the same failures. I've noticed that AWT/Swing isn't stable with -X:processors=2 too. There's also a problem with powersaving effecting RDTSC on AMD64x2 processors. I don't know if this can help with anything, but the traces look familiar to ones I've seen here running with >1 processor. Thanks, Ian David P Grove wrote: > last night's regression run was not very pretty. > > A number of new failures, many of which take the form of either assertions > failures during GC stack scanning or segfaults during GC scanning of heap > (which could also be a symptom of stack scanning problems). If we're > lucky, this all has the same root cause so fixing the assertion problem > could fix the other problems as well. > > I'm fairly sure the switch to classpath 0.91 was not the problem, since > I'd been testing that the middle of last week and didn't see any problems. > So some combination of GC changes + other changes is the most likely > thing to look at. I'm going to go ahead and test this theory by running > my exact system against 0.90 and 0.91, but unless I post something else > assume it is not the classpath version that is the problem. > > I'd like to suggest that people try to get these problems straightened out > before we do much more forward development. With this last batch of > failures I think we've tipped over to the point where the system is > unstable enough that running regression tests against new functionality > will not be sufficient to catch bugs, since new problems can be masked by > the exisiting crashes. > > --dave > > > > > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core |
From: Steve B. <Ste...@an...> - 2006-06-20 21:39:50
|
Thanks Ian. If you can investigate that further that would be great. When I saw Dave's bug report I realized that I had not tested with t3GT3. However, I did test the MMTk refactoring with processors=2 and it seemed very stable (on the benchmarks I was evaluating :-/). So there may be more than one problem here, but I'm sure Dave is right about the refactoring introducing a bug with native code. That handling of native code was one of the targets of the refactoring, so it seems highly suspicious. --Steve Ian Rogers wrote: > Hi, > > Just to note that the regressions appear ok to me with 1 processor, but > with -X:processors=2 I get the same failures. I've noticed that > AWT/Swing isn't stable with -X:processors=2 too. There's also a problem > with powersaving effecting RDTSC on AMD64x2 processors. I don't know if > this can help with anything, but the traces look familiar to ones I've > seen here running with >1 processor. > > Thanks, > > Ian > > David P Grove wrote: > >> last night's regression run was not very pretty. >> >> A number of new failures, many of which take the form of either assertions >> failures during GC stack scanning or segfaults during GC scanning of heap >> (which could also be a symptom of stack scanning problems). If we're >> lucky, this all has the same root cause so fixing the assertion problem >> could fix the other problems as well. >> >> I'm fairly sure the switch to classpath 0.91 was not the problem, since >> I'd been testing that the middle of last week and didn't see any problems. >> So some combination of GC changes + other changes is the most likely >> thing to look at. I'm going to go ahead and test this theory by running >> my exact system against 0.90 and 0.91, but unless I post something else >> assume it is not the classpath version that is the problem. >> >> I'd like to suggest that people try to get these problems straightened out >> before we do much more forward development. With this last batch of >> failures I think we've tipped over to the point where the system is >> unstable enough that running regression tests against new functionality >> will not be sufficient to catch bugs, since new problems can be masked by >> the exisiting crashes. >> >> --dave >> >> >> >> >> _______________________________________________ >> Jikesrvm-core mailing list >> Jik...@li... >> https://lists.sourceforge.net/lists/listinfo/jikesrvm-core >> > > > > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > |
From: David P G. <gr...@us...> - 2006-06-21 13:12:19
|
Steve's last commit seems to have cleared most of the problems. So, it's probably ok to resume some forward development of "small" changes. I'm still seeing assertions failures/crashes when running some of the dacapo benchmarks on a development image. They aren't 100% reproducible (run the tests again, and it moves from one benchmark to another), but I am seeing the stack dump below on 1 or 2 of the dacapo subtests on every run I do. --dave -- Stack -- Lcom/ibm/JikesRVM/VM; _assertionFailure(Ljava/lang/String;Ljava/lang/String;)V at line 577 Lcom/ibm/JikesRVM/VM; _assert(ZLjava/lang/String;Ljava/lang/String;)V at line 558 Lcom/ibm/JikesRVM/VM; _assert(Z)V at line 538 Lcom/ibm/JikesRVM/VM_CompiledMethods; getCompiledMethod(I)Lcom/ibm/JikesRVM/VM_CompiledMethod; at line 79 Lorg/mmtk/vm/ScanThread; setUpFrame(I)Z at line 357 Lorg/mmtk/vm/ScanThread; scanFrame(I)Lorg/vmmagic/unboxed/Address; at line 317 Lorg/mmtk/vm/ScanThread; scanThreadInternal(Lorg/vmmagic/unboxed/Address;I)V at line 240 Lorg/mmtk/vm/ScanThread; startScan(Lorg/mmtk/plan/TraceLocal;ZLcom/ibm/JikesRVM/VM_Thread;Lorg/vmmagic/unboxed/Addr ess;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Addres s;)V at line 200 Lorg/mmtk/vm/ScanThread; scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan/TraceLocal;ZLorg/vmmagic/unboxed/Add ress;Lorg/vmmagic/unboxed/Address;)V at line 166 Lorg/mmtk/vm/ScanThread; scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan/TraceLocal;Z)V at line 131 Lorg/mmtk/vm/Scanning; computeAllRoots(Lorg/mmtk/plan/TraceLocal;)V at line 236 Lorg/mmtk/plan/StopTheWorldCollector; collectionPhase(IZ)V at line 82 Lorg/mmtk/plan/generational/GenCollector; collectionPhase(IZ)V at line 118 Lorg/mmtk/plan/generational/marksweep/GenMSCollector; collectionPhase(IZ)V at line 151 Lorg/mmtk/plan/SimplePhase; delegatePhase()V at line 123 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/Phase; delegatePhase(I)V at line 140 Lorg/mmtk/plan/ComplexPhase; delegatePhase()V at line 95 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/Phase; delegatePhase(I)V at line 140 Lorg/mmtk/plan/ComplexPhase; delegatePhase()V at line 95 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/StopTheWorldCollector; collect()V at line 57 Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread; run()V at line 342 Lcom/ibm/JikesRVM/VM_Thread; startoff()V at line 781 |
From: Eliot M. <mo...@cs...> - 2006-06-20 21:53:49
|
>>>>> "Steve" == Steve Blackburn <Ste...@an...> writes: Steve> Thanks Ian. If you can investigate that further that would be great. Steve> When I saw Dave's bug report I realized that I had not tested with Steve> t3GT3. Maybe I'm being dense, but I don't recognize "t3GT3". Could someone explain for me? Thanks -- Eliot |
From: Steve B. <Ste...@an...> - 2006-06-20 22:03:16
|
It is one of the JNI tests. I obviously should have run the JNI tests as part of my pre-commit regression. $RVM_ROOT/src/examples/jni/t3GT3.java --Steve Eliot Moss wrote: >>>>>> "Steve" == Steve Blackburn <Ste...@an...> writes: >>>>>> > > Steve> Thanks Ian. If you can investigate that further that would be great. > Steve> When I saw Dave's bug report I realized that I had not tested with > Steve> t3GT3. > > Maybe I'm being dense, but I don't recognize "t3GT3". Could someone explain > for me? Thanks -- Eliot > > > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > |
From: Steve B. <Ste...@an...> - 2006-06-21 04:12:18
|
Daniel and I have fixed this bug now. It was a simple result of some of the recent MMTk refactoring. :-/ Ian, it will be interesting to see whether you continue to see AWT/Swing problems... --Steve David P Grove wrote: > last night's regression run was not very pretty. > > A number of new failures, many of which take the form of either assertions > failures during GC stack scanning or segfaults during GC scanning of heap > (which could also be a symptom of stack scanning problems). If we're > lucky, this all has the same root cause so fixing the assertion problem > could fix the other problems as well. > > I'm fairly sure the switch to classpath 0.91 was not the problem, since > I'd been testing that the middle of last week and didn't see any problems. > So some combination of GC changes + other changes is the most likely > thing to look at. I'm going to go ahead and test this theory by running > my exact system against 0.90 and 0.91, but unless I post something else > assume it is not the classpath version that is the problem. > > I'd like to suggest that people try to get these problems straightened out > before we do much more forward development. With this last batch of > failures I think we've tipped over to the point where the system is > unstable enough that running regression tests against new functionality > will not be sufficient to catch bugs, since new problems can be masked by > the exisiting crashes. > > --dave > > > > > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > -- --Steve Research Fellow, Australian National University phone: +61 2 6125 4821 fax: +61 2 6125 0010 http://cs.anu.edu.au/~Steve.Blackburn |
From: Ian R. <ian...@ma...> - 2006-06-21 10:57:16
|
Steve Blackburn wrote: > Ian, it will be interesting to see whether you continue to see AWT/Swing > problems... Great work! Sorry to have mislead things, the big multi-processor test I want to run is SpecJBB 2005. It works with 1 processor, but dies with more. With 4 processors on a 4 CPU Pentium 4 I get trace [1]. With Swing/AWT I tend to get frequent locking up even with modest 2 processor tests (even just something like ScribbleFrame). 2 processors will worry MMTk and ScribbleFrame as shown by trace [2]. The RDTSC problems worry MMTk on dual core AMD64x2 processors. If I can generate any more possibly enlightening traces (more like [1] than [2]) I'll post them. Thanks, Ian [1] Traces generated with current CVS head, Classpath 0.91 and patch 1509601. Exception in thread "VM_CollectorThread": java.lang.NullPointerException at org.mmtk.vm.ScanThread.scanThreadInternal(ScanThread.java:239) at org.mmtk.vm.ScanThread.startScan(ScanThread.java:200) at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:166) at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:131) at org.mmtk.vm.Scanning.computeAllRoots(Scanning.java:236) at org.mmtk.plan.StopTheWorldCollector.collectionPhase(StopTheWorldCollector.java:82) at org.mmtk.plan.generational.GenCollector.collectionPhase(GenCollector.java:118) at org.mmtk.plan.generational.marksweep.GenMSCollector.collectionPhase(GenMSCollector.java:151) at org.mmtk.plan.SimplePhase.delegatePhase(SimplePhase.java:123) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.StopTheWorldCollector.collect(StopTheWorldCollector.java:57) at com.ibm.JikesRVM.memoryManagers.mmInterface.VM_CollectorThread.run(VM_CollectorThread.java:342) vm internal error at: -- Stack -- Lcom/ibm/JikesRVM/VM; sysFail(Ljava/lang/String;)V at line 1079 Lcom/ibm/JikesRVM/VM; _assertionFailure(Ljava/lang/String;Ljava/lang/String;)V at line 577 Lcom/ibm/JikesRVM/VM; _assert(ZLjava/lang/String;Ljava/lang/String;)V at line 558 Lcom/ibm/JikesRVM/VM; _assert(Z)V at line 538 Lcom/ibm/JikesRVM/VM_Thread; terminate()V at line 903 Lcom/ibm/JikesRVM/VM_Runtime; deliverException(Ljava/lang/Throwable;Lcom/ibm/JikesRVM/VM_Registers;)V at line 902 Lcom/ibm/JikesRVM/VM_Runtime; deliverHardwareException(II)V at line 659 <hardware trap> Lorg/mmtk/vm/ScanThread; scanThreadInternal(Lorg/vmmagic/unboxed/Address;I)V at line 239 Lorg/mmtk/vm/ScanThread; startScan(Lorg/mmtk/plan/TraceLocal;ZLcom/ibm/JikesRVM/VM_Thread;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;)V at line 200 Lorg/mmtk/vm/ScanThread; scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan/TraceLocal;ZLorg/vmmagic/unboxed/Address;Lorg/vmmagic/unboxed/Address;)V at line 166 Lorg/mmtk/vm/ScanThread; scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan/TraceLocal;Z)V at line 131 Lorg/mmtk/vm/Scanning; computeAllRoots(Lorg/mmtk/plan/TraceLocal;)V at line 236 Lorg/mmtk/plan/StopTheWorldCollector; collectionPhase(IZ)V at line 82 Lorg/mmtk/plan/generational/GenCollector; collectionPhase(IZ)V at line 118 Lorg/mmtk/plan/generational/marksweep/GenMSCollector; collectionPhase(IZ)V at line 151 Lorg/mmtk/plan/SimplePhase; delegatePhase()V at line 123 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/Phase; delegatePhase(I)V at line 140 Lorg/mmtk/plan/ComplexPhase; delegatePhase()V at line 95 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/Phase; delegatePhase(I)V at line 140 Lorg/mmtk/plan/ComplexPhase; delegatePhase()V at line 95 Lorg/mmtk/plan/Phase; delegatePhase(Lorg/mmtk/plan/Phase;)V at line 154 Lorg/mmtk/plan/StopTheWorldCollector; collect()V at line 57 Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread; run()V at line 342 Lcom/ibm/JikesRVM/VM_Thread; startoff()V at line 781 [2] Exception in thread "VM_CollectorThread": ERROR: Suspiciously large offset of interior pointer from object base object base = 0x4b00000c interior reference = 0x00000000 offset = 0xb4fffff4 interior ref loc = 0x65028f90 --- Start Of Stack Scan --- ERROR: Suspiciously large offset of interior pointer from object base object base = 0x4b00000c interior reference = 0x00000000 offset = 0xb4fffff4 interior ref loc = 0x65028f90 topFrame = 0x00000000 ip = 0x4b0eab24 fp = 0x57033320 registers.ip = 0x4b0eab24 * * dump of JNIEnvironment JniRefs Stack * * * JNIRefs = 0x6502670c * JNIRefsTop = 0 * JNIRefsSavedFP = 0. * 0 0x6502670c REF=NULL * * end of dump * * --- METHOD (BASELINE) Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread;.run ()V --- fp = 0x57033320 code base = 0x4b0eaa10 code offset = 0x00000114 --- 20 words of stack frame with fp = 0x57033320 REF=0x570332f0 TIB=0x00000000 STATUS=0x470001dc (INVALID TIB: CLASS NOT ACCESSIBLE) REF=0x4b0ff800 TIB=0x448b0024 STATUS=0x97ff0024We got an uncaught exception while (recursively) handling 1 uncaught exception. Exception in thread "VM_CollectorThread": |
From: Ian R. <ian...@ma...> - 2006-06-21 14:33:49
|
Here are a few more traces that I observe when running with processors >1 (btw: I've applied Andrew Dinn's Lock patch to improve the situation in getting through boot with >2 processors). [1] is typical of having many processors. [2] happens with >1 in AWT/Swing code. [3] happens intermittently with many processors. Our threading codes leads us to have special non-blocking (polling) IO code. I've observed this doesn't allow us to get as far with JDWP debugging as Classpath's regular blocking IO. I believe we're also behind normal Classpath VM's on Mauve tests because of it. I thought it would be interesting to see if the behavior I see for AWT/Swing/SpecJBB05 went away with a NoGC plan, it did (except for [3]). It's quite fun watching JFreeChart charting memory usage as it consumes it, the impending sense of danger as the line plunges toward an out of memory error.. I thought it'd be fun to try this for GCSpy but I couldn't build it (I'm not sure if we have enough infrastructure to run GCSpy on the Jikes RVM). Regards, Ian [1] Exception in thread "VM_CollectorThread": GC Warning: Barrier wait has reached 3.01 seconds. Called from 4200. myOrder = 0 count is 1 waiting for 2 GC Warning: Barrier wait has reached 6.03 seconds. Called from 4200. myOrder = 0 count is 1 waiting for 2 GC Warning: Barrier wait has reached 9.04 seconds. Called from 4200. myOrder = 0 count is 1 waiting for 2 GC Warning: Barrier wait has reached 12.07 seconds. Called from 4200. myOrder = 0 count is 1 waiting for 2 [2] Exception in thread "VM_CollectorThread": We got an uncaught exception while (recursively) handling 1 uncaught exception. Exception in thread "VM_CollectorThread": java.lang.NullPointerException at org.mmtk.vm.ScanThread.scanThreadInternal(ScanThread.java:239) at org.mmtk.vm.ScanThread.startScan(ScanThread.java:200) at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:166) at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:131) at org.mmtk.vm.Scanning.computeAllRoots(Scanning.java:236) at org.mmtk.plan.StopTheWorldCollector.collectionPhase(StopTheWorldCollector.java:82) at org.mmtk.plan.generational.GenCollector.collectionPhase(GenCollector.java:118) at org.mmtk.plan.generational.marksweep.GenMSCollector.collectionPhase(GenMSCollector.java:151) at org.mmtk.plan.SimplePhase.delegatePhase(SimplePhase.java:123) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) at org.mmtk.plan.StopTheWorldCollector.collect(StopTheWorldERCROR: Suspiociously larlge offset ofl interior peointer fromc object baset o object basre = 0x4b0000.0c j interiora referencev = 0x00000000 offset = 0xb4fffff4 interior ref loc = 0x65032f90 --- Start Of Stack Scan --- ERROR: Suspiciously large offset of interior pointer from object base object base = 0x4b00000c interior reference = 0x00000000 offset = 0xb4fffff4 interior ref loc = 0x65032f90 topFrame = 0x00000000 ip = 0x4b0eb034 fp = 0x5706d110 registers.ip = 0x4b0eb034 * * dump of JNIEnvironment JniRefs Stack * * * JNIRefs = 0x6502a70c * JNIRefsTop = 0 * JNIRefsSavedFP = 0. * 0 0x6502a70c REF=NULL * * end of dump * * --- METHOD (BASELINE) Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread;.run ()V --- fp = 0x5706d110 code base = 0x4b0eaf20 code offset = 0x00000114 --- 20 words of stack frame with fp = 0x5706d110 REF=0x5706d0e0 TIB=0x00000000 STATUS=0x470001dc (INVALID TIB: CLASS NOT ACCESSIBLE) REF=0x4b0ffd10 TIB=0x448b0024 STATUS=0x97ff0024We got an uncaught exception while (recursively) handling 2 uncaught exceptions. Exception in thread "VM_CollectorThread": a:57) at com.ibm.JikesRVM.memoryManagers.mmInterface.VM_CollectorThread.run(VM_CollectorThread.java:342) vm internal error at: java.lang.NullPointerException-- Stack -- Lcom/ibm/JikesRVM/VM; at sysFail(Ljavoa/lang/Strirng;)V at line g1079 Lcom./ibm/JikesRVmM/VM; _assertimonFailure(Ljavta/lang/Stringk;Ljava/lang/String;)V at line .577 Lcom/ivbm/JikesRVM/VM;m _assert(ZLjav.a/lang/StringS;Ljava/langc/String;)V aat line 558 n Lcom/ibm/JikesRVM/VM;T _assert(Z)V at line h538 Lcom/ibm/JikesRVM/VM_Thread; rterminate()Ve at line 903 a Lcom/ibm/dJikesRVM/VM_.Runtime; delsiverExceptionc(Ljava/lang/aThrowable;Lcom/ibnm/JikesRVM/VM_Registers;T)V at line 902 h Lcom/ibm/JikesRVM/VMr_Runtime; deliverHardewareExceptiona(II)V at lined 659 <hardIware trap> n Lorg/mmttk/vm/ScanThread; escanThreadIntrernal(Lorg/vnmmagic/unboxaed/Address;I)Vl at line 239 ( Lorg/mmtkS/vm/ScanThreadc; startScana(Lorg/mmtk/pnlan/TraceLocTal;ZLcom/ibmh/JikesRVM/VM_rThread;Lorg/vmemagic/unboxed/aAddress;Lorg/dvmmagic/unboxed./Address;Lorg/jvmmagic/unboxead/Address;Lorgv/vmmagic/unboxaed/Address;Lorg/vmmagic/unboxed/Address;)V: at line 200 2 Lorg/mmtk/vm3/ScanThread; s9canThread(Lcom)/ibm/JikesRVM/ VM_Thre at ad;Lorg/mmtk/polan/TraceLorcal;ZLorg/vmmaggic/unboxed/Ad.dress;Lorg/vmmmagic/unboxed/Address;)Vm at line 166 t Lorg/mmtk/vmk/ScanThread; scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan./TraceLocal;Zv)V at line m131 Lorg/m.mtk/vm/ScanninSg; computeAllRcoots(Lorg/mmtka/plan/TraceLocnal;)V at line T236 Lorg/mhmtk/plan/StopTrheWorldCollectoer; collectionPahase(IZ)V at ldine 82 L.org/mmtk/plan/sgenerationtal/GenCollectora; collectionPhaser(IZ)V at litne 118 LSorg/mmtk/plan/gcenerational/maarksweep/GenMSCnollector; co(llectionPhase(ISZ)V at line 151 c Lorg/mmtka/plan/SimplePhanse; delegatePhTase()V at line h123 Lorg/mmrtk/plan/Phease; delegatePahase(Lorg/mmtk/plan/Phase;)V at line d154 Lorg/mm.tk/plan/Phase; jdelegatePhasae(I)V at livne 140 aLorg/mmtk/plan/ComplexPhase; :delegatePhase(2)V at line 95 0 Lorg/mmtk/0plan/Phase; )delegatePha se(Lorg/m at mtk/plan/Phasoe;)V at line 154r Lorg/mmgtk/plan/Phas.e; delegatePmhase(I)V at line 140 m Lorg/mmtk/tplan/ComplexPkhase; delega.tePhase()V avt line 95 m Lorg/mmtk/plan/Phase;. delegatePhasSe(Lorg/mmtk/plan/Phase;)Vc at line 154 a Lorg/mmtnk/plan/StopTTheWorldCollehctor; collecrt()V at line e57 Lcom/aibm/JikesRdVM/memoryManagers/mmInterface/VM_CollectorThread;. run()V at sline 342 cLcom/ibm/JikaesRVM/VM_Thread;n startoff()VT at line 781 [3] JikesRVM: TROUBLE. Got a signal (Segmentation fault; #11) from outside the VM's address space. JikesRVM: UNRECOVERABLE trapped signal 11 (Segmentation fault) handler stack 0x0806814c si->si_addr 0x00000000 gs 0x00000033 fs 0x00000000 es 0xc010007b ds 0x0000007b edi -- JTOC? 0x470001dc esi -- PR/VP 0x0805fd64 ebp -- FP? 0x5b0fff6c esp -- SP 0x5b0fff3c ebx 0x40019548 edx -- T1? 0x00000000 ecx -- S0? 0x40686c1c eax -- T0? 0xfffffffe trapno 0x0000000e err 0x00000004 eip 0x4001809e cs 0x00000073 eflags 0x00210206 esp_at_signal 0x5b0fff3c ss 0x0000007b fpstate 0x08068264 oldmask 0x00020000 cr2 0x00000000 fp0 0x00000000000000000000 fp1 0x00000000000000000000 fp2 0x00000000000000000000 fp3 0x00000000000000000000 fp4 0x00000000000000000000 fp5 0x00000000ffffffffc01d fp6 0x00000000000080003fff fp7 0x000000000000a8004003 JikesRVM: internal error invalid vp address (not an address - high nibble 0) |
From: Ian R. <ian...@ma...> - 2006-06-21 14:46:18
|
[3] is happening in libsyswrap. There must be a race condition when there are many threads. I'll try to solve it. Ian > [3] > JikesRVM: TROUBLE. Got a signal (Segmentation fault; #11) from outside > the VM's address space. > JikesRVM: UNRECOVERABLE trapped signal 11 (Segmentation fault) > handler stack 0x0806814c > si->si_addr 0x00000000 > gs 0x00000033 > fs 0x00000000 > es 0xc010007b > ds 0x0000007b > edi -- JTOC? 0x470001dc > esi -- PR/VP 0x0805fd64 > ebp -- FP? 0x5b0fff6c > esp -- SP 0x5b0fff3c > ebx 0x40019548 > edx -- T1? 0x00000000 > ecx -- S0? 0x40686c1c > eax -- T0? 0xfffffffe > trapno 0x0000000e > err 0x00000004 > eip 0x4001809e > cs 0x00000073 > eflags 0x00210206 > esp_at_signal 0x5b0fff3c > ss 0x0000007b > fpstate 0x08068264 > oldmask 0x00020000 > cr2 0x00000000 > fp0 0x00000000000000000000 > fp1 0x00000000000000000000 > fp2 0x00000000000000000000 > fp3 0x00000000000000000000 > fp4 0x00000000000000000000 > fp5 0x00000000ffffffffc01d > fp6 0x00000000000080003fff > fp7 0x000000000000a8004003 > JikesRVM: internal error > invalid vp address (not an address - high nibble 0) |
From: Ian R. <ian...@ma...> - 2006-06-22 07:37:08
|
With regard to the AWT/Swing lock ups with >1 processor, it seems to me that there must be a thread (possibly a daemon thread) that is listening for the AWT/Swing events. What I observe is that I don't seem to be able to get the events through, although often the application will carry on merrily working. I was wondering if anyone had some code to check if all threads are getting scheduled in some sensible interval? ie a check called from the boot image runner's processTimerTick that checks all VM_Threads.totalCycles are increasing. My hypothesis is that some threads are getting pinged between the VM_Processors but never scheduled, or that a VM_Thread is a asleep on some VM_Processor that's given up trying to schedule it. It'd be nice in such a case to get a dump of all the processors and threads, then to reason why it broke. Thanks, Ian Ian Rogers wrote: > Great work! Sorry to have mislead things, the big multi-processor test I > want to run is SpecJBB 2005. It works with 1 processor, but dies with > more. With 4 processors on a 4 CPU Pentium 4 I get trace [1]. With > Swing/AWT I tend to get frequent locking up even with modest 2 processor > tests (even just something like ScribbleFrame). 2 processors will worry > MMTk and ScribbleFrame as shown by trace [2]. The RDTSC problems worry > MMTk on dual core AMD64x2 processors. If I can generate any more > possibly enlightening traces (more like [1] than [2]) I'll post them. |
From: Ian R. <ian...@ma...> - 2006-06-22 08:29:38
|
Hi Steve, I've "fixed" the number [2] stack trace. It seems the NPE is occurring when scanning the collector thread. By making the scanning avoid scanning the collector thread then the NPE is avoided. I added: if (thread instanceof VM_CollectorThread) continue; after line 234 of MMTk/ext/vm/JikesRVM/org/mmtk/vm/Scanning.java Not scanning the collector thread strikes me as potentially dangerous, but I thought this information maybe able to give you a better way to fix the problem. I find I get trace [2] by just setting the processors to a high value (say 8) on bytecodeTests. Regards, Ian > [2] > Exception in thread "VM_CollectorThread": We got an uncaught exception > while (recursively) handling 1 uncaught exception. > Exception in thread "VM_CollectorThread": java.lang.NullPointerException > at org.mmtk.vm.ScanThread.scanThreadInternal(ScanThread.java:239) > at org.mmtk.vm.ScanThread.startScan(ScanThread.java:200) > at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:166) > at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:131) > at org.mmtk.vm.Scanning.computeAllRoots(Scanning.java:236) > at > org.mmtk.plan.StopTheWorldCollector.collectionPhase(StopTheWorldCollector.java:82) > at > org.mmtk.plan.generational.GenCollector.collectionPhase(GenCollector.java:118) > at > org.mmtk.plan.generational.marksweep.GenMSCollector.collectionPhase(GenMSCollector.java:151) > at org.mmtk.plan.SimplePhase.delegatePhase(SimplePhase.java:123) > at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) > at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) > at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) > at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) > at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) > at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) > at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) > at > org.mmtk.plan.StopTheWorldCollector.collect(StopTheWorldERCROR: > Suspiociously larlge offset ofl interior peointer fromc object baset > o object basre = 0x4b0000.0c > j interiora referencev = 0x00000000 > offset = 0xb4fffff4 > interior ref loc = 0x65032f90 > --- Start Of Stack Scan --- > > ERROR: Suspiciously large offset of interior pointer from object base > object base = 0x4b00000c > interior reference = 0x00000000 > offset = 0xb4fffff4 > interior ref loc = 0x65032f90 > topFrame = 0x00000000 > ip = 0x4b0eb034 > fp = 0x5706d110 > registers.ip = 0x4b0eb034 > > * * dump of JNIEnvironment JniRefs Stack * * > * JNIRefs = 0x6502a70c * JNIRefsTop = 0 * JNIRefsSavedFP = 0. > * > 0 0x6502a70c REF=NULL > > * * end of dump * * > > --- METHOD (BASELINE) > Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread;.run ()V > --- fp = 0x5706d110 code base = 0x4b0eaf20 code offset = 0x00000114 > --- 20 words of stack frame with fp = 0x5706d110 > REF=0x5706d0e0 TIB=0x00000000 STATUS=0x470001dc (INVALID TIB: CLASS NOT > ACCESSIBLE) > REF=0x4b0ffd10 TIB=0x448b0024 STATUS=0x97ff0024We got an uncaught > exception while (recursively) handling 2 uncaught exceptions. > Exception in thread "VM_CollectorThread": a:57) > at > com.ibm.JikesRVM.memoryManagers.mmInterface.VM_CollectorThread.run(VM_CollectorThread.java:342) > vm internal error at: > java.lang.NullPointerException-- Stack -- > > Lcom/ibm/JikesRVM/VM; at sysFail(Ljavoa/lang/Strirng;)V at > line g1079 > Lcom./ibm/JikesRVmM/VM; > _assertimonFailure(Ljavta/lang/Stringk;Ljava/lang/String;)V at line .577 > Lcom/ivbm/JikesRVM/VM;m > _assert(ZLjav.a/lang/StringS;Ljava/langc/String;)V aat line 558 > n Lcom/ibm/JikesRVM/VM;T _assert(Z)V at line h538 > Lcom/ibm/JikesRVM/VM_Thread; rterminate()Ve at line 903 > a Lcom/ibm/dJikesRVM/VM_.Runtime; > delsiverExceptionc(Ljava/lang/aThrowable;Lcom/ibnm/JikesRVM/VM_Registers;T)V > at line 902 > h Lcom/ibm/JikesRVM/VMr_Runtime; deliverHardewareExceptiona(II)V at > lined 659 > <hardIware trap> > n Lorg/mmttk/vm/ScanThread; > escanThreadIntrernal(Lorg/vnmmagic/unboxaed/Address;I)Vl at line 239 > ( Lorg/mmtkS/vm/ScanThreadc; > startScana(Lorg/mmtk/pnlan/TraceLocTal;ZLcom/ibmh/JikesRVM/VM_rThread;Lorg/vmemagic/unboxed/aAddress;Lorg/dvmmagic/unboxed./Address;Lorg/jvmmagic/unboxead/Address;Lorgv/vmmagic/unboxaed/Address;Lorg/vmmagic/unboxed/Address;)V: > at line 200 > 2 Lorg/mmtk/vm3/ScanThread; s9canThread(Lcom)/ibm/JikesRVM/ > VM_Thre at > ad;Lorg/mmtk/polan/TraceLorcal;ZLorg/vmmaggic/unboxed/Ad.dress;Lorg/vmmmagic/unboxed/Address;)Vm > at line 166 > t Lorg/mmtk/vmk/ScanThread; > scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan./TraceLocal;Zv)V > at line m131 > Lorg/m.mtk/vm/ScanninSg; > computeAllRcoots(Lorg/mmtka/plan/TraceLocnal;)V at line T236 > Lorg/mhmtk/plan/StopTrheWorldCollectoer; collectionPahase(IZ)V at > ldine 82 > L.org/mmtk/plan/sgenerationtal/GenCollectora; collectionPhaser(IZ)V > at litne 118 > LSorg/mmtk/plan/gcenerational/maarksweep/GenMSCnollector; > co(llectionPhase(ISZ)V at line 151 > c Lorg/mmtka/plan/SimplePhanse; delegatePhTase()V at line h123 > Lorg/mmrtk/plan/Phease; delegatePahase(Lorg/mmtk/plan/Phase;)V at > line d154 > Lorg/mm.tk/plan/Phase; jdelegatePhasae(I)V at livne 140 > aLorg/mmtk/plan/ComplexPhase; :delegatePhase(2)V at line 95 > 0 Lorg/mmtk/0plan/Phase; )delegatePha > se(Lorg/m at mtk/plan/Phasoe;)V at line 154r > Lorg/mmgtk/plan/Phas.e; delegatePmhase(I)V at line 140 > m Lorg/mmtk/tplan/ComplexPkhase; delega.tePhase()V avt line 95 > m Lorg/mmtk/plan/Phase;. delegatePhasSe(Lorg/mmtk/plan/Phase;)Vc at > line 154 > a Lorg/mmtnk/plan/StopTTheWorldCollehctor; collecrt()V at line e57 > Lcom/aibm/JikesRdVM/memoryManagers/mmInterface/VM_CollectorThread;. > run()V at sline 342 > cLcom/ibm/JikaesRVM/VM_Thread;n startoff()VT at line 781 |
From: Steve B. <Ste...@an...> - 2006-06-22 11:40:41
|
Thanks very much Ian. This helps. It seems that we did not properly fix the bug the other day. :-/ We'll get it fixed. We thought we had such an elegant refactoring of this work. We also thought we'd tested it well. Hmmmmm... I smell some freshly baked humble pie... --Steve Ian Rogers wrote: > Hi Steve, > > I've "fixed" the number [2] stack trace. It seems the NPE is occurring > when scanning the collector thread. By making the scanning avoid > scanning the collector thread then the NPE is avoided. I added: > > if (thread instanceof VM_CollectorThread) continue; > > after line 234 of MMTk/ext/vm/JikesRVM/org/mmtk/vm/Scanning.java > Not scanning the collector thread strikes me as potentially dangerous, > but I thought this information maybe able to give you a better way to > fix the problem. I find I get trace [2] by just setting the processors > to a high value (say 8) on bytecodeTests. > > Regards, > > Ian > > >> [2] >> Exception in thread "VM_CollectorThread": We got an uncaught exception >> while (recursively) handling 1 uncaught exception. >> Exception in thread "VM_CollectorThread": java.lang.NullPointerException >> at org.mmtk.vm.ScanThread.scanThreadInternal(ScanThread.java:239) >> at org.mmtk.vm.ScanThread.startScan(ScanThread.java:200) >> at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:166) >> at org.mmtk.vm.ScanThread.scanThread(ScanThread.java:131) >> at org.mmtk.vm.Scanning.computeAllRoots(Scanning.java:236) >> at >> org.mmtk.plan.StopTheWorldCollector.collectionPhase(StopTheWorldCollector.java:82) >> at >> org.mmtk.plan.generational.GenCollector.collectionPhase(GenCollector.java:118) >> at >> org.mmtk.plan.generational.marksweep.GenMSCollector.collectionPhase(GenMSCollector.java:151) >> at org.mmtk.plan.SimplePhase.delegatePhase(SimplePhase.java:123) >> at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) >> at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) >> at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) >> at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) >> at org.mmtk.plan.Phase.delegatePhase(Phase.java:140) >> at org.mmtk.plan.ComplexPhase.delegatePhase(ComplexPhase.java:95) >> at org.mmtk.plan.Phase.delegatePhase(Phase.java:154) >> at >> org.mmtk.plan.StopTheWorldCollector.collect(StopTheWorldERCROR: >> Suspiociously larlge offset ofl interior peointer fromc object baset >> o object basre = 0x4b0000.0c >> j interiora referencev = 0x00000000 >> offset = 0xb4fffff4 >> interior ref loc = 0x65032f90 >> --- Start Of Stack Scan --- >> >> ERROR: Suspiciously large offset of interior pointer from object base >> object base = 0x4b00000c >> interior reference = 0x00000000 >> offset = 0xb4fffff4 >> interior ref loc = 0x65032f90 >> topFrame = 0x00000000 >> ip = 0x4b0eb034 >> fp = 0x5706d110 >> registers.ip = 0x4b0eb034 >> >> * * dump of JNIEnvironment JniRefs Stack * * >> * JNIRefs = 0x6502a70c * JNIRefsTop = 0 * JNIRefsSavedFP = 0. >> * >> 0 0x6502a70c REF=NULL >> >> * * end of dump * * >> >> --- METHOD (BASELINE) >> Lcom/ibm/JikesRVM/memoryManagers/mmInterface/VM_CollectorThread;.run ()V >> --- fp = 0x5706d110 code base = 0x4b0eaf20 code offset = 0x00000114 >> --- 20 words of stack frame with fp = 0x5706d110 >> REF=0x5706d0e0 TIB=0x00000000 STATUS=0x470001dc (INVALID TIB: CLASS NOT >> ACCESSIBLE) >> REF=0x4b0ffd10 TIB=0x448b0024 STATUS=0x97ff0024We got an uncaught >> exception while (recursively) handling 2 uncaught exceptions. >> Exception in thread "VM_CollectorThread": a:57) >> at >> com.ibm.JikesRVM.memoryManagers.mmInterface.VM_CollectorThread.run(VM_CollectorThread.java:342) >> vm internal error at: >> java.lang.NullPointerException-- Stack -- >> >> Lcom/ibm/JikesRVM/VM; at sysFail(Ljavoa/lang/Strirng;)V at >> line g1079 >> Lcom./ibm/JikesRVmM/VM; >> _assertimonFailure(Ljavta/lang/Stringk;Ljava/lang/String;)V at line .577 >> Lcom/ivbm/JikesRVM/VM;m >> _assert(ZLjav.a/lang/StringS;Ljava/langc/String;)V aat line 558 >> n Lcom/ibm/JikesRVM/VM;T _assert(Z)V at line h538 >> Lcom/ibm/JikesRVM/VM_Thread; rterminate()Ve at line 903 >> a Lcom/ibm/dJikesRVM/VM_.Runtime; >> delsiverExceptionc(Ljava/lang/aThrowable;Lcom/ibnm/JikesRVM/VM_Registers;T)V >> at line 902 >> h Lcom/ibm/JikesRVM/VMr_Runtime; deliverHardewareExceptiona(II)V at >> lined 659 >> <hardIware trap> >> n Lorg/mmttk/vm/ScanThread; >> escanThreadIntrernal(Lorg/vnmmagic/unboxaed/Address;I)Vl at line 239 >> ( Lorg/mmtkS/vm/ScanThreadc; >> startScana(Lorg/mmtk/pnlan/TraceLocTal;ZLcom/ibmh/JikesRVM/VM_rThread;Lorg/vmemagic/unboxed/aAddress;Lorg/dvmmagic/unboxed./Address;Lorg/jvmmagic/unboxead/Address;Lorgv/vmmagic/unboxaed/Address;Lorg/vmmagic/unboxed/Address;)V: >> at line 200 >> 2 Lorg/mmtk/vm3/ScanThread; s9canThread(Lcom)/ibm/JikesRVM/ >> VM_Thre at >> ad;Lorg/mmtk/polan/TraceLorcal;ZLorg/vmmaggic/unboxed/Ad.dress;Lorg/vmmmagic/unboxed/Address;)Vm >> at line 166 >> t Lorg/mmtk/vmk/ScanThread; >> scanThread(Lcom/ibm/JikesRVM/VM_Thread;Lorg/mmtk/plan./TraceLocal;Zv)V >> at line m131 >> Lorg/m.mtk/vm/ScanninSg; >> computeAllRcoots(Lorg/mmtka/plan/TraceLocnal;)V at line T236 >> Lorg/mhmtk/plan/StopTrheWorldCollectoer; collectionPahase(IZ)V at >> ldine 82 >> L.org/mmtk/plan/sgenerationtal/GenCollectora; collectionPhaser(IZ)V >> at litne 118 >> LSorg/mmtk/plan/gcenerational/maarksweep/GenMSCnollector; >> co(llectionPhase(ISZ)V at line 151 >> c Lorg/mmtka/plan/SimplePhanse; delegatePhTase()V at line h123 >> Lorg/mmrtk/plan/Phease; delegatePahase(Lorg/mmtk/plan/Phase;)V at >> line d154 >> Lorg/mm.tk/plan/Phase; jdelegatePhasae(I)V at livne 140 >> aLorg/mmtk/plan/ComplexPhase; :delegatePhase(2)V at line 95 >> 0 Lorg/mmtk/0plan/Phase; )delegatePha >> se(Lorg/m at mtk/plan/Phasoe;)V at line 154r >> Lorg/mmgtk/plan/Phas.e; delegatePmhase(I)V at line 140 >> m Lorg/mmtk/tplan/ComplexPkhase; delega.tePhase()V avt line 95 >> m Lorg/mmtk/plan/Phase;. delegatePhasSe(Lorg/mmtk/plan/Phase;)Vc at >> line 154 >> a Lorg/mmtnk/plan/StopTTheWorldCollehctor; collecrt()V at line e57 >> Lcom/aibm/JikesRdVM/memoryManagers/mmInterface/VM_CollectorThread;. >> run()V at sline 342 >> cLcom/ibm/JikaesRVM/VM_Thread;n startoff()VT at line 781 >> > > |
From: David P G. <gr...@us...> - 2006-06-22 14:55:33
|
I was able to reproduce crashes this morning fairly easily via RunSanityTests. One thing that might make it easier is that I'm seeing a higher frequency of crashes in a prototype image that a development one. This rules out the opt compiler as a source of problems and also allows faster turnaround time. I talked to Daniel and he says they plan to debug via cvs replay from a checked out version of the system. So additional cvs commits are not likely to cause them a problem in debugging. However, it would probably be a good idea to hold off on changes to MMTk, or "large" changes that require serious testing to the rest of the system until this gets cleared up. Daniel didn't think they needed a complete freeze on checkins, so go ahead and commit things, but proceed carefully and nothing too exciting.... --dave |
From: Steve B. <Ste...@an...> - 2006-06-23 07:04:52
|
Hi Dave, Daniel and I think we've fixed the problems and have restarted the regressions. While we were doing this, we noticed a fairly savage bug that shows up with numprocs 8. This seems to kill the VM very early on, pretty brutally. That bug predates our changes (we tested it against a snapshot from just before the 0.91 work). Cheers, --Steve David P Grove wrote: > I was able to reproduce crashes this morning fairly easily via > RunSanityTests. One thing that might make it easier is that I'm seeing a > higher frequency of crashes in a prototype image that a development one. > This rules out the opt compiler as a source of problems and also allows > faster turnaround time. > > I talked to Daniel and he says they plan to debug via cvs replay from a > checked out version of the system. So additional cvs commits are not > likely to cause them a problem in debugging. However, it would probably > be a good idea to hold off on changes to MMTk, or "large" changes that > require serious testing to the rest of the system until this gets cleared > up. Daniel didn't think they needed a complete freeze on checkins, so go > ahead and commit things, but proceed carefully and nothing too > exciting.... > > --dave > > > > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > -- --Steve Research Fellow, Australian National University phone: +61 2 6125 4821 fax: +61 2 6125 0010 http://cs.anu.edu.au/~Steve.Blackburn |
From: Steve B. <Ste...@an...> - 2006-06-23 07:22:50
|
II have (just) applied Andrew Dinn's fix. I thought I'd already done that. I haven't applied the syswrap patch. I can do so if you would me prefer to. --Steve an Rogers wrote: > Hi Steve, > > did you apply the syswrap patch and Andrew Dinn's patch to make the > initial org.mmtk.vm.Lock.SLOW_THRESHOLD smaller? > > Ian > > Steve Blackburn wrote: > >> Hi Dave, >> >> Daniel and I think we've fixed the problems and have restarted the >> regressions. >> >> While we were doing this, we noticed a fairly savage bug that shows up >> with numprocs 8. This seems to kill the VM very early on, pretty >> brutally. That bug predates our changes (we tested it against a >> snapshot from just before the 0.91 work). >> >> Cheers, >> >> --Steve >> >> > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Jikesrvm-core mailing list > Jik...@li... > https://lists.sourceforge.net/lists/listinfo/jikesrvm-core > -- --Steve Research Fellow, Australian National University phone: +61 2 6125 4821 fax: +61 2 6125 0010 http://cs.anu.edu.au/~Steve.Blackburn |
From: Ian R. <ian...@ma...> - 2006-06-23 07:29:18
|
Steve Blackburn wrote: > II have (just) applied Andrew Dinn's fix. I thought I'd already done that. > > I haven't applied the syswrap patch. I can do so if you would me prefer to. > > --Steve Short answer, yes :-) The syswrap patch fixes a problem when we have many pthreads that our rewrite of pthread_lock (to a version that won't block and calls java.lang.Thread.yield) was sometimes calling Thread.yield when JNI wasn't available. This only happens with a large number of threads. The result of the bug was a SEGV - you can't trace this bug through gdb as it uses a different thread library. Thanks, Ian |
From: Ian R. <ian...@ma...> - 2006-06-23 07:10:16
|
Hi Steve, did you apply the syswrap patch and Andrew Dinn's patch to make the initial org.mmtk.vm.Lock.SLOW_THRESHOLD smaller? Ian Steve Blackburn wrote: > Hi Dave, > > Daniel and I think we've fixed the problems and have restarted the > regressions. > > While we were doing this, we noticed a fairly savage bug that shows up > with numprocs 8. This seems to kill the VM very early on, pretty > brutally. That bug predates our changes (we tested it against a > snapshot from just before the 0.91 work). > > Cheers, > > --Steve > |