Menu

#82 waitForCancelled and waitForAnnounced can cause bad behavior

closed-fixed
networking (37)
5
2014-11-06
2011-01-04
No

In areas of shaky WiFi coverage where the WiFi goes up and down a lot, we sometimes get into a state where one thread is waiting for a cancel and another thread is waiting for an announce, but neither ever succeeds.

This is particularly bad, because the waitForCancelled and waitForAnnounced functions are spinning while loops, and this can easily cause the CPU to get pegged and bring the system to a halt.

Is there any plan to remove the blocking code and replace it with an event-driven model, so that at least if something gets into a bad state, only JmDNS will be affected, and not the whole machine (spinning while loops are a bad thing.)

I'm still looking into this -- it's really hard to reproduce, and I'm not sure what event actually triggers this behavior. It seems that calling recoverTask from any other step will cause this problem, since it will try to cancel, but cancelling runs on the same stateTimer, and since the Prober is waiting for the canceller, bu tthe canceller can never start because the Prober is blocking the stateTimer thread.

These are stack traces from the misbehaving threads:

"JmDNS(192-168-1-3).State.Timer" prio=1 tid=13 RUNNABLE
| group="main" sCount=1 dsCount=0 s=Y obj=0x47cccdf8 self=0x254ac8
| sysTid=4176 nice=19 sched=0/0 cgrp=bg_non_interactive handle=2753408
| schedstat=( 150118100032 192908634375 146559 )
at java.util.concurrent.locks.AbstractQueuedSynchronizer.getState(AbstractQueuedSynchronizer.java:~497)
at java.util.concurrent.locks.ReentrantLock$Sync.nonfairTryAcquire(ReentrantLock.java:106)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.tryAcquire(ReentrantLock.java:189)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:415)
at javax.jmdns.impl.DNSStatefulObject$DefaultImplementation.waitForCanceled(DNSStatefulObject.java:266)
at javax.jmdns.impl.HostInfo.waitForCanceled(HostInfo.java:395)
at javax.jmdns.impl.JmDNSImpl.waitForCanceled(JmDNSImpl.java:456)
at javax.jmdns.impl.JmDNSImpl.recover(JmDNSImpl.java:1369)
at javax.jmdns.impl.tasks.state.Prober.recoverTask(Prober.java:144)
at javax.jmdns.impl.tasks.state.DNSStateTask.run(DNSStateTask.java:145)
at java.util.Timer$TimerImpl.run(Timer.java:289)

"SyncService worker" prio=1 tid=8 RUNNABLE
| group="main" sCount=1 dsCount=0 s=Y obj=0x47c950a8 self=0x261808
| sysTid=3099 nice=19 sched=0/0 cgrp=bg_non_interactive handle=2573336
| schedstat=( 209572919166 231530661529 214000 )
at java.util.concurrent.locks.AbstractQueuedSynchronizer.getState(AbstractQueuedSynchronizer.java:~497)
at java.util.concurrent.locks.ReentrantLock$Sync.nonfairTryAcquire(ReentrantLock.java:106)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.tryAcquire(ReentrantLock.java:189)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:415)
at javax.jmdns.impl.DNSStatefulObject$DefaultImplementation.waitForAnnounced(DNSStatefulObject.java:233)
at javax.jmdns.impl.ServiceInfoImpl.waitForAnnounced(ServiceInfoImpl.java:929)
at javax.jmdns.impl.JmDNSImpl.registerService(JmDNSImpl.java:793)
at com.doubleTwist.sync.SyncService.startZeroconf(SyncService.java:566)
at com.doubleTwist.sync.SyncService.startServer(SyncService.java:604)
at com.doubleTwist.sync.SyncService.startOrStopServer(SyncService.java:493)
at com.doubleTwist.sync.SyncService.access$800(SyncService.java:79)
at com.doubleTwist.sync.SyncService$SyncHandler.handleMessage(SyncService.java:1025)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:123)
at com.doubleTwist.androidPlayer.MusicUtils$BackgroundWorker.run(MusicUtils.java:1842)
at java.lang.Thread.run(Thread.java:1096)

Discussion

<< < 1 2 3 (Page 3 of 3)
  • Pierre Frisch

    Pierre Frisch - 2011-01-20

    Integrated.

     
  • SourceForge Robot

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 14 days (the time period specified by
    the administrator of this Tracker).

     
  • SourceForge Robot

    • status: pending-fixed --> closed-fixed
     
<< < 1 2 3 (Page 3 of 3)