Thread: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

Brought to you by: claudiuc, jean2, sambau

irda-users

[irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Ryan R. <ry...@gm...> - 2007-05-25 01:54:45

All,

Kernel version: 2.6.19

I've been having some connection issues with IrCOMM.  It seems that a
new ircomm_tty_cb object is created and cached in a hashbin every time
a new /dev/ircomm? device is open.  I start seeing problems after more
than one ircomm device node has been opened by the application (i.e.
/dev/ircomm0 and /dev/ircomm1).  After some analysis of the problem it
seems that I if I update the implementation of
ircomm_tty_attach.c:ircomm_tty_discovery_indication() to just use the
object passed back to the callback via the priv variable, the problems
seem to go away.

Is there a reason that this callback loops through the objects in the
hash_bin and processes the IRCOMM_TTY_DISCOVERY_INDICATION event for
each object rather than just using the structure passed into the
callback?  It doesn't seem that the search is necessary, but I
certainly don't understand the code well enough to assert that.

Here is basically what I have done.  (Sorry, not at my Linux box right
now, so I don't have a diff.)  Can somebody tell me what the drawbacks
of doing this would be?

352 /*
353  * Function ircomm_tty_discovery_indication (discovery)
354  *
355  *    Remote device is discovered, try query the remote IAS to see which
356  *    device it is, and which services it has.
357  *
358  */
359 static void ircomm_tty_discovery_indication(discinfo_t *discovery,
360                                             DISCOVERY_MODE mode,
361                                             void *priv)
362 {
363         struct ircomm_tty_cb *self;
364         struct ircomm_tty_info info;
365
366         IRDA_DEBUG(2, "%s()\n", __FUNCTION__ );
367
368         /* Important note :
369          * We need to drop all passive discoveries.
370          * The LSAP management of IrComm is deficient and doesn't deal
371          * with the case of two instance connecting to each other
372          * simultaneously (it will deadlock in LMP).
373          * The proper fix would be to use the same technique as in IrNET,
374          * to have one server socket and separate instances for the
375          * connecting/connected socket.
376          * The workaround is to drop passive discovery, which drastically
377          * reduce the probability of this happening.
378          * Jean II */
379         if(mode == DISCOVERY_PASSIVE)
380                 return;
381
382         info.daddr = discovery->daddr;
383         info.saddr = discovery->saddr;
384
--->         self = (struct ircomm_tty_cb *) priv;
--->         ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
--->                                     NULL, &info);
--->         #if 0
385         /* FIXME. We have a locking problem on the hashbin here.
386          * We probably need to use hashbin_find_next(), but we first
387          * need to ensure that "line" is unique. - Jean II */
388         self = (struct ircomm_tty_cb *) hashbin_get_first(ircomm_tty);
389         while (self != NULL) {
390                 IRDA_ASSERT(self->magic == IRCOMM_TTY_MAGIC, return;);
391
392                 ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
393                                     NULL, &info);
394
395                 self = (struct ircomm_tty_cb *)
hashbin_get_next(ircomm_tty);
396         }
--->         #endif
397 }

Thanks for any info on why the current function is implemented the way
it is and if my change is reasonable or not.

-- Ryan

Re: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Samuel O. <sa...@so...> - 2007-05-28 00:59:54

Hi Ryan,

On Thu, May 24, 2007 at 09:54:45PM -0400, Ryan Reading wrote:
> All,
> 
> Kernel version: 2.6.19
> 
> I've been having some connection issues with IrCOMM.  It seems that a
> new ircomm_tty_cb object is created and cached in a hashbin every time
> a new /dev/ircomm? device is open.  I start seeing problems after more
> than one ircomm device node has been opened by the application (i.e.
> /dev/ircomm0 and /dev/ircomm1).  After some analysis of the problem it
> seems that I if I update the implementation of
> ircomm_tty_attach.c:ircomm_tty_discovery_indication() to just use the
> object passed back to the callback via the priv variable, the problems
> seem to go away.
> 
> Is there a reason that this callback loops through the objects in the
> hash_bin and processes the IRCOMM_TTY_DISCOVERY_INDICATION event for
> each object rather than just using the structure passed into the
> callback?  It doesn't seem that the search is necessary, but I
> certainly don't understand the code well enough to assert that.
Well, it seems to me that you're right. However, could you describe more
precisely the problem you get with the current code ?

Cheers,
Samuel.


> Here is basically what I have done.  (Sorry, not at my Linux box right
> now, so I don't have a diff.)  Can somebody tell me what the drawbacks
> of doing this would be?
> 
> 352 /*
> 353  * Function ircomm_tty_discovery_indication (discovery)
> 354  *
> 355  *    Remote device is discovered, try query the remote IAS to see which
> 356  *    device it is, and which services it has.
> 357  *
> 358  */
> 359 static void ircomm_tty_discovery_indication(discinfo_t *discovery,
> 360                                             DISCOVERY_MODE mode,
> 361                                             void *priv)
> 362 {
> 363         struct ircomm_tty_cb *self;
> 364         struct ircomm_tty_info info;
> 365
> 366         IRDA_DEBUG(2, "%s()\n", __FUNCTION__ );
> 367
> 368         /* Important note :
> 369          * We need to drop all passive discoveries.
> 370          * The LSAP management of IrComm is deficient and doesn't deal
> 371          * with the case of two instance connecting to each other
> 372          * simultaneously (it will deadlock in LMP).
> 373          * The proper fix would be to use the same technique as in IrNET,
> 374          * to have one server socket and separate instances for the
> 375          * connecting/connected socket.
> 376          * The workaround is to drop passive discovery, which drastically
> 377          * reduce the probability of this happening.
> 378          * Jean II */
> 379         if(mode == DISCOVERY_PASSIVE)
> 380                 return;
> 381
> 382         info.daddr = discovery->daddr;
> 383         info.saddr = discovery->saddr;
> 384
> --->         self = (struct ircomm_tty_cb *) priv;
> --->         ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
> --->                                     NULL, &info);
> --->         #if 0
> 385         /* FIXME. We have a locking problem on the hashbin here.
> 386          * We probably need to use hashbin_find_next(), but we first
> 387          * need to ensure that "line" is unique. - Jean II */
> 388         self = (struct ircomm_tty_cb *) hashbin_get_first(ircomm_tty);
> 389         while (self != NULL) {
> 390                 IRDA_ASSERT(self->magic == IRCOMM_TTY_MAGIC, return;);
> 391
> 392                 ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
> 393                                     NULL, &info);
> 394
> 395                 self = (struct ircomm_tty_cb *)
> hashbin_get_next(ircomm_tty);
> 396         }
> --->         #endif
> 397 }
> 
> Thanks for any info on why the current function is implemented the way
> it is and if my change is reasonable or not.
> 
> -- Ryan
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> irda-users mailing list
> ird...@li...
> http://lists.sourceforge.net/lists/listinfo/irda-users

Re: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Ryan R. <ry...@gm...> - 2007-05-28 21:18:07

Samuel,

Thanks for the response.

I have application code that essentially does the following (this is
pseudo-code):

fd = open( "/dev/ircomm0", O_NONBLOCK );
if( fd != -1 )
{
  if( select( fd, 5_sec ) ) /* Wait for port to be ready for data */
    return SUCCESS;
}

fd = open( "/dev/ircomm1", O_NONBLOCK );
if( fd != -1 )
{
  if( select( fd, 5_sec ) ) /* Wait for port to be ready for data */
    return SUCCESS;
}

... and so on.

As long as this code always finds the device on /dev/ircomm0, then
this code works fine.  If /dev/ircomm0 can't be opened for whatever
reason, /dev/ircomm1 will be opened as well.  This creates 2 entries
in the ircomm_tty hashbin.

If there is only one entry in the hashbin, everything works without
problem.  As soon as there are two entries in the hashbin, this code
stops working and it won't work again until ircomm is unloaded and
reloaded.

If there are two entries, the open() call will succeed, but the
select() call will timeout.  At the protocol level, the CONN_CMD is
sent twice. I believe the CONN_CMDs have different SLSAPs.  A response
to the first CONN_CMD is received *after* the second CONN_CMD is sent,
but then a disconnect response is sent and the connection is never
actually established.

The only difference I could see between the working case and the
broken case was the two CONN_CMD being sent rather than just one.
After digging into the code, I found the multiple CONN_CMD were being
sent b/c of the ircomm_tty_discovery_indication() implementation.  I
changed the implementation to what I have below, and the problems I
was having seem to go away.  I still have more testing to do over the
next week or so, but I wanted to get a sanity check from some experts
because I'm obviously speculating on the validity of the patch.

-- Ryan


On 5/27/07, Samuel Ortiz <sa...@so...> wrote:
> Hi Ryan,
>
> On Thu, May 24, 2007 at 09:54:45PM -0400, Ryan Reading wrote:
> > All,
> >
> > Kernel version: 2.6.19
> >
> > I've been having some connection issues with IrCOMM.  It seems that a
> > new ircomm_tty_cb object is created and cached in a hashbin every time
> > a new /dev/ircomm? device is open.  I start seeing problems after more
> > than one ircomm device node has been opened by the application (i.e.
> > /dev/ircomm0 and /dev/ircomm1).  After some analysis of the problem it
> > seems that I if I update the implementation of
> > ircomm_tty_attach.c:ircomm_tty_discovery_indication() to just use the
> > object passed back to the callback via the priv variable, the problems
> > seem to go away.
> >
> > Is there a reason that this callback loops through the objects in the
> > hash_bin and processes the IRCOMM_TTY_DISCOVERY_INDICATION event for
> > each object rather than just using the structure passed into the
> > callback?  It doesn't seem that the search is necessary, but I
> > certainly don't understand the code well enough to assert that.
> Well, it seems to me that you're right. However, could you describe more
> precisely the problem you get with the current code ?
>
> Cheers,
> Samuel.
>
>
> > Here is basically what I have done.  (Sorry, not at my Linux box right
> > now, so I don't have a diff.)  Can somebody tell me what the drawbacks
> > of doing this would be?
> >
> > 352 /*
> > 353  * Function ircomm_tty_discovery_indication (discovery)
> > 354  *
> > 355  *    Remote device is discovered, try query the remote IAS to see which
> > 356  *    device it is, and which services it has.
> > 357  *
> > 358  */
> > 359 static void ircomm_tty_discovery_indication(discinfo_t *discovery,
> > 360                                             DISCOVERY_MODE mode,
> > 361                                             void *priv)
> > 362 {
> > 363         struct ircomm_tty_cb *self;
> > 364         struct ircomm_tty_info info;
> > 365
> > 366         IRDA_DEBUG(2, "%s()\n", __FUNCTION__ );
> > 367
> > 368         /* Important note :
> > 369          * We need to drop all passive discoveries.
> > 370          * The LSAP management of IrComm is deficient and doesn't deal
> > 371          * with the case of two instance connecting to each other
> > 372          * simultaneously (it will deadlock in LMP).
> > 373          * The proper fix would be to use the same technique as in IrNET,
> > 374          * to have one server socket and separate instances for the
> > 375          * connecting/connected socket.
> > 376          * The workaround is to drop passive discovery, which drastically
> > 377          * reduce the probability of this happening.
> > 378          * Jean II */
> > 379         if(mode == DISCOVERY_PASSIVE)
> > 380                 return;
> > 381
> > 382         info.daddr = discovery->daddr;
> > 383         info.saddr = discovery->saddr;
> > 384
> > --->         self = (struct ircomm_tty_cb *) priv;
> > --->         ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
> > --->                                     NULL, &info);
> > --->         #if 0
> > 385         /* FIXME. We have a locking problem on the hashbin here.
> > 386          * We probably need to use hashbin_find_next(), but we first
> > 387          * need to ensure that "line" is unique. - Jean II */
> > 388         self = (struct ircomm_tty_cb *) hashbin_get_first(ircomm_tty);
> > 389         while (self != NULL) {
> > 390                 IRDA_ASSERT(self->magic == IRCOMM_TTY_MAGIC, return;);
> > 391
> > 392                 ircomm_tty_do_event(self, IRCOMM_TTY_DISCOVERY_INDICATION,
> > 393                                     NULL, &info);
> > 394
> > 395                 self = (struct ircomm_tty_cb *)
> > hashbin_get_next(ircomm_tty);
> > 396         }
> > --->         #endif
> > 397 }
> >
> > Thanks for any info on why the current function is implemented the way
> > it is and if my change is reasonable or not.
> >
> > -- Ryan
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > irda-users mailing list
> > ird...@li...
> > http://lists.sourceforge.net/lists/listinfo/irda-users
>

Re: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Samuel O. <sa...@so...> - 2007-06-02 00:56:44

On Mon, May 28, 2007 at 05:18:08PM -0400, Ryan Reading wrote:
> If there is only one entry in the hashbin, everything works without
> problem.  As soon as there are two entries in the hashbin, this code
> stops working and it won't work again until ircomm is unloaded and
> reloaded.
> 
> If there are two entries, the open() call will succeed, but the
> select() call will timeout.  
You mean the 2nd open(), right ?

> At the protocol level, the CONN_CMD is
> sent twice. I believe the CONN_CMDs have different SLSAPs.
I believe this is an IAS connection request, since this is what the 2
IrCOMM instances are supposed to do after having been notified of a
discovery event: checking for the peer's IrCOMM support.

> The only difference I could see between the working case and the
> broken case was the two CONN_CMD being sent rather than just one.
> After digging into the code, I found the multiple CONN_CMD were being
> sent b/c of the ircomm_tty_discovery_indication() implementation.  I
> changed the implementation to what I have below, and the problems I
> was having seem to go away.
I quickly tried to reproduce this bug with 2 IrDA dongles, and only one of them
seeing an IrCOMM capable phone. As expected both of them were sending
the IAS request, but I didn't see any further issues. I will try again
this weekend, but meanwhile could you do the following on your side:

1) echo 5 > /proc/sys/net/irda/debug
2) Run your application until you hit the failing condition
3) echo 0 > /proc/sys/net/irda/debug
4) Send me the corresponding dmesg output.


> I still have more testing to do over the
> next week or so, but I wanted to get a sanity check from some experts
> because I'm obviously speculating on the validity of the patch.
The patch looks OK to me.
The current code looks valid as well, even though not optimal. So, I'm
trying to understand why it fails on you.

Cheers,
Samuel.

Re: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Samuel O. <sa...@so...> - 2007-07-01 22:34:01

Hi Ryan,

On Sat, Jun 02, 2007 at 03:56:39AM +0300, Samuel Ortiz wrote:
> > I still have more testing to do over the
> > next week or so, but I wanted to get a sanity check from some experts
> > because I'm obviously speculating on the validity of the patch.
> The patch looks OK to me.
> The current code looks valid as well, even though not optimal. So, I'm
> trying to understand why it fails on you.
Have you got more time to test your patch ?
It works fine for me, but I'd like to double check with you before
pushing it upstream.

Cheers,
Samuel.


> Cheers,
> Samuel.
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> irda-users mailing list
> ird...@li...
> http://lists.sourceforge.net/lists/listinfo/irda-users

Re: [irda-users] IrCOMM trouble: ircomm_tty_discovery_indication() implementation question

From: Ryan R. <ry...@gm...> - 2007-07-03 01:04:16

Samuel,

I've done ad hoc testing and have been using it day to day and it has
worked very well for me.  I wanted to send debug traces back to the
list but have been swamped at work.  When I get time I'll try and do
some more formal testing.  Thanks!

-- Ryan


On 7/1/07, Samuel Ortiz <sa...@so...> wrote:
> Hi Ryan,
>
> On Sat, Jun 02, 2007 at 03:56:39AM +0300, Samuel Ortiz wrote:
> > > I still have more testing to do over the
> > > next week or so, but I wanted to get a sanity check from some experts
> > > because I'm obviously speculating on the validity of the patch.
> > The patch looks OK to me.
> > The current code looks valid as well, even though not optimal. So, I'm
> > trying to understand why it fails on you.
> Have you got more time to test your patch ?
> It works fine for me, but I'd like to double check with you before
> pushing it upstream.
>
> Cheers,
> Samuel.
>
>
> > Cheers,
> > Samuel.
> >
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > irda-users mailing list
> > ird...@li...
> > http://lists.sourceforge.net/lists/listinfo/irda-users
>