Re: [mh] Insteon links and scenes clue request (and a bit of rambling )

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Quoting Brian Warren (1/24/08 12:59 AM):
> On Wednesday 23 January 2008 05:35:47 pm Gregg Liming wrote:
> 
>> ... because there were additional bugs that I didn't test for.  So, what
>> I've done is:
>>
>> 1) Change my primary keypadlinc "load" from an IPLD (which is the only
>> way it used to possibly work) to an IPLL.  This better simulates what
>> you've been trying to do w/ your non-load SLs.
>> 2) unlinked and then (re-) linked my new IPLL w/ group num 01.
>> 3) unlinked and then (re-) linked button #7 (group 07) on the same
>> keypadlinc.
>>
>> Of course, I did this after discovering and fixing the bugs.  On the
>> plus side, the code seems to do reasonably well even when partial links
>> (i.e., only the device side or plm side) exist as I managed to trash
>> things a bit while resolving the problems.
>>
> 
> Currently running revision 1314.
> I have things a bit screwed up myself.  I factory defaulted the $patio_sl 

I'm hoping that you deliberately wanted to factory default the patio_sl.
 I've personally been very hesitant to do so as I was always able to
find another way to do things.  Definitely, I've been able to clean up
orphaned links (see below) w/o resorting to this.  So, it's good "for
the cause" that you've done so as it reveals additional problems; I just
wouldn't recommend it.

> before remembering that mh did not yet sync the links between devices. 

I'll hopefully get around to testing this soon and removing the constraint.

> The 
> intent was to test the familyrm_recessed_scene from scratch.

Actually, I was hoping that you were going to continue to test as you
had to confirm that the link to interface and unlink to interface was
now working for you.  But, on to sync links.

> I ended up 
> with an unknown device in the link table.

I've been somewhat slow to fully test all of the sync links because of
not wanting to have orphaned links about.  So, that caused me to add the
remove orphaned links function attached to the plm.  It does not yet
catch every possible case (as the permutations of possible orphans
across all device combinations is very large); but, it does a great
deal.  And, the scope of it's application continues to increase.  It's
very conservative; so, I would suggest using it vice factory resets.

> Here is the link table prior to reset:

[...snip...]

> And after factory defaulting and scaning the links:
> 01/23/08 23:16:43 [Insteon_Device] link table for $patio_sl (devcat: 0101):
> 01/23/08 23:16:43 [Insteon_Device] $patio_sl adlb [0x0FF8] is empty

... which is correct.

> I also ran 'unlink with interface' which cleaned up the plm side of the link 
> to patio_sl.

Ahh... ok, so you did use a part of the unlink/link to interface.  Good.

> Next I ran 'sync links' for the scene.
> 
> 01/23/08 23:18:46 menu_run: g=mh m=Insteon i=16 s=4 => action: familyrm 
> recessed scene 'sync links'
> 01/23/08 23:18:46 Running: familyrm recessed scene sync links
> 01/23/08 23:18:46 [Insteon_Device] adding link record $patio_sl light level 
> controlled by $plm and group: 50 with on level: 100 and ramp rate: 0.1
> 01/23/08 23:18:46 [Insteon_Device] $patio_sl address: 0FF8 found for device: 
> 093836 and group: 50
> 01/23/08 23:18:46 [Insteon_Device] adding link record $entry_kp_B light level 
> controlled by $plm and group: 50 with on level: 100 and ramp rate: 0.1
> 01/23/08 23:18:46 [Insteon_Device] WARN: $entry_kp_B write_link failure: no 
> address could be found for device: 093836 and group: 50 and is_controller: 0

I see what is going on; but, I'll have to give some thought to the fix.
Up till now, I've been somewhat conservative in testing adding only one
scene member at a time--which works for me.  Currently, adding a new
spare address happens at the end of an add (the first one still hasn't
completed yet) and the second scene member to add is failing because no
spare address is known.

Until I get this resolved, you should be able to build up scenes
incrementally (running sync links repetetively--waiting until each run
fully completes)--allowing them to partial fail until the last one is
added.  I'll let you know once I have a fix for this.

> And so on.  Now I'll bet you'll want to see all those logs after this next 
> bit, but I'm wondering if we should take this off list due to the amount of 
> logging involved.  Your call there.  

That's fine with me.

> Here is the interesting part.  After 
> this completed I scanned links on the patio_sl:
> 
> 01/23/08 23:19:37 [Insteon_Device] link table for $patio_sl (devcat: 0101):
> 01/23/08 23:19:37 [Insteon_Device] $patio_sl adlb [0x0FF8] responder record to 
> 093800(00): onlevel=100% and ramp=540s
> 01/23/08 23:19:37 [Insteon_Device] $patio_sl adlb [0x0FF0] is empty
> 
> 093800 is not one of mine, but the first 4 are the same as the plm.

On occasion, I have seen an incorrectly reported device.  Rerunning scan
(usually just once) will reveal the correct value.  Now, it is possible
that something in the scan code is amiss; although the fact that
rerunning it can give good results makes me wonder a bit more about
insteon collision issues.  Anyway, please rerun scan links to see if
this rather goofy link really does seem to exist or the scanned values
are not accurate.  If it really is goofed up, then delete orphaned links
will delete it; but, don't run it until you're sure it's wrong.

> The logs show that the overnight runs have completed with all devices.
> When run from the web interface, I got the same distance in 2 instances.
> 
> 01/23/08 22:45:11 Running: Scan all link tables
> 01/23/08 22:45:11 [Scan all link tables] Now scanning: $plm
> 01/23/08 22:45:30 [Scan all link tables] Now scanning: $livingrm_light_sl
> 01/23/08 22:45:53 [Scan all link tables] Now scanning: $kitchen_fluorescent_rl
> 01/23/08 22:46:23 [Scan all link tables] Now scanning: $porch_light_sl
> 01/23/08 22:46:31 [Scan all link tables] Now scanning: $mbr_lampb_ll
> 01/23/08 22:46:43 [Scan all link tables] Now scanning: 
> $stairway_top_entry_light_kp
> 
> After that run, I tried again and it completed.  Given that it's halting on a 
> keypad, I'm wondering about the devcat discovery you mentioned previously.  

That only matters for creating proper links--not scanning.  I would be
more apt to "blame" collisions.  If you do a (get) status, what hop
count do you see (should be "hops left:x")?

> Both instances, wherein it did not complete the scan, were run shortly after 
> startup.  After all the status request messages were finished.  Perhaps this 
> is just the impatience getting me again.

Well, I would definitely wait until another 30 seconds or so after the
initial startup.  Also, I have Insteon_PLM_max_queue_time=20 set; the
default value is 15 seconds.  This seems about right for my number of
insteon devices during startup to not start retrying before they should
(due to the very large number of queued status requests).  You're the
only user (that I'm aware of) with an equally large (actually larger)
number of devices.  You might try setting it as I have and possibly
larger.  Realistically, this value should truly be a max and it would
really be dynamic based upon queue volume so that the requeue time could
be much more responsive in low queue volumes.  But, that's more of a
luxury/nice to have at this point.  It is on my "TO-DO" list however;
since the occasional collision that can't be handled through peer
hopping forces me to wait up to 20 seconds before mh causes a retry (my
goal is to allow dynamic adjustment to get it as low as 5 seconds).

Gregg