Menu

GetBestMatchingSegment()

2013-05-03
2013-05-09
  • Michael Ferrier

    Michael Ferrier - 2013-05-03

    This is more of a design question than an implementation question, so I've put it in its own thread.

    GetBestMatchingSegment() counts all synapses from active cells, not just those active synapses that are connected, when determining how well a segment matches current (or previous) activity. In the words of the Numenta paper, "This routine is aggressive in finding the best match. The permanence value of synapses is allowed to be below connectedPerm."

    One possible problem with this (that is showing up in my tests) is as follows. Say there is a cell that is part of the representation of 'C', and one of its segments has a pattern of synapses on it that represents the context of 'B' occurring immediately before 'C'. If 'C' is preceded by 'B' with too low of a probability (relative to the other patterns represented on that same segment) then the
    synapses making up the pattern of 'B' cells will over time be decremented more than they are incremented, and so those synapses will always be disconnected, and have permanence near 0. However, when 'C' is active after 'B', GetBestMatchingSegment() will nonetheless consider this segment to be a good match for the previous context of 'B', and so that segment's cell will be chosen as the learning cell, even though that segment can never be activated by 'B' and so 'B' will thus never lead to a prediction of 'C'.

    If GetBestMatchingSegment() were to only count connected cells, however, then it would not see this as a good segment to represent the prior context of 'B', and so a different segment and/or cell would be found or created to represent it, one where the pattern of synapses on the segment could actually stay connected and activate its cell.

    Does this make sense? Is there an alternative way around this problem that I'm missing? And does anyone have a sense of what the advantage is to having GetBestMatchingSegment() count also disconnected synapses, such that Numenta's design has it working that way?

     
  • David Ragazzi

    David Ragazzi - 2013-05-03

    Say there is a cell that is part of the representation of 'C', and one of its segments has a pattern of synapses on it that represents the context of 'B' occurring immediately before 'C'. If 'C' is preceded by 'B' with too low of a probability (relative to the other patterns represented on that same segment) then the
    synapses making up the pattern of 'B' cells will over time be decremented more than they are incremented, and so those synapses will always be disconnected, and have permanence near 0. However, when 'C' is active after 'B', GetBestMatchingSegment() will nonetheless consider this segment to be a good match for the previous context of 'B', and so that segment's cell will be chosen as the learning cell, even though that segment can never be activated by 'B' and so 'B' will thus never lead to a prediction of 'C'.

    Maybe this also explains what have happened in some tests I did last days.

    I tested the sequence 'ABBCBBA' whose could be translated to 'CBBAABB' ('ABB' is moved to the end) after a few steps.

    My results:
    -'C' always predicts correctly 'B'. CLA don't show any other likely prediction because after a 'C' always comes a 'B'.
    -The 'B' after 'C' predicts both 'B' and 'A'. Given the context, ie the last input was 'C' and not other 'B', it should predict only 'B'. However, in addition to CLA predicts also 'A', the value of the synapses permanences (its strenght/probability) of 'A' is greater than 'B', leading us understand that 'A' is more probable to come than 'B'. So far so good, since that segments predicting 'A' should have their sinapses permanence decremented due to false predictions. But is not the case, even with learning already stable, CLA always predicts 'A' together with first 'B' and with more strenght than second 'B'.

    C-B => A (not B)

    In a detailed view in 3d viewer, I noted that the segments used for connect the first 'B' to 'A' and the segments used for connect first 'B' to the second 'B' were correctly in different cells in same column.

     

    Last edit: David Ragazzi 2013-05-03
    • softnhard

      softnhard - 2013-05-03

      I had similar problem until i remove segment creation when calling updateSegmentActiveSynapses in phase 2 of temporal pooler. As I told in other topic after this removal many tests has been made and always the network is predicting correctly even if I set PermanenceDecVal to zero. I'm now using it to store phrases wit up to 10 words and the network is memorizing whole phrase without any problem.

       
      • Uwe Kirschenmann

        hi softnhard, do you mean you deactivated the temporal pooler? Or do you just create one segment in the beginning for the first time step and then add always new synapses to that segment?

         
        • softnhard

          softnhard - 2013-05-03

          Uwe, How do you think that one can get predictions by disabling the temporal pooler? What I did is discussed here

           
          • Michael Ferrier

            Michael Ferrier - 2013-05-03

            The changes that softnhard is referring to:

            1- On part b of phase 2 of the temporal pooler (currently line 788 on OpneHTM.CLA_Numenta.region.cs) remove the last true from cell.UpdateSegmentActiveSynapses(true, predictiveSegment, true);
            2- change the way Segment activity is measured to percent of active synapses. I've chosen segment to be activated if 15% of synapses are active instead of counting active synapses and compare it to 15.
            3-caused by above rule remove any synapse which it's permanence is decreased to zero.
            4- choose equal value (experimentaly 0.05) for both PermanenceIncValue and PermanenceDecValue

            It seem to me that the first change would eliminate CLA's ability to make predictions with time step greater than 1, because it would not be able to create segments to predict when a prediction will occur. Is that what you are seeing with that change softnhard, that all predictions are 1-step predictions?

            The other changes would seem to enforce that only one pattern can be stored per segment, which is a similar effect to what I was proposing above, by having GetBestActiveSegment() not count synapses that are not connected.

             
            • David Ragazzi

              David Ragazzi - 2013-05-03

              Michael:The other changes would seem to enforce that only one pattern can be stored per segment, which is a similar effect to what I was proposing above, by having GetBestActiveSegment() not count synapses that are not connected.

              I also agree in test this approach. In the tests I did, it seems that a unique segment is used at distinct time steps for different patterns. The reason could be that at same time synapses permanence of this segment is decremented by a pattern, the permanence of other synapses also can be incremented by an another one.

              David: So far so good, since that segments predicting 'A' should have their sinapses permanence decremented due to false predictions. But is not the case, even with learning already stable, CLA always predicts 'A' together with first 'B' and with more strenght than second 'B'.

              As IDE still don't have good tools to analyse closer the segments and synapses, is hard try see what is really happening. But I believe that due to segment predicting 'A' has its synapses incremented by other pattern and has other synapses decremented due to these predict falsely 'A' and not 'B', and finally GetBestActiveSegment() is not able to handle this, predictions get confuse. Anyone correct me if I'm wrong, please.

               

              Last edit: David Ragazzi 2013-05-03
  • softnhard

    softnhard - 2013-05-03

    You are right and my changes causes the network to predict just 1 step after but at least there will be no ambiguity in prediction and my tests shows that even 14 continuous sequences can be stored with no problem.
    About storing just one pattern in each segment it can be ignored because there will be multiple cells per column each one capable to form connections with other cell (both the same column and other columns in my implementation). To suppress problems with count of cells which increases resource usage exponentially when set to big number and low pattern memorization capability when a small number is chosen, I've used an adaptive algorithm to increase count of cell per column just for those columns with more activity.

     

    Last edit: softnhard 2013-05-03
    • Itay

      Itay - 2013-05-03

      The performance you say you are getting also goes hand to hand with Numenta's original performance. I say we should explore this full force ahead. I will do it tomorrow (unless some other dev from OpenHTM would precede me)

       
  • David Ragazzi

    David Ragazzi - 2013-05-03

    I decided restrain the number of segments used for predicting patterns that will happen to t+n and fortunatelly ALL predictions were gone perfectly.. It seems that the segments used for predict steps ahead were causing the context forking, although is curious how it could be causing conflicts.

    I simply changed this line in Segment class in order to CLA predict only for t+1:

    public static readonly int MaxTimeSteps = 1; // It was 10
    

    I still don't think that we should discard NumberPredictionSteps but at least we know where is the problem.

    Please, do the tests starting by ABBCBBA, JumpingBall, AAAX, and others.

     

    Last edit: David Ragazzi 2013-05-04
    • Itay

      Itay - 2013-05-04

      I tried and it's still noisy and confuses with context.
      please commit your changes for the exact configuration you are using, or upload it as a separate file through my email
      I also tried softnhard changes and it still confuses with context. if you guys find a way to solve the context forking problem, please upload a complete file with the source code so that we can investigate.

       
      • David Ragazzi

        David Ragazzi - 2013-05-04

        Actually I didn't change anything, I just set MaxTimeSteps constant to 1 to restrain the creation of sequence segments. The example I tested was ABBCBBA_20x20 that is the current repository and presents context problems when MaxTimeSteps > 1.

        I tested this with old commits (april and march) and the predictions also worked well when I restrained MaxTimeSteps to 1.

         
        • Itay

          Itay - 2013-05-04

          what??
          I just tested a build from 10/04/2013 and it performs even worse than modern builds when setting the max timesteps of segment to 1.
          here is the build I tested with : http://sourceforge.net/p/openhtm/code/ci/a97f2a2cf2b189dcf8877753d2c302cd61b1a312/tree/

           
        • Barry Matt

          Barry Matt - 2013-05-07

          I am a bit confused. Setting MaxTimeSteps to 1 should not have any effect on prediction. When we look at prediction accuracy we usually only consider t+1 predictions anyway. That is we ignore all the other t+2,3,4,...n segments. The t>1 segments should be completely isolated from the t+1 segments, there should be no effects between the two. So if you prevent the formation of t>1 segments you are only preventing predictions for t>1. The t+1 predictions should be exactly the same with or without the presence of the t>1 segments.

          If there is some difference in the t+1 prediction then that is a symptom of something wrong with the code. That is worth looking into if indeed the case. But if the code is working as designed, there should be no difference in t+1 prediction when limiting MaxTimeSteps to 1.

          However this got me to thinking about your other proposal in phase 3 of the temporal pooler:
          else if (!cell.IsPredicting && cell.WasPredicted)

          This statement will indeed take effect in cells being predicted for any time step. The idea here is that if a cell is predicted for say t+5, then it will stay in the predicting state for the next 5 time steps (if prediction is correct). If the prediction fails at any point, then the cell's pending segments are decremented. As there is no restriction on time steps, this can affect all types of segments.

          So if you are seeing different results when limiting to t+1 only, this line is likely making the difference. I have not done detailed analysis among the cases, but that is certainly where I would focus on.

          It is worth experimenting with your proposed alternative:
          else if (!cell.IsActive && cell.WasPredicted)

          But as always with changes like these, you should be careful because there are always possible side effects in other cases. With that said, your proposed change makes some sense on the surface so it is worth exploring the behavior.

           
          • David Ragazzi

            David Ragazzi - 2013-05-08

            When you test ABBCBBA sequence with MaxTimeSteps = 1, early in the first steps CLA stop show false predictions and show only a correct prediction by time step. However when > 1, in some time steps it shows 2 predicted patterns where the strongest is the wrong. I don't know what why is happening, but it is happening.

            About the phase 3 of the temporal pooler:
            else if (!cell.IsPredicting && cell.WasPredicted)

            I was trying find which code part that is responsible to decrement permanences of bad predictions, then I stop in this line. In the first time that I saw this, I thought this could be really some wrong (ALTHOUGH this is present in the CLA white paper). Then I tested with a sentence like "My name is David" and voila, after a first steps ALL letters were well predicted. In the begining the predictions are confuse and noisy, however as synapse permanences are incremented/decremented, the segment formation was refined showing almost perfect predictions.

            An alternative to avoid side effects is:
            else if ((!cell.IsActive && !cell.IsPredicting) && cell.WasPredicted)

            I will upload a video in Youtube in order to you see the result.

             

            Last edit: David Ragazzi 2013-05-08
            • Michael Ferrier

              Michael Ferrier - 2013-05-08

              For the ABBCBBA example, I think that the problem #1 I described here may explain why setting MaxPredictionSteps to 1 would correct its 1-step predictions. This is because, when predictions with many steps are allowed, then eventually C's pattern will always be either in a predictive state or active; it will never be inactive. So, the synapses that cause ABBCB[B]A to predict C will incorrectly always be strengthened, rather than weakened. When ABBCB[B]A is active, it will cause C's pattern to enter predictive state, with time step 1. A SegmentUpdate will be created for C's cells, to update the synapses that put C in a predictive state. Then, in the next time step, A is activated instead of C. We'd expect that, since the prediction of C coming next was proven incorrect, that SegmentUpdate should now be processed causing those synapses to be weakened. But this doesn't happen, because C remains in predictive state -- it's now predicting with time step 4. C then stays in predictive state, until 4 steps later when it actually becomes active. At that point all of C's cells' SegmentUpdates are positively reinforced, including the incorrect one predicting C to occur after ABBCB[B]A. In that other post I described the fix I made for that... basically If C is predicted in 1 time step, and then in the next time step C remains predictive but now is predicted for >1 time steps, then the SegmentUpdate that predicted C for 1 time step gets negatively reinforced. That fix got ABBCBBA working much better in my tests.

               

              Last edit: Michael Ferrier 2013-05-08
              • Uwe Kirschenmann

                I did make the same observation. My guess is, that has to do with the symmetry of the input pattern. If you negatively reinforce the sequence segment, you might confuse the temporal pooler for patterns that are not symmetrical. did you test this also with simple sequences?

                 
                • Michael Ferrier

                  Michael Ferrier - 2013-05-09

                  Hi Uwe, I did test it with sequences of multiple lengths, with and without symmetry and repetition, and the change seems to work fine. It only negatively reinforces a 1-step segment if the 1-step prediction made by that segment is shown to be incorrect, by being replaced by a >1 step prediction.

                   
  • David Ragazzi

    David Ragazzi - 2013-05-04

    Note: When I restrained MaxPredictionSteps to 1, I didn't isolated the problem with segments I just mitigated it. Even with t+1 still may there's noise and some confusion depending of the kind of input (letters for example). Because one single segment can be used by more of one patterns it can activate different predictions however with inputs like those used in ABBCBBA it mitigates the problem.

     
    • Itay

      Itay - 2013-05-04

      yes, that's what I was saying : it still have noise and confusion with complex sequences like the jumping ball.
      check my latest experimental with little noise at all on the jumping ball or other complex sequences with these parameters : 30 cells per column, 12 new synapses and active segment threshold of 4 or 5, this is the kind of result I am aiming for that I want to see in Numenta's temporal pooler

       

Log in to post a comment.

MongoDB Logo MongoDB