Balloon again

Developers
2013-11-19
2013-11-27
1 2 > >> (Page 1 of 2)
  • Guus Bonnema
    Guus Bonnema
    2013-11-19

    HI Paul,

    I may have stumbled on an error in Balloon.java. It looks like it queues the last packet containing ENDPACKET and then does a drop after the dequeue. This of course is an error, because receive() does an increase in packetCount and drop() does a decrease of packetCount (in Component). However, ENDPACKET never gets received()! It does get dropped though.

    So, I am left wondering why Balloon just works in trunk, while in branch 2.7 it reports a -1 in packetCount (which is correct, but leads to an exception).

    After the error occurred I also changed packetCount into a private AtomicInteger and provided the methods increasePacketCount, decreasePacketCount and initializePacketCount in order to make access somewhat restricted.

    Can you help me verify if my conclusions are correct? Is this an error as I suspect?

    The only thing I changed in between is making Network.java more threadsafe. It is not finished yet.

    I changed nothing in the packetCounting area. I am left wondering whether the counting error doesn't occur in trunk because some fields are not "properly published" in Network.

    Kind regards, Guus.

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-19

    I have committed a version with the Balloon error in it, and right after that committed the corrected Balloon version (with an incrementPacketCount() in the right place -- where the ENDPACKET gets queued --.

    Regards, Guus.

     
  • Sorry, Guus, I guess I missed this question! I took a look at the new Balloon, and I like the name changes, and I'm sure you're right about the count problem. However, I am wondering if a cleaner solution might not be to use

    final Packet ENDPACKET = create(Integer.MAX_VALUE, null);
    

    instead of the "new" - then hopefully you don't need to add an incrementPacketCount(). I haven't tested it, and I have to buzz out now, but we can talk later today - let me know what you think!

    Regards,

    Paul

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-21

    It works, because you do not give the ENDPACKET an owner. The question is whether that is ok with your counting of packets. Because, shouldn't you call all your packets? Including the ones you generate for clarity?

    Suppose this component is part of a subnet and gets executed like, oh say a 1000 times, then you would miss out on a 1000 ENDPACKET's you generated.

    So, it all depends on what exactly you want to count: including generated packets or excluding?

    Regards, Guus.

     
    • Sorry, didn't really understand your comment! I changed the new for ENDPACKET to create (removing final and the incrementPacketCount()), and TestBalloon works fine.

      I have promoted my change as the easiest way for you to see what I did - hope you don't mind - we can certainly change it if there is a better way! :-)

       
  • Guus Bonnema
    Guus Bonnema
    2013-11-21

    You could look at Balloon in another way. In stead of implementing it as an equal to others component, this feature could be part of the internal system. That would exclude any packet counting, because it is not a component, but a feature.

    Imagine a feature that when defining a connection at network specification time, you have the option to say, capacity=50 (or something like that). In response the system would create a connection with the same advanced locking queue you use in Balloon, but it would do it in connection. That way, you provide extra facility to the JavaFBP system and the Balloon component is unnecessary, because its a network.define() option.

    What do you think?

    Regards, Guus.

     
  • That's an interesting idea - I would tend to lean away from it unless (until?) it becomes a major problem for many users. This is because philosophically I like simple connections with the complex stuff in components. In my book I talk about the idea of many different kinds of connections, and try to show that IMO they make the mental model more complex, without really buying you much. Still... I guess we could add it in with appropriate warnings!

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-22

    Rationale
    There are two reasons why I had this idea:

    1. The component works a new thread, while the component itself is also running a new thread. This is complexity that we have JavaFBP for to avoid. We actually want to have the concurrency issues within the framework, while avoiding them from the components. In other words we are shielding the component builder from concurrency issues. I picked up on this from one of John Cowan's remarks, and I must say I like that idea.

    2. Enabling this feature from JavaFBP, shields the user (point 1), but also frees us to implement it in the connection, or anywhere else we find it is responsible and useful. The component users need know nothing about the connection or its complexities, so the mental model for the component builder should not change! Of course the network designer does need to know buffering is an option (see next points).

    options for the interface boundingqueue
    When checking out balloon, on internet I found the following table that gives an overview of the options available.

    action Throws Exception Special Value Blocks Times Out
    Insert add(o) offer(o) put(o) offer(o, timeout, timeunit)
    Remove remove(o) poll(o) take(o) poll(timeout, timeunit)
    Examine element(o) peek(o)

    Looking at these options, I wonder why Balloon uses the poll(timeout, secs) in stead of the take(o) or even poll(o). AFAICS it does a stacktrace and continues. I feel that using put and take would simplify the mental model without trading in options. put blocks when the queue is full, take blocks whens the queue is empty.

    Users
    Yes, we need to know whether network designers (or should we say application designers?) would like this feature. We can reason all day, but without their input, we are guessing.

    Having said that, we need to decide, so here goes my reasoning.

    Architectural option
    Of course the network builder has a more architectural job and should be aware of buffering features as they may influence throughput and response times. Not offering this option, makes the network simpeler, but also takes away options that the network designer should consider when building non-trivial networks.

    I lean towards offering a buffering feature that defaults to 1. Combine this with a proper manual where we describe its use and when to use or not to use.

    Kind Regards, Guus.

     
    • Hi Guus, I stand by my earlier remark! In 40 years, we never needed this facility, even when processing millions of records in a single run (we just used the file technique as shown in p. 342 of the 2nd edition, where the subnets saying "Hold until Sort finished" are actually sequential files). However, Mike Beckerle did need something like ballooning in his FBP(-like) apps, and the solution he suggests is quite a bit more sophisticated than our Balloon component - see p. 169 and Fig. 16.15 of the 2nd edition. If you haven't read this, you may find it interesting...

      That said, you are a co-developer of JavaFBP, and if you want to add a ballooning facility into the infrastructure, you're very welcome - but, as Matthew Lai keeps saying, you may want to wait until you have to develop an app which clearly needs such a facility. This is in keeping with the YAGNI principle, which I had reason to mention on a different topic in the group earlier today!

      Good discussion!

      All the best, and thanks for all your hard work!

      Paul

      PS I see you like longer labels! :-)

       
      • Guus Bonnema
        Guus Bonnema
        2013-11-23

        It is clear: we only implement this facility if either user(s) need it or we need it ourselves.

        I did favour implementing it, but the YAGNI principle (looked it up on internet) seems a pretty solid extreme programming principle. There is enough to do in the area of concurrency (threadsafe objects) and object oriented principles to go around for a while.

        Thanks for your consistently constructive comments. For now let's leave it. When we need it, we might prefer internal processing to a special component.

        Regards, Guus.

         
  • Guus Bonnema
    Guus Bonnema
    2013-11-22

    P.S. Did you see my first post in this thread where I stated that trunk runs fine? Only by thread safing Network did this error in counting surface! And I am not finished it doing that yet. I want to continue that until all that I can find in Network is thread safe or thread confined.

     
  • I think I finally got it!

    1) I was stupid! create vs. new makes no difference, as incrementing packetcount is in new Packet, not in create - create is just a shell for new Packet + the count of creates, so my change makes no difference.

    2) The reason you thought you had to add an incrementPacketCount() was that your new Packet(Integer.MAX_VALUE, null, this) was outside the execute(), not inside - I'm guessing that the count is zeroized just before executing execute()...

    3) I stepped through the old Balloon and it worked fine several times. :-) I think that the problem that you ran into was that incrementing and decrementing the component's packet count was not synchronized, so the two threads sometimes conflicted - this is the only component that I can remember where I tried putting a second thread inside a component!

    So another alternative would have been to make your new methods incrementPacketCount() and decrementPacketCount() (and incrementPacketCount()) synchronized, and I think this would have worked. However, I just noticed that the count is now an AtomicInteger, so this takes care of that case. The odd thing is that this one component (Balloon) makes it necessary to make the count an AtomicInteger, which probably increases the overhead slightly for all components! Do you think that the added cost will be significant? I guess the alternative would be to find a different way to do ballooning... :-)

    Thanks for spotting and fixing this!

     
    Last edit: J. Paul Morrison 2013-11-22
    • Guus Bonnema
      Guus Bonnema
      2013-11-23

      About step 3: When I encountered the counting error in balloon, I had not changed one syllable in Balloon. What I had done was making Network.java more thread secure. So the error (running fine was the error where it should have aborted due to mismatch in counted packets) was there before I added AtomicInteger.

      But it was certainly caused by concurrency errors. That convinces me that making objects threadsafe is useful.



      BTW, If possible I will look for the "cheapest" way to make an object thread safe. For instance if for certain fields I can get away with just making sure it is thread confined, then that is enough. But anyone can break any kind of thread safety easily with just one little change, so care for thread safety is very necessary.

      After we find that JavaFBP works in a thread safe way, the only way to stay thread safe is to carefully review each change. This means that the person doing the review should not be the programmer.

      This we should engrain in the JavaFBP development process.

      But for now, I will just continue threadsafing Network.

      Kind regards, Guus.

       
  • I totally agree with your last two posts! However, I just want to clarify that in 3) I was trying to make the point that Balloon is the only component containing an extra thread, so it doesn't surprise me that there was a concurrency error - I am glad you found it, as it could have caused alarm and despondency if anyone had actually tried to use it in production (I don't believe they have, but you can never be sure!). I am still convinced that, overall, JavaFBP is thread-safe - in fact, as you said, it may be over-synchronized!

    My last question still stands: given that Balloon is the only component that requires the packet count to be an AtomicInteger, is there a performance hit - and if so, is there any way around it?

    TIA

    Paul

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-23

    I checked your assertion: it is true. Only the "home" thread of the object Component alters packetCount, but the object Packet does a decrement of packetCount, so I added a check in the decrement that it is indeed the same thread. And it is.

    The other setters and the getter are defined private and thus must be the object thread.

    Conclusion
    I returned packetCount to "private int", made all getters and setters private except decrementPacketCount, which I made protected, but added a check that it is called within the object thread.

    So: happyness all around: Look ma, no locks! :)

    Remarks
    1. Many of the items are unprotected for concurrent access. I am not at all sure that JavaFBP as it stands in trunk is indeed ThreadSafe. For Network.java at least, I now know it is not. I will get back to you on this when I checked out the other classes. I admit, it is a slow process because I have to check and recheck. However, I have found a way to compare the object's thread to the current thread, and that makes things a lot easier! Subsequently I document stuff with annotations like @ThreadConfined or @GuardedBy("this").
    2. I haven't committed yet, but I will try to do so tonight, so you can check out some stuff tomorrow.

    Kind regards, Guus.

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-23

    I was wrong on one point: Packet calls decrement and increment. So I made both protected and added a check for threadconfined to both.

     
  • Hi Guus, I am afraid I still don't understand your changes - surely incrementPacketCount and decrementPacketCount in this one situation (Balloon) are running in different threads, so they could clash. So IMO this explains the mismatch that you saw, but I didn't. How does your solution resolve this?

    TIA

    Paul

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-24

    Hi Paul,

    Thanks for staying alert. I totally forgot what it was all about: Balloon!

    I "solved" while checking with copy1 instead of Balloon, which of course runs fabulously.

    My apologies. I will revert the change back to AtomicInteger.

    About your real question (the one I didn't answer):

    "given that Balloon is the only component that requires the packet count to be an AtomicInteger, is there a performance hit - and if so, is there any way around it?"

    I don't know yet. Let me try and figure it out and I will get back to you.

    Kind Regards, Guus.

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-24

    Hi Paul,

    Checking internet I found the following comment on a trial run with unprotected, volatile, AtomicInteger and synchronized, and these were the conclusions of the author:

    These show that only the AtomicInteger and locked (synchronized) approaches actually work. However, there is a price to pay. The simplest (and completely broken) approach is 10 times faster than using a volatile. The first approach which actually works is AtomicInteger which is 3 to 4 times slower than volatile. Using a synchronized method is very much slower (around 7 times) again.

    This really shows the benefit of the AtomicInteger class.

    If you want to check for your self, this is the URL:
    http://nerds-central.blogspot.nl/2011/11/atomicinteger-volatile-synchronized-and.html?escaped_fragment#!

    Conclusion
    For me this means the performance impact of AtomicInteger is limited but noticable: a difference of factor 30 - 40 with respect to raw int. This is acceptible if cross-threading is necessary and a lot cheaper than synchronizing, but .... is it necessary?

    How to continue?
    So actually, it is a choice: which do we want?

    My feelings are that if we are going to use ballooning, we might as well build it in as an option within the engine. In that case we need AtomicInteger.

    However, if we would not do that (YAGNI is still a pretty persuasive argument) then falling back to int with threadconfined checks might be the way to go ... for now.

    So: what will it be?

    Kind regards, Guus.

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-24

    The second part of your query: "is there a way around it?". I don't think so: crossing thread boundaries requires some kind of locking. If there were a way, I guess it would be massively used.

    However, if we could find a way of making sure the increment is always thread confined even in the case of Ballooning, then yes, there is a way around it: make sure the increment is always part of the component's thread.

    Not sure this helps, regards, Guus.

     
  • I came up with an idea and tried it out - I modified version 622, tried it out, and have put it up as revision 624. It probably violates all the precepts of guarding, etc., but a) it works, b) it does not involve every API call paying the overhead of AtomicInteger, and c) the changes are totally within Balloon, so that there should be no impact on other components, networks, etc.

    What I didn't know was the meaning of the annotations you added, so I would appreciate it if you could look at the changed areas in Component and Balloon, and see whether annotations have to be changed, removed, added etc.

    You'll probably disapprove of my code changes - this may be one of those cases where we have to "hold our noses" as we say, unless there is a better way... :-)

    Thanks again for all the hard work you are putting into this - want to tackle C#FBP after this? :-)

    Best regards,

    Paul

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-24

    I understand what you are doing, and it works ... but not quite. Every increment and decrement of your local value is correct. The transfer to component is to put it bluntly: a bug. It doesn't work. At least it doesn't work at all times under all conditions. Everything up to the transfer is fine, but the last move (transferring the count) is thread unsafe.

    I do understand where you are coming from: it seems a waste to add AtomicInteger when it is only necessary for one component, that we don't even know of whether people use it. Probably not.

    If you would ask me
    My suggestion would be to forget about Balloon for now and concentrate on getting the system thread safe. I will keep the int as it is now, and add checks for thread confinement. In practice this means no syncing at all, only a small check that we are indeed transferring within the component's thread. That rules out Balloon, for the time being.

    That would be my priority, if you would ask me.

    Things I dare not say
    Actually, in this respect, I have been thinking for a while now that the way we attack multithreading in the engine may be the wrong way. Of course we will get things to work properly, but it will take a lot of time and effort and once correctly functioning it will be fragile.

    Every new maintainer who doesn't have a deep and complete understanding of concurrency can and probably will break thread safety.

    Hold on to your stockings
    And here comes what I was thinking: what if we would replace the current communication system? What if we would use packets in stead of shared mutable fields? We could have an internal network that enables internal communication across threads?

    The drawback is, that we need to redesign everything. And that may prove a challange to say the least. On the other hand, if it is possible and we succeed, then the engine will be much easier to maintain and have less points of concurrency problems. I have no illusions that we can completely eradicate shared mutable fields, but we should be able to minimize them.

    Undertaking this might require a clean slate. A project with a different name, because it might or might not succeed, so it should not interfere with javafbp.
    And whether succeeds or not, should not affect javafbp either.

    In the meantime, while thread proofing javaFBP, I learn a lot about the internal workings that I could later use. Also: I do not want to push this, it is just a suggestion, something to think about. I have no intention to rush anything, just thought I would finally mention this idea -- I had from almost the start -- and see what happens.

    Kind regards, Guus Bonnema.

     
  • Fascinating! And I agree with your first point. Re the second, I think you are right - but I feel we are pretty close to a working solution! Re the third, wow!, but definitely back of the burner stuff! We would of course need a (very) low-overhead technique, but it would be a lot of fun!

    But anyway I agree about Balloon - perhaps you could add some well-chosen words: use at your own risk, burn before reading, whatever!

    Thanks again, as always! Onwards and upwards!

    Paul

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-25

    Hi Paul,

    Balloon
    If you agree with thread confinement, that means Balloon will no longer work, because Component will actively check for thread confinement. What it actually means is: "don't create new threads in a user component: that's what the engine is for".

    Packet communication
    Being non-native english: what does "back of the burner stuff" mean?
    Not sure whether you agree or actually don't agree. xD

    Regards, Guus.

     
  • Guus Bonnema
    Guus Bonnema
    2013-11-25

    I think I now understand the expression "back of the burner": does it mean the same as what I said? "something to think about"? On internet I found the reference that pots on a stove earn a place at the back of the stove when they don't need immediate attention. Am I right? Is that what it means?

    Sharing datum by transfer of ownership
    Anyway, I wanted to explain how I got to this idea. In the first place it seems intuitively conflicting that the system that supports FBP cannot use FBP for its apparent "components": Network, SubNet, Component, Connections.

    Secondly, while reading about concurrency in the book I own (Java Concurrency in Practice by Brian Goetz) I found comments on avoiding the need to share mutable items. The first way is of course to create only constants or effectively constant objects.

    The second way is to "transfer ownership of data items" under the strict condition that the "old" owner does not access the item once it is transferred.
    And this is very much the way that FBP handles packets. It transfers ownership.

    So my thinking is to integrate Information Packets into the engine so that we can minimize sharing of mutable items. This would aleviate a lot of the locking we currently need (packetCount is only a very small portion of it). In the current implementation, even if we lock efficiently, the amount of lock processing is massive due to the parallel nature of our engine.

    I will leave it at this for now, and we can get back to this later. In the meantime I will continue with checking Component source for thread safety.

    On to the back burner! Or something like that.

    Regards, Guus.

     
1 2 > >> (Page 1 of 2)