Menu

#2509 osaf: Support self scale-out

5.17.11
fixed
None
enhancement
osaf
-
major
False
2017-10-30
2017-06-21
No

Ticket [#1453] added support for scale-out, which allows scale-out from an initial cluster containing at least one node. This ticket add support for scaling out from a cluster containing zero nodes, or alternatively, a cluster where the active node is not a configured node. The use cases are as follows:

  • Support loading a backup that was created on a different cluster where none of the new nodes have the same name as any of the nodes in the old cluster.
  • Support cluster restart on a system where nodes don't have persistent local storage (or persistent host names / node names) - i.e. a system where a node reboot will always cause a scale-in followed by a scale-out
  • Make scaling more robust, e.g. imagine a case when a one-node cluster is scaled out by adding a second node, but then the original node is removed before scale-out of the new node has completed.

One possible solution is that when a node boots up and becomes the newly elected active system controller (i.e. the first active system controller after a cluster restart), it will check if the node itself exists as a configured node in the IMM database. If it does not, then at first only IMM will be started and the scale-out script (added in ticket [#1453]) will be called to scale out the current node. When the current node has added to IMM, the rest of the OpenSAF services are started.

Related

Tickets: #1453
Wiki: ChangeLog-5.17.11

Discussion

  • Anders Widell

    Anders Widell - 2017-07-01
    • Milestone: 5.17.08 --> 5.17.10
     
  • Anders Widell

    Anders Widell - 2017-07-18
    • status: assigned --> accepted
     
  • Anders Widell

    Anders Widell - 2017-07-24
    • status: accepted --> review
     
  • Anders Widell

    Anders Widell - 2017-07-31
    • status: review --> fixed
     
  • Anders Widell

    Anders Widell - 2017-07-31

    commit 4c1bc429712be376a81856f5d636919e2c82d95d (HEAD -> develop, origin/develop, ticket-2509)
    Author: Anders Widell an..@..com
    Date: Mon Jul 31 12:54:43 2017 +0200

    clm: Make it possible for a node to scale out itself using autoscaling [#2509]
    
    Ticket [#1453] added support for autoscaling, which allows scale-out from an
    initial cluster containing at least one node. This commit adds support for
    scaling out from a cluster containing zero nodes, or alternatively, a cluster
    where the active node is not a configured node. The use cases are as follows:
    
    * Support loading a backup that was created on a different cluster where none of
      the new nodes have the same name as any of the nodes in the old cluster.
    * Support cluster restart on a system where nodes don't have persistent local
      storage (or persistent host names / node names) - i.e. a system where a node
      reboot will always result in a scale-in followed by a scale-out
    * Make scaling more robust, e.g. imagine a case when a one-node cluster is
      scaled out by adding a second node, but then the original node is removed
      before scale-out of the new node has completed.
    

    commit 0e1884bdd2cac0dbc1932eb304f7278996330bcc
    Author: Anders Widell an..@..com
    Date: Mon Jul 31 12:54:43 2017 +0200

    ntf: Re-try initializing CLM on unconfigured nodes [#2509]
    
    Re-try initializing the CLM API when it returns SA_AIS_ERR_UNAVAILABLE, so that
    the NTF service properly waits for the node to become configured by the
    autoscaling functionality.
    

    commit f47463fb9f0610c431f5802e1ff77945f6e0ea6a
    Author: Anders Widell an..@..com
    Date: Mon Jul 31 12:54:43 2017 +0200

    log: Re-try initializing CLM on unconfigured nodes [#2509]
    
    Re-try initializing the CLM API when it returns SA_AIS_ERR_UNAVAILABLE. This
    error code is returned if the LOG service has been started on an unconfigured
    node, which may happen for a while when the autoscaling feature is used.
    

    commit 49d658ee54abb3c8e6c78ddd9954aa5edf3dd285
    Author: Anders Widell an..@..com
    Date: Mon Jul 31 12:54:43 2017 +0200

    amf: Log CLM initialization error only once on unconfigured nodes [#2509]
    
    Avoid spamming the syslog with more than one log message in case CLM returns
    SA_AIS_ERR_UNAVAILABLE (i.e. we are running on a currently unconfigured node).
    
     

Log in to post a comment.