Menu

#3019 imm: return try-again on write requests if FS is unresponsive

5.19.07
fixed
nobody
None
enhancement
imm
nd
major
False
2019-07-03
2019-03-18
No

With current design, when the file system (FS) is neither available nor responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to respond to IMM in time.

With this ticket, we propose to introduce 02 new admin operations (set/clear) towards IMM; using these operations to inform IMM if the file system is unavailable or in healthy state. Based on that data, IMM will reject the write request earlier with error code SA_AIS_ERR_TRY_AGAIN if the file system is unavailable.

Besides, a runtime attribute saImmFileSystemStatus is added and owned by IMM to show the status of underlying file system.

Related

Tickets: #3024
Tickets: #3030
Wiki: ChangeLog-5.19.07

Discussion

  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-18
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -2,4 +2,4 @@
    
     With this ticket, we propose to introduce 02 new admin operations (set/clear) towards IMM; using these operations to inform IMM if the file system is unavailable or in healthy state. Based on that data, IMM will reject the write request earlier with error code SA_AIS_ERR_TRY_AGAIN if the file system is unavailable.
    
    -Besides, an runtime attribute `saImmFileSystemStatus` is added and owned by IMM to show the status of underlying file system.
    +Besides, a runtime attribute `saImmFileSystemStatus` is added and owned by IMM to show the status of underlying file system.
    
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-18
    • status: unassigned --> assigned
    • assigned_to: Vu Minh Nguyen
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-18
    • summary: imm: return try-again on write requests when file system is unresponsive --> imm: return try-again on write requests when FS is unresponsive
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -With current design, when the file system is unavailable or not responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may stuck at the activity of writing to sqlite3 database and therefore is not able to response to IMM in time.
    +With current design, when the file system (FS) is either unavailable or responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to response to IMM in time.
    
     With this ticket, we propose to introduce 02 new admin operations (set/clear) towards IMM; using these operations to inform IMM if the file system is unavailable or in healthy state. Based on that data, IMM will reject the write request earlier with error code SA_AIS_ERR_TRY_AGAIN if the file system is unavailable.
    
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-18
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -With current design, when the file system (FS) is either unavailable or responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to response to IMM in time.
    +With current design, when the file system (FS) is neither available nor responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to response to IMM in time.
    
     With this ticket, we propose to introduce 02 new admin operations (set/clear) towards IMM; using these operations to inform IMM if the file system is unavailable or in healthy state. Based on that data, IMM will reject the write request earlier with error code SA_AIS_ERR_TRY_AGAIN if the file system is unavailable.
    
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-18
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -With current design, when the file system (FS) is neither available nor responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to response to IMM in time.
    +With current design, when the file system (FS) is neither available nor responsive, CCB apply on IMM OM side likely get SA_AIS_ERR_TIMEOUT error code as osafpbed may be stuck at the activity of writing to sqlite3 database and therefore is not able to respond to IMM in time.
    
     With this ticket, we propose to introduce 02 new admin operations (set/clear) towards IMM; using these operations to inform IMM if the file system is unavailable or in healthy state. Based on that data, IMM will reject the write request earlier with error code SA_AIS_ERR_TRY_AGAIN if the file system is unavailable.
    
     
  • Gary Lee

    Gary Lee - 2019-03-26
    • Milestone: 5.19.03 --> 5.19.06
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-03-26
    • status: assigned --> review
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-04-11
    • summary: imm: return try-again on write requests when FS is unresponsive --> imm: return try-again on write requests if FS is unresponsive
    • status: review --> fixed
    • assigned_to: Vu Minh Nguyen --> nobody
     
  • Vu Minh Nguyen

    Vu Minh Nguyen - 2019-04-11

    commit ecbdd454813cb2e5994143aa202535374d119392 (HEAD -> develop, origin/develop, ticket-3019)
    Author: Vu Minh Nguyen vu.m.nguyen@dektech.com.au
    Date: Thu Apr 11 15:13:50 2019 +0700

    imm: return try-again on write requests if fs is unavailable [#3019]
    
    When underlying file system is unresponsive to pbe write request, all IMM
    write requests that need their changes to be persistent such as apply of ccb or
    updates to persistent runtime attributes or creation/deletion of classes or
    creation/deletion of persistent runtime objects likely gets SA_AIS_ERR_TIMEOUT.
    
    This patch introduces two administrative operations to let user inform IMM about
    the availibity of the file system. If the server (IMMND) detects the logical
    unavailability of the file system through this variable, such write requets will
    get the honest code SA_AIS_ERR_TRY_AGAIN rather then SA_AIS_ERR_TIMEOUT.
    
    Besides, a new IMM attribute, saImmFileSystemStatus, is added to
    SaImmMngt class; the value shows the current status of the file system
    according to IMM view
    
     

Log in to post a comment.