Menu

HA-JDBC/Postgres; can't get fail-over working

Help
MohaBash
2009-05-21
2012-09-28
  • MohaBash

    MohaBash - 2009-05-21

    Hi all,

    I need some help with setting up HA-JDBC with postgres. I have a set-up with two databases, one is a primary db and the other is secondary (weight=0) in the cluster config.

    The problem I have is that I am trying to simulate a network failure scenario while committing writes. Whenever I shut down the link to the secondary db, the java app just hangs and it does not write anything to the primary db until the secondary db link is restored.

    Please can any one advice?

    cheers.

     
    • Paul Ferraro

      Paul Ferraro - 2009-05-27

      What exactly do you mean by "shut down the link"?
      HA-JDBC doesn't have any special powers beyond the PostgreSQL driver's ability to detect failures. The PostgreSQL driver should notice the connectivity loss immediately via a broken socket connection.
      Do you by any chance have a connection pooling layer beneath HA-JDBC that might be iterating indefinitely through a pool of broken connections until it gets a valid one?

       
    • MohaBash

      MohaBash - 2009-05-28

      Hi,

      By "shutting down link" I meant "yanking the cable to the secondary site." I did not use any connection pooling.

      I noticed that the postgres jdbc does not send any notification when the link does down - thats when I used it without HA-JDBC. Is there anyway one can specify a timeout for HA-JDBC to abandon a db operation and assume that a database is down?

      cheers

       
      • Paul Ferraro

        Paul Ferraro - 2009-05-28

        OK - I see. The client socket input stream stays open until it times out (per your OS settings). For now, you can try tuning your OS's socket settings so your client socket times out faster.
        e.g. on linux:
        /proc/sys/net/ipv4/tcp_keepalive_probes
        /proc/sys/net/ipv4/tcp_keepalive_intvl

        HA-JDBC ought to be able to be more proactive than this. There is a failure-detect-schedule configuration option that defines a cron schedule for establishing a connection and executing a simple query to detect failures. When a failure is detected via this mechanism, we deactivate the database - however, any statements currently executing against that database can still hang like this. HA-JDBC should be able to notify each thread executing against the deactivated database so it can interrupt and sent the appropriate Statement.cancel(). Let me think about this a bit more.

         

Log in to post a comment.