Nate Woodward
2010-12-03 21:46:14 UTC
I'm trying to acheive high availability without data loss. I have an HDR
pair configured with a connection manager on each box for redundancy.
Clients are pointed at an sqlhosts group with the connection managers in
them, and the connection managers are configured to redirect the clients
to the current primary server in the pair. The connection managers are
also configured for failover, with FOC = HDR,10, so that the secondary
takes over the primary's job if it fails.
Now, I'm worried about this scenario:
- primary gets disconnected from network
- secondary is brought to primary mode
- primary regains network connectivity
- two primary's are on the network -- what now?
I'm considering writing an ALARMPROGRAM script to bring down the primary
if it loses network connectivity, so that HDR can be re-established with
the old secondary as the new primary. Is there a better way to do what
I'm trying to accomplish?
More info:
Informix 11.70.UC1GE (Linux 32-bit, although it'll be 64-bit in
production if I get a working setup)
sqlhosts (same on both boxes):
barmgr group - -
primbar1 onsoctcp host1 foobar_1526 g=barmgr
primbar2 onsoctcp host2 foobar_1526 g=barmgr
barcluster group - - i=10
#foobar onsoctcp host1 foobar_1526
g=barcluster
foobar1 onsoctcp host1 foobar1_1527 g=barcluster
foobar2 onsoctcp host2 foobar2_1528 g=barcluster
cmsm.cfg on host1:
NAME barmgr1
SLA primbar1=primary
DEBUG 1
LOGFILE connmgr.log
FOC HDR,10
cmsm.cfg on host2:
NAME barmgr2
SLA primbar2=primary
DEBUG 1
LOGFILE connmgr.log
FOC HDR,10
'onstat -g dri' on host1:
IBM Informix Dynamic Server Version 11.70.UC1GE -- On-Line (Prim) -- Up
02:23:24 -- 354840 Kbytes
Data Replication at 0x841a81a0:
Type State Paired server Last DR CKPT (id/pg)
Supports Proxy Writes
primary on foobar2 417 / 40
NA
DRINTERVAL -1
DRTIMEOUT 30
DRAUTO 3
DRLOSTFOUND /usr/informix/etc/dr.lostfound
DRIDXAUTO 1
ENCRYPT_HDR 0
Backlog 0
'onstat -g dri' on host2:
IBM Informix Dynamic Server Version 11.70.UC1GE -- Read-Only (Sec) -- Up
01:46:53 -- 354840 Kbytes
Data Replication at 0x841ae1a0:
Type State Paired server Last DR CKPT (id/pg)
Supports Proxy Writes
HDR Secondary on foobar1 417 / 40
N
DRINTERVAL -1
DRTIMEOUT 30
DRAUTO 3
DRLOSTFOUND /usr/informix/etc/dr.lostfound
DRIDXAUTO 1
ENCRYPT_HDR 0
Backlog 0
pair configured with a connection manager on each box for redundancy.
Clients are pointed at an sqlhosts group with the connection managers in
them, and the connection managers are configured to redirect the clients
to the current primary server in the pair. The connection managers are
also configured for failover, with FOC = HDR,10, so that the secondary
takes over the primary's job if it fails.
Now, I'm worried about this scenario:
- primary gets disconnected from network
- secondary is brought to primary mode
- primary regains network connectivity
- two primary's are on the network -- what now?
I'm considering writing an ALARMPROGRAM script to bring down the primary
if it loses network connectivity, so that HDR can be re-established with
the old secondary as the new primary. Is there a better way to do what
I'm trying to accomplish?
More info:
Informix 11.70.UC1GE (Linux 32-bit, although it'll be 64-bit in
production if I get a working setup)
sqlhosts (same on both boxes):
barmgr group - -
primbar1 onsoctcp host1 foobar_1526 g=barmgr
primbar2 onsoctcp host2 foobar_1526 g=barmgr
barcluster group - - i=10
#foobar onsoctcp host1 foobar_1526
g=barcluster
foobar1 onsoctcp host1 foobar1_1527 g=barcluster
foobar2 onsoctcp host2 foobar2_1528 g=barcluster
cmsm.cfg on host1:
NAME barmgr1
SLA primbar1=primary
DEBUG 1
LOGFILE connmgr.log
FOC HDR,10
cmsm.cfg on host2:
NAME barmgr2
SLA primbar2=primary
DEBUG 1
LOGFILE connmgr.log
FOC HDR,10
'onstat -g dri' on host1:
IBM Informix Dynamic Server Version 11.70.UC1GE -- On-Line (Prim) -- Up
02:23:24 -- 354840 Kbytes
Data Replication at 0x841a81a0:
Type State Paired server Last DR CKPT (id/pg)
Supports Proxy Writes
primary on foobar2 417 / 40
NA
DRINTERVAL -1
DRTIMEOUT 30
DRAUTO 3
DRLOSTFOUND /usr/informix/etc/dr.lostfound
DRIDXAUTO 1
ENCRYPT_HDR 0
Backlog 0
'onstat -g dri' on host2:
IBM Informix Dynamic Server Version 11.70.UC1GE -- Read-Only (Sec) -- Up
01:46:53 -- 354840 Kbytes
Data Replication at 0x841ae1a0:
Type State Paired server Last DR CKPT (id/pg)
Supports Proxy Writes
HDR Secondary on foobar1 417 / 40
N
DRINTERVAL -1
DRTIMEOUT 30
DRAUTO 3
DRLOSTFOUND /usr/informix/etc/dr.lostfound
DRIDXAUTO 1
ENCRYPT_HDR 0
Backlog 0