Discussion:
ER issue
(too old to reply)
Laurie Gustin
2013-03-06 22:19:14 UTC
Permalink
I have two IDS 11.7FC7 servers replicating data. I have ER replication
between the two, all data changes are made on server 1, recently it appears
that data has stopped moving to server 2.

The sendq on server 1 continues to grow, while the recvq on server 2
appears to be stuck. any clues?

from server 2:
[***@psdb10rpr ~]$ onstat -g rqm recvq | head -30

IBM Informix Dynamic Server Version 11.70.FC5 -- On-Line -- Up 00:35:05 --
20724404 Kbytes

CDR Reliable Queue Manager (RQM) Statistics:

RQM Statistics for Queue (0x69c46028) trg_receive
Transaction Spool Name: trg_receive_stxn
Insert Stamp: 11672278
Communal Stamp: 0
Flags: RECV_Q, SPOOLED, PROGRESS_TABLE
Txns in queue: 4277
Txns in memory: 4277
Txns in spool only: 0
Txns spooled: 0
Unspooled bytes: 0
Size of Data in queue: 84675870 Bytes
Real memory in use: 84675870 Bytes
Pending Txn Buffers: 0
Pending Txn Data: 0 Bytes
Max Real memory data used: 84675870 (83886080) Bytes
Max Real memory hdrs used 2381888 (3869788976) Bytes
Total data queued: 84681895 Bytes
Total Txns queued: 4292
Total Txns spooled: 0
Total Txns restored: 0
Total Txns recovered: 0
Spool Rows read: 0
Total Txns deleted: 15
Total Txns duplicated: 0
Total Txn Lookups: 10333

from server 1:

[***@psdb01spr bci_licensing]$ onstat -g rqm sendq | head -30

IBM Informix Dynamic Server Version 11.70.FC7 -- On-Line -- Up 9 days
07:41:32 -- 20724404 Kbytes

CDR Reliable Queue Manager (RQM) Statistics:

RQM Statistics for Queue (0x6b4aa028) trg_send
Transaction Spool Name: trg_send_stxn
Insert Stamp: 5417
Flags: SEND_Q, SPOOLED, PROGRESS_TABLE, NEED_ACK
Txns in queue: 5401
Log Events in queue: 0
Txns in memory: 5401
Txns in spool only: 0
Txns spooled: 0
Unspooled bytes: 26126352
Size of Data in queue: 26126352 Bytes
Real memory in use: 22975580 Bytes
Pending Txn Buffers: 0
Pending Txn Data: 0 Bytes
Max Real memory data used: 22975580 (83886080) Bytes
Max Real memory hdrs used 2883104 (83886080) Bytes
Total data queued: 26182638 Bytes
Total Txns queued: 5417
Total Txns spooled: 0
Total Txns restored: 0
Total Txns recovered: 0
Spool Rows read: 0
Total Txns deleted: 16
Total Txns duplicated: 0
Total Txn Lookups: 212443
Laurie Gustin
2013-03-06 23:13:44 UTC
Permalink
Yes - basic checks look good - although onstat -g nif shows run/blocked on
server 1
[***@psdb01spr logs]$ onstat -g nif

IBM Informix Dynamic Server Version 11.70.FC7 -- On-Line -- Up 9 days
08:37:50 -- 20724404 Kbytes

NIF anchor Block: 0x7d2f0ad8
nifGState RUN
RetryTimeout 300

CDR connections:
Id Name State Version Sent Received
---------------------------------------------------------------------------
10 g_psprod10rf RUN,BLOCK 10 19341 22


My next step is to blow away all the replication and start over - ugh - i
hate having to to that, so I was hoping for an easier answer.
Good afternoon. I assume you’ve done all the standard checks, should
show something along these lines. For example the following command should
show that both sides of your replication are not just up, but connected to
the other side:****
** **
cdr view servers****
** **
SERVERS****
Server Peer ID State Status Queue Connection Changed****
---------------------------------------------------------------------------
****
pin22_dba pin22_dba 551 Active Local
0 ****
pin23_dba 552 *Active
Connected* 0 Mar 6 07:56:40****
pin23_dba pin22_dba 551 *Active Connected*
0 Mar 6 07:56:40****
pin23_dba 552 Active
Local 0 ****
** **
===========================================================================================
****
Also, make sure to check the following on both servers to assure that
status is RUN, and not RUN,BLOCKED:****
** **
Onstat –g nif****
** **
IBM Informix Dynamic Server Version 11.50.FC6W4X8 -- On-Line -- Up 13 days
00:24:40 -- 2415100 Kbytes****
** **
NIF anchor Block: 5f0d0678****
nifGState RUN****
RetryTimeout 300****
** **
CDR connections:****
Id Name State
Version Sent Received****
---------------------------------------------------------------------------
****
551 pin22_dba *RUN* 9
18754 43****
** **
===========================================================================================
****
** **
I have an alias in my informix .profile file that gives me a general state
of health for replication on my servers. I named it ‘erhealth’ and it is
as follows:****
** **
alias -x erhealth="cdr view servers;cdr list repl|grep QUEUE;cdr list
repl|grep STATE;onstat -g rqm|grep 'Txns in queue';onstat -g nif|grep *551
*"****
** **
where the only thing you have to change is the grep for group ID of the
other server, which will be in your $INFORMIXDIR/etc/sqlhosts file. The
output looks like the following:****
** **
SERVERS****
Server Peer ID State Status Queue Connection Changed****
---------------------------------------------------------------------------
****
pin22_dba pin22_dba 551 Active Local
0 ****
p in23_dba 552 Active
Connected 0 Mar 6 07:56:40****
pin23_dba pin22_dba 551 Active Connected
0 Mar 6 07:56:40****
pin23_dba 552 Active Local
0 ****
** **
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
QUEUE SIZE: 0****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
STATE: Active ON:pin23_dba****
Txns in queue: 0****
Txns in queue: 0****
Txns in queue: 0****
Txns in queue: 0****
Txns in queue: 0****
551 pin22_dba RUN 9 18761 43
****
** **
Hope some of this is useful. Always helps me to document some of this for
times when other DBA’s are on call. Please feel free to let me know if any
of this isn’t clear.****
** **
** **
Thanks,****
** **
Floyd Bortz
Database Analyst
Enterprise Data Management
Ocotillo Corporate Center | 2600 S. Price Rd., 4th floor | Chandler, AZ
85286<http://mysite.teamworks.wellsfargo.com/searchcenter/pages/portalpeople.aspx?s=People&k=WorkZip:%2285286-7806%22>
MAC S3929-047
Tel 480-724-6854 | Cell 602-541-1189
** **
*Sent:* Wednesday, March 06, 2013 3:19 PM
*Subject:* ER issue****
** **
I have two IDS 11.7FC7 servers replicating data. I have ER replication
between the two, all data changes are made on server 1, recently it appears
that data has stopped moving to server 2.****
** **
The sendq on server 1 continues to grow, while the recvq on server 2
appears to be stuck. any clues?****
** **
from server 2:****
** **
IBM Informix Dynamic Server Version 11.70.FC5 -- On-Line -- Up 00:35:05 --
20724404 Kbytes****
** **
CDR Reliable Queue Manager (RQM) Statistics:****
** **
RQM Statistics for Queue (0x69c46028) trg_receive****
Transaction Spool Name: trg_receive_stxn****
Insert Stamp: 11672278****
Communal Stamp: 0****
Flags: RECV_Q, SPOOLED, PROGRESS_TABLE****
Txns in queue: 4277****
Txns in memory: 4277****
Txns in spool only: 0****
Txns spooled: 0****
Unspooled bytes: 0****
Size of Data in queue: 84675870 Bytes****
Real memory in use: 84675870 Bytes****
Pending Txn Buffers: 0****
Pending Txn Data: 0 Bytes****
Max Real memory data used: 84675870 (83886080) Bytes****
Max Real memory hdrs used 2381888 (3869788976) Bytes****
Total data queued: 84681895 Bytes****
Total Txns queued: 4292****
Total Txns spooled: 0****
Total Txns restored: 0****
Total Txns recovered: 0****
Spool Rows read: 0****
Total Txns deleted: 15****
Total Txns duplicated: 0****
Total Txn Lookups: 10333****
** **
from server 1:****
** **
** **
IBM Informix Dynamic Server Version 11.70.FC7 -- On-Line -- Up 9 days
07:41:32 -- 20724404 Kbytes****
** **
CDR Reliable Queue Manager (RQM) Statistics:****
** **
RQM Statistics for Queue (0x6b4aa028) trg_send****
Transaction Spool Name: trg_send_stxn****
Insert Stamp: 5417****
Flags: SEND_Q, SPOOLED, PROGRESS_TABLE, NEED_ACK****
Txns in queue: 5401****
Log Events in queue: 0****
Txns in memory: 5401****
Txns in spool only: 0****
Txns spooled: 0****
Unspooled bytes: 26126352****
Size of Data in queue: 26126352 Bytes****
Real memory in use: 22975580 Bytes****
Pending Txn Buffers: 0****
Pending Txn Data: 0 Bytes****
Max Real memory data used: 22975580 (83886080) Bytes****
Max Real memory hdrs used 2883104 (83886080) Bytes****
Total data queued: 26182638 Bytes****
Total Txns queued: 5417****
Total Txns spooled: 0****
Total Txns restored: 0****
Total Txns recovered: 0****
Spool Rows read: 0****
Total Txns deleted: 16****
Total Txns duplicated: 0****
Total Txn Lookups: 212443****
** **
Loading...