When I perform a MUPIP EXTRACT using binary format on a very large database (350 GB) I immediately get a %SYSTEM-E-ENO14 Bad Address error.
Here's a transcript of the session along with some info about the database and the environment:
$ mupip extract -format=BIN -select=changeset cs.extract
%SYSTEM-E-ENO14, Bad address
MUPIP is not able to complete the extract due the the above error
WARNING!!! Extract file cs.extract is incomplete!
$ ls -al changeset.dat
-rw-rw-rw- 1 xxx users 387589793280 Oct 2007:26 changeset.dat
$ ls -alh changeset.dat
-rw-rw-rw- 1 xxx users 361G Oct 2007:26 changeset.dat
$ dse
File changeset.dat
Region CHANGESET
DSE> d -f
File changeset.dat
Region CHANGESET
Date/Time 20-OCT-2011 07:27:02 [$H=62384,26822]
Access method BG Global Buffers 64
Reserved Bytes 0 Block size (in bytes)65024
Maximum record size 4080 Starting VBN 129
Maximum key size 255 Total blocks 0x005AF40E
Null subscripts NEVER Free blocks 0x0029D100
Standard Null Collation FALSE Free space 0x00000000
Last Record Backup 0x0000000000000001 Extension Count 65535
Last Database Backup 0x0000000000000001 Number of local maps 11643
Last Bytestream Backup 0x0000000000000001 Lock space 0x00000028
In critical section 0x00000000 Timers pending 0
Cache freeze id 0x00000000 Flush timer 00:00:01:00
Freeze match 0x00000000 Flush trigger 60
Current transaction 0x00000005247789C0 No. of writes/flush 7
Maximum TN 0xFFFFFFFFE3FFFFFF Certified for Upgrade to V5
Maximum TN Warn 0xFFFFFFFF73FFFFFF Desired DB Format V5
Master Bitmap Size 112 Blocks to Upgrade 0x00000000
Create in progress FALSE Modified cache blocks 0
Reference count 1 Wait Disk 0
Journal State [inactive] ON Journal Before imaging TRUE
Journal Allocation 2048 Journal Extension 2048
Journal Buffer Size 2159 Journal Alignsize 2048
Journal AutoSwitchLimit 8386560 Journal Epoch Interval 300
Journal Yield Limit 8 Journal Sync IO FALSE
Journal File: changeset.mjl
Mutex Hard Spin Count 128 Mutex Sleep Spin Count 128
Mutex Spin Sleep Time 2048 KILLs in progress 0
Replication State OFF Region Seqno 0x0000000000000001
Zqgblmod Seqno 0x0000000000000000 Zqgblmod Trans 0x0000000000000000
Endian Format LITTLE Commit Wait Spin Count 16
Database file encrypted FALSE
DSE> exit
$
GTM>w $zv
GT.M V5.4-001 Linux x86_64
GTM>h
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It perhaps means that a shared memory quota is not large enough. Remember that the size of the shared memory segment needed is the block size times the number of global buffers plus some overhead for control structures. In any case, a block size of 65,024 certainly is not tested as thoroughly as more common block sizes. Also, if I remember correctly, we have a recommendation not to use before image journaling on block sizes larger than either 16KB or 32KB, I forget which.
If you really need such a large block size, using MM access method may be a cleaner alternative. But it does restrict you to NOBEFORE journaling.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I really do need the 64k blocksize, unless I somehow split my globals across databases.
I've been using before image journalling without any problem, so far. That includes performing multiple rollbacks without any trouble. There's a warning about stream backups with 64k blocksizes, but I've not read anything about a limitation on before image journaling:
$ mupip create -reg=C
%GTM-W-MUNOSTRMBKUP, Database c.dat has a block size larger than 32256 and thus cannot use stream (incremental) backup
The shared memory requirement is for only 64 global buffers, so 64 x 64k = 4MB. But I have shmmax set to 4GB which ought to be enough. Are there any other kernel settings that should be increased?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
On my laptop, I just successfully created a database with a 65,024 byte block size and successfully extracted data from it in both BIN and ZWR formats. Of course, I had far, far, less data than your 361GB. To help debug, could you please:
1. Verify that you have adequate space in your file system.
2. Try with a small database - maybe just a few nodes - but a 65,024 byte block size.
Thank you very much.
Bhaskar
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Bhaskar
1) Plenty of free space on the file system. 226GB.
2) I tried creating a fresh database and populated it with just a small amount of data. I get exactly the same problem. Here's the transcript for verification that I didn't do anything silly:
/data$ $gde
%GDE-I-LOADGD, Loading Global Directory file
/data/xapi.gld
%GDE-I-VERIFY, Verification OK
GDE> add -seg test -file=test.dat
GDE> cha -seg test -block=65024
GDE> sh -seg test
*** SEGMENTS ***
Segment File (def ext: .dat)Acc Typ Block Alloc Exten Options
-------------------------------------------------------------------------------------------
TEST test.dat BG DYN 65024 100 100 GLOB=1024
LOCK= 40
RES = 0
ENCR=OFF
GDE> add -reg test -dyn=test
GDE> add -name test -reg=test
GDE> exit
%GDE-I-VERIFY, Verification OK
%GDE-I-GDUPDATE, Updating Global Directory file
/data/xapi.gld
/data$
/data$
/data$ $mupip create -reg="TEST"
%GTM-W-MUNOSTRMBKUP, Database /iscsidata/xapi/data/test.dat has a block size larger than 32256 and thus cannot use stream (incremental) backup
Created file /data/test.dat
/data$ $gtm
GT.M Rock solid, lightning fast
GT.M V5.4-001 Linux x86_64
GTM>zwr ^test
%GTM-E-GVUNDEF, Global variable undefined: ^test
GTM>f x=1:1:10000 s ^test(x)=$j("",100)
GTM>h
/data$ $mupip extract -format=bin -select="^test"
Output File: test.bin
%SYSTEM-E-ENO14, Bad address
MUPIP is not able to complete the extract due the the above error
WARNING!!! Extract file test.bin is incomplete!
/data$
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When I perform a MUPIP EXTRACT using binary format on a very large database (350 GB) I immediately get a %SYSTEM-E-ENO14 Bad Address error.
Here's a transcript of the session along with some info about the database and the environment:
Further investigation suggests that this is somehow related to the blocksize of the database.
Extract works for all my databases that have a 4k blocksize, but gives the ENO14 error for all my databases that have a 64k blocksize.
Is this a known problem with this version of GT.M?
It perhaps means that a shared memory quota is not large enough. Remember that the size of the shared memory segment needed is the block size times the number of global buffers plus some overhead for control structures. In any case, a block size of 65,024 certainly is not tested as thoroughly as more common block sizes. Also, if I remember correctly, we have a recommendation not to use before image journaling on block sizes larger than either 16KB or 32KB, I forget which.
If you really need such a large block size, using MM access method may be a cleaner alternative. But it does restrict you to NOBEFORE journaling.
Thanks, Bhaskar
I really do need the 64k blocksize, unless I somehow split my globals across databases.
I've been using before image journalling without any problem, so far. That includes performing multiple rollbacks without any trouble. There's a warning about stream backups with 64k blocksizes, but I've not read anything about a limitation on before image journaling:
The shared memory requirement is for only 64 global buffers, so 64 x 64k = 4MB. But I have shmmax set to 4GB which ought to be enough. Are there any other kernel settings that should be increased?
On my laptop, I just successfully created a database with a 65,024 byte block size and successfully extracted data from it in both BIN and ZWR formats. Of course, I had far, far, less data than your 361GB. To help debug, could you please:
1. Verify that you have adequate space in your file system.
2. Try with a small database - maybe just a few nodes - but a 65,024 byte block size.
Thank you very much.
Bhaskar
1) Plenty of free space on the file system. 226GB.
2) I tried creating a fresh database and populated it with just a small amount of data. I get exactly the same problem. Here's the transcript for verification that I didn't do anything silly: