---------------------------------------------------------------------- ELOQUENCE B.07.00 - patch 0406030 ---------------------------------------------------------------------- This patch fixes a defect of the eloqdb6 program as released with Eloquence B.07.00. This patch will be integrated in the Eloquence B.07.00 release. Eloquence B.07.00 must be installed before applying this patch. Severity: PE70-0406030: BUG FIX PE70-0401160: CRITICAL (server abort, corruption) PE70-0401130: CRITICAL (server abort, corruption) PE70-0312080: BUG FIX PE70-0311180: ENHANCEMENT PE70-0310180: BUG FIX PE70-0310082: BUG FIX PE70-0309300: BUG FIX PE70-0307161: CRITICAL (server abort, corruption) PE70-0305270: BUG FIX PE70-0304041: CRITICAL (database inconsistencies) Patch PE70-0403050 ------------------ Platforms: ALL * The eloqdb6 process could abort with an internal failure when searching an index with multiple string segments (#2219). An internal error message like below was written to the log file: Assertion failed: len_a >= 0 && len_b >= 0 server panic: Aborting on internal failure, file nls.c, line 366 This was caused by a defect in the Eloquence DBFIND modes 6/7 that are used in the Eloquence image3k library. * Performing a DBFIND on a master set that was not created could result in an abort of the eloqdb6 process with an error message like below (#2263): Assertion failed: node server panic: Aborting on internal failure, file nodecore.c, line 1273 eloqdb6 has been fixed to return a status 80 in case an index is not created. * The http status could cause a temporary hang of the eloqdb6 server process in case of a network communication problem with the browser (#2315). This was observed with conections through a VPN gateway. * A new dbctl function was added to allow changing of sync mode without restarting the server process. usage: dbctl syncmode {ON|OFF} > dbctl -udba syncmode on Enables sync mode. Equivalent to SyncMode=1 > dbctl -udba syncmode off Disables sync mode. Equivalent to SyncMode=0 Platforms: Windows * Due to a build problem the number of concurrent connections on the Windows platfrom might be limited to 62. Additional connections would hang during establishing the connection. Platforms: HP-UX * An improper timeout value could cause a performance problem when using TCP/IP communication and few connections are active. Platforms: Linux * eloqdb6 was not compatible with the glibc2.3 library due to directly accessing the errno variable. This resulted in a warning on SUSE Linux 9.0 and a start failure with SUSE Linux 9.1. * The -s command line argument to specify the service or port nunber from the command line was corrupted during startup on Linux (#2317). This caused a failure if the ELOQDB6_SERVICE[] variable was specified in the startup configuration. Notes / Related patches: - When using this patch, installation of dblogreset patch PE70-0401161 (or newer) and dbrecover patch PE70-0401162 (or newer) is recommended. Patch PE70-0401160 ------------------ Platforms: ALL * A server abort due to an internal error during a dbrestore operation could result in disk space not being released and stale node entries (#2187,#2192). When the eloqdb6 process aborted during a dbrestore operation in some cases the disk space used by the partial dbrestore was not released. In addition stale node entries could remain that could result in warnings when running dbfsck. The eloqdb6 database server has been modified to recover disk space from an aborted dbrestore operation and to delete all non-committed node entries during restart. * In case of a server abort due to an internal error forward logging is disabled and a warning mesage is written to the log file (#2198). In case the eloqdb6 process aborted and performs a crash recovery the forward log files may not fully describe the volume changes. Consequently, forward logging is disabled. To recover, a backup must be performed and forward logging needs to be re-enabled manually. * The eloqdb6 process could abort with an internal error message like below (#2190): Assertion failed: cluster panic: Aborting on internal failure, file volpool.c, line 3182 * Insuffient disk space during a dbrestore operation could cause the eloqdb6 process to abort with an internal failure (#2189,#2192): Pool_Commit() failed on Node_AddItemToFlist() Assertion failed: Tlog_Commit() failed on Tlog_Action() panic: Aborting on internal failure, file voltxn.c, line 749 Assertion failed: !FPool_ValidAnchor(&meta->do_log_anchor) panic: Aborting on internal failure, file voltxn.c, line 734 * A server recovery after a previous server abort due to an internal error could result in an internal failure message like below (#1169): Assertion failed: f_bhp->id.node_id == node_id panic: Aborting on internal failure, file mpool.c, line 408 This was caused by a defect that could release transaction journal pages too early. In rare cases this could corrupt the transaction journal and could potentially lead to data corruption. When using this patch, installation of dblogreset patch PE70-0401161 (or newer) and dbrecover patch PE70-0307162 (or newer) is recommended. Patch PE70-0401130 ------------------ Platforms: ALL * The eloqdb6 process could abort with an internal failure when searching an index with multiple key segments of different types when providing a search key that covered multiple segments (#2175). An internal error message like below was written to the log file: Assertion failed: idx_op->seg_len == idx_op->val_sz server panic: Aborting on internal failure, file find.c, line 933 This was caused by a defect in the Eloquence DBFIND modes 6/7 that are used in the Eloquence image3k library. * A problem with collating sequences and the Eloquence DBFIND modes 6/7 was fixed (#2175). * The eloqdb6 server process could abort with a message like below when a dbrestore operation failed due to an invalid archive (#2181): server panic: Fatal problem detected in bf_flags Assertion failed: !(bhp->flags & BUF_NEWPAGE) server panic: Aborting on internal failure, file mpool.c, line 2095 The eloqdb6 code has been fixed to recover correctly from a dbrestore failure in this case. * A rare race condition has been fixed that could result in corrupted node meta data during a dbrestore operation (#2182). * The http status has been modified to use relative links (#2183). Previous versions created absolute links (including host ip address) that could cause a problem on system with multiple network addresses and in NAT environments. * A server abort during a dbrestore operation could result in a circular page link in the server node directory (#2186). The server recovery procedure has been modified to avoid and fix circular page links in the node directory. In case a circular page link in the server node directory is recognized an informational message like below is output: N1: Node_CommitNodePage(page=0xe396): page is already present - ignored * The eloqdb6 server could abort with an internal failure during if the available disk space (as defined by the volume files) was exausted in a dbrestore operation. This could result in volume corruption (#2187). * An ongoing dbrestore operation could block concurrent requests or cause large delays (#2188). Patch PE70-0312080 ------------------ Platforms: ALL * DBLOCK could ignore a conflicting lock. In rare cases DBLOCK could ignore a granted lock when granting a previously blocked lock request. This could cause status -12 on a subsequent database write (#2167). * DBGET mode 1 returned status 12 if no current record existed. It has been modified to return status 17 (#2168). * DBFIND mode 1 on a search item (chained access) could result in a memory leak if the corresponding master set was not created. * Database restructuring could fail to initialize new items (#2171) * The status return for DBFIND mode 1 on index items was inconsistent with the documentation. Status element 8 returned the first matching record number and status element 10 the last matching record number. The documented behavior is to return the first matching record number in status element 10 and the last matching record number in status element 8. DBFIND mode 1 on index items now follows the documented behavior. * The Eloquence DBFIND modes 6 and 7 may be used with detail search items. This enables the use of TurboIMAGE btree modes with the image3k library ("superchains"). * The data set /INDEXED flag is now returned to the image3k library (#1358). This enables the TurboIMAGE DBINFO modes 113 and 209 to return information on master sets with "attached" indexes. It also enables the use of DBFIND mode 1 (if the btreemode1 flag is set) to locate records though the master set index ("superchains"). Please note that index access (other than DBFIND mode 1) is always enabled with Eloquence, independendly if the /INDEXED flag is set for a master set. Patch PE70-0311180 ------------------ Platforms: ALL * The maximum number of concurrent connections has been increased to 4000 on the HP-UX platform (PA-RISC and Itanium). Windows and Linux are still limited to 1000 concurrent connections. * The performance for larger configurations has been improved. - The eloqdb6 process has been changed so that a large number of configured threads does not affect the performance. - The performance of the eloqdb6 process has been improved when a large number of (mostly) idle connections is used. - The number of system calls have been reduced. * The Eloquence DBFIND modes 6 and 7 may be used with master sets. The previous implementation was limited to detail sets. To use the image3k DBFIND with master sets, installation of patch PE70-0311181 (or newer) is required. Patch PE70-0310180 ------------------ Platforms: ALL * A deadlock situation could cause the eloqdb6 process to abort with a a panic message (#2136) D0: Assertion failed: tcp->marker != current_marker D0: server panic: Aborting on internal failure, file thread.c, line 362 This panic is caused by a deadlock situation that is not detected immediately and only becomes apparent after another process executes a DBUNLOCK. However at this time, the deadlock resolving code is bypassed instead of flagging one of the processes involved in the deadlock. The resolution is to notify one of the threads involved of the deadlock and return a db status to the application. Patch PE70-0310082 ------------------ Platforms: ALL * The eloqdb6 process could abort with an internal failure when on-line backup is initiated (dbctl backup start) (#2128) eloqdb6 could abort with the message below: D0: server panic: Fatal problem detected in Fwr_Setup D0: Assertion failed: rc == 0 D0: server panic: Aborting on internal failure, file volfwr.c, line 541 When forward logging is used but currently disabled due to an error (eg. disk full, permission problem) and is re-enabled subsequently (dbctl forwardlog enable) writing to the log file is delayed until the server is restarted or an on-line backup is performed. Initiating on-line backup resulted in an abort of the server process because the previous failure status was not reset properly. Patch PE70-0309300 ------------------ Platforms: ALL * The eloqdb6 server process could abort with a SIGSEGV when forward logging is used and a log file reached its maximum size (#2126). * The maximum number of concurrent connections has been increased from 1000 to 2000 on the HP-UX platform (#2125). Patch PE70-0307161 ------------------ Platforms: All * DBGET mode 6 after an image3k DBFIND 1/21 on an index could return status 14 (#2046) A DBGET mode 6 after an image3k DBFIND mode 1/21 on an index should retrieve the last matching record in index order. DBGET mode 6 failed with status 14 when the record was the highest key in the index. * The eloqdb6 process could abort with an internal failure or cause database corruption (#1966, #2030) Assertion failed: f_bhp->id.node_id == node_id Restructuring a database with dbutil could cause the server process to abort with an internal error. In rare cases it could cause corruption of the database catalog during crash recovery. * Restructuring a database could cause an internal failure during forward recovery (#2051) Restructuring a database might cause an internal failure in the dbrecover utility when applying the forward log files. The dbrecover utility could abort with an error message like below: panic: Fatal problem detected in FixRec_FinalCommitUpdatePut Assertion failed: *flag_ptr == FixRec_USED || *flag_ptr == FixRec_DELETED Patch PE70-0305270 ------------------ Platforms: All * The eloqdb6 process could abort with an internal failure (#1973) eloqdb6 panics with the message: panic: current_task->waiting_for == NULL (thread.c, line ~413) This was caused by a race condition which could happen when a number of application processes are terminated at the same time while accessing the database. The eloqdb6 server has been modified to solve this problem. The problem can be triggered in the following scenario: - Application thread A holds a lock. It is in the process of releasing database locks. - Application thread B is waiting for the lock held by thread A. - Application thread C is wating for the lock held by thread A (and thread B). Thread C gets interrupted (application process has been killed) during the wait. - There is a situation where thread A might become runnable before thread C during the same scheduler quantum and might unexpectedly update a scheduler variable used in the deadlock detection code ("waiting_for" which is used to track dependencies among application threads). * Connecting to the database server could fail with status -700:-6 when EnableIPC=2 is used on HP-UX (#1982). A message like below is printed: P0: Unable to down semaphore P0: semop(DOWN): Identifier removed (errno 36) P0: Failure during wait on server response This was caused by a missing initialization when a shared memory segment was re-used by a subsequent process. Due to a race condition, a new connection could become disconnected by the server immediately. * Recovery in case of abnormal server termination has been improved. An incomplete transaction journal could corrupt the database volume(s) for btree and catalog node types (#1966). * In case of inconsistent meta information (such as number of records) the eloqdb6 process could abort with an internal error message like below (#1941). D0: server panic: Fatal problem detected D0: Assertion failed: meta->num_records == 1 D0: server panic: Aborting on internal failure, file volfrec.c, line 3191 This situation is is now handled more gracefully and does no longer cause a server abort. * When forward logging is enabled, the eloqdb6 process will refuse to override an existing file (#1952). An existing file is considered a logging failure and will either result in diabling forward logging or an internal abort (depending on the configuration). When using forward recovery and restoring the volume files from the backup and errorneously starting the eloqdb6 process could could overwrite forward log segments and cause potential data loss. The eloqdb6 has been modified to refuse overwriting existing files. * When forward logging is anbled and the eloqdb6 process is killed using kill -9 (eg. due to patch PE70-0305090 is not installed), it might fail to flag the volume file(s) as consistent. This might cause the server to refuse to start (#1985). A message like below is printed: failed to open volume: volume #1 has inconsistent generation count 8 (should be 9) volume.c line 828 The eloqdb6 server code has been modified to handle this this situation correctly. Patch PE70-0304041 ------------------ Platforms: All * Changing the data set structure with dbutil could result in an inconsistent internal highest record number. As a consequence, new records which are added subsequently could become inaccessible. This could cause database status -96 on DBPUT or status 17 and status 18 on DBGET, depending on the use of the particular data set (#1880). Please note: Installing this patch will prevent this problem to occur in the future. However, it will not correct your existing database environment(s). In case you used the Eloquence B.07.00 dbutil program to modify the database structure we recommend to check and correct your database environment(s) with the dbfsck utility: 1) Shut down the eloqdb6 database server instance(s). 2) Write access to the volume files is required. On HP-UX and Linux logon as root or the user configured in the server configuration file. 3) Check and correct your database environment(s) with dbfsck: dbfsck -way This will automatically fix any inconsistencies caused by dbutil data set restructuring. You may need to specify the -c /path/to/configfile command line option in case multiple database server instances are used. Installation: ------------- UNIX: In order to install this patch, you need to unpack it with gzip. Gzip is included with HP-UX and Linux. Installation requires root privileges. cd /opt/eloquence6 gzip -dc /path/to/PE70-0406030-hpux.tar.gz | tar xf - Files: bin/eloqdb6 share/doc/PE70-0406030-README Windows XP/2000/NT: This patch should *only* be installed if you previously installed the Eloquence server components on your system. Download the PE70-0406030-win32.zip file and unpack its contents with WinZip or PKUNZIP. Installation requires administrative capabilities. PLEASE MAKE SURE THE eloqdb6 SERVICE has been STOPPED previously (in the Service Control Manager or with NET STOP eloqdb6). Please copy the eloqdb6.exe file into the WINDOWS SYSTEM DIRECTORY (for example C:\Windows\System32). Please copy the PE70-0406030-README.txt file into the share\doc subdirectory of your Eloquence installation (for example C:\Programs\Eloquence\share\doc). Files: eloqdb6.exe PE70-0406030-README.txt