---------------------------------------------------------------------- ELOQUENCE B.08.20 - patch PE82-1907090 ---------------------------------------------------------------------- This patch adds enhancements or fixes defects of the eloqdb server as released with Eloquence B.08.20. This patch will be integrated in the Eloquence B.08.20 release. Eloquence B.08.20 must be installed before applying this patch. Severity: PE82-1907090: BUG FIX Superseded patches: PE82-1906240: BUG FIX PE82-1903040: BUG FIX PE82-1902130: BUG FIX PE82-1711140: BUG FIX, ENHANCEMENT PE82-1710260: BUG FIX PE82-1710200: BUG FIX PE82-1707110: BUG FIX PE82-1706230: BUG FIX, ENHANCEMENT PE82-1704210: BUG FIX PE82-1704120: BUG FIX PE82-1704070: BUG FIX PE82-1703300: BUG FIX PE82-1703240: BUG FIX PE82-1703140: BUG FIX PE82-1612190: BUG FIX PE82-1612070: BUG FIX PE82-1611170: BUG FIX PE82-1610260: BUG FIX PE82-1609130: BUG FIX PE82-1606270: BUG FIX PE82-1606020: BUG FIX PE82-1510050: BUG FIX PE82-1507270: BUG FIX PE82-1506230: BUG FIX PE82-1501300: BUG FIX PE82-1411100: BUG FIX PE82-1410010: BUG FIX PE82-1404150: BUG FIX PE82-1312160: BUG FIX PE82-1312040: BUG FIX Patch PE82-1907090 ------------------ Platforms: All * Fixed a rare race condition where enumerating the internal threads could access a terminating thread (#4061). This could in some cases result in a failed lock or removal operation. Threads are enumerated when using the http status or the dbctl list functionality. An error message as below was logged T0: [#] pthread_mutex_destroy(p_mutex) failed (errno 16) T0: [#] pthread_mutex_lock(tcp->p_mutex) failed (errno 22) D0: [#] server panic: Fatal problem detected in thread__lock * Sync forward log file changes to disk on a checkpoint. In case of a system abort this limits data loss in the forward log to the last checkpoint and ensures replication can be resumed. Patch PE82-1906240 ------------------ Platforms: All * Fixed a problem where under rare conditions a replication slave server could abort with a message like below when replication is resumed (#4283): Fwr_PageHashLookup() failed: key ... not found ... server panic: Fatal problem detected in Fwr_PageHash__Lookup Assertion failed: Fwr_PageHashLookup() failed: key not found server panic: Aborting on internal failure, file volfwr.c ... * Fixed a problem where a replication slave server could abort with a message like below while processing a large transaction (#4292): Fwr_PageHashAdd() failed: key ... already present ... server panic: Fatal problem detected in Fwr_PageHash__Add Assertion failed: Fwr_PageHashAdd() failed: key already present server panic: Aborting on internal failure, file volfwr.c ... Notes / Related patches: * Patch PE82-1906241 or superseding (dbrecover utility) fixes a related problem and should be installed with this patch. Patch PE82-1903040 ------------------ Platforms: All * Fixed database server internal deadlock condition that could in rare cases result in a hanging server process (#4285). A deadlock condition was fixed in the database server cache management that could result in a (partially) hanging server process. A thread status (dbctl list thread or thread status dumped on shutdown) indicates a block in the mpool subsystem. This was caused by a potential lock ordering problem when a cache buffer was reused by a concurrent connection. Patch PE82-1902130 ------------------ Platforms: All * Fixed bad syntax in FTS range search could trigger panic. A specific range syntax could trigger an internal consistency check instead of returning an invalid syntax status. * Fixed a problem where FTS range search results could be inconsistent if the FTS index was modified concurrently. * Revised some FTS diagnostic log messages. * Fixed a potential deadlock condition which could happen on a replication slave server if an FTS query was executed while FTS updates were replicated (#4263). * Fixed dbctl volume limit to reset volume flags. Changing the volume size limit with dbctl volume limit did not allow volume files to grow if a size limit was reached before. * Fixed a potential eloqdb panic when a transaction is committed but no space is left in the transaction log. Assertion failed: FixRec_CommitUpdatePut() failed on Tlog_WriteRecord() * Fixed a problem where rolling back a transaction could fail due to insufficient space in the transaction log. * Fixed a problem when TransactionSizeLimit = 0 is configured, causing the eloqdb to panic due to insufficient space in the transaction log. * Fixed a problem where the transaction journal could grow beyond the configured CheckPtSize, subsequently causing concurrent transactions to fail due to insufficient transaction log space. Long running client requests, such as dbrestore, use transaction log space but did not monitor the configured CheckPtSize. * Fixed replication stop/start race condition resulting in eloqdb abort (#4280). The slave server eloqdb process was aborted with a message as below after replication was started: Assertion failed: fwr.recovery.repl_flags & FWREPL_STARTED This problem was caused by allowing a new replication session while the previous replication session is wound down. * Fixed negative replication lag caused by timer interval. * The dbctl list thread output was changed to include the operating system LWPID. The "blocked" column was removed to output the operating system LWPID (OSID) instead. * Fixed UseKeepAlive configuration not functional. * The database encryption functionality now supports recent OpenSSL crypto library versions. * Linux: Add work around for glibc (nptl pthread library) defect. Glibc Versions from 2.26 may cause the eloqdb server to hang under load. Patch PE82-1711140 ------------------ Platforms: All * Fixed a problem in the index retrieval procedure that could under rare conditions cause partial results if the index is updated concurrently. The problem could be observed in FTS range searches if the dictionary index was updated concurrently. Under certain conditions, incomplete search results could be returned. * Changed the FTS memory allocation to grow the result buffer differently for large FTS results. * Added a config option to limit the number of FTS search results for a session. The [Config] SessionFtsLimit config item may be used to specify the max. FTS result size. SessionFtsLimit = 100m A trailing "m" indicates millions, a trailing "k" indicates thousands of results. If the number of search results exceeds the limit a status -813:11 is returned. To ensure full backwards compatbility no FTS limit is set by default. When using FTS indexes it is recommended to specify a reasonable limit to avoid arbitrarily complex searches. * Added the dbctl "fts limit" command to obtain or specify the FTS result limit. $dbctl -u dba fts limit 10m fts limit 10000000 Patch PE82-1710260 ------------------ Platforms: All * Changed the FTS syntax parser to support an empty range. * Changed the FTS range search to assume a numeric range if a numeric end value is specified and an open range is used. If the FTS "NR" option is not set, an optimized numeric search is now performed, matching the number of digits of the end value. Patch PE82-1710200 ------------------ Platforms: All * Improved FTS search performance for leading wild card expression in a text field. * Fixed a problem where repeating an interrupted startup recovery could result in an empty forward-log "bridge" segment (#4255). After an abnormal termination, the database server (or the dblogreset utility) perform a startup recovery. During this process, a special forward-log "bridge" segment is created that allows a dbrecover, dbrepl or fwutil process to continue across the previous abnormal termination. Interrupting and then repeating the startup recovery should resume on an existing "bridge" segment. However, because an internal state was not correctly initialized, the resulting "bridge" segment could be empty. Patch PE82-1707110 ------------------ Platforms: Windows * Fixed a problem which could happen when starting the eloqdb service on Windows. A saved command line for the eloqdb service could sometimes become overwritten internally. As a consequence, the eloqdb service could fail to start. This problem was recently observed on Windows 10 version 1703. Patch PE82-1706230 ------------------ Platforms: All * Support combining record specific FTS searches from different sets in the same set group. Platforms: HP-UX PA-RISC * Work around a compiler defect that could result in a crash when using FTS indexes with the PA-RISC eloqdb64. Patch PE82-1704210 ------------------ Platforms: All * Fixed an internal race condition which, in rare cases and under high load, could accidentally modify resources not owned by the invoking session (#4247, #4248). In turn, the server process issued a subsequent panic. The following log messages were observed in this context: server panic: Fatal problem detected in bf_get_page Assertion failed: bhp->id.node_id == node_id server panic: Aborting on internal failure, file mpool.c, line 3582 server panic: Fatal problem detected in btree_FinalCommit Assertion failed: lrec != NULL && lrec->arg == 0 server panic: Aborting on internal failure, file btree.c, line 2114 mutex(lr#...:LP) is not locked, caller volfrec_fts.c:255 server panic: Fatal problem detected in thread_mutex_unlock Assertion failed: thread_mutex_unlock failed server panic: Aborting on internal failure, file thread.c, line 1565 ** Caught signal #11 signo = 11 errno = 0 code = 1 addr = ... server panic: Fatal problem detected in tsignal_crash_handler Assertion failed: Fatal signal encountered ** traceback follows: (0) 0x40000000000f4c70 eq__assert_fail + 0x110 (1) 0x400000000045b5b0 tsignal_crash_handler + 0x390 (2) 0xe000000120043440 ---- Signal 11 (SIGSEGV) delivered ---- (3) 0x40000000001239e0 Node_MetaData + 0x20 (4) 0x400000000013d650 Tlog__LogrecUnlock + 0x30 (5) 0x40000000001fb300 FixRecFTS_GetLock + 0x2e0 ... server panic: Aborting on internal failure, file tmain.c, line 385 Patch PE82-1704120 ------------------ Platforms: All * Fixed a problem during FTS transaction rollback where under high write load a FTS record could be reused before all references were removed (#4244). This could result in an internal deadlock and a message like below was logged by the server process: K0: fts__free_keywd_ref: keyword not deleted: db=..., adr0=... K0: fts__on_rollback: BUG: keyword reference inconsistent, adr0=... [631] * Fixed a problem where an FTS transaction commit or rollback could in some cases cause a subsequent server panic (#4245). A message like below was logged by the server process: server panic: Fatal problem detected in Node_BeginAdvLock Assertion failed: !advlck->save_locks server panic: Aborting on internal failure, file nodetxrec.c, line 1789 Patch PE82-1704070 ------------------ Platforms: All * Fixed FTS transaction commit/rollback issues (#4242, #4244) which under high write load could cause a deadlock or result in log messages as below during rollback. K0: fts__adj_keywd_ref: lookup failed: db=..., adr0=... K0: fts__on_rollback: BUG: keyword not found, adr0=... [571] Patch PE82-1703300 ------------------ Platforms: All * Fixed a regression introduced with patch PE82-1703240 where an FTS transaction rollback could in some cases trigger a server panic due to a failed consistency test (#4243). A message like below was logged by the server process: server panic: Fatal problem detected in idb__fts_tx_end Assertion failed: level == 1 server panic: Aborting on internal failure, file runutil.c, line 1221 Patch PE82-1703240 ------------------ Platforms: All * Fixed an internal FTS locking problem that could in rare cases result in FTS index inconsistencies when deleting keywords (#4242). A keyword is locked while an update is pending. However, this lock was released before the FTS commit was completed which could expose an intermediate FTS update result to a concurrent session on heavy write load. This could result in FTS diagnostic messages in the server message log, similar to below: K0: fts__read_ref failed: adr=0:9999 status=17/2 [929] K0: fts_upd_index: failed [251] Patch PE82-1703140 ------------------ Platforms: All * Fixed a problem where an FTS search could fail with status -813. An error message like below was logged by the server process: FTS FAILED:set_entries failed (-1) This could be reproduced by searching an FTS key resulting in detail records and refining this search subsequently on an aggregated key when a wildcard was used as a search argument. * An FTS search that failed with an internal error could result in a double free of a memory block in a rare case. On the Linux platform this could abort the server process. Patch PE82-1612190 ------------------ Platforms: All * Fixed FTS transaction issues (#4235). When using transactions updates to FTS keywords might not work correctly in some cases. This could result in a warning message in the eloqdb log. Applications not using transactions were not affected. * Fixed some FTS search expression parser issues. * Fixed a potential memory leak if a database could not be opened due to insufficient permissions. Patch PE82-1612070 ------------------ Platforms: All * Fixed a potential FTS memory leak. Memory used to hold FTS search results might not be properly released in some cases if a FTS search returns no result but had intermediate results. * Fixed potential server abort caused by a failed consistency test in the FTS search expression parser. The FTS search expression parser was changed to fail more gracefully when encountering an internal issue. A syntax error is returned and the search expression is logged in the eloqdb message log. A message as below is logged: K1: FTS parser failed: search expression [###] Patch PE82-1611170 ------------------ Platforms: All * Substantially improved FTS search for a numeric range in a text field. Patch PE82-1610260 ------------------ Platforms: HP-UX IA64 * Fix a problem with the panic handler not producing a stack dump on HP-UX IA64. Patch PE82-1609130 ------------------ Platforms: All * Fixed a problem where write operations could be stalled while on-line backup mode is stopped (#4228). If stopping on-line backup mode takes some time so that a checkpoint operation is started, an internal lock could cause concurrent write operations to block until on-line backup mode has stopped. * Fixed a problem where in a rare case a newly created database could get (partially) lost if the database is created while on-line backup mode is active and afterwards the database server is restarted while still in on-line backup mode (#4224). * Fixed a minor problem when adding/deleting records in a data set and shortly afterwards stopping the database server before a checkpoint operation has been performed (#4225). In this case, the next time a new record is added to that data set, the database server has to skip redundant entries in the list of free record numbers. These records would also be warned about if the dbfsck utility is run with -vvv verbosity. * Enhance thread status to provide additional information when a thread is blocked on a mutex lock. Patch PE82-1606270 ------------------ Platforms: All * Substantially improved FTS performance when searching a range. * Changed the FTS range search to assume a numeric range if a numeric start value is specified and an open range is used. If the FTS "NR" option was not set, this was previously considered a string search. Now a numeric upper boundary is assumed in this case, matching the number of digits of the start value. For example, searching for 2016: now implies a search for 2016:9999. * Fixed a potential buffer overflow in the http status. * Fixed a potential locking inconsistency when adding FTS keywords. The server process might incorrectly assume a deadlock situation and return a status code. A message as below is logged: T1: [10] Deadlock detected between tasks 10 and 10 K0: [10] fts__lock_ref failed: adr=0:9900024 status=-803/27 [981] Patch PE82-1606020 ------------------ Platforms: All * Fixed a problem that could affect FTS backwards compatibility (ODX calls). The number of results might incorrectly returned as zero after a previous search with negative results (unary NOT). * Some FTS related consistency checks were relaxed to return a database status. Patch PE82-1510050 ------------------ Platforms: All * Fixed an internal consistency check that could result in a panic message as below: Assertion failed: m_recno == fts_mrecno_new This could happen when updating a search item in a detail record with an aggregated FTS index. * Changed memory allocation algorithm for huge FTS results. A large number of FTS results could take many allocations to grow the result list. This could be inefficient. Patch PE82-1507270 ------------------ Platforms: All * Fixed a memory leak where the last FTS search results were not released when the database was closed. * Fixed a potential race condition where a busy server could fail with an internal error when allocating new records (#4060). Assertion failed: freelist_miss == 0 After enlarging a table all records could be used up by concurrent threads in some corner case condition. This would then trigger an internal consistency check. The algorithm was changed to ensure that one record is now reserved. Patch PE82-1506230 ------------------ Platforms: All * A problem was fixed which could cause the database server to abort during database restructuring with an internal error (#4211). A message as below was logged: server panic: Fatal problem detected in cv_zoned Assertion failed: rc >= 0 server panic: Aborting on internal failure, file restruct_set.c This problem was caused by corrupted P or Z item values that could in some cases result in an unexpected conversion error. The database restructure process was changed to no longer fail due to corrupted item values. When a corrupted value is encountered the item is set to a default value (zero for numeric items). A message is written to the log for every item a conversion problem was encountered. WARNING: corrupted value during size conversion (item 'SET.ITEM') WARNING: precision loss during size conversion (item 'SET.ITEM') data set 'SET' has been restructured (### records) NOTE: ### conversion problem(s) encountered Patch PE82-1501300 ------------------ Platforms: All * A problem was fixed which could cause a slave server to not resume replication if the master server was shut down while on-line backup mode was active and the following on-line backup recovery was aborted (#4186). After the eloqdb is shut down while on-line backup mode is active, the eloqdb (or the dblogreset utility) performs a recovery from on-line backup mode. This may take considerable time, depending on the amount of data redirected to the log volume while on-line backup mode was active. Interrupting and then repeating this recovery could result in a wrong volume generation count, causing a slave server to stop replication. * Improved the interoperability of the dbrecover utility with database replication (#4201). After using the dbrecover utility on a replication master server, the resulting database environment can be used to synchronize a slave server, regardless whether dbrecover completely processed the last forward-log generation or a point-in-time recovery was performed where the last forward-log generation was processed only partially. If a master or slave server detects a previous point-in-time recovery, the volume generation count is incremented and a message like below is logged: L0: Note: dbrecover point-in-time recovery detected, volume generation set to ... Please note: It is not supported to use the dbrecover utility on a slave server to perform a point-in-time recovery, then later resume replication. Doing so would very likely result in slave server data corruption and/or would cause the slave server to crash. Notes / Related patches: * Patch PE82-1501301 or superseding (dblogreset utility) should be installed with this patch. Patch PE82-1411100 ------------------ Platforms: Windows * Fixed a potential event handle leak when a database session is closed (#4196). * Fixed misleading LogFile output on the config page of the HTTP status display and the dbctl logfile command if LogFile is set to syslog (#4197). Patch PE82-1410010 ------------------ Platforms: All * Fixed a potential race condition where under rare conditions dbopen could succeed although the database was already opened exclusively by another session purging the same database (#4188). As a consequence, the eloqdb server process could abort on a failed database consistency check because the database was accessed while being purged, for example: server panic: Fatal problem detected in Node_Delete Assertion failed: !node->ref_count server panic: Fatal problem detected in Pool_ReadAcc Assertion failed: Pool_TestFlag(Pool_BLOCKUSED, &header) server panic: Fatal problem detected in FixRec_GetLock Assertion failed: node * Fixed a potential race condition where under rare conditions accessing a database on a replicated slave server was not sufficiently locked against a concurrently replicated dbpurge or dbctl dbrestore (#4188). As a consequence, the eloqdb server process could abort on a failed database consistency check, for example: server panic: Fatal problem detected in FixRec_GetClusterAddress Assertion failed: meta->ulist_cache server panic: Fatal problem detected in Pool_ReadAcc Assertion failed: Pool_TestFlag(Pool_BLOCKUSED, &header) Patch PE82-1404150 ------------------ Platforms: All * Fixed a potential lock starvation issue with starting on-line backup (#4161). Under some conditions starting the on-line backup mode or switching the forward-log file could take an undefined time. * Shutting down the eloqdb server process could in some cases result in messages as below (#3623): epoll_ctl: write failed. [9] Bad file number This could happen if worker threads terminate after the internal event thread. The messages are harmless and have no further implications. * Fixed a potential race condition where a new forward-log generation could start with pending information from the previous segment (#4047). This could unexpectedly require that the previous segment is present when a recovery or replication is started. Patch PE82-1312160 ------------------ Platforms: HP-UX * Fix a potential performance regression on HP-UX introduced with "vnode" locking changes in patch PE82-1312040. Patch PE82-1312040 ------------------ Platforms: All * Fixed a lock starvation issue that could happen under a specific workload. This might cause eloqdb threads to be blocked for an undefined period waiting on the internal "vnode" lock. This should only affect large systems under high load. * Improved scalability of the internal "vnode" lock. The internal "vnode" lock could impact performance and CPU consumption under some workloads if the vnode lock is contended. * Fixed a B.08.10 interoperability problem that caused a status -700:-7 when opening a B.08.20 database using B.08.10. Installation: ------------- Please download the patch archive that corresponds with the installed release. The patch files follow the conventions below: PE82-1907090-hpux-ia64.tar.gz ^ ^ ^ | | Architecture / OS specific build | Operating system Patch ID HP-UX: In order to install this patch, you need to unpack it with gzip and tar. Gzip is included with HP-UX. Installation requires root privileges. cd /opt/eloquence/8.2 gzip -dc /path/to/PE82-1907090-hpux.tar.gz | tar xf - Files: bin/eloqdb32 (32 bit database server) bin/eloqdb64 (64 bit database server, not available on hpux-pa11) share/doc/PE82-1907090-README Linux: In order to install this patch, you need to unpack it with tar. Installation requires root privileges. cd /opt/eloquence/8.2 tar xzf /path/to/PE82-1907090-linux.tar.gz Files: bin/eloqdb32 (32 bit database server, only available on linux-i686) bin/eloqdb64 (64 bit database server, not available on linux-i686) share/doc/PE82-1907090-README Windows: Two options are available for patch installation. The patch is available as self extracting archive for automatic installation and as a zip archive for manual installation. Both patches are equivalent. Installation requires administrative capabilities. For automatic installation of this patch, please download the patch file PE82-1907090-win32.exe. Before installation, please consider stopping the database server, then execute the patch installation program. Installation does not require a reboot unless the patched files were active. For a manual installation of the patch, please download the patch file PE82-1907090-win32.zip and unpack its contents. Then perform the following steps: * Please make sure the eloqdb service is stopped before installing the patch (in the Service Control Manager or with net stop eloqdb). * Please copy the eloqdb32.exe file into the Eloquence bin directory. (Default location: C:\Program Files\Eloquence\8.2\bin) * Please copy the eloqdb64.exe file into the Eloquence bin64 directory. (Default location: C:\Program Files\Eloquence\8.2\bin64) * Please copy the PE82-1907090-README.txt file into the Eloquence share\doc directory. (Default location: C:\Program Files\Eloquence\8.2\share\doc) Files: eloqdb32.exe (32 bit database server) eloqdb64.exe (64 bit database server) PE82-1907090-README.txt