----------------------------------------------------------------------

ELOQUENCE B.08.20 - patch PE82-1907090

----------------------------------------------------------------------

This patch adds enhancements or fixes defects of the eloqdb server
as released with Eloquence B.08.20. This patch will be integrated
in the Eloquence B.08.20 release.

Eloquence B.08.20 must be installed before applying this patch.

Severity:
 PE82-1907090: BUG FIX

Superseded patches:
 PE82-1906240: BUG FIX
 PE82-1903040: BUG FIX
 PE82-1902130: BUG FIX
 PE82-1711140: BUG FIX, ENHANCEMENT
 PE82-1710260: BUG FIX
 PE82-1710200: BUG FIX
 PE82-1707110: BUG FIX
 PE82-1706230: BUG FIX, ENHANCEMENT
 PE82-1704210: BUG FIX
 PE82-1704120: BUG FIX
 PE82-1704070: BUG FIX
 PE82-1703300: BUG FIX
 PE82-1703240: BUG FIX
 PE82-1703140: BUG FIX
 PE82-1612190: BUG FIX
 PE82-1612070: BUG FIX
 PE82-1611170: BUG FIX
 PE82-1610260: BUG FIX
 PE82-1609130: BUG FIX
 PE82-1606270: BUG FIX
 PE82-1606020: BUG FIX
 PE82-1510050: BUG FIX
 PE82-1507270: BUG FIX
 PE82-1506230: BUG FIX
 PE82-1501300: BUG FIX
 PE82-1411100: BUG FIX
 PE82-1410010: BUG FIX
 PE82-1404150: BUG FIX
 PE82-1312160: BUG FIX
 PE82-1312040: BUG FIX


Patch PE82-1907090
------------------

Platforms: All

* Fixed a rare race condition where enumerating the internal threads
  could access a terminating thread (#4061). This could in some cases
  result in a failed lock or removal operation.
  Threads are enumerated when using the http status or the dbctl list
  functionality. An error message as below was logged

    T0: [#] pthread_mutex_destroy(p_mutex) failed (errno 16)

    T0: [#] pthread_mutex_lock(tcp->p_mutex) failed (errno 22)
    D0: [#] server panic: Fatal problem detected in thread__lock

* Sync forward log file changes to disk on a checkpoint. In case
  of a system abort this limits data loss in the forward log to the
  last checkpoint and ensures replication can be resumed.


Patch PE82-1906240
------------------

Platforms: All

* Fixed a problem where under rare conditions a replication slave
  server could abort with a message like below when replication is
  resumed (#4283):

  Fwr_PageHashLookup() failed: key ... not found ...
  server panic: Fatal problem detected in Fwr_PageHash__Lookup
  Assertion failed: Fwr_PageHashLookup() failed: key not found
  server panic: Aborting on internal failure, file volfwr.c ...

* Fixed a problem where a replication slave server could abort with
  a message like below while processing a large transaction (#4292):

  Fwr_PageHashAdd() failed: key ... already present ...
  server panic: Fatal problem detected in Fwr_PageHash__Add
  Assertion failed: Fwr_PageHashAdd() failed: key already present
  server panic: Aborting on internal failure, file volfwr.c ...

Notes / Related patches:

* Patch PE82-1906241 or superseding (dbrecover utility) fixes a
  related problem and should be installed with this patch.


Patch PE82-1903040
------------------

Platforms: All

* Fixed database server internal deadlock condition that could in rare
  cases result in a hanging server process (#4285).

  A deadlock condition was fixed in the database server cache management
  that could result in a (partially) hanging server process.
  A thread status (dbctl list thread or thread status dumped on shutdown)
  indicates a block in the mpool subsystem.

  This was caused by a potential lock ordering problem when a cache buffer
  was reused by a concurrent connection.


Patch PE82-1902130
------------------

Platforms: All

* Fixed bad syntax in FTS range search could trigger panic.

  A specific range syntax could trigger an internal consistency check
  instead of returning an invalid syntax status.

* Fixed a problem where FTS range search results could be inconsistent
  if the FTS index was modified concurrently.

* Revised some FTS diagnostic log messages.

* Fixed a potential deadlock condition which could happen on a replication
  slave server if an FTS query was executed while FTS updates were
  replicated (#4263).

* Fixed dbctl volume limit to reset volume flags. Changing the volume
  size limit with dbctl volume limit did not allow volume files to grow
  if a size limit was reached before.

* Fixed a potential eloqdb panic when a transaction is committed but
  no space is left in the transaction log.

  Assertion failed: FixRec_CommitUpdatePut() failed on Tlog_WriteRecord()

* Fixed a problem where rolling back a transaction could fail due to
  insufficient space in the transaction log.

* Fixed a problem when TransactionSizeLimit = 0 is configured, causing
  the eloqdb to panic due to insufficient space in the transaction log.

* Fixed a problem where the transaction journal could grow beyond the
  configured CheckPtSize, subsequently causing concurrent transactions
  to fail due to insufficient transaction log space.

  Long running client requests, such as dbrestore, use transaction log
  space but did not monitor the configured CheckPtSize.

* Fixed replication stop/start race condition resulting in eloqdb abort
  (#4280). The slave server eloqdb process was aborted with a message
  as below after replication was started:

  Assertion failed: fwr.recovery.repl_flags & FWREPL_STARTED

  This problem was caused by allowing a new replication session while
  the previous replication session is wound down.

* Fixed negative replication lag caused by timer interval.

* The dbctl list thread output was changed to include the operating
  system LWPID.

  The "blocked" column was removed to output the operating system
  LWPID (OSID) instead.

* Fixed UseKeepAlive configuration not functional.

* The database encryption functionality now supports recent OpenSSL
  crypto library versions.

* Linux: Add work around for glibc (nptl pthread library) defect.

  Glibc Versions from 2.26 may cause the eloqdb server to hang under load.


Patch PE82-1711140
------------------

Platforms: All

* Fixed a problem in the index retrieval procedure that could under
  rare conditions cause partial results if the index is updated
  concurrently.

  The problem could be observed in FTS range searches if the dictionary
  index was updated concurrently. Under certain conditions, incomplete
  search results could be returned.

* Changed the FTS memory allocation to grow the result buffer
  differently for large FTS results.

* Added a config option to limit the number of FTS search results for
  a session.

  The [Config] SessionFtsLimit config item may be used to specify the
  max. FTS result size.

  SessionFtsLimit = 100m

  A trailing "m" indicates millions, a trailing "k" indicates thousands
  of results. If the number of search results exceeds the limit a
  status -813:11 is returned.

  To ensure full backwards compatbility no FTS limit is set by default.
  When using FTS indexes it is recommended to specify a reasonable
  limit to avoid arbitrarily complex searches.

* Added the dbctl "fts limit" command to obtain or specify the FTS
  result limit.

  $dbctl -u dba fts limit 10m
  fts limit 10000000


Patch PE82-1710260
------------------

Platforms: All

* Changed the FTS syntax parser to support an empty range.

* Changed the FTS range search to assume a numeric range if a
  numeric end value is specified and an open range is used.
  If the FTS "NR" option is not set, an optimized numeric search
  is now performed, matching the number of digits of the end value.


Patch PE82-1710200
------------------

Platforms: All

* Improved FTS search performance for leading wild card expression in
  a text field.

* Fixed a problem where repeating an interrupted startup recovery could
  result in an empty forward-log "bridge" segment (#4255).

  After an abnormal termination, the database server (or the dblogreset
  utility) perform a startup recovery. During this process, a special
  forward-log "bridge" segment is created that allows a dbrecover,
  dbrepl or fwutil process to continue across the previous abnormal
  termination.

  Interrupting and then repeating the startup recovery should resume
  on an existing "bridge" segment. However, because an internal state
  was not correctly initialized, the resulting "bridge" segment could
  be empty.


Patch PE82-1707110
------------------

Platforms: Windows

* Fixed a problem which could happen when starting the eloqdb service
  on Windows.

  A saved command line for the eloqdb service could sometimes become
  overwritten internally. As a consequence, the eloqdb service could
  fail to start.

  This problem was recently observed on Windows 10 version 1703.


Patch PE82-1706230
------------------

Platforms: All

* Support combining record specific FTS searches from different sets
  in the same set group.

Platforms: HP-UX PA-RISC

* Work around a compiler defect that could result in a crash when
  using FTS indexes with the PA-RISC eloqdb64.


Patch PE82-1704210
------------------

Platforms: All

* Fixed an internal race condition which, in rare cases and under high
  load, could accidentally modify resources not owned by the invoking
  session (#4247, #4248).

  In turn, the server process issued a subsequent panic. The following
  log messages were observed in this context:

  server panic: Fatal problem detected in bf_get_page
  Assertion failed: bhp->id.node_id == node_id
  server panic: Aborting on internal failure, file mpool.c, line 3582

  server panic: Fatal problem detected in btree_FinalCommit
  Assertion failed: lrec != NULL && lrec->arg == 0
  server panic: Aborting on internal failure, file btree.c, line 2114

  mutex(lr#...:LP) is not locked, caller volfrec_fts.c:255
  server panic: Fatal problem detected in thread_mutex_unlock
  Assertion failed: thread_mutex_unlock failed
  server panic: Aborting on internal failure, file thread.c, line 1565

  ** Caught signal #11
   signo = 11 errno = 0 code = 1
   addr = ...
  server panic: Fatal problem detected in tsignal_crash_handler
  Assertion failed: Fatal signal encountered
  ** traceback follows:
  (0)  0x40000000000f4c70  eq__assert_fail + 0x110
  (1)  0x400000000045b5b0  tsignal_crash_handler + 0x390
  (2)  0xe000000120043440  ---- Signal 11 (SIGSEGV) delivered ----
  (3)  0x40000000001239e0  Node_MetaData + 0x20
  (4)  0x400000000013d650  Tlog__LogrecUnlock + 0x30
  (5)  0x40000000001fb300  FixRecFTS_GetLock + 0x2e0
  ...
  server panic: Aborting on internal failure, file tmain.c, line 385


Patch PE82-1704120
------------------

Platforms: All

* Fixed a problem during FTS transaction rollback where under high
  write load a FTS record could be reused before all references were
  removed (#4244).
  This could result in an internal deadlock and a message like below
  was logged by the server process:

    K0: fts__free_keywd_ref: keyword not deleted: db=..., adr0=...
    K0: fts__on_rollback: BUG: keyword reference inconsistent, adr0=... [631]

* Fixed a problem where an FTS transaction commit or rollback could
  in some cases cause a subsequent server panic (#4245).

  A message like below was logged by the server process:

  server panic: Fatal problem detected in Node_BeginAdvLock
  Assertion failed: !advlck->save_locks
  server panic: Aborting on internal failure, file nodetxrec.c, line 1789


Patch PE82-1704070
------------------

Platforms: All

* Fixed FTS transaction commit/rollback issues (#4242, #4244)
  which under high write load could cause a deadlock or result
  in log messages as below during rollback.

    K0: fts__adj_keywd_ref: lookup failed: db=..., adr0=...
    K0: fts__on_rollback: BUG: keyword not found, adr0=... [571]


Patch PE82-1703300
------------------

Platforms: All

* Fixed a regression introduced with patch PE82-1703240 where an
  FTS transaction rollback could in some cases trigger a server
  panic due to a failed consistency test (#4243).

  A message like below was logged by the server process:

  server panic: Fatal problem detected in idb__fts_tx_end
  Assertion failed: level == 1
  server panic: Aborting on internal failure, file runutil.c, line 1221


Patch PE82-1703240
------------------

Platforms: All

* Fixed an internal FTS locking problem that could in rare cases
  result in FTS index inconsistencies when deleting keywords (#4242).

  A keyword is locked while an update is pending. However, this
  lock was released before the FTS commit was completed which
  could expose an intermediate FTS update result to a concurrent
  session on heavy write load.

  This could result in FTS diagnostic messages in the server message
  log, similar to below:

    K0: fts__read_ref failed: adr=0:9999 status=17/2 [929]
    K0: fts_upd_index: failed [251]


Patch PE82-1703140
------------------

Platforms: All

* Fixed a problem where an FTS search could fail with status -813.
  An error message like below was logged by the server process:

     FTS FAILED:set_entries failed (-1)

  This could be reproduced by searching an FTS key resulting in detail
  records and refining this search subsequently on an aggregated key
  when a wildcard was used as a search argument.

* An FTS search that failed with an internal error could result
  in a double free of a memory block in a rare case. On the Linux
  platform this could abort the server process.


Patch PE82-1612190
------------------

Platforms: All

* Fixed FTS transaction issues (#4235). When using transactions
  updates to FTS keywords might not work correctly in some cases.
  This could result in a warning message in the eloqdb log.
  Applications not using transactions were not affected.

* Fixed some FTS search expression parser issues.

* Fixed a potential memory leak if a database could not be opened
  due to insufficient permissions.


Patch PE82-1612070
------------------

Platforms: All

* Fixed a potential FTS memory leak. Memory used to hold FTS search
  results might not be properly released in some cases if a FTS search
  returns no result but had intermediate results.

* Fixed potential server abort caused by a failed consistency test in
  the FTS search expression parser.
  The FTS search expression parser was changed to fail more gracefully
  when encountering an internal issue. A syntax error is returned and
  the search expression is logged in the eloqdb message log.

  A message as below is logged:
  K1: FTS parser failed: search expression [###]


Patch PE82-1611170
------------------

Platforms: All

* Substantially improved FTS search for a numeric range in a text
  field.


Patch PE82-1610260
------------------

Platforms: HP-UX IA64

* Fix a problem with the panic handler not producing a stack dump
  on HP-UX IA64.


Patch PE82-1609130
------------------

Platforms: All

* Fixed a problem where write operations could be stalled while
  on-line backup mode is stopped (#4228).

  If stopping on-line backup mode takes some time so that a
  checkpoint operation is started, an internal lock could cause
  concurrent write operations to block until on-line backup mode
  has stopped.

* Fixed a problem where in a rare case a newly created database
  could get (partially) lost if the database is created while
  on-line backup mode is active and afterwards the database server
  is restarted while still in on-line backup mode (#4224).

* Fixed a minor problem when adding/deleting records in a data
  set and shortly afterwards stopping the database server before
  a checkpoint operation has been performed (#4225).

  In this case, the next time a new record is added to that data
  set, the database server has to skip redundant entries in the
  list of free record numbers. These records would also be warned
  about if the dbfsck utility is run with -vvv verbosity.

* Enhance thread status to provide additional information when
  a thread is blocked on a mutex lock.


Patch PE82-1606270
------------------

Platforms: All

* Substantially improved FTS performance when searching a range.

* Changed the FTS range search to assume a numeric range if a
  numeric start value is specified and an open range is used.
  If the FTS "NR" option was not set, this was previously considered
  a string search. Now a numeric upper boundary is assumed in this
  case, matching the number of digits of the start value.
  For example, searching for 2016: now implies a search for 2016:9999.

* Fixed a potential buffer overflow in the http status.

* Fixed a potential locking inconsistency when adding FTS keywords.
  The server process might incorrectly assume a deadlock situation
  and return a status code. A message as below is logged:

  T1: [10] Deadlock detected between tasks 10 and 10
  K0: [10] fts__lock_ref failed: adr=0:9900024 status=-803/27 [981]


Patch PE82-1606020
------------------

Platforms: All

* Fixed a problem that could affect FTS backwards compatibility
  (ODX calls). The number of results might incorrectly returned
  as zero after a previous search with negative results (unary NOT).

* Some FTS related consistency checks were relaxed to return a
  database status.


Patch PE82-1510050
------------------

Platforms: All

* Fixed an internal consistency check that could result in a
  panic message as below:

  Assertion failed: m_recno == fts_mrecno_new

  This could happen when updating a search item in a detail
  record with an aggregated FTS index.

* Changed memory allocation algorithm for huge FTS results.
  A large number of FTS results could take many allocations
  to grow the result list. This could be inefficient.


Patch PE82-1507270
------------------

Platforms: All

* Fixed a memory leak where the last FTS search results were not
  released when the database was closed.

* Fixed a potential race condition where a busy server could fail
  with an internal error when allocating new records (#4060).

  Assertion failed: freelist_miss == 0

  After enlarging a table all records could be used up by concurrent
  threads in some corner case condition. This would then trigger an
  internal consistency check.
  The algorithm was changed to ensure that one record is now reserved.


Patch PE82-1506230
------------------

Platforms: All

* A problem was fixed which could cause the database server to abort
  during database restructuring with an internal error (#4211).
  A message as below was logged:

  server panic: Fatal problem detected in cv_zoned
  Assertion failed: rc >= 0
  server panic: Aborting on internal failure, file restruct_set.c

  This problem was caused by corrupted P or Z item values that could
  in some cases result in an unexpected conversion error.

  The database restructure process was changed to no longer fail due
  to corrupted item values. When a corrupted value is encountered the
  item is set to a default value (zero for numeric items). A message
  is written to the log for every item a conversion problem was
  encountered.

  WARNING: corrupted value during size conversion (item 'SET.ITEM')
  WARNING: precision loss during size conversion (item 'SET.ITEM')
  data set 'SET' has been restructured (### records)
  NOTE: ### conversion problem(s) encountered


Patch PE82-1501300
------------------

Platforms: All

* A problem was fixed which could cause a slave server to not resume
  replication if the master server was shut down while on-line backup
  mode was active and the following on-line backup recovery was aborted
  (#4186).

  After the eloqdb is shut down while on-line backup mode is active,
  the eloqdb (or the dblogreset utility) performs a recovery from
  on-line backup mode. This may take considerable time, depending on
  the amount of data redirected to the log volume while on-line backup
  mode was active.

  Interrupting and then repeating this recovery could result in a wrong
  volume generation count, causing a slave server to stop replication.

* Improved the interoperability of the dbrecover utility with database
  replication (#4201).

  After using the dbrecover utility on a replication master server, the
  resulting database environment can be used to synchronize a slave
  server, regardless whether dbrecover completely processed the last
  forward-log generation or a point-in-time recovery was performed
  where the last forward-log generation was processed only partially.

  If a master or slave server detects a previous point-in-time recovery,
  the volume generation count is incremented and a message like below
  is logged:

    L0: Note: dbrecover point-in-time recovery detected,
              volume generation set to ...

  Please note: It is not supported to use the dbrecover utility on a
  slave server to perform a point-in-time recovery, then later resume
  replication. Doing so would very likely result in slave server data
  corruption and/or would cause the slave server to crash.


Notes / Related patches:

* Patch PE82-1501301 or superseding (dblogreset utility) should be
  installed with this patch.


Patch PE82-1411100
------------------

Platforms: Windows

* Fixed a potential event handle leak when a database session is
  closed (#4196).

* Fixed misleading LogFile output on the config page of the HTTP
  status display and the dbctl logfile command if LogFile is set
  to syslog (#4197).


Patch PE82-1410010
------------------

Platforms: All

* Fixed a potential race condition where under rare conditions dbopen
  could succeed although the database was already opened exclusively
  by another session purging the same database (#4188).

  As a consequence, the eloqdb server process could abort on a failed
  database consistency check because the database was accessed while
  being purged, for example:

  server panic: Fatal problem detected in Node_Delete
  Assertion failed: !node->ref_count

  server panic: Fatal problem detected in Pool_ReadAcc
  Assertion failed: Pool_TestFlag(Pool_BLOCKUSED, &header)

  server panic: Fatal problem detected in FixRec_GetLock
  Assertion failed: node

* Fixed a potential race condition where under rare conditions accessing
  a database on a replicated slave server was not sufficiently locked
  against a concurrently replicated dbpurge or dbctl dbrestore (#4188).

  As a consequence, the eloqdb server process could abort on a failed
  database consistency check, for example:

  server panic: Fatal problem detected in FixRec_GetClusterAddress
  Assertion failed: meta->ulist_cache

  server panic: Fatal problem detected in Pool_ReadAcc
  Assertion failed: Pool_TestFlag(Pool_BLOCKUSED, &header)


Patch PE82-1404150
------------------

Platforms: All

* Fixed a potential lock starvation issue with starting on-line backup
  (#4161). Under some conditions starting the on-line backup mode or
  switching the forward-log file could take an undefined time.

* Shutting down the eloqdb server process could in some cases result
  in messages as below (#3623):

    epoll_ctl: write failed. [9] Bad file number

  This could happen if worker threads terminate after the internal
  event thread. The messages are harmless and have no further
  implications.

* Fixed a potential race condition where a new forward-log generation
  could start with pending information from the previous segment (#4047).
  This could unexpectedly require that the previous segment is present
  when a recovery or replication is started.


Patch PE82-1312160
------------------

Platforms: HP-UX

* Fix a potential performance regression on HP-UX introduced with
  "vnode" locking changes in patch PE82-1312040.


Patch PE82-1312040
------------------

Platforms: All

* Fixed a lock starvation issue that could happen under a specific
  workload. This might cause eloqdb threads to be blocked for
  an undefined period waiting on the internal "vnode" lock.
  This should only affect large systems under high load.

* Improved scalability of the internal "vnode" lock.  The internal
  "vnode" lock could impact performance and CPU consumption under
  some workloads if the vnode lock is contended.

* Fixed a B.08.10 interoperability problem that caused a status
  -700:-7 when opening a B.08.20 database using B.08.10.


Installation:
-------------

Please download the patch archive that corresponds with the installed
release.  The patch files follow the conventions below:

   PE82-1907090-hpux-ia64.tar.gz
        ^       ^    ^
        |       |    Architecture / OS specific build
        |       Operating system
        Patch ID


HP-UX:

In order to install this patch, you need to unpack it with gzip and tar.
Gzip is included with HP-UX. Installation requires root privileges.

cd /opt/eloquence/8.2
gzip -dc /path/to/PE82-1907090-hpux.tar.gz | tar xf -

Files:

   bin/eloqdb32 (32 bit database server)
   bin/eloqdb64 (64 bit database server, not available on hpux-pa11)
   share/doc/PE82-1907090-README


Linux:

In order to install this patch, you need to unpack it with tar.
Installation requires root privileges.

cd /opt/eloquence/8.2
tar xzf /path/to/PE82-1907090-linux.tar.gz

Files:

   bin/eloqdb32 (32 bit database server, only available on linux-i686)
   bin/eloqdb64 (64 bit database server, not available on linux-i686)
   share/doc/PE82-1907090-README


Windows:

Two options are available for patch installation. The patch is
available as self extracting archive for automatic installation
and as a zip archive for manual installation. Both patches are
equivalent. Installation requires administrative capabilities.

For automatic installation of this patch, please download the patch
file PE82-1907090-win32.exe. Before installation, please consider
stopping the database server, then execute the patch installation
program. Installation does not require a reboot unless the patched
files were active.

For a manual installation of the patch, please download the patch
file PE82-1907090-win32.zip and unpack its contents. Then perform
the following steps:

* Please make sure the eloqdb service is stopped before installing
  the patch (in the Service Control Manager or with net stop eloqdb).

* Please copy the eloqdb32.exe file into the Eloquence bin directory.
  (Default location: C:\Program Files\Eloquence\8.2\bin)

* Please copy the eloqdb64.exe file into the Eloquence bin64 directory.
  (Default location: C:\Program Files\Eloquence\8.2\bin64)

* Please copy the PE82-1907090-README.txt file into the Eloquence
  share\doc directory.
  (Default location: C:\Program Files\Eloquence\8.2\share\doc)

Files:

   eloqdb32.exe (32 bit database server)
   eloqdb64.exe (64 bit database server)
   PE82-1907090-README.txt