----------------------------------------------------------------------

ELOQUENCE B.07.00 - patch 0409140

----------------------------------------------------------------------

This patch adds enhancements or fixes defects of the eloqdb6 program 
as released with Eloquence B.07.00. This patch will be integrated in 
the Eloquence B.07.00 release.

Eloquence B.07.00 must be installed before applying this patch.

Severity:
 PE70-0409140: CRITICAL (data loss, corruption)

Superseded patches:
 PE70-0408310: BUG FIX
 PE70-0407280: ENHANCEMENT
 PE70-0406240: BUG FIX
 PE70-0406030: BUG FIX
 PE70-0401160: CRITICAL (server abort, corruption)
 PE70-0401130: CRITICAL (server abort, corruption)
 PE70-0312080: BUG FIX
 PE70-0311180: ENHANCEMENT
 PE70-0310180: BUG FIX
 PE70-0310082: BUG FIX
 PE70-0309300: BUG FIX
 PE70-0307161: CRITICAL (server abort, corruption)
 PE70-0305270: BUG FIX
 PE70-0304041: CRITICAL (database inconsistencies)


Patch PE70-0409140
------------------

Platforms: ALL

* A DBUPDATE on a detail set could fail to update the user accessible
  part the record data when changing a search item.
  While the user accessible part of the record data was not updated
  the chain links were updated possibly causing a subsequent status 
  -18 (broken chain) or -96 (currupt chain pointer).
  This was caused by a side effect of the changes in beta patch 
  PE70-0407280.

  This could happen when using DBUPDATE mode 2 (or DBUPDATE mode 1 
  with the CIUPDATE flag from the image3k library) and updating a 
  detail record while changing a search or sort item.
  eloqdb6 patch levels before PE70-0407280 are not affected.


Notes / Related patches:

- This patch should immediately be installed to replace beta patches
  PE70-0407280 (beta) or PE70-0408310 (beta).

- Please review the enhancements in PE70-0407280 below.

- To retain the previous SyncMode behavior the SyncerJournalFlushInterval
  configuration item must be added to the eloqdb6 configuration file with
  a value of zero. In the section [limits] please add a line like below:
  
    [limits]
    SyncerJournalFlushInterval = 0

- Installation of dblogreset patch PE70-0408311 (or newer) and 
  dbrecover patch PE70-0408312 (or newer) is required with this
  version of eloqdb6.


Patch PE70-0408310
------------------

Platforms: ALL

* A rare race condition was fixed that could result in the wrong
  status code returned to the application (#2373).
  In case a database call that modified the database (eg. DBPUT)
  failed and the internal rollback operation became blocked the 
  status code could be corrupted by a concurrent thread. A wrong 
  status code would be returned to the application.

* A rare race condition was fixed that could cause a DBPUT call
  on a detail set to fail with status code 43 ("duplicate key value"). 
  Due to an internal locking problem there was a small window of
  opportunity where multiple threads could attempt to concurrently
  add the same automatic master entry (#2458).
  
* The eloqdb6 server could exceed the valid range of the idle thread 
  semaphore that is used internally to signal outstanding events
  (#2441).
  The semaphore value overflow could only happen in a rare case when 
  the idle thread did not block for some time due to a large number 
  of outstanding disk i/o completion (TIO) events.

  In case the idle semaphore overflowed some database sessions could
  terminate with status -700/-6 when shared memory communication was
  configured (EnableIPC set to 1 or 2). The server log had a message 
  like below

    semop(UP): Numerical result out of range (errno 34)

  The internal event processing was changed to avoid overflowing the 
  valid semaphore value range.

* The eloqdb6 crash recovery was improved when the volume file was
  truncated (#2462, #2467, #2468). 
  In case of a system crash a recent increase of a data volume might 
  be lost. Increasing the data volume is now recorded in the 
  transaction journal and volume size is recovered during crash recovery.

* The eloqdb6 crash recovery was improved to recover the unique record
  id number that is used with indexes on secondary keys (#2466).


Patch PE70-0407280
------------------

Platforms: ALL

* The amount of data that was written to the forward log was reduced
  substantially (#2398). This was done by using a different strategy 
  to log changes to linked records caused by a PUT or DELETE operation.

* A new SyncerJournalFlushInterval configuration item was added that
  may be used to substantially improve performance in "sync" write 
  mode (#2400).

  Previously, when sync write mode was used (SyncMode=1), the 
  transaction journal was flushed to disk for every commit operation
  (either explicitly when using transactions or implicitly for each
  database call). In case of a system crash only the last committed
  transactions could potentially be lost and the volume integrity is 
  typically maintained.
  
  The new SyncerJournalFlushInterval configuration item adds the option
  to specify the time (in msec) after which changes to the transaction 
  journal are pushed to disk. In case of a system crash any transactions
  committed in the period specified by SyncerJournalFlushInterval are
  lost but the volume integrity is typically maintained.

  The default value for the SyncerJournalFlushInterval configuration
  item is 500 ms (a half second).
  
  NOTE: This default modifies the behavior of the current SyncMode and
  in case of a system crash may result in losing any transactions 
  committed within the last 500 ms.
  To retain the previous SyncMode behavior the SyncerJournalFlushInterval
  configuration item must be added to the eloqdb6 configuration file with
  a value of zero. In the section [limits] please add a line like below:
  
    [limits]
    SyncerJournalFlushInterval = 0

  NOTE: Specifying a low SyncerJournalFlushInterval value may have a
  detrimental effect on performance. This will result in additional
  load on the I/O subsystem and also the operating system may block 
  concurrent access to the lock volume(s) during a sync operation.
  
* The dbctl syncmode command was enhanced to allow setting and
  returning the current status of the sync mode configuration item 
  and the current value of SyncerJournalFlushInterval.

  usage: dbctl syncmode {ON [msec]|OFF|STATUS}

  > dbctl syncmode status
  Returns status of sync mode. In case the sync mode is used and
  the SyncerJournalFlushInterval is nonzero it also returns the
  current value of SyncerJournalFlushInterval in msec.

  > dbctl syncmode on 500
  Enables sync write mode and defines an SyncerJournalFlushInterval
  interval of 500 msec.


Patch PE70-0406240
------------------

Platforms: ALL

* The eloqdb6 process could abort with an internal failure when a 
  client was aborted. A message message like below was written to 
  the log (#2366):

    tcp_send: send failed: writecount -1, [9] Bad file descriptor
    tcp_async_mode: fcntl(F_GETFL): Bad file descriptor (errno 9)
    server panic: Fatal problem detected in close_connection
    Assertion failed: poll_ofs > 2 && poll_ofs < poll_set_cnt
    server panic: Aborting on internal failure, file event.c, line 489

  This could only happen when patch PE70-0406030 was installed and
  the connection used the TCP communication method (connection
  from a remote system or EnableIPC=0) and the database call was 
  blocked during execution (for example a DBLOCK with a wait option).
  In this case a cleanup operation could be executed twice which
  caused a problem with the changes in PE70-0406030.

* The dbctl syncmode command was enhanced to return the current 
  status of the sync mode configuration item (#2371).

  usage: dbctl syncmode {ON|OFF|STATUS}

  > dbctl syncmode status
  Returns status of sync mode.

* Changing the server sync mode using dbctl now updates the status in
  the HTTP status.


Notes / Related patches:

- Installation of dblogreset patch PE70-0401161 (or newer) and 
  dbrecover patch PE70-0401162 (or newer) is recommended.


Patch PE70-0406030
------------------

Platforms: ALL

* The eloqdb6 process could abort with an internal failure when 
  searching an index with multiple string segments (#2219).
  
  An internal error message like below was written to the log file:
   Assertion failed: len_a >= 0 && len_b >= 0
   server panic: Aborting on internal failure, file nls.c, line 366

  This was caused by a defect in the Eloquence DBFIND modes 6/7
  that are used in the Eloquence image3k library.

* Performing a DBFIND on a master set that was not created could 
  result in an abort of the eloqdb6 process with an error message 
  like below (#2263):
  
    Assertion failed: node
    server panic: Aborting on internal failure, file nodecore.c, line 1273

  eloqdb6 has been fixed to return a status 80 in case an index is
  not created.

* The http status could cause a temporary hang of the eloqdb6 server 
  process in case of a network communication problem with the browser
  (#2315). This was observed with conections through a VPN gateway.

* A new dbctl function was added to allow changing of sync mode 
  without restarting the server process.

  usage: dbctl syncmode {ON|OFF}

  > dbctl -udba syncmode on
  Enables sync mode. Equivalent to SyncMode=1

  > dbctl -udba syncmode off
  Disables sync mode. Equivalent to SyncMode=0


Platforms: Windows

* Due to a build problem the number of concurrent connections on 
  the Windows platfrom might be limited to 62. Additional connections
  would hang during establishing the connection.


Platforms: HP-UX

* An improper timeout value could cause a performance problem when 
  using TCP/IP communication and few connections are active.


Platforms: Linux

* eloqdb6 was not compatible with the glibc2.3 library due to directly
  accessing the errno variable. This resulted in a warning on SUSE
  Linux 9.0 and a start failure with SUSE Linux 9.1.

* The -s command line argument to specify the service or port nunber
  from the command line was corrupted during startup on Linux (#2317).
  This caused a failure if the ELOQDB6_SERVICE[] variable was specified
  in the startup configuration.


Notes / Related patches:

- When using this patch, installation of dblogreset patch PE70-0401161
(or newer) and dbrecover patch PE70-0401162 (or newer) is recommended.


Patch PE70-0401160
------------------

Platforms: ALL

* A server abort due to an internal error during a dbrestore operation
  could result in disk space not being released and stale node 
  entries (#2187,#2192).

  When the eloqdb6 process aborted during a dbrestore operation in 
  some cases the disk space used by the partial dbrestore was not 
  released. In addition stale node entries could remain that could
  result in warnings when running dbfsck.

  The eloqdb6 database server has been modified to recover disk
  space from an aborted dbrestore operation and to delete all
  non-committed node entries during restart.

* In case of a server abort due to an internal error forward logging
  is disabled and a warning mesage is written to the log file (#2198).

  In case the eloqdb6 process aborted and performs a crash recovery
  the forward log files may not fully describe the volume changes.
  Consequently, forward logging is disabled.
  To recover, a backup must be performed and forward logging needs
  to be re-enabled manually.

* The eloqdb6 process could abort with an internal error message
  like below (#2190):

   Assertion failed: cluster
   panic: Aborting on internal failure, file volpool.c, line 3182

* Insuffient disk space during a dbrestore operation could cause 
  the eloqdb6 process to abort with an internal failure (#2189,#2192):

    Pool_Commit() failed on Node_AddItemToFlist()
    Assertion failed: Tlog_Commit() failed on Tlog_Action()
    panic: Aborting on internal failure, file voltxn.c, line 749

    Assertion failed: !FPool_ValidAnchor(&meta->do_log_anchor)
    panic: Aborting on internal failure, file voltxn.c, line 734
    
* A server recovery after a previous server abort due to an internal
  error could result in an internal failure message like below (#1169):
  
   Assertion failed: f_bhp->id.node_id == node_id
   panic: Aborting on internal failure, file mpool.c, line 408

  This was caused by a defect that could release transaction journal 
  pages too early. In rare cases this could corrupt the transaction 
  journal and could potentially lead to data corruption.

When using this patch, installation of dblogreset patch PE70-0401161
(or newer) and dbrecover patch PE70-0307162 (or newer) is recommended.


Patch PE70-0401130
------------------

Platforms: ALL

* The eloqdb6 process could abort with an internal failure when searching
  an index with multiple key segments of different types when providing a 
  search key that covered multiple segments (#2175).
  
  An internal error message like below was written to the log file:
   Assertion failed: idx_op->seg_len == idx_op->val_sz
   server panic: Aborting on internal failure, file find.c, line 933

  This was caused by a defect in the Eloquence DBFIND modes 6/7
  that are used in the Eloquence image3k library.

* A problem with collating sequences and the Eloquence DBFIND 
  modes 6/7 was fixed (#2175).

* The eloqdb6 server process could abort with a message like below 
  when a dbrestore operation failed due to an invalid archive (#2181):

   server panic: Fatal problem detected in bf_flags
   Assertion failed: !(bhp->flags & BUF_NEWPAGE)
   server panic: Aborting on internal failure, file mpool.c, line 2095

  The eloqdb6 code has been fixed to recover correctly from a dbrestore
  failure in this case.

* A rare race condition has been fixed that could result in corrupted 
  node meta data during a dbrestore operation (#2182).
  
* The http status has been modified to use relative links (#2183).
  Previous versions created absolute links (including host ip address) 
  that could cause a problem on system with multiple network addresses
  and in NAT environments.

* A server abort during a dbrestore operation could result in a circular 
  page link in the server node directory (#2186).
  The server recovery procedure has been modified to avoid and fix 
  circular page links in the node directory.

  In case a circular page link in the server node directory is recognized
  an informational message like below is output:
  N1: Node_CommitNodePage(page=0xe396): page is already present - ignored

* The eloqdb6 server could abort with an internal failure during if the 
  available disk space (as defined by the volume files) was exausted in 
  a dbrestore operation. This could result in volume corruption (#2187).

* An ongoing dbrestore operation could block concurrent requests or
  cause large delays (#2188).


Patch PE70-0312080
------------------

Platforms: ALL

* DBLOCK could ignore a conflicting lock. In rare cases DBLOCK
  could ignore a granted lock when granting a previously blocked
  lock request. This could cause status -12 on a subsequent 
  database write (#2167).

* DBGET mode 1 returned status 12 if no current record existed.
  It has been modified to return status 17 (#2168).

* DBFIND mode 1 on a search item (chained access) could result in 
  a memory leak if the corresponding master set was not created.

* Database restructuring could fail to initialize new items (#2171)

* The status return for DBFIND mode 1 on index items was inconsistent 
  with the documentation. Status element 8 returned the first 
  matching record number and status element 10 the last matching 
  record number. The documented behavior is to return the first 
  matching record number in status element 10 and the last matching 
  record number in status element 8. DBFIND mode 1 on index items 
  now follows the documented behavior.

* The Eloquence DBFIND modes 6 and 7 may be used with detail search
  items.  This enables the use of TurboIMAGE btree modes with the
  image3k library ("superchains").

* The data set /INDEXED flag is now returned to the image3k library
  (#1358). This enables the TurboIMAGE DBINFO modes 113 and 209 to 
  return information on master sets with "attached" indexes. It also
  enables the use of DBFIND mode 1 (if the btreemode1 flag is set)
  to locate records though the master set index ("superchains").
  Please note that index access (other than DBFIND mode 1) is always
  enabled with Eloquence, independendly if the /INDEXED flag is set
  for a master set.


Patch PE70-0311180
------------------

Platforms: ALL

* The maximum number of concurrent connections has been increased 
  to 4000 on the HP-UX platform (PA-RISC and Itanium). 
  Windows and Linux are still limited to 1000 concurrent connections.

* The performance for larger configurations has been improved.
  -  The eloqdb6 process has been changed so that a large number 
  of configured threads does not affect the performance.
  -  The performance of the eloqdb6 process has been improved 
  when a large number of (mostly) idle connections is used.
  -  The number of system calls have been reduced.

* The Eloquence DBFIND modes 6 and 7 may be used with master sets.
  The previous implementation was limited to detail sets.
  To use the image3k DBFIND with master sets, installation of
  patch PE70-0311181 (or newer) is required.


Patch PE70-0310180
------------------

Platforms: ALL

* A deadlock situation could cause the eloqdb6 process to abort
  with a a panic message (#2136)

  D0: Assertion failed: tcp->marker != current_marker
  D0: server panic: Aborting on internal failure, file thread.c, line 362

  This panic is caused by a deadlock situation that is not detected 
  immediately and only becomes apparent after another process
  executes a DBUNLOCK. However at this time, the deadlock resolving 
  code is bypassed instead of flagging one of the processes involved 
  in the deadlock. 
  The resolution is to notify one of the threads involved of the
  deadlock and return a db status to the application.


Patch PE70-0310082
------------------

Platforms: ALL

* The eloqdb6 process could abort with an internal failure when
  on-line backup is initiated (dbctl backup start) (#2128)

  eloqdb6 could abort with the message below:
  D0: server panic: Fatal problem detected in Fwr_Setup
  D0: Assertion failed: rc == 0
  D0: server panic: Aborting on internal failure, file volfwr.c, line 541

  When forward logging is used but currently disabled due to an error
  (eg. disk full, permission problem) and is re-enabled subsequently
  (dbctl forwardlog enable) writing to the log file is delayed until
  the server is restarted or an on-line backup is performed.
  Initiating on-line backup resulted in an abort of the server process
  because the previous failure status was not reset properly.


Patch PE70-0309300
------------------

Platforms: ALL

* The eloqdb6 server process could abort with a SIGSEGV when
  forward logging is used and a log file reached its maximum 
  size (#2126).

* The maximum number of concurrent connections has been increased 
  from 1000 to 2000 on the HP-UX platform (#2125).


Patch PE70-0307161
------------------

Platforms: All

* DBGET mode 6 after an image3k DBFIND 1/21 on an index could return 
  status 14 (#2046) 
  A DBGET mode 6 after an image3k DBFIND mode 1/21 on an index should 
  retrieve the last matching record in index order. DBGET mode 6 failed 
  with status 14 when the record was the highest key in the index.

* The eloqdb6 process could abort with an internal failure or cause 
  database corruption (#1966, #2030) 
  Assertion failed: f_bhp->id.node_id == node_id
  Restructuring a database with dbutil could cause the server process
  to abort with an internal error. In rare cases it could cause 
  corruption of the database catalog during crash recovery.

* Restructuring a database could cause an internal failure during 
  forward recovery (#2051)
  
  Restructuring a database might cause an internal failure in the 
  dbrecover utility when applying the forward log files. 
  The dbrecover utility could abort with an error message like below:
  panic: Fatal problem detected in FixRec_FinalCommitUpdatePut
  Assertion failed: *flag_ptr == FixRec_USED || *flag_ptr == FixRec_DELETED


Patch PE70-0305270
------------------

Platforms: All

* The eloqdb6 process could abort with an internal failure (#1973)

    eloqdb6 panics with the message:
    panic: current_task->waiting_for == NULL (thread.c, line ~413)

  This was caused by a race condition which could happen when a 
  number of application processes are terminated at the same time 
  while accessing the database.
  The eloqdb6 server has been modified to solve this problem.
  
  The problem can be triggered in the following scenario:
  - Application thread A holds a lock. It is in the process of
    releasing database locks.
  - Application thread B is waiting for the lock held by thread A.
  - Application thread C is wating for the lock held by thread A
    (and thread B). Thread C gets interrupted (application process
    has been killed) during the wait.
  - There is a situation where thread A might become runnable
    before thread C during the same scheduler quantum and might
    unexpectedly update a scheduler variable used in the deadlock
    detection code ("waiting_for" which is used to track dependencies
    among application threads).

* Connecting to the database server could fail with status -700:-6
  when EnableIPC=2 is used on HP-UX (#1982).
  A message like below is printed:
  
    P0: Unable to down semaphore
    P0: semop(DOWN): Identifier removed (errno 36)
    P0: Failure during wait on server response

  This was caused by a missing initialization when a shared memory 
  segment was re-used by a subsequent process. Due to a race condition, 
  a new connection could become disconnected by the server immediately.

* Recovery in case of abnormal server termination has been improved.
  An incomplete transaction journal could corrupt the database volume(s)
  for btree and catalog node types (#1966).

* In case of inconsistent meta information (such as number of records)
  the eloqdb6 process could abort with an internal error message like 
  below (#1941).

    D0: server panic: Fatal problem detected
    D0: Assertion failed: meta->num_records == 1
    D0: server panic: Aborting on internal failure, file volfrec.c, line 3191

  This situation is is now handled more gracefully and does no
  longer cause a server abort.

* When forward logging is enabled, the eloqdb6 process will refuse 
  to override an existing file (#1952).

  An existing file is considered a logging failure and will either 
  result in diabling forward logging or an internal abort (depending
  on the configuration).
  When using forward recovery and restoring the volume files from the 
  backup and errorneously starting the eloqdb6 process could could 
  overwrite forward log segments and cause potential data loss.
  The eloqdb6 has been modified to refuse overwriting existing files.

* When forward logging is anbled and the eloqdb6 process is killed
  using kill -9 (eg. due to patch PE70-0305090 is not installed), it 
  might fail to flag the volume file(s) as consistent. This might 
  cause the server to refuse to start (#1985).

  A message like below is printed:

     failed to open volume: volume #1 has inconsistent generation 
     count 8 (should be 9) volume.c line 828

  The eloqdb6 server code has been modified to handle this this 
  situation correctly.


Patch PE70-0304041
------------------

Platforms: All

* Changing the data set structure with dbutil could result in an
  inconsistent internal highest record number. As a consequence,
  new records which are added subsequently could become inaccessible.
  This could cause database status -96 on DBPUT or status 17 and
  status 18 on DBGET, depending on the use of the particular data set
  (#1880).

  Please note: Installing this patch will prevent this problem to
  occur in the future. However, it will not correct your existing
  database environment(s).

  In case you used the Eloquence B.07.00 dbutil program to modify
  the database structure we recommend to check and correct your 
  database environment(s) with the dbfsck utility:
  1) Shut down the eloqdb6 database server instance(s).
  2) Write access to the volume files is required. On HP-UX and 
     Linux logon as root or the user configured in the server
     configuration file.
  3) Check and correct your database environment(s) with dbfsck:
     dbfsck -way
     This will automatically fix any inconsistencies caused by
     dbutil data set restructuring. You may need to specify the
     -c /path/to/configfile command line option in case multiple
     database server instances are used.


Installation:
-------------

UNIX:

In order to install this patch, you need to unpack it with gzip.
Gzip is included with HP-UX and Linux.
Installation requires root privileges.

cd /opt/eloquence6
gzip -dc /path/to/PE70-0409140-hpux.tar.gz | tar xf -

Files:

   bin/eloqdb6
   share/doc/PE70-0409140-README


Windows XP/2000/NT:

This patch should *only* be installed if you previously installed
the Eloquence server components on your system.

Download the PE70-0409140-win32.zip file and unpack its contents
with WinZip or PKUNZIP.
Installation requires administrative capabilities.

PLEASE MAKE SURE THE eloqdb6 SERVICE has been STOPPED previously
(in the Service Control Manager or with NET STOP eloqdb6).

Please copy the eloqdb6.exe file into the WINDOWS SYSTEM DIRECTORY
(for example C:\Windows\System32).

Please copy the PE70-0409140-README.txt file into the share\doc
subdirectory of your Eloquence installation (for example
C:\Programs\Eloquence\share\doc).

Files:

   eloqdb6.exe
   PE70-0409140-README.txt