The checkpoint based transactional store consists of 2 files:
- page.db (database)
- transaction.log (journal)
It combines linear high performance random and FIFO access of message pages.
The transactional store works page-oriented with a size of 2 KB. The
pages are stored into a RandomAccessFile named
page.db. Every message
uses at least 1 page. Bigger messages are distributed to several pages.
The messages are accessed via a 2-ary index, consisting of the rootindex
and an index per queue. Every queue index contains message keys with the
respective message page references. An index is sorted. When inserting a
message key into a full index page, a pagesplitting takes place. The
insertion of new elements into an index is always made at the end, the
deletion of elements mostly at the beginning of a queue index. Thus,
every queue index keeps the first and the last index page within the
direct access so that a localization is very fast. Also, all index pages
are stored within the
A cache belongs to the transactional store which keeps often used pages within the main memory. The decision if a page is relocated or not is made on the basis of the hitcounts and the last access time. The cache is controlled by a cache manager by which every page access of other components is done. The cache is both read- and write-cache, as pages are flushed onto the disk not until a checkpoint. Through this, a very high speed concerning read- and write-transactions is obtained.
Every access on the transactional store, reading and writing, is done in
form of store transactions which are controlled by a store transaction
manager. A journal belongs to every transaction where any change within
a page is listed at byte level. Page accesses within a transaction
begins with a
pin of the page which ensures that the page is not
flushed before the transaction is closed.
A commit or a rollback always forms the termination of a transaction.
Both actions are performed by a recovery manager who transfers a log
record to the log manager. In the case of a rollback it previously
performs a rollback onto the pages within the cache. The log manager
works asynchronously with an input queue. Log records are placed into
this queue by the recovery manager and are processed asynchronously by
the log manager. Because of the asynchronous implementation of the log
manager it is possible to process several log records within one go (so
group commit). The log manager writes the log record of the
transaction into a RandomAccessFile named
transaction.log and triggers
the transaction. This now makes an
unpin of all its pages. Therewith
the transaction is finished. In the case of a system failure, all
changes may be reconstructed out of the transaction log. This procedure
is called "write-ahead logging". Here it is guaranteed that the log
record is stored before changes concerning the proper data is written
into the "stable store".
Automatic Message Page Referencing
The Store Swiftlet is able to detect if a message has already been stored and automatically references new inserts of the same message within the same transaction (composite transaction) to the already stored message. This is currently supported for non-XA transactions for durable subscribers so that only 1 message is stored for each durable subscriber.
The synchronisation between
transaction.log is made by
the log manager when the configured maximum size of the transaction log
is reached. The log manager now initializes a so called checkpoint. This
works as follows: the log manager states that the transaction log has
reached the maximum size and informs the store transaction manager which
now blocks every new transaction request and waits until all current
transactions are finished. The store transaction manager now puts a
checkpoint ready event into the log manager queue. This receives the
event and informs the cache manager to perform a flush of the whole
cache. The cache manager writes all changed pages into the
executes a disk-sync. All data is thus consistent. The log manager now
resets the transaction log and forwards a checkpoint finished event to
the store transaction manager whereupon it allows again new
A checkpoint is automatically performed while a router is regularly shut down. If the router terminates because of other reasons (processor failure), a recovery phase will run during the startup of the Store Swiftlet, which is made by the recovery manager. It processes the transaction log and produces a consistent state.
XA transactions are distributed transactions where several resource managers may be involved. A resource manager manages one resource, e.g. a database system or a message server. Hence, SwiftMQ is a resource. The handling of XA transactions belongs to the 2-phase-commit-protocol, in short 2PC. The first phase of this protocol is the prepare-phase. For this, the JTA transaction manager (e.g. an application server) sends a prepare request to all resource managers. A resource manager has to guarantee that transaction changes are stored persistently. This is done by writing a prepare log record. The transaction is not yet closed, all locks will remain. The second phase now comprehends either a commit or a rollback. The prepare phase guarantees that all participating resource managers commit or abort their transactions. If a resource manager fails between prepare and commit, the transaction is in a doubtful condition, it is an "in-doubt-transaction". After restarting the resource manager, either the JTA transaction manager instructs the resource manager to perform a commit or a rollback, or it has to be performed manually by administrative functions. Subsequently, the consistency is recreated.
During the prepare phase, the Store Swiftlet writes a prepare log record into an internal log queue. Thereafter, the transaction is completely committed. Indeed, the locks remain; changes are therefore not visible to other transactions. If a commit is sent within the next phase, only the prepare log record is deleted. In the case of a rollback, the whole transaction is aborted. At the end, the prepare log record is deleted. The consistency necessary for 2PC is only guaranteed with this operations. If the router should terminate between prepare and commit, all prepared log records are read during a restart and all active in-doubt-transactions incl. all locks which are active during the termination are reconstructed. This part is taken by the XA Resource Manager Swiftlet. It also provides functions to a manual commit/rollback of this in-doubt-transactions.
Illustration of the Transactional Store
If a queue cache if full, the swap store serves as a swapping-out of messages. In this case, a RandomAccessFile is created per queue in which messages are swapped-out. The file position, stored within the queue message index, is passed back for any stored message. Thus, the access performance is maximum. Swap files are deleted automatically if all entries are marked as deleted as well as during a router shutdown. Concerning swap files, a roll-over size may be spezified which ensures that files may not grow to large. When reaching this size, a new swap file is created and the previous one persists until it is empty.
Durable Subscriber Store
Durable subscriber definitions are stored in single files. The file name contains the client-id and the durable subscriber name.