One of the major new features in the MongoDB 4.0 release is ACID transactions. For a quick refresher here is what Wikipedia has to say on ACID transactions at the time of this article:
In computer science, ACID ( Atomicity, Consistency, Isolation, Durability ) is a set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc. In the context of databases, a sequence of database operations that satisfies the ACID properties, and thus can be perceived as a single logical operation on the data, is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.
In this post I am going to cover three topics. First, a brief review of transaction syntax in other databases. Next, a quick overview of MongoDB transactions. Finally, a demonstration of MongoDB transactions using the Java driver.
If you are unfamiliar with database transactions it may be helpful to review how some of the popular relational databases handle them before we get into the MongoDB transactions. Let's take a look at the structure around MySQL, Oracle, and PostgreSQL for transactions. I will leave the more advanced topics for the reader to review. Perhaps I will prepare a deep dive in a later post if there is interest.
Example from the MySQL docs:
Example from the Oracle docs:
Example from the PostgreSQL docs:
Now, let us turn our attention to MongoDB transactions. For more information on MongoDB transactions, please see
the official
documentation.
There are a few things from the documentation that I feel are very important to keep in mind.
For transactions:The following operations are not allowed in multi-document transactions:
- You can specify read/write (CRUD) operations on existing collections. The collections can be in different databases.
- You cannot read/write to collections in the config, admin, or local databases.
- You cannot write to system.* collections.
- You cannot return the supported operation’s query plan (i.e. explain).
- For cursors created outside of transactions, you cannot call getMore inside a transaction.
- For cursors created in a transaction, you cannot call getMore outside the transaction.
- Operations that affect the database catalog, such as creating or dropping a collection or an index. For example, a multi-document transaction cannot include an insert operation that would result in the creation of a new collection.
- The listCollections and listIndexes commands and their helper methods are also excluded.
- Non-CRUD and non-informational operations, such as createUser, getParameter, count, etc. and their helpers.
The following code examples are from a MongoDB 4.0 transactions demo I wrote during the 3.7 beta testing and may be found here. This code base should be considered as demo/example quality.
First, let’s take a look at the transaction structure as we have with the other databases. In this case, I am going to use the MongoDB java driver syntax as there is not a direct SQL translation.
Once you get past the Java syntax, the structure is pretty much the same as in the other databases. Let's take a
few of the key elements out and display them in less verbose pseudo code. (And yes, the MySQL, Oracle, or Postgres
Java code is just as ugly if you are using the odbc/jdbc drivers directly.)
The structure of MongoDB transactions follows the same structure as the other examples.
We will begin by taking a look at the main class that runs the demo. One line to note is:
DemoMongoConnector dmc = new DemoMongoConnector()
This class wraps
MongoClient
for convenience. The code may be found
here.
The first thing we are doing after instantiating the connector is attaching a change stream watcher:
Change streams
were introduced in MongoDB 3.6 and allow one to have an open query against the changes in the database. They
also make several of my blog posts on oplog tailing mostly old news. (sad face)
Change streams are outside the scope of this post, but we will use them to watch when the database accepts the
writes during transactions. The change stream code may be found
here.
Next, we set up the database environment for the transactional tests. The Collections to be used during a transaction must be created prior to the start of the transaction. For this testing scenario we will begin by dropping the existing "test" database and then recreating inventory and shipment collections. We will then insert an inventory document with a quantity value of 500.
Now it’s time to begin the transactions! The launch method uses the subsequent submit method to do the actual work, but the forEach calling future.get() is important here. This is a blocking call on the main thread that will cause the main thread to be blocked until each of the transaction threads have completed.
We’re almost to the good part, I swear! The submitTransactionThreads method is creating four transaction threads with the TransactionRetryModule that are going to increase the quantity of our item by one hundred, and four that will decrease the value by one hundred. Thus, a total of eight transaction threads will be created and placed into the changeStreamExecutorService. The second argument to the iterateTransactions will be used in a loop to fire the transaction multiple times. In this case, each thread will do its transaction five times. With the two combined arguments, each thread will increase or decrease the value by five hundred. This is done for testing only.
Retries are very important for the MongoDB transaction logic. The documentation gives examples for multiple languages.
From the MongoDB transaction documentation:
The individual write operations inside the transaction are not retryable, regardless of whether retryWrites is set to true.I highly recommend reading up on the circuit breaker pattern if you implement retry patterns in production code. The Netflix Hystrix library is one of the more popular implementations that I have worked with. It is, however, rather heavy-weight for this demo. A good post from DZone on retries can be found here.
If an operation encounters an error, the returned error may have an errorLabels array field. If the error is a transient error, the errorLabels array field contains "TransientTransactionError" as an element and the transaction as a whole can be retried.
We are finally ready to review the TransactionRetryModule. The entry point into the class is the iterateTransactions method. This method creates a runnable lambda that is used in our multi-threaded model. Next, we create a new DemoMongoConnector. While one could use the connector from the main class, I wanted to imitate the transactions coming from multiple distinct applications. Next, we generate a random name for logging to show the order in which the transactions are being executed. Then, we loop and fire a new transaction for the number of iterations we passed into the runnable. Finally, the method calls which will begin the transaction session.
The handleTransactionClientSession method starts the ClientSession and enters the transaction retry loop.
The transactionRetryLoop method begins by instantiating a Retry object that is used to control the loop. Next, we enter the loop and attempt the transaction. If successful, we mark the retry loop as complete. If an error is thrown from the transaction, the transaction will be aborted and retried. The sysouts are for demo only. Friends don't let friends sysout in production. Finally, if the retry loop exits unsuccessfully after maximum attempts, an error will be thrown.
We made it! The doTransaction method is where the magic finally happens. We begin the transaction, we perform an update on the inventory collection, and make an insert to the shipment collection. Finally, we commit the transaction. In the preceding code block we checked for a MongoException with a specific label MongoException.TRANSIENT_TRANSACTION_ERROR_LABEL in the catch statement. This condition signifies that the transaction has failed due to a TransientTransactionError.
Please note the following code block that updates a document in the inventory collection: This is the "shared" document for all of the transactions in this demo and is the key to raising the transaction errors we are checking for as each thread attempts to update the qty field for sku "abc123".
The standard out for a run of the demo can be found here. It is rather verbose to paste into this post, but if examine it you will see the flow of aborted transactions and how the change streams show the successful ones.
Transactions are a big step forward for MongoDB. In the 4.0 release, they are limited to replica sets, but sharded clusters are on the road map. As with most new additions to a technology, I highly recommend doing extensive testing yourself prior to running it in production. Remember, even with transactions, it is your responsibility to ensure the integrity of your data. Any time your data model becomes more complex, so does the array of issues that can arise.
I hope this review has been helpful, and feel free to test out the code for yourself.
Jai Hirsch
Senior Systems Architect
CARFAX
jai.hirsch@gmail.com