One of the major new features in the MongoDB 4.0 release is ACID transactions. For a quick refresher here is what
Wikipedia
has to say on ACID transactions at the time of this article:
In computer science, ACID (
Atomicity,
Consistency,
Isolation,
Durability
) is a set of properties of database
transactions intended to guarantee validity even in the event of errors, power failures, etc. In the context of
databases, a sequence of database operations that satisfies the ACID properties, and thus can be perceived as a
single logical operation on the data, is called a transaction. For example, a transfer of funds from one bank
account to another, even involving multiple changes such as debiting one account and crediting another, is a
single transaction.
In this post I am going to cover three topics. First, a brief review of transaction syntax in other databases.
Next, a quick overview of MongoDB transactions. Finally, a demonstration of MongoDB transactions using the Java
driver.
A brief review of transaction syntax
If you are unfamiliar with database transactions it may be helpful to review how some of the popular relational
databases handle them before we get into the MongoDB transactions. Let's take a look at the structure around MySQL,
Oracle, and PostgreSQL for transactions. I will leave the more advanced topics for the reader to review. Perhaps
I will prepare a deep dive in a later post if there is interest.
Now, let us turn our attention to MongoDB transactions. For more information on MongoDB transactions, please see
the official
documentation.
There are a few things from the documentation that I feel are very important to keep in mind.
For transactions:
You can specify read/write (CRUD) operations on existing collections. The collections can be in different databases.
You cannot read/write to collections in the config, admin, or local databases.
You cannot write to system.* collections.
You cannot return the supported operation’s query plan (i.e. explain).
For cursors created outside of transactions, you cannot call getMore inside a transaction.
For cursors created in a transaction, you cannot call getMore outside the transaction.
The following operations are not allowed in multi-document transactions:
Operations that affect the database catalog, such as creating or dropping a collection or an index. For
example, a multi-document transaction cannot include an insert operation that would result in the creation of a new collection.
The listCollections and listIndexes commands and their helper methods are also excluded.
Non-CRUD and non-informational operations, such as createUser, getParameter, count, etc. and their helpers.
The following code examples are from a MongoDB 4.0 transactions demo I wrote during the 3.7 beta testing and may
be found
here.
This code base should be considered as demo/example quality.
First, let’s take a look at the transaction structure as we have with the other databases. In this case, I am
going to use the MongoDB java driver syntax as there is not a direct SQL translation.
Once you get past the Java syntax, the structure is pretty much the same as in the other databases. Let's take a
few of the key elements out and display them in less verbose pseudo code. (And yes, the MySQL, Oracle, or Postgres
Java code is just as ugly if you are using the odbc/jdbc drivers directly.)
The structure of MongoDB transactions follows the same structure as the other examples.
A demonstration of MongoDB transactions using the Java driver
We will begin by taking a look at the main class that runs the demo. One line to note is: DemoMongoConnector dmc = new DemoMongoConnector()
This class wraps
MongoClient
for convenience. The code may be found
here.
The first thing we are doing after instantiating the connector is attaching a change stream watcher:
Change streams
were introduced in MongoDB 3.6 and allow one to have an open query against the changes in the database. They
also make several of my blog posts on oplog tailing mostly old news. (sad face)
Change streams are outside the scope of this post, but we will use them to watch when the database accepts the
writes during transactions. The change stream code may be found
here.
Next, we set up the database environment for the transactional tests. The Collections to be used during a
transaction must be created prior to the start of the transaction. For this testing scenario we will begin
by dropping the existing "test" database and then recreating inventory and shipment collections. We will then
insert an inventory document with a quantity value of 500.
Now it’s time to begin the transactions! The launch method uses the subsequent submit method to do the actual work,
but the forEach calling future.get() is important here. This is a blocking call on the main thread that will
cause the main thread to be blocked until each of the transaction threads have completed.
We’re almost to the good part, I swear! The submitTransactionThreads method is creating four transaction
threads with the TransactionRetryModule that are going to increase the quantity of our item by one hundred,
and four that will decrease the value by one hundred. Thus, a total of eight transaction threads will be created
and placed into the changeStreamExecutorService. The second argument to the iterateTransactions will
be used in a loop to fire the transaction multiple times. In this case, each thread will do its transaction five
times. With the two combined arguments, each thread will increase or decrease the value by five hundred. This is
done for testing only.
Retries are very important for the MongoDB transaction logic. The documentation gives
examples
for multiple languages.
The individual write operations inside the transaction are not retryable, regardless of whether retryWrites is
set to true.
If an operation encounters an error, the returned error may have an errorLabels array field. If the error is a
transient error, the errorLabels array field contains "TransientTransactionError" as an element and the
transaction as a whole can be retried.
I highly recommend reading up on the
circuit breaker pattern if you
implement retry patterns in production code. The
Netflix Hystrix library is one of the more popular
implementations that I have worked with. It is, however, rather heavy-weight for this demo. A good post
from DZone on retries can be found
here.
We are finally ready to review the
TransactionRetryModule. The entry point into the class is the iterateTransactions method. This
method creates a runnable lambda that is used in our multi-threaded model. Next, we create a new
DemoMongoConnector. While one could use the connector from the main class, I wanted to imitate the
transactions coming from multiple distinct applications. Next, we generate a random name for logging to show the
order in which the transactions are being executed. Then, we loop and fire a new transaction for the number of
iterations we passed into the runnable. Finally, the method calls
which will begin the transaction session.
The handleTransactionClientSession method starts the ClientSession and enters the transaction retry loop.
The transactionRetryLoop method begins by instantiating a
Retry
object that is used to control the loop. Next, we enter the loop and attempt the transaction. If successful, we
mark the retry loop as complete. If an error is thrown from the transaction, the transaction will be aborted and
retried. The sysouts are for demo only. Friends don't let friends sysout in production. Finally, if the retry loop
exits unsuccessfully after maximum attempts, an error will be thrown.
We made it! The doTransaction method is where the magic finally happens. We begin the transaction, we
perform an update on the inventory collection, and make an insert to the shipment collection. Finally, we commit
the transaction. In the preceding code block we checked for a MongoException with a specific label
MongoException.TRANSIENT_TRANSACTION_ERROR_LABEL
in the catch statement. This condition signifies that the transaction has failed due to a TransientTransactionError.
Please note the following code block that updates a document in the inventory collection:
This is the "shared" document for all of the transactions in this demo and is the key to raising the transaction
errors we are checking for as each thread attempts to update the qty field for sku "abc123".
The standard out for a run of the demo can be found
here. It is
rather verbose to paste into this post, but if examine it you will see the flow of aborted transactions and how
the change streams show the successful ones.
Conclusion
Transactions are a big step forward for MongoDB. In the 4.0 release, they are limited to replica sets, but sharded
clusters are on the road map. As with most new additions to a technology, I highly recommend doing extensive
testing yourself prior to running it in production. Remember, even with transactions, it is your responsibility to
ensure the integrity of your data. Any time your data model becomes more complex, so does the array of issues that
can arise.
I hope this review has been helpful, and feel free to test out the code for yourself.
While at MongoDB World 2016 I was involved in multiple conversations about unit testing and test driven development
of data access objects that wrap the MongoDB Java driver. Not two days after returning to work from MongoDB World
one of the teams I work with was complaining that the 3.x Java driver broke a lot of their testing patterns and that
they were having difficulties mocking out the driver. I had a few meetings canceled so I decided to sit down and
create a testing pattern that could be easily implemented and extended using the Mockito libraries.
A side note before we get started. If you are mocking at the driver layer you may want to take a step back and
consider why you are doing it and much of the following code violates the principle of 'Do no mock types you don't own'.
However, while I am not going to dive deep into the philosophy of unit testing, I think these testing patterns
may be of use to some. My general rule of
thumb is to only test logic, avoid the file system, and assume that the external systems work as advertised. Avoid
the pitfalls of only testing mocks for the sake of testing and make sure you set up integration layers to verify
that the external systems do, in fact, work as advertised. That being said, all the code for the following examples
may be found here.
Lets begin with setting up our base mocks and injecting them into the class we will be working on.
I am using the
Mockito's InjectMocks which will inject the mock MongoClient into the DriverWrapper class.
DriverWrapper is the class under test in this case. Using the Junit @Before annotation reduces the amount boilerplate
code needed in each test as this method will run before each test.
For the first test lets try something simple with a test for a method that finds documents by a last name field
And here is the method under test:
So this works, but the code feels pretty bulky. Following the red/green/refactor principles lets clean up the
implementation of the findByLastName method. Looking over the Java api there is a into method that will will replace
the body of the code above:
Well, that reduced the line count of the method by a bit... It did however break the test as we need to mock out
different parts of the code and needs to be updated to reflect the new implementation:
While the code has been reduced down to one line the test still has a lot of boilerplate that will need to be
"duplicated" in the next test. I am a fan of the builder pattern for problems like this so lets take a look at
a way to implement this pattern to create a more generic tool for setting up the tests.
Using this approach we are passing the mock MongoCollection we created in the test as a constructor parameter.
We then set collection as a field. We internalize the mock MongoCursor and FindIterable classes that we needed to
set up per test. Finally we use the builder pattern to define the mocking of the find and into methods. The test
may now be rewritten as:
Using this new builder it becomes much easier to create a test with multiple Documents returned by the query.
The main difference in this test is the use of the Hamcrest library's IsIterableContainingInOrder class. This
allows us to test the order of a returned list against an expected list in a direct assertion.
This concludes my basic implementation of using Mockito to test implementations of the MongoDB Java driver. I
hope you have found it useful. There are some other patterns and tests in the source code that I am working on
and my publish more on this later.