This section is designed to give you a high-level overview of MongoDB 4.x. In this section, you are introduced to the first of three realistic scenarios, a fictitious company called Sweets Complete, Inc. These scenarios are representative of real-life companies and are designed to show how MongoDB can be incorporated into various types of businesses.
In this section, you'll learn about the key differences between MongoDB 3 and 4. You will learn not only how to install MongoDB 4.x for a future customer, but also how to set up an environment based upon Docker technology that you can use throughout the remainder of the book in order to run the examples. You will also learn basic MongoDB administration using the mongo shell.
In this book, we cover how to work with a MongoDB 4.x database, starting with the simplest concepts and moving on to more complex ones. The book is divided into parts or sections, each of which looks at a different scenario.
In this chapter, you are given a general introduction to MongoDB 4.x with a focus on new features and a brief high-level overview of the technology. We also discuss security enhancements, along with backward-incompatible changes that might cause an application written for MongoDB 3 to break after an upgrade to MongoDB 4.x.
In the next chapter, a simple scenario is introduced: a fictitious company called Sweets Complete Inc. that sells confections online to a small base of international customers. In the next two parts that follow, you are introduced to BookSomething.com, another fictitious company with a large database of hotel listings worldwide. Finally, in the last part, you are introduced to BigLittle Micro Finance Ltd., a fictitious company that connects lenders with borrowers and deals with a massive volume of geographically dispersed data.
In this chapter, the following topics are covered:
- A high-level technology overview of MongoDB 4.x
- Significant new features introduced in MongoDB 4.x
- Important security enhancements
- Spotting and avoiding potential problems when migrating from MongoDB 3 to 4
High-level technology overview of MongoDB 4.x
When it was first introduced in 2009, MongoDB took the database world by storm, and since that time it has rapidly gained in popularity. According to the 2019 StackOverflow developer survey (https://insights.stackoverflow.com/survey/2019#technology-_-databases), MongoDB is ranked fifth, with 26% of professional developers and 25.5% of all respondents saying they use MongoDB. DB-Engines (https://db-engines.com/en/ranking) also ranks MongoDB as the fifth most widely used database, using an algorithm that takes into account the frequency of search, DBA Stack Exchange and StackOverflow references, and the frequency with which MongoDB appears in job postings. What is of even more interest is that the trend graph generated by DB-Engines shows that the score (and therefore ranking) of MongoDB has grown by 200% since 2013. You can refer to https://db-engines.com/en/ranking_trend for more details. In 2013, MongoDB was not even in the top 10!
There are many key features of MongoDB that account for its rise in popularity. Subsequent chapters in this book cover the most important of these features in detail. In this section, we present you with a brief, big-picture overview of three key aspects of MongoDB.
MongoDB is based upon documents
One of the most important distinctions between MongoDB and the traditional relational database management systems (RDBMS) is that instead of tables, rows, and columns, the basis for storage in MongoDB is a document. In a certain sense, you can think of the traditional RDBMS system as two dimensional, whereas MongoDB is three dimensional. Documents are typically modeled using JSON formatting and then inserted into the database where they are converted to a binary format for storage (more on that in later chapters!).
Related to the document basis for storage is the fact that MongoDB documents have no fixed schema. The main benefit of this is vastly reduced overhead. Database restructuring is a piece of cake, and doesn't cause the massive problems, website crashes, and security breaches seen in applications reliant upon a traditional RDBMS database restructuring.
The really great news for developers is that most modern programming applications are based on classes representing information that needs to be stored. This has spawned the creation of a large number of object-relational mapping (ORM) libraries for the various programming languages. In MongoDB, on the other hand, the need for a complex ORM infrastructure is completely eliminated as programmatic objects can be directly stored in the database as-is:
So instead of columns, MongoDB documents have fields. Instead of tables, there are collections of documents. Let's now have a look at replication in MongoDB.
High availability
Another feature that causes MongoDB to stand out from other database technologies is its ability to ensure high availability through a process known as replication. A server running MongoDB can have copies of its databases duplicated across two more servers. These copies are known as replica sets. Replica sets are organized through an election process whereby the members of the replica vote on which server becomes the primary. Other servers are then assigned the role of secondary.
This arrangement not only ensures that the database is continuously available, but that it can also be used by application code by way of read preferences. A read preference tells the replica set which servers in the replica set are preferred. If the read preferences are set less restrictively, then the first server in the set to respond might be able to satisfy the request, thereby implementing a form of parallel process that has the potential to greatly enhance performance. This setup is illustrated in the following diagram:
This topic is covered in extensive detail in Chapter 13, Deploying a Replica Set. Lastly, we have a look at sharding.
Horizontal scaling
One more feature, among many, is MongoDB's ability to handle a massive amount of data. This is accomplished by splitting up a sizeable collection across multiple servers, creating a sharded cluster. In the process of splitting the collection, a shard key is chosen from among the fields present with the collection's documents. The shard key is then used by the sharded cluster balancer to determine the appropriate distribution of documents. Application program code is then able, by its knowledge of the value of the shard key, to direct queries to specific members of the sharded cluster, achieving potentially enormous performance gains. This setup is illustrated in the following diagram:
This topic is covered in extensive detail in Chapter 15, Deploying a Sharded Cluster. In the next section of this chapter, we have a look at the major differences between MongoDB 3 and MongoDB 4.x.
Discovering what's new and different in MongoDB 4.x
What's new and different in the MongoDB 4.x release can be broken down into two main categories: new features and internal enhancements. Let's look at the most significant new features first.
Significant new features
The most significant new features introduced in MongoDB 4.x include the following:
- Multidocument ACID transaction support
- Nonblocking secondary replica reads
- In-progress index build interruption
Transactions, secondary replica reads, and the aggregation pipeline are covered in detail in later chapters of this book. For an excellent brief overview of the major changes from MongoDB 3 to 4, go to
https://www.mongodb.com/blog/post/mongodb-40-release-candidate-0-has-landed.
Multidocument ACID transaction support
In the database world, a transaction is a block of database operations that should be treated as if the entire block of commands was just a single command. An example would be where your application is performing end-of-the-month payroll processing. In order to maintain the integrity of the database, and your accounting files, you would need this set of operations to be safeguarded in the event of a failure. ACID is an acronym that stands for atomicity, consistency, isolation, and durability. It represents a set of principles that the database needs to follow in order to safeguard a block of database updates. For more information on ACID, you can refer to https://en.wikipedia.org/wiki/ACID.
In MongoDB 3, a write operation on a single docum...