Mastering MongoDB 4.x
eBook - ePub

Mastering MongoDB 4.x

Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition

Alex Giamas

  1. 394 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Mastering MongoDB 4.x

Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition

Alex Giamas

Book details
Book preview
Table of contents
Citations

About This Book

Leverage the power of MongoDB 4.x to build and administer fault-tolerant database applications

Key Features

  • Master the new features and capabilities of MongoDB 4.x
  • Implement advanced data modeling, querying, and administration techniques in MongoDB
  • Includes rich case-studies and best practices followed by expert MongoDB developers

Book Description

MongoDB is the best platform for working with non-relational data and is considered to be the smartest tool for organizing data in line with business needs. The recently released MongoDB 4.x supports ACID transactions and makes the technology an asset for enterprises across the IT and fintech sectors.

This book provides expertise in advanced and niche areas of managing databases (such as modeling and querying databases) along with various administration techniques in MongoDB, thereby helping you become a successful MongoDB expert. The book helps you understand how the newly added capabilities function with the help of some interesting examples and large datasets. You will dive deeper into niche areas such as high-performance configurations, optimizing SQL statements, configuring large-scale sharded clusters, and many more. You will also master best practices in overcoming database failover, and master recovery and backup procedures for database security.

By the end of the book, you will have gained a practical understanding of administering database applications both on premises and on the cloud; you will also be able to scale database applications across all servers.

What you will learn

  • Perform advanced querying techniques such as indexing and expressions
  • Configure, monitor, and maintain a highly scalable MongoDB environment
  • Master replication and data sharding to optimize read/write performance
  • Administer MongoDB-based applications on premises or on the cloud
  • Integrate MongoDB with big data sources to process huge amounts of data
  • Deploy MongoDB on Kubernetes containers
  • Use MongoDB in IoT, mobile, and serverless environments

Who this book is for

This book is ideal for MongoDB developers and database administrators who wish to become successful MongoDB experts and build scalable and fault-tolerant applications using MongoDB. It will also be useful for database professionals who wish to become certified MongoDB professionals. Some understanding of MongoDB and basic database concepts is required to get the most out of this book.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Mastering MongoDB 4.x an online PDF/ePUB?
Yes, you can access Mastering MongoDB 4.x by Alex Giamas in PDF and/or ePUB format, as well as other popular books in Informatik & Desktop-Anwendungen. We have over one million books available in our catalogue for you to explore.

Information

Year
2019
ISBN
9781789611380
Edition
2

Section 1: Basic MongoDB – Design Goals and Architecture

In this section, we will go through the history of databases and how we arrived at the need for non-relational databases. We will also learn how to model our data so that storage and retrieval from MongoDB can be as efficient as possible. Even though MongoDB is schemaless, designing how data will be organized into documents can have a great effect in terms of performance.
This section consists of the following chapters:
  • Chapter 1, MongoDB – A Database for Modern Web
  • Chapter 2, Schema Design and Data Modeling

MongoDB – A Database for Modern Web

In this chapter, we will lay the foundations for understanding MongoDB, and how it claims to be a database that's designed for the modern web. Learning in the first place is as important as knowing how to learn. We will go through the references that have the most up-to-date information about MongoDB, for both new and experienced users. We will cover the following topics:
  • SQL and MongoDB's history and evolution
  • MongoDB from the perspective of SQL and other NoSQL technology users
  • MongoDB's common use cases and why they matter
  • MongoDB's configuration and best practices

Technical requirements

You will require MongoDB version 4+, Apache Kafka, Apache Spark and Apache Hadoop installed to smoothly sail through the chapter. The codes that have been used for all the chapters can be found at: https://github.com/PacktPublishing/Mastering-MongoDB-4.x-Second-Edition.

The evolution of SQL and NoSQL

Structured Query Language (SQL) existed even before the WWW. Dr. E. F. Codd originally published the paper, A Relational Model of Data for Large Shared Data Banks, in June, 1970, in the Association of Computer Machinery (ACM) journal, Communications of the ACM. SQL was initially developed at IBM by Chamberlin and Boyce, in 1974. Relational Software (now Oracle Corporation) was the first to develop a commercially available implementation of SQL, targeted at United States governmental agencies.
The first American National Standards Institute (ANSI) SQL standard came out in 1986. Since then, there have been eight revisions, with the most recent being published in 2016 (SQL:2016).
SQL was not particularly popular at the start of the WWW. Static content could just be hardcoded into the HTML page without much fuss. However, as the functionality of websites grew, webmasters wanted to generate web page content driven by offline data sources, in order to generate content that could change over time without redeploying code.
Common Gateway Interface (CGI) scripts, developing Perl or Unix shells, were driving early database-driven websites in Web 1.0. With Web 2.0, the web evolved from directly injecting SQL results into the browser to using two-tier and three-tier architectures that separated views from the business and model logic, allowing for SQL queries to be modular and isolated from the rest of the web application.
On the other hand, Not only SQL (NoSQL) is much more modern and supervened web evolution, rising at the same time as Web 2.0 technologies. The term was first coined by Carlo Strozzi in 1998, for his open source database that did not follow the SQL standard, but was still relational.
This is not what we currently expect from a NoSQL database. Johan Oskarsson, a developer at Last.fm at the time, reintroduced the term in early 2009, in order to group a set of distributed, non-relational data stores that were being developed. Many of them were based on Google's Bigtable and MapReduce papers, or Amazon's DynamoDB, a highly available key-value based storage system.
NoSQL's foundations grew upon relaxed atomicity, consistency, isolation, and durability (ACID) properties, which guarantee the performance, scalability, flexibility, and reduced complexity. Most NoSQL databases have gone one way or another in providing as many of the previously mentioned qualities as possible, even offering adjustable guarantees to the developer. The following diagram describes the evolution of SQL and NoSQL:

The evolution of MongoDB

10gen started to develop a cloud computing stack in 2007 and soon realized that the most important innovation was centered around the document-oriented database that they built to power it, which was MongoDB. MongoDB was initially released on August 27, 2009.
Version 1 of MongoDB was pretty basic in terms of features, authorization, and ACID guarantees but it made up for these shortcomings with performance and flexibility.
In the following sections, we will highlight the major features of MongoDB, along with the version numbers with which they were introduced.

Major feature set for versions 1.0 and 1.2

The different features of versions 1.0 and 1.2 are as follows:
  • Document-based model
  • Global lock (process level)
  • Indexes on collections
  • CRUD operations on documents
  • No authentication (authentication was handled at the server level)
  • Master and slave replication
  • MapReduce (introduced in v1.2)
  • Stored JavaScript functions (introduced in v1.2)

Version 2

The different features of version 2.0 are as follows:
  • Background index creation (since v1.4)
  • Sharding (since v1.6)
  • More query operators (since v1.6)
  • Journaling (since v1.8)
  • Sparse and covered indexes (since v1.8)
  • Compact commands to reduce disk usage
  • Memory usage more efficient
  • Concurrency improvements
  • Index performance enhancements
  • Replica sets are now more configurable and data center aware
  • MapReduce improvements
  • Authentication (since 2.0, for sharding and most database commands)
  • Geospatial features introduced
  • Aggregation framework (since v2.2) and enhancements (since v2.6)
  • TTL collections (since v2.2)
  • Concurrency improvements, among which is DB-level locking (since v2.2)
  • Text searching (since v2.4) and integration (since v2.6)
  • Hashed indexes (since v2.4)
  • Security enhancements and role-based access (since v2.4)
  • V8 JavaScript engine instead of SpiderMonkey (since v2.4)
  • Query engine improvements (since v2.6)
  • Pluggable storage engine API
  • WiredTiger storage engine introduced, with document-level locking, while previous storage engine (now called MMAPv1) supports collection-level locking

Version 3

The different features of version 3.0 are as follows:
  • Replication and sharding enhancements (since v3.2)
  • Document validation (since v3.2)
  • Aggregation framework enhanced operations (since v3.2)
  • Multiple storage engines (since v3.2, only in Enterprise Edition)
  • Query language and indexes collation (since v3.4)
  • Read-only database views (since v3.4)
  • Linearizable read concern (since v3.4)

Version 4

The different features of version 4.0 are as follows:
  • Multi-document ACID transactions
  • Change streams
  • MongoDB tools (Stitch, Mobile, Sync, and Kubernetes Operator)
The following diagram shows MongoDB's evolution:
As we can observe, version 1 was pretty basic, whereas version 2 introduced most of the features present in the current version, such as sharding, usable and special indexes, geospatial features, and memory and concurrency improvements.
On the way from version 2 to version 3, the aggregation framework was introduced, mainly as a supplement to the ageing (and never up to par with dedicated frameworks, such as Hadoop) MapReduce framework. Then, text search was added, and slowly but surely, the framework was improving performance, stability, and security, to adapt to the increasing enterprise load of customers using MongoDB.
With WiredTiger's introduction in version 3, locking became much less of an issue for MongoDB, as it was brought down from the process (global lock) to the document level, almost the most granular level possible.
Version 4 marked a major transition, bridging the SQL and NoSQL world with the introduction of multi-document ACID transactions. This allowed for a wider range of applications to use MongoDB, especially applications that require a strong real-time consistency guarantee. Further, the introduction of change streams allowed for a faster time to market for real-time applications using MongoDB. A series of tools have also been introduced, to facilitate serverless, mobile, and Internet of Things (IoT) development.
In its current state, MongoDB is a database that can handle loads ranging from start up MVPs and POCs to enterprise applications with hundreds of servers.

MongoDB for SQL developers

MongoDB was developed in the Web 2.0 era. By then, most developers had been using SQL or o...

Table of contents