eBook - ePub

Mastering MongoDB 4.x

Name: Mastering MongoDB 4.x
ISBN: 9781789611380

Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition

Alex Giamas,

394 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Mastering MongoDB 4.x

Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition

Alex Giamas,

About this book

Leverage the power of MongoDB 4.x to build and administer fault-tolerant database applications

Key Features

Master the new features and capabilities of MongoDB 4.x
Implement advanced data modeling, querying, and administration techniques in MongoDB
Includes rich case-studies and best practices followed by expert MongoDB developers

Book Description

MongoDB is the best platform for working with non-relational data and is considered to be the smartest tool for organizing data in line with business needs. The recently released MongoDB 4.x supports ACID transactions and makes the technology an asset for enterprises across the IT and fintech sectors.

This book provides expertise in advanced and niche areas of managing databases (such as modeling and querying databases) along with various administration techniques in MongoDB, thereby helping you become a successful MongoDB expert. The book helps you understand how the newly added capabilities function with the help of some interesting examples and large datasets. You will dive deeper into niche areas such as high-performance configurations, optimizing SQL statements, configuring large-scale sharded clusters, and many more. You will also master best practices in overcoming database failover, and master recovery and backup procedures for database security.

By the end of the book, you will have gained a practical understanding of administering database applications both on premises and on the cloud; you will also be able to scale database applications across all servers.

What you will learn

Perform advanced querying techniques such as indexing and expressions
Configure, monitor, and maintain a highly scalable MongoDB environment
Master replication and data sharding to optimize read/write performance
Administer MongoDB-based applications on premises or on the cloud
Integrate MongoDB with big data sources to process huge amounts of data
Deploy MongoDB on Kubernetes containers
Use MongoDB in IoT, mobile, and serverless environments

Who this book is for

This book is ideal for MongoDB developers and database administrators who wish to become successful MongoDB experts and build scalable and fault-tolerant applications using MongoDB. It will also be useful for database professionals who wish to become certified MongoDB professionals. Some understanding of MongoDB and basic database concepts is required to get the most out of this book.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Mastering MongoDB 4.x by Alex Giamas in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Modelling & Design. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2019

eBook ISBN

9781789611380

Edition

Topic

Computer Science

Subtopic

Data Modelling & Design

Index

Computer Science

Section 1: Basic MongoDB – Design Goals and Architecture

In this section, we will go through the history of databases and how we arrived at the need for non-relational databases. We will also learn how to model our data so that storage and retrieval from MongoDB can be as efficient as possible. Even though MongoDB is schemaless, designing how data will be organized into documents can have a great effect in terms of performance.

This section consists of the following chapters:

Chapter 1, MongoDB – A Database for Modern Web
Chapter 2, Schema Design and Data Modeling

MongoDB – A Database for Modern Web

In this chapter, we will lay the foundations for understanding MongoDB, and how it claims to be a database that's designed for the modern web. Learning in the first place is as important as knowing how to learn. We will go through the references that have the most up-to-date information about MongoDB, for both new and experienced users. We will cover the following topics:

SQL and MongoDB's history and evolution
MongoDB from the perspective of SQL and other NoSQL technology users
MongoDB's common use cases and why they matter
MongoDB's configuration and best practices

Technical requirements

You will require MongoDB version 4+, Apache Kafka, Apache Spark and Apache Hadoop installed to smoothly sail through the chapter. The codes that have been used for all the chapters can be found at: https://github.com/PacktPublishing/Mastering-MongoDB-4.x-Second-Edition.

The evolution of SQL and NoSQL

Structured Query Language (SQL) existed even before the WWW. Dr. E. F. Codd originally published the paper, A Relational Model of Data for Large Shared Data Banks, in June, 1970, in the Association of Computer Machinery (ACM) journal, Communications of the ACM. SQL was initially developed at IBM by Chamberlin and Boyce, in 1974. Relational Software (now Oracle Corporation) was the first to develop a commercially available implementation of SQL, targeted at United States governmental agencies.

The first American National Standards Institute (ANSI) SQL standard came out in 1986. Since then, there have been eight revisions, with the most recent being published in 2016 (SQL:2016).

SQL was not particularly popular at the start of the WWW. Static content could just be hardcoded into the HTML page without much fuss. However, as the functionality of websites grew, webmasters wanted to generate web page content driven by offline data sources, in order to generate content that could change over time without redeploying code.

Common Gateway Interface (CGI) scripts, developing Perl or Unix shells, were driving early database-driven websites in Web 1.0. With Web 2.0, the web evolved from directly injecting SQL results into the browser to using two-tier and three-tier architectures that separated views from the business and model logic, allowing for SQL queries to be modular and isolated from the rest of the web application.

On the other hand, Not only SQL (NoSQL) is much more modern and supervened web evolution, rising at the same time as Web 2.0 technologies. The term was first coined by Carlo Strozzi in 1998, for his open source database that did not follow the SQL standard, but was still relational.

This is not what we currently expect from a NoSQL database. Johan Oskarsson, a developer at Last.fm at the time, reintroduced the term in early 2009, in order to group a set of distributed, non-relational data stores that were being developed. Many of them were based on Google's Bigtable and MapReduce papers, or Amazon's DynamoDB, a highly available key-value based storage system.

NoSQL's foundations grew upon relaxed atomicity, consistency, isolation, and durability (ACID) properties, which guarantee the performance, scalability, flexibility, and reduced complexity. Most NoSQL databases have gone one way or another in providing as many of the previously mentioned qualities as possible, even offering adjustable guarantees to the developer. The following diagram describes the evolution of SQL and NoSQL:

The evolution of MongoDB

10gen started to develop a cloud computing stack in 2007 and soon realized that the most important innovation was centered around the document-oriented database that they built to power it, which was MongoDB. MongoDB was initially released on August 27, 2009.

Version 1 of MongoDB was pretty basic in terms of features, authorization, and ACID guarantees but it made up for these shortcomings with performance and flexibility.

In the following sections, we will highlight the major features of MongoDB, along with the version numbers with which they were introduced.

Major feature set for versions 1.0 and 1.2

The different features of versions 1.0 and 1.2 are as follows:

Document-based model
Global lock (process level)
Indexes on collections
CRUD operations on documents
No authentication (authentication was handled at the server level)
Master and slave replication
MapReduce (introduced in v1.2)
Stored JavaScript functions (introduced in v1.2)

Version 2

The different features of version 2.0 are as follows:

Background index creation (since v1.4)
Sharding (since v1.6)
More query operators (since v1.6)
Journaling (since v1.8)
Sparse and covered indexes (since v1.8)
Compact commands to reduce disk usage
Memory usage more efficient
Concurrency improvements
Index performance enhancements

Replica sets are now more configurable and data center aware
MapReduce improvements
Authentication (since 2.0, for sharding and most database commands)
Geospatial features introduced
Aggregation framework (since v2.2) and enhancements (since v2.6)
TTL collections (since v2.2)
Concurrency improvements, among which is DB-level locking (since v2.2)
Text searching (since v2.4) and integration (since v2.6)
Hashed indexes (since v2.4)
Security enhancements and role-based access (since v2.4)
V8 JavaScript engine instead of SpiderMonkey (since v2.4)
Query engine improvements (since v2.6)
Pluggable storage engine API
WiredTiger storage engine introduced, with document-level locking, while previous storage engine (now called MMAPv1) supports collection-level locking

Version 3

The different features of version 3.0 are as follows:

Replication and sharding enhancements (since v3.2)
Document validation (since v3.2)
Aggregation framework enhanced operations (since v3.2)
Multiple storage engines (since v3.2, only in Enterprise Edition)
Query language and indexes collation (since v3.4)
Read-only database views (since v3.4)
Linearizable read concern (since v3.4)

Version 4

The different features of version 4.0 are as follows:

Multi-document ACID transactions
Change streams
MongoDB tools (Stitch, Mobile, Sync, and Kubernetes Operator)

The following diagram shows MongoDB's evolution:

As we can observe, version 1 was pretty basic, whereas version 2 introduced most of the features present in the current version, such as sharding, usable and special indexes, geospatial features, and memory and concurrency improvements.

On the way from version 2 to version 3, the aggregation framework was introduced, mainly as a supplement to the ageing (and never up to par with dedicated frameworks, such as Hadoop) MapReduce framework. Then, text search was added, and slowly but surely, the framework was improving performance, stability, and security, to adapt to the increasing enterprise load of customers using MongoDB.

With WiredTiger's introduction in version 3, locking became much less of an issue for MongoDB, as it was brought down from the process (global lock) to the document level, almost the most granular level possible.

Version 4 marked a major transition, bridging the SQL and NoSQL world with the introduction of multi-document ACID transactions. This allowed for a wider range of applications to use MongoDB, especially applications that require a strong real-time consistency guarantee. Further, the introduction of change streams allowed for a faster time to market for real-time applications using MongoDB. A series of tools have also been introduced, to facilitate serverless, mobile, and Internet of Things (IoT) development.

In its current state, MongoDB is a database that can handle loads ranging from start up MVPs and POCs to enterprise applications with hundreds of servers.

MongoDB for SQL developers

MongoDB was developed in the Web 2.0 era. By then, most developers had been using SQL or o...

Title Page
Copyright and Credits
About Packt
Contributors
Preface
Section 1: Basic MongoDB – Design Goals and Architecture
MongoDB – A Database for Modern Web
Schema Design and Data Modeling
Section 2: Querying Effectively
MongoDB CRUD Operations
Advanced Querying
Multi-Document ACID Transactions
Aggregation
Indexing
Section 3: Administration and Data Management
Monitoring, Backup, and Security
Storage Engines
MongoDB Tooling
Harnessing Big Data with MongoDB
Section 4: Scaling and High Availability
Replication
Sharding
Fault Tolerance and High Availability
Other Books You May Enjoy

About this book

Frequently asked questions

Information

Table of contents