eBook - ePub

Learning Elastic Stack 7.0

Name: Learning Elastic Stack 7.0
Author: Pranav Shukla, Sharath Kumar M N

Distributed search, analytics, and visualization using Elasticsearch, Logstash, Beats, and Kibana, 2nd Edition

Pranav Shukla, Sharath Kumar M N

474 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Learning Elastic Stack 7.0

Distributed search, analytics, and visualization using Elasticsearch, Logstash, Beats, and Kibana, 2nd Edition

Pranav Shukla, Sharath Kumar M N

Book details

Book preview

Table of contents

Citations

About This Book

A beginner's guide to storing, managing, and analyzing data with the updated features of Elastic 7.0

Key Features

Gain access to new features and updates introduced in Elastic Stack 7.0
Grasp the fundamentals of Elastic Stack including Elasticsearch, Logstash, and Kibana
Explore useful tips for using Elastic Cloud and deploying Elastic Stack in production environments

Book Description

The Elastic Stack is a powerful combination of tools for techniques such as distributed search, analytics, logging, and visualization of data. Elastic Stack 7.0 encompasses new features and capabilities that will enable you to find unique insights into analytics using these techniques. This book will give you a fundamental understanding of what the stack is all about, and help you use it efficiently to build powerful real-time data processing applications.

The first few sections of the book will help you understand how to set up the stack by installing tools, and exploring their basic configurations. You'll then get up to speed with using Elasticsearch for distributed searching and analytics, Logstash for logging, and Kibana for data visualization. As you work through the book, you will discover the technique of creating custom plugins using Kibana and Beats. This is followed by coverage of the Elastic X-Pack, a useful extension for effective security and monitoring. You'll also find helpful tips on how to use Elastic Cloud and deploy Elastic Stack in production environments.

By the end of this book, you'll be well versed with the fundamental Elastic Stack functionalities and the role of each component in the stack to solve different data processing problems.

What you will learn

Install and configure an Elasticsearch architecture
Solve the full-text search problem with Elasticsearch
Discover powerful analytics capabilities through aggregations using Elasticsearch
Build a data pipeline to transfer data from a variety of sources into Elasticsearch for analysis
Create interactive dashboards for effective storytelling with your data using Kibana
Learn how to secure, monitor and use Elastic Stack's alerting and reporting capabilities
Take applications to an on-premise or cloud-based production environment with Elastic Stack

Who this book is for

This book is for entry-level data professionals, software engineers, e-commerce developers, and full-stack developers who want to learn about Elastic Stack and how the real-time processing and search engine works for business analytics and enterprise search applications. Previous experience with Elastic Stack is not required, however knowledge of data warehousing and database concepts will be helpful.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Learning Elastic Stack 7.0 an online PDF/ePUB?

Yes, you can access Learning Elastic Stack 7.0 by Pranav Shukla, Sharath Kumar M N in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2019

ISBN

9781789958539

Edition

Topic

Computer Science

Subtopic

Data Processing

Index

Computer Science

Section 1: Introduction to Elastic Stack and Elasticsearch

This section covers the basics of Elasticsearch and Elastic Stack. It highlights the importance of distributed and scalable search and analytics that Elastic Stack offers. It will includes concepts such as indexes, types, nodes, and clusters, and provide insights into the REST API, which can be used to perform essential operations such as datatypes and mappings.

This section includes the following chapters:

Chapter 1, Introducing Elastic Stack
Chapter 2, Getting Started with Elasticsearch

Introducing Elastic Stack

The emergence of the web, mobiles, social networks, blogs, and photo sharing has created a massive amount of data in recent years. These new data sources create information that cannot be handled using traditional data storage technology, typically relational databases. As an application developer or business intelligence developer, your job is to fulfill the search and analytics needs of the application.

A number of data stores, capable of big data scale, have emerged in the last few years. These include Hadoop ecosystem projects, several NoSQL databases, and search and analytics engines such as Elasticsearch.

The Elastic Stack is a rich ecosystem of components serving as a full search and analytics stack. The main components of the Elastic Stack are Kibana, Logstash, Beats, X-Pack, and Elasticsearch.

Elasticsearch is at the heart of the Elastic Stack, providing storage, search, and analytical capabilities. Kibana, also referred to as a window into the Elastic Stack, is a user interface for the Elastic Stack with great visualization capabilities. Logstash and Beats help get the data into the Elastic Stack. X-Pack provides powerful features including monitoring, alerting, security, graph, and machine learning to make your system production-ready. Since Elasticsearch is at the heart of the Elastic Stack, we will cover the stack inside-out, starting from the heart and moving on to the surrounding components.

In this chapter, we will cover the following topics:

What is Elasticsearch, and why use it?
A brief history of Elasticsearch and Apache Lucene
Elastic Stack components
Use cases of Elastic Stack

We will look at what Elasticsearch is and why you should consider it as your data store. Once you know the key strengths of Elasticsearch, we will look at the history of Elasticsearch and its underlying technology, Apache Lucene. We will then look at some use cases of the Elastic Stack, and provide an overview of the Elastic Stack's components.

What is Elasticsearch, and why use it?

Since you are reading this book, you probably already know what Elasticsearch is. For the sake of completeness, let's define Elasticsearch:

Elasticsearch is a real-time, distributed search and analytics engine that is horizontally scalable and capable of solving a wide variety of use cases. At the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

Elasticsearch is at the core of the Elastic Stack, playing the central role of a search and analytics engine. Elasticsearch is built on a radically different technology, Apache Lucene. This fundamentally different technology in Elasticsearch sets it apart from traditional relational databases and other NoSQL solutions. Let's look at the key benefits of using Elasticsearch as your data store:

Schemaless, document-oriented
Searching
Analytics
Rich client library support and the REST API
Easy to operate and easy to scale
Near real-time
Lightning-fast
Fault-tolerant

Let's look at each benefit one by one.

Schemaless and document-oriented

Elasticsearch does not impose a strict structure on your data; you can store any JSON documents. JSON documents are first-class citizens in Elasticsearch as opposed to rows and columns in a relational database. A document is roughly equivalent to a record in a relational database table. Traditional relational databases require a schema to be defined beforehand to specify a fixed set of columns and their data types and sizes. Often the nature of data is very dynamic, requiring support for new or dynamic columns. JSON documents naturally support this type of data. For example, take a look at the following document:

{ "name": "John Smith", "address": "121 John Street, NY, 10010", "age": 40 }

This document may represent a customer's record. Here the record has the name, address, and age fields of the customer. Another record may look like the following:

{ "name": "John Doe", "age": 38, "email": "[email protected]" }

Note that the second customer doesn't have the address field but, instead, has an email address. In fact, other customer documents may have completely different sets of fields. This provides a tremendous amount of flexibility in terms of what can be stored.

Searching capability

The core strength of Elasticsearch lies in its text-processing capabilities. Elasticsearch is great at searching, especially full-text searches. Let's understand what a full-text search is:

Full-text search means searching through all the terms of all the documents available in the database. This requires the entire contents of all documents to be parsed and stored beforehand. When you hear full-text search, think of Google Search. You can enter any search term and Google looks through all of the web pages on the internet to find the best-matching web pages. This is quite different from simple SQL queries run against columns of type string in relational databases. Normal SQL queries with a WHERE clause and an equals (=) or LIKE clause try to do an exact or wildcard match with underlying data. SQL queries can, at best, just match the search term to a sub-string within the text column.

When you want to perform a search similar to a Google search on your own data, Elasticsearch is your best bet. You can index emails, text documents, PDF files, web pages, or practically any unstructured text documents and search across all your documents with search terms.

At a high level, Elasticsearch breaks up text data into terms and makes every term searchable by building Lucene indexes. You can build your own fast and flexible Google-like search for your application.

In addition to supporting text data, Elasticsearch also supports other data types such as numbers, dates, geolocations, IP addresses, and many more. We will take an in-depth look at searching in Chapter 3, Searching-What is Relevant.

Analytics

Apart from searching, the second most important functional strength of Elasticsearch is analytics. Yes, what was originally known as just a full-text search engine is now used as an analytics engine in a variety of use cases. Many organizations are running analytics solutions powered by Elasticsearch in production.

Conducting a search is like zooming in and finding a needle in a haystack, that is, locating precisely what is needed within huge amounts of data. Analytics is exactly the opposite of a search; it is about zooming out and taking a look at the bigger picture. For example, you may want to know how many visitors on your website are from the United States as opposed to every other country, or you may want to know how many of your website's visitors use macOS, Windows, or Linux.

Elasticsearch supports a wide variety of aggregations for analytics. Elasticsearch aggregations are quite powerful and can be applied to various data types. We will take a look at the analytics capabilities of Elasticsearch in Chapter 4, Analytics with Elasticsearch.

Rich client library support and the REST API

Elasticsearch has very rich client library support to make it accessible to many programming languages. There are client libraries available for Java, C#, Python, JavaScript, PHP, Perl, Ruby, and many more. Apart from the official client libraries, there are community-driven libraries for 20 plus programming languages.

Additionally, Elasticsearch has a very rich REST (Representational State Transfer) API, which works on the HTTP protocol. The REST API is very well documented and quite comprehensive, making all operations available over HTTP.

All this means that Elasticsearch is very easy to integrate into any application to fulfill your search and analytics needs.

Easy to operate and easy to scale

Elasticsearch can run on a single node and easily scale out to hundreds of nodes. It is very easy to start a single node instance of Elasticsearch; it works out of the box without any configuration changes and scales to hundreds of nodes.

Horizontal scalability is the ability to scale a system horizontally by starting up multiple instances of the same type rather than making one instance more and more powerful. Vertical scaling is about upgrading a single instance by adding more processing power (by increasing the number of CPUs or CPU cores), memory, or storage capacity. There is a practical limit to how much a system can be scaled vertically due to cos...

Title Page
Copyright and Credits
About Packt
Contributors
Preface
Section 1: Introduction to Elastic Stack and Elasticsearch
Introducing Elastic Stack
Getting Started with Elasticsearch
Section 2: Analytics and Visualizing Data
Searching - What is Relevant
Analytics with Elasticsearch
Analyzing Log Data
Building Data Pipelines with Logstash
Visualizing Data with Kibana
Section 3: Elastic Stack Extensions
Elastic X-Pack
Section 4: Production and Server Infrastructure
Running Elastic Stack in Production
Building a Sensor Data Analytics Application
Monitoring Server Infrastructure
Other Books You May Enjoy