eBook - ePub

Advanced Elasticsearch 7.0

Name: Advanced Elasticsearch 7.0
Author: Wai Tak Wong

A practical guide to designing, indexing, and querying advanced distributed search engines

Wai Tak Wong

Share book

560 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Advanced Elasticsearch 7.0

A practical guide to designing, indexing, and querying advanced distributed search engines

Wai Tak Wong

Book details

Book preview

Table of contents

Citations

About This Book

Master the intricacies of Elasticsearch 7.0 and use it to create flexible and scalable search solutions

Key Features

Master the latest distributed search and analytics capabilities of Elasticsearch 7.0
Perform searching, indexing, and aggregation of your data at scale
Discover tips and techniques for speeding up your search query performance

Book Description

Building enterprise-grade distributed applications and executing systematic search operations call for a strong understanding of Elasticsearch and expertise in using its core APIs and latest features. This book will help you master the advanced functionalities of Elasticsearch and understand how you can develop a sophisticated, real-time search engine confidently. In addition to this, you'll also learn to run machine learning jobs in Elasticsearch to speed up routine tasks.

You'll get started by learning to use Elasticsearch features on Hadoop and Spark and make search results faster, thereby improving the speed of query results and enhancing the customer experience. You'll then get up to speed with performing analytics by building a metrics pipeline, defining queries, and using Kibana for intuitive visualizations that help provide decision-makers with better insights. The book will later guide you through using Logstash with examples to collect, parse, and enrich logs before indexing them in Elasticsearch.

By the end of this book, you will have comprehensive knowledge of advanced topics such as Apache Spark support, machine learning using Elasticsearch and scikit-learn, and real-time analytics, along with the expertise you need to increase business productivity, perform analytics, and get the very best out of Elasticsearch.

What you will learn

Pre-process documents before indexing in ingest pipelines
Learn how to model your data in the real world
Get to grips with using Elasticsearch for exploratory data analysis
Understand how to build analytics and RESTful services
Use Kibana, Logstash, and Beats for dashboard applications
Get up to speed with Spark and Elasticsearch for real-time analytics
Explore the basics of Spring Data Elasticsearch, and understand how to index, search, and query in a Spring application

Who this book is for

This book is for Elasticsearch developers and data engineers who want to take their basic knowledge of Elasticsearch to the next level and use it to build enterprise-grade distributed search applications. Prior experience of working with Elasticsearch will be useful to get the most out of this book.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Advanced Elasticsearch 7.0 an online PDF/ePUB?

Yes, you can access Advanced Elasticsearch 7.0 by Wai Tak Wong in PDF and/or ePUB format, as well as other popular books in Computer Science & Web Development. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2019

ISBN

9781789956566

Edition

Topic

Computer Science

Subtopic

Web Development

Index

Computer Science

Section 1: Fundamentals and Core APIs

In this section, you will get an overview of Elasticsearch 7 by looking into various concepts and examining Elasticsearch services and core APIs. You will also look at the new distributed, scalable, real-time search and analytics engine.

This section is comprised the following chapters:

Chapter 1, Overview of Elasticsearch 7
Chapter 2, Index APIs
Chapter 3, Document APIs
Chapter 4, Mapping APIs
Chapter 5, Anatomy of an Analyzer
Chapter 6, Search APIs

Overview of Elasticsearch 7

Welcome to Advanced Elasticsearch 7.0. Elasticsearch quickly evolved from version 1.0.0, released in February 2014, to version 6.0.0 GA, released in November 2017. Nonetheless, we will use 7.0.0 release as the base of this book. Without making any assumptions about your knowledge of Elasticsearch, this opening chapter provides setup instructions with the Elasticsearch development environment. To help beginners complete some basic features within a few minutes, several steps are given to launch the new version of an Elasticsearch server. An architectural overview and some core concepts will help you to understand the workflow within Elasticsearch. It will help you straighten your learning path.

Keep in mind that you can learn the potential benefits by reading the API conventions section and becoming familiar with it. The section New features following this one is a list of new features you can explore in the new release. Because major changes are often introduced between major versions, you must check to see whether it breaks the compatibility and affects your application. Go through the Migration between versions section to find out how to minimize the impact on your upgrade project.

In this chapter, you'll learn about the following topics:

Preparing your environment
Running Elasticsearch
Talking to Elasticsearch
Elasticsearch architectural overview
Key concepts
API conventions
New features
Breaking changes
Migration between versions

Preparing your environment

The first step of the novice is to set up the Elasticsearch server, while an experienced user may just need to upgrade the server to the new version. If you are going to upgrade your server software, read through the Breaking changes section and the Migration between versions section to discover the changes that require your attention.

Elasticsearch is developed in Java. As of writing this book, it is recommended that you use a specific Oracle JDK, version 1.8.0_131. By default, Elasticsearch will use the Java version defined by the JAVA_HOME environment variable. Before installing Elasticsearch, please check the installed Java version.

Elasticsearch is supported on many popular operating systems such as RHEL, Ubuntu, Windows, and Solaris. For information on supported operating systems and product compatibility, see the Elastic Support Matrix at https://www.elastic.co/support/matrix. The installation instructions for all the supported platforms can be found in the Installing Elasticsearch documentation (https://www.elastic.co/guide/en/elasticsearch/reference/7.0/install-elasticsearch.html). Although there are many ways to properly install Elasticsearch on different operating systems, it'll be simple and easy to run Elasticsearch from the command line for novices. Please follow the instructions on the official download site (https://www.elastic.co/downloads/past-releases/elasticsearch-7-0-0). In this book, we'll use the Ubuntu 16.04 operating system to host Elasticsearch Service. For example, use the following command line to check the Java version on Ubuntu 16.04:

java -version
java version "1.8.0_181"
java(TM) SE Runtime Environment(build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

The following is a step-by-step guide for installing the preview version from the official download site:

Select the correct package for your operating system (WINDOWS, MACOS, LINUX, DEB, RPM, or MSI (BETA)) and download the 7.0.0 release. For Linux, the filename is elasticsearch-7.0.0-linux-x86_64.tar.gz.
Extract the GNU zipped file into the target directory, which will generate a folder called elasticsearch-7.0.0 using the following command:

tar -zxvf elasticsearch-7.0.0-linux-86_64.tar.gz

Go to the folder and run Elasticsearch with the -p parameter to create a pid file at the specified path:

cd elasticsearch-7.0.0
./bin/elasticsearch -p pid

Elasticsearch runs in the foreground when it runs with the command line above. If you want to shut it down, you can stop it by pressing Ctrl + C, or you can use the process ID from the pid file in the working directory to terminate the process:

kill -15 `cat pid`

Check the log file to make sure the process is closed. You will see the text Native controller process has stopped, stopped, closing, closed near the end of file:

tail logs/elasticsearch.log

To run Elasticsearch as a daemon in background mode, specify -d on the command line:

./bin/elasticsearch -d -p pid

In the next section, we will show you how to run an Elasticsearch instance.

Running Elasticsearch

Elasticsearch does not start automatically after installation. On Windows, to start it automatically at boot time, you can install Elasticsearch as a service. On Ubuntu, it's best to use the Debian package, which installs everything you need to configure Elasticsearch as a service. If you're interested, please refer to the official website (https://www.elastic.co/guide/en/elasticsearch/reference/master/deb.html).

Basic Elasticsearch configuration

Elasticsearch 7.0 has several configuration files located in the config directory, shown as follows. Basically, it provides good defaults, and it requires very little configuration from developers:

ls config

The output will be similar to the following:

elasticsearch.keystore elasticsearch.yml jvm.options log4j2.properties role_mapping.yml roles.yml users users_roles

Let's take a quick look at the elasticsearch.yml, jvm.options, and log4j2.properties files:

elasticsearch.yml: The main configuration file. This configuration file contains settings for the clusters, nodes, and paths. If you specify an item, comment out the line. We'll explain the terminology in the Elasticsearch architectur...