eBook - ePub

Mastering PostgreSQL 10

Name: Mastering PostgreSQL 10
Author: Hans-Jurgen Schonig

Hans-Jurgen Schonig

Share book

428 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Mastering PostgreSQL 10

Hans-Jurgen Schonig

Book details

Book preview

Table of contents

Citations

About This Book

Master the capabilities of PostgreSQL 10 to efficiently manage and maintain your databaseAbout This Book• Your one-stop guide to mastering advanced concepts in PostgreSQL 10 with ease• Master query optimization, replication, and high availability with PostgreSQL • Extend the functionalities of your PostgreSQL instance to suit your organizational needs with minimal effortWho This Book Is ForIf you are a PostgreSQL data architect or an administrator and want to understand how to implement advanced functionalities and master complex administrative tasks with PostgreSQL 10, then this book is perfect for you. Prior experience of administrating a PostgreSQL database and a working knowledge of SQL are required to make the best use of this book.What You Will Learn• Get to grips with the advanced features of PostgreSQL 10 and handle advanced SQL• Make use of the indexing features in PostgreSQL and fine-tune the performance of your queries• Work with stored procedures and manage backup and recovery• Master replication and failover techniques• Troubleshoot your PostgreSQL instance for solutions to common and not-so-common problems• Learn how to migrate your database from MySQL and Oracle to PostgreSQL without any hassleIn DetailPostgreSQL is an open source database used for handling large datasets (big data) and as a JSON document database. This book highlights the newly introduced features in PostgreSQL 10, and shows you how you can build better PostgreSQL applications, and administer your PostgreSQL database more efficiently. We begin by explaining advanced database design concepts in PostgreSQL 10, along with indexing and query optimization. You will also see how to work with event triggers and perform concurrent transactions and table partitioning, along with exploring SQL and server tuning. We will walk you through implementing advanced administrative tasks such as server maintenance and monitoring, replication, recovery, high availability, and much more. You will understand common and not-so-common troubleshooting problems and how you can overcome them. By the end of this book, you will have an expert-level command of advanced database functionalities and will be able to implement advanced administrative tasks with PostgreSQL 10.Style and approachThis mastering-level guide delves into the advanced functionalities of PostgreSQL 10

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Mastering PostgreSQL 10 an online PDF/ePUB?

Yes, you can access Mastering PostgreSQL 10 by Hans-Jurgen Schonig in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Bases de datos. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2018

ISBN

9781788470537

Edition

Topic

Ciencia de la computación

Subtopic

Bases de datos

Making Use of Indexes

In Chapter 2, Understanding Transactions and Locking, you learned concurrency and locking. In this chapter, it is time to attack indexing head on. The importance of this topic cannot be stressed enough—indexing is (and will most likely remain) one of the most important topics in the life of every database engineer.

After 18 years of professional, full-time PostgreSQL consulting and PostgreSQL 24x7 support (www.cybertec-postgresql.com), I can say one thing for sure—bad indexing is the main source of bad performance. Of course, it is important to adjust memory parameters and all that. However, it is all in vain if indexes are not used properly. There is simply no replacement for a missing index.

Therefore, I have dedicated an entire chapter to indexing alone to give you as many insights as possible.

In this chapter, you will learn these topics:

When does PostgreSQL use indexes?
How does an optimizer handle things?
What types of indexes are there and how do they work?
Using your own indexing strategies

At the end of the chapter, you will be able to understand how indexes can be used beneficially in PostgreSQL.

Understanding simple queries and the cost model

In this section, we will get started with indexes. To show how things work, some test data is needed. The following code snippet shows how data can be created easily:

test=# CREATE TABLE t_test (id serial, name text); 
CREATE TABLE 
test=# INSERT INTO t_test (name) SELECT 'hans' 
 FROM generate_series(1, 2000000); 
INSERT 0 2000000 
test=# INSERT INTO t_test (name) SELECT 'paul' 
 FROM generate_series(1, 2000000); 
INSERT 0 2000000

In the first line, a simple table is created. Two columns are used: an autoincrement column that just keeps creating numbers and a column that will be filled with static values.

The generate_series function will generate numbers from 1 to 2 million. So, in this example, 2 million static values for hans and 2 million static values for paul are created.

In all, 4 million rows have been added:

test=# SELECT name, count(*) FROM t_test GROUP BY 1; 
 name | count ------+--------- hans | 2000000 paul | 2000000 (2 rows)

These 4 million rows have some nice properties. IDs are ascending and there are only two distinct names.

Let's run a simple query now:

test=# \timing Timing is on. 
test=# SELECT * FROM t_test WHERE id = 432332; 
 id | name --------+------ 432332 | hans (1 row) 
Time: 176.949 ms

In this case, the timing command will tell psql to show the runtime of a query. Note that this is not the real execution time on the server, but the time measured by psql. In case of very short queries, network latency can be a substantial part of the total time, so this has to be taken into account.

Making use of EXPLAIN

In this example, reading 4 million rows has taken more than 100 milliseconds. From a performance point of view, it is a total disaster. To figure out what goes wrong, PostgreSQL offers the EXPLAIN command:

test=# \h EXPLAIN 
Command: EXPLAIN 
Description: show the execution plan of a statement 
Syntax: 
EXPLAIN [ ( option [, ...] ) ] statement 
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement 

where option can be one of: 

 ANALYZE [ boolean ] 
 VERBOSE [ boolean ] 
 COSTS [ boolean ] 
 BUFFERS [ boolean ] 
 TIMING [ boolean ] 
 FORMAT { TEXT | XML | JSON | YAML }

When you have a feeling that a query is not performing well, EXPLAIN will help you to reveal the real performance problem.

Here is how it works:

test=# EXPLAIN SELECT * FROM t_test WHERE id = 432332; 
  QUERY PLAN 
--------------------------------------------------------------- 
 Gather (cost=1000.00..43463.92 rows=1 width=9) 
 Workers Planned: 2 
 -> Parallel Seq Scan on t_test 
 (cost=0.00..42463.82 rows=1 width=9) 
 Filter: (id = 432332) 
(4 rows)

What you see in this listing is an execution plan. In PostgreSQL, a SQL statement will be executed in four stages. The following components are at work:

The parser will check for syntax errors and obvious problems
The rewrite system takes care of rules (views and other things)
The optimizer will figure out how to execute a query in the most efficient way and work out a plan
The plan provided by the optimizer will be used by the executor to finally create the result

The purpose of EXPLAIN is to see what the planner has come up with to run the query efficiently. In my example, PostgreSQL will use a parallel sequential scan. This means that two workers will cooperate and work on the filter condition together. The partial results are then united through a thing called a gather node, which has been introduced in PostgreSQL 9.6 (it is a part of the parallel query infrastructure). If you look at the plan more precisely, you will see how many rows PostgreSQL expects at each stage of the plan (in this example, rows = 1, that is, one row will be returned).

In PostgreSQL 9.6 and 10.0, the number of parallel workers will be determined by the size of the table. The larger an operation is, the more parallel workers PostgreSQL will fire up. For a very small table, parallelism is not used as it would create too much overhead.

Parallelism is not a must. It is always possible to reduce the number of parallel workers to mimic pre-PostgreSQL 9.6 behavior by setting the following variable to 0:

test=# SET max_parallel_workers_per_gather TO 0; 
SET

Note that this change has no side effect as it is only in your session. Of course, you can also decide the change in the postgresql.conf file, but I would not advise you to do this, as you might lose quite a lot of performance provided by the parallel queries.

Digging into the PostgreSQL cost model

If only one CPU is used, the execution plan will look like this:

test=# EXPLAIN SELECT * FROM t_test WHERE id = 432332; 
  QUERY PLAN 
---------------------------------------------------------- 
 Seq Scan o...