eBook - ePub

Mastering PostgreSQL 10

Name: Mastering PostgreSQL 10
Author: Hans-Jurgen Schonig

Hans-Jurgen Schonig

Buch teilen

428 Seiten
English
ePUB (handyfreundlich)
Über iOS und Android verfügbar

eBook - ePub

Mastering PostgreSQL 10

Hans-Jurgen Schonig

Angaben zum Buch

Buchvorschau

Inhaltsverzeichnis

Quellenangaben

Über dieses Buch

Master the capabilities of PostgreSQL 10 to efficiently manage and maintain your databaseAbout This Book• Your one-stop guide to mastering advanced concepts in PostgreSQL 10 with ease• Master query optimization, replication, and high availability with PostgreSQL • Extend the functionalities of your PostgreSQL instance to suit your organizational needs with minimal effortWho This Book Is ForIf you are a PostgreSQL data architect or an administrator and want to understand how to implement advanced functionalities and master complex administrative tasks with PostgreSQL 10, then this book is perfect for you. Prior experience of administrating a PostgreSQL database and a working knowledge of SQL are required to make the best use of this book.What You Will Learn• Get to grips with the advanced features of PostgreSQL 10 and handle advanced SQL• Make use of the indexing features in PostgreSQL and fine-tune the performance of your queries• Work with stored procedures and manage backup and recovery• Master replication and failover techniques• Troubleshoot your PostgreSQL instance for solutions to common and not-so-common problems• Learn how to migrate your database from MySQL and Oracle to PostgreSQL without any hassleIn DetailPostgreSQL is an open source database used for handling large datasets (big data) and as a JSON document database. This book highlights the newly introduced features in PostgreSQL 10, and shows you how you can build better PostgreSQL applications, and administer your PostgreSQL database more efficiently. We begin by explaining advanced database design concepts in PostgreSQL 10, along with indexing and query optimization. You will also see how to work with event triggers and perform concurrent transactions and table partitioning, along with exploring SQL and server tuning. We will walk you through implementing advanced administrative tasks such as server maintenance and monitoring, replication, recovery, high availability, and much more. You will understand common and not-so-common troubleshooting problems and how you can overcome them. By the end of this book, you will have an expert-level command of advanced database functionalities and will be able to implement advanced administrative tasks with PostgreSQL 10.Style and approachThis mastering-level guide delves into the advanced functionalities of PostgreSQL 10

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?

Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.

(Wie) Kann ich Bücher herunterladen?

Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.

Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?

Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.

Was ist Perlego?

Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.

Unterstützt Perlego Text-zu-Sprache?

Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.

Ist Mastering PostgreSQL 10 als Online-PDF/ePub verfügbar?

Ja, du hast Zugang zu Mastering PostgreSQL 10 von Hans-Jurgen Schonig im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Ciencia de la computación & Bases de datos. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Verlag

Packt Publishing

Jahr

2018

ISBN

9781788470537

Auflage

Thema

Ciencia de la computación

Thema

Bases de datos

Making Use of Indexes

In Chapter 2, Understanding Transactions and Locking, you learned concurrency and locking. In this chapter, it is time to attack indexing head on. The importance of this topic cannot be stressed enough—indexing is (and will most likely remain) one of the most important topics in the life of every database engineer.

After 18 years of professional, full-time PostgreSQL consulting and PostgreSQL 24x7 support (www.cybertec-postgresql.com), I can say one thing for sure—bad indexing is the main source of bad performance. Of course, it is important to adjust memory parameters and all that. However, it is all in vain if indexes are not used properly. There is simply no replacement for a missing index.

Therefore, I have dedicated an entire chapter to indexing alone to give you as many insights as possible.

In this chapter, you will learn these topics:

When does PostgreSQL use indexes?
How does an optimizer handle things?
What types of indexes are there and how do they work?
Using your own indexing strategies

At the end of the chapter, you will be able to understand how indexes can be used beneficially in PostgreSQL.

Understanding simple queries and the cost model

In this section, we will get started with indexes. To show how things work, some test data is needed. The following code snippet shows how data can be created easily:

test=# CREATE TABLE t_test (id serial, name text); 
CREATE TABLE 
test=# INSERT INTO t_test (name) SELECT 'hans' 
 FROM generate_series(1, 2000000); 
INSERT 0 2000000 
test=# INSERT INTO t_test (name) SELECT 'paul' 
 FROM generate_series(1, 2000000); 
INSERT 0 2000000

In the first line, a simple table is created. Two columns are used: an autoincrement column that just keeps creating numbers and a column that will be filled with static values.

The generate_series function will generate numbers from 1 to 2 million. So, in this example, 2 million static values for hans and 2 million static values for paul are created.

In all, 4 million rows have been added:

test=# SELECT name, count(*) FROM t_test GROUP BY 1; 
 name | count ------+--------- hans | 2000000 paul | 2000000 (2 rows)

These 4 million rows have some nice properties. IDs are ascending and there are only two distinct names.

Let's run a simple query now:

test=# \timing Timing is on. 
test=# SELECT * FROM t_test WHERE id = 432332; 
 id | name --------+------ 432332 | hans (1 row) 
Time: 176.949 ms

In this case, the timing command will tell psql to show the runtime of a query. Note that this is not the real execution time on the server, but the time measured by psql. In case of very short queries, network latency can be a substantial part of the total time, so this has to be taken into account.

Making use of EXPLAIN

In this example, reading 4 million rows has taken more than 100 milliseconds. From a performance point of view, it is a total disaster. To figure out what goes wrong, PostgreSQL offers the EXPLAIN command:

test=# \h EXPLAIN 
Command: EXPLAIN 
Description: show the execution plan of a statement 
Syntax: 
EXPLAIN [ ( option [, ...] ) ] statement 
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement 

where option can be one of: 

 ANALYZE [ boolean ] 
 VERBOSE [ boolean ] 
 COSTS [ boolean ] 
 BUFFERS [ boolean ] 
 TIMING [ boolean ] 
 FORMAT { TEXT | XML | JSON | YAML }

When you have a feeling that a query is not performing well, EXPLAIN will help you to reveal the real performance problem.

Here is how it works:

test=# EXPLAIN SELECT * FROM t_test WHERE id = 432332; 
  QUERY PLAN 
--------------------------------------------------------------- 
 Gather (cost=1000.00..43463.92 rows=1 width=9) 
 Workers Planned: 2 
 -> Parallel Seq Scan on t_test 
 (cost=0.00..42463.82 rows=1 width=9) 
 Filter: (id = 432332) 
(4 rows)

What you see in this listing is an execution plan. In PostgreSQL, a SQL statement will be executed in four stages. The following components are at work:

The parser will check for syntax errors and obvious problems
The rewrite system takes care of rules (views and other things)
The optimizer will figure out how to execute a query in the most efficient way and work out a plan
The plan provided by the optimizer will be used by the executor to finally create the result

The purpose of EXPLAIN is to see what the planner has come up with to run the query efficiently. In my example, PostgreSQL will use a parallel sequential scan. This means that two workers will cooperate and work on the filter condition together. The partial results are then united through a thing called a gather node, which has been introduced in PostgreSQL 9.6 (it is a part of the parallel query infrastructure). If you look at the plan more precisely, you will see how many rows PostgreSQL expects at each stage of the plan (in this example, rows = 1, that is, one row will be returned).

In PostgreSQL 9.6 and 10.0, the number of parallel workers will be determined by the size of the table. The larger an operation is, the more parallel workers PostgreSQL will fire up. For a very small table, parallelism is not used as it would create too much overhead.

Parallelism is not a must. It is always possible to reduce the number of parallel workers to mimic pre-PostgreSQL 9.6 behavior by setting the following variable to 0:

test=# SET max_parallel_workers_per_gather TO 0; 
SET

Note that this change has no side effect as it is only in your session. Of course, you can also decide the change in the postgresql.conf file, but I would not advise you to do this, as you might lose quite a lot of performance provided by the parallel queries.

Digging into the PostgreSQL cost model

If only one CPU is used, the execution plan will look like this:

test=# EXPLAIN SELECT * FROM t_test WHERE id = 432332; 
  QUERY PLAN 
---------------------------------------------------------- 
 Seq Scan o...