eBook - ePub

Cassandra 3.x High Availability - Second Edition

Name: Cassandra 3.x High Availability - Second Edition
ISBN: 9781786460578

Robbie Strickland,

196 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Cassandra 3.x High Availability - Second Edition

Robbie Strickland,

About this book

Achieve scalability and high availability without compromising on performanceAbout This Book• See how to get 100 percent uptime with your Cassandra applications using this easy-follow guide• Learn how to avoid common and not-so-common mistakes while working with Cassandra using this highly practical guide• Get familiar with the intricacies of working with Cassandra for high availability in your work environment with this go-to-guideWho This Book Is ForIf you are a developer or DevOps engineer who has basic familiarity with Cassandra and you want to become an expert at creating highly available, fault tolerant systems using Cassandra, this book is for you.What You Will Learn • Understand how the core architecture of Cassandra enables highly available applications• Use replication and tunable consistency levels to balance consistency, availability, and performance• Set up multiple data centers to enable failover, load balancing, and geographic distribution• Add capacity to your cluster with zero downtime• Take advantage of high availability features in the native driver• Create data models that scale well and maximize availability• Understand common anti-patterns so you can avoid them• Keep your system working well even during failure scenariosIn DetailApache Cassandra is a massively scalable, peer-to-peer database designed for 100 percent uptime, with deployments in the tens of thousands of nodes, all supporting petabytes of data. This book offers a practical insight into building highly available, real-world applications using Apache Cassandra.The book starts with the fundamentals, helping you to understand how Apache Cassandra's architecture allows it to achieve 100 percent uptime when other systems struggle to do so. You'll get an excellent understanding of data distribution, replication, and Cassandra's highly tunable consistency model. Then we take an in-depth look at Cassandra's robust support for multiple data centers, and you'll see how to scale out a cluster. Next, the book explores the domain of application design, with chapters discussing the native driver and data modeling. Lastly, you'll find out how to steer clear of common anti-patterns and take advantage of Cassandra's ability to fail gracefully.Style and approach This practical guide will get you implementing Cassandra right from the design to creating highly available systems. Through a systematic, step-by-step approach, you will learn different aspects of building highly available Cassandra applications and all this with the help of easy-to-follow examples, tips, and tricks.

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Packt Publishing

Year

2016

eBook ISBN

9781786460578

Edition

Topic

Informatik

Subtopic

Datenbanken

Cassandra 3.x High Availability

Cassandra 3.x High Availability - Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2014

Second edition: August 2016

Production reference: 1250816

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78646-210-7

www.packtpub.com

Credits

Author Robbie Strickland	Copy Editor Safis Editing Vikrant Phadke
Reviewer Jimmy Mårdell	Project Coordinator Nidhi Joshi
Commissioning Editor Veena Pagare	Proofreader Safis Editing
Acquisition Editor Divya Poojari	Indexer Aishwarya Gangawane
Content Development Editor Mayur Pawanikar	Graphics Disha Haria
Technical Editor Suwarna Patil	Production Coordinator Arvindkumar Gupta

About the Author

Robbie Strickland has been involved in the Apache Cassandra project since 2010, and he initially went to production with the 0.5 release. He has made numerous contributions over the years, including work on drivers for C# and Scala and multiple contributions to the core Cassandra codebase. In 2013 he became the very first certified Cassandra developer, and in 2014 DataStax selected him as an Apache Cassandra MVP.

Robbie has been an active speaker and writer in the Cassandra community and is the founder of the Atlanta Cassandra Users Group. Other examples of his writing can be found on the DataStax blog, and he has presented numerous webinars and conference talks over the years.

About the Reviewer

Jimmy Mårdell is a senior software engineer and Cassandra contributor who has worked with Cassandra for more than 5 years. He has been leading the database infrastructure team at Spotify, focusing on improving the Cassandra ecosystem at Spotify and empowering other teams to operate large-scale Cassandra clusters. He has been a speaker at many Cassandra events and in 2015 he was elected by DataStax as an Apache Cassandra MVP. Besides Cassandra, Jimmy likes algorithms and competitive programming and won the programming competition Google Code Jam in 2003.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser

Preface

Cassandra is a fantastic data store and certainly well suited as the foundation of a highly available system. In fact, it was built just for such a purpose: to handle Facebook’s messaging service. But it hasn’t always been so easy to use, with its early Thrift interface and unfamiliar data model causing many potential users to pause—and in many cases for a good reason.

Fortunately, Cassandra has matured substantially over the last few years. I used to advise people only to use Cassandra if nothing else would do the job because the learning curve was quite steep. Version 3.x continues this trend, with the introduction of features such as materialized views and SASI indexes. These additions reduce developer workload and significantly increase the overall utility of the system.

The flip side is that each new feature further obscures the underlying data structure, making complex operations seem straightforward. The familiarity of a SQL-like interface can lure an unsuspecting new user into dangerous traps. The moral of this story is that it’s still not a relational database, and you still need to know what it’s doing under the hood.

And imparting that knowledge is the core objective of this book. Each chapter attempts to demystify the inner workings of Cassandra so that you’re no longer working blindly against a black box data store. You will learn to configure, design, and build your system based on a fundamentally solid foundation.

The good news is that Cassandra makes the task of building massively scalable and incredibly reliable systems relatively straightforward, presuming you understand how to partner with it to achieve these goals.

Since you are reading this book, I presume you are either already using Cassandra or planning to do so, and that you’re interested in building a highly available system on top of it. If so, I am confident that you will meet with success if you follow the principles and guidelines offered in the chapters that follow.

What this book covers

Chapter 1, Cassandra’s Approach to High Availability, is an introduction to concepts related to system availability and the problems that have been encountered historically when trying to make data stores highly available. The chapter outlines Cassandra’s solutions to these problems.

Chapter 2, Data Distribution, outlines the core mechanisms that underlie Cassandra’s distributed hash table model, including consistent hashing and partitioner implementations.

Chapter 3, Replication, offers an in-depth look at the data replication architecture used in Cassandra, with a focus on the relationship between consistency levels and replication factor.

Chapter 4, Data Centers, provides you with a thorough understanding of Cassandra’s robust data center replication capabilities, including deployment on EC2 and building separate clusters for analysis using Hadoop or Spark.

Chapter 5, Scaling Out, is a discussion of the tools, processes, and general guidance needed to properly increase the size of your cluster.

Chapter 6, High Availability Features in the Native Java Client, covers the new native Java driver and its availability-related features. We’ll discuss node discovery, cluster-aware load balancing, automatic failover, and other important concepts.

Chapter 7, Modeling for Availability, discusses the important concepts readers need to understand when modeling highly available data in Cassandra. CQL, keys, wide rows, and denormalization are among the topics that will be covered.

Chapter 8, Anti-Patterns, complements the data modeling chapter by presenting a set of common anti-patterns that proliferate among inexperienced Cassandra developers. Some patterns include queues, joins, high delete volumes, and high-cardinality secondary indexes, among others.

Chapter 9, Failing Gracefully, helps you understand how to deal with the various failure cases, as failure in a large distributed system is inevitable. We’ll examine a number of possible failure scenarios, how to detect them, and how to resolve them.

What you need for this book

This bo...

Cassandra 3.x High Availability

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Cassandra 3.x High Availability - Second Edition by Robbie Strickland in PDF and/or ePUB format, as well as other popular books in Informatik & Datenbanken. We have over one million books available in our catalogue for you to explore.

About this book

Tools to learn more effectively

Information

Cassandra 3.x High Availability

Cassandra 3.x High Availability - Second Edition

Credits

About the Author

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Table of contents

Frequently asked questions