Cassandra 3.x High Availability - Second Edition
eBook - ePub

Cassandra 3.x High Availability - Second Edition

Robbie Strickland

Share book
  1. 196 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Cassandra 3.x High Availability - Second Edition

Robbie Strickland

Book details
Book preview
Table of contents
Citations

About This Book

Achieve scalability and high availability without compromising on performanceAbout This Book• See how to get 100 percent uptime with your Cassandra applications using this easy-follow guide• Learn how to avoid common and not-so-common mistakes while working with Cassandra using this highly practical guide• Get familiar with the intricacies of working with Cassandra for high availability in your work environment with this go-to-guideWho This Book Is ForIf you are a developer or DevOps engineer who has basic familiarity with Cassandra and you want to become an expert at creating highly available, fault tolerant systems using Cassandra, this book is for you.What You Will Learn • Understand how the core architecture of Cassandra enables highly available applications• Use replication and tunable consistency levels to balance consistency, availability, and performance• Set up multiple data centers to enable failover, load balancing, and geographic distribution• Add capacity to your cluster with zero downtime• Take advantage of high availability features in the native driver• Create data models that scale well and maximize availability• Understand common anti-patterns so you can avoid them• Keep your system working well even during failure scenariosIn DetailApache Cassandra is a massively scalable, peer-to-peer database designed for 100 percent uptime, with deployments in the tens of thousands of nodes, all supporting petabytes of data. This book offers a practical insight into building highly available, real-world applications using Apache Cassandra.The book starts with the fundamentals, helping you to understand how Apache Cassandra's architecture allows it to achieve 100 percent uptime when other systems struggle to do so. You'll get an excellent understanding of data distribution, replication, and Cassandra's highly tunable consistency model. Then we take an in-depth look at Cassandra's robust support for multiple data centers, and you'll see how to scale out a cluster. Next, the book explores the domain of application design, with chapters discussing the native driver and data modeling. Lastly, you'll find out how to steer clear of common anti-patterns and take advantage of Cassandra's ability to fail gracefully.Style and approach This practical guide will get you implementing Cassandra right from the design to creating highly available systems. Through a systematic, step-by-step approach, you will learn different aspects of building highly available Cassandra applications and all this with the help of easy-to-follow examples, tips, and tricks.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Cassandra 3.x High Availability - Second Edition an online PDF/ePUB?
Yes, you can access Cassandra 3.x High Availability - Second Edition by Robbie Strickland in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Bases de datos. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9781786460578

Cassandra 3.x High Availability


Cassandra 3.x High Availability - Second Edition

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: December 2014
Second edition: August 2016
Production reference: 1250816
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-210-7
www.packtpub.com

Credits

Author
Robbie Strickland
Copy Editor
Safis Editing
Vikrant Phadke
Reviewer
Jimmy Mårdell
Project Coordinator
Nidhi Joshi
Commissioning Editor
Veena Pagare
Proofreader
Safis Editing
Acquisition Editor
Divya Poojari
Indexer
Aishwarya Gangawane
Content Development Editor
Mayur Pawanikar
Graphics
Disha Haria
Technical Editor
Suwarna Patil
Production Coordinator
Arvindkumar Gupta

About the Author

Robbie Strickland has been involved in the Apache Cassandra project since 2010, and he initially went to production with the 0.5 release. He has made numerous contributions over the years, including work on drivers for C# and Scala and multiple contributions to the core Cassandra codebase. In 2013 he became the very first certified Cassandra developer, and in 2014 DataStax selected him as an Apache Cassandra MVP.
Robbie has been an active speaker and writer in the Cassandra community and is the founder of the Atlanta Cassandra Users Group. Other examples of his writing can be found on the DataStax blog, and he has presented numerous webinars and conference talks over the years.

About the Reviewer

Jimmy Mårdell is a senior software engineer and Cassandra contributor who has worked with Cassandra for more than 5 years. He has been leading the database infrastructure team at Spotify, focusing on improving the Cassandra ecosystem at Spotify and empowering other teams to operate large-scale Cassandra clusters. He has been a speaker at many Cassandra events and in 2015 he was elected by DataStax as an Apache Cassandra MVP. Besides Cassandra, Jimmy likes algorithms and competitive programming and won the programming competition Google Code Jam in 2003.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
eBooks, discount offers, and more
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

  • Fully searchable across every book published by Packt
  • Copy and paste, print, and bookmark content
  • On demand and accessible via a web browser

Preface

Cassandra is a fantastic data store and certainly well suited as the foundation of a highly available system. In fact, it was built just for such a purpose: to handle Facebook’s messaging service. But it hasn’t always been so easy to use, with its early Thrift interface and unfamiliar data model causing many potential users to pause—and in many cases for a good reason.
Fortunately, Cassandra has matured substantially over the last few years. I used to advise people only to use Cassandra if nothing else would do the job because the learning curve was quite steep. Version 3.x continues this trend, with the introduction of features such as materialized views and SASI indexes. These additions reduce developer workload and significantly increase the overall utility of the system.
The flip side is that each new feature further obscures the underlying data structure, making complex operations seem straightforward. The familiarity of a SQL-like interface can lure an unsuspecting new user into dangerous traps. The moral of this story is that it’s still not a relational database, and you still need to know what it’s doing under the hood.
And imparting that knowledge is the core objective of this book. Each chapter attempts to demystify the inner workings of Cassandra so that you’re no longer working blindly against a black box data store. You will learn to configure, design, and build your system based on a fundamentally solid foundation.
The good news is that Cassandra makes the task of building massively scalable and incredibly reliable systems relatively straightforward, presuming you understand how to partner with it to achieve these goals.
Since you are reading this book, I presume you are either already using Cassandra or planning to do so, and that you’re interested in building a highly available system on top of it. If so, I am confident that you will meet with success if you follow the principles and guidelines offered in the chapters that follow.

What this book covers

Chapter 1, Cassandra’s Approach to High Availability, is an introduction to concepts related to system availability and the problems that have been encountered historically when trying to make data stores highly available. The chapter outlines Cassandra’s solutions to these problems.
Chapter 2, Data Distribution, outlines the core mechanisms that underlie Cassandra’s distributed hash table model, including consistent hashing and partitioner implementations.
Chapter 3, Replication, offers an in-depth look at the data replication architecture used in Cassandra, with a focus on the relationship between consistency levels and replication factor.
Chapter 4, Data Centers, provides you with a thorough understanding of Cassandra’s robust data center replication capabilities, including deployment on EC2 and building separate clusters for analysis using Hadoop or Spark.
Chapter 5, Scaling Out, is a discussion of the tools, processes, and general guidance needed to properly increase the size of your cluster.
Chapter 6, High Availability Features in the Native Java Client, covers the new native Java driver and its availability-related features. We’ll discuss node discovery, cluster-aware load balancing, automatic failover, and other important concepts.
Chapter 7, Modeling for Availability, discusses the important concepts readers need to understand when modeling highly available data in Cassandra. CQL, keys, wide rows, and denormalization are among the topics that will be covered.
Chapter 8, Anti-Patterns, complements the data modeling chapter by presenting a set of common anti-patterns that proliferate among inexperienced Cassandra developers. Some patterns include queues, joins, high delete volumes, and high-cardinality secondary indexes, among others.
Chapter 9, Failing Gracefully, helps you understand how to deal with the various failure cases, as failure in a large distributed system is inevitable. We’ll examine a number of possible failure scenarios, how to detect them, and how to resolve them.

What you need for this book

This bo...

Table of contents