Mastering Apache Cassandra - Second Edition
eBook - ePub

Mastering Apache Cassandra - Second Edition

Nishant Neeraj

Share book
  1. 350 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Mastering Apache Cassandra - Second Edition

Nishant Neeraj

Book details
Book preview
Table of contents
Citations

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Mastering Apache Cassandra - Second Edition an online PDF/ePUB?
Yes, you can access Mastering Apache Cassandra - Second Edition by Nishant Neeraj in PDF and/or ePUB format, as well as other popular books in Informatik & Datenmodellierung- & design. We have over one million books available in our catalogue for you to explore.

Information

Year
2015
ISBN
9781784392611

Mastering Apache Cassandra Second Edition


Table of Contents

Mastering Apache Cassandra Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Quick Start
Introduction to Cassandra
A distributed database
High availability
Replication
Multiple data centers
A brief introduction to a data model
Installing Cassandra locally
Cassandra in action
Modeling data
Writing code
Setting up
Inserting records
Retrieving data
Writing your application
Getting the connection
Executing queries
Object mapping
Summary
2. Cassandra Architecture
Problems in the RDBMS world
Enter NoSQL
The CAP theorem
Consistency
Availability
Partition-tolerance
The significance of the CAP theorem
Cassandra
Understanding the architecture of Cassandra
Ring representation
Virtual nodes
How Cassandra works
Write in action
Read in action
The components of Cassandra
The messaging service
Gossip
Failure detection
Gossip and failure detection
Partitioner
Replication
The notorious R + W > N inequality
LSM tree
Commit log
MemTable
SSTable
The bloom filter
Index files
Data files
Compaction
Tombstones
Hinted handoff
Read repair and anti-entropy
Merkle tree
Summary
3. Effective CQL
The Cassandra data model
The counter column (cell)
The expiring cell
The column family
Keyspaces
Data types
The primary index
CQL3
Creating a keyspace
SimpleStrategy
NetworkTopologyStrategy
Altering a keyspace
Creating a table
Table properties
Altering a table
Adding a column
Renaming a column
Changing the data type
Dropping a column
Updating the table properties
Dropping a table
Creating an index
Dropping an index
Creating a data type
Altering a custom type
Dropping a custom type
Creating triggers
Dropping a trigger
Creating a user
Altering a user
Dropping a user
The granting permission
Revoking permission using REVOKE
Inserting data
Collections in CQL
Lists
Sets
Maps
Lightweight transactions
Updating a row
Deleting a row
Executing the BATCH statement
Other CQL commands
USE
TRUNCATE
LIST USERS
LIST PERMISSIONS
CQL shell commands
DESCRIBE
TRACING
CONSISTENCY
COPY
CAPTURE
ASSUME
SOURCE
SHOW
EXIT
Summary
4. Deploying a Cluster
Evaluating requirements
Hard disk capacity
RAM
CPU
Is node a server?
Network
System configurations
Optimizing user limits
Swapping memory
Clock synchronization
Disk readahead
The required software
Installing Oracle Java 7
RHEL and CentOS systems
Debian and Ubuntu systems
Installing the Java Native Access library
Installing Cassandra
Installing from a tarball
Installing from ASFRepository for Debian or Ubuntu
Anatomy of the installation
Cassandra binaries
Configuration files
Setting up data and commitlog directories
Configuring a Cassandra cluster
The cluster name
The seed node
Listen, broadcast, and RPC addresses
num_tokens versus initial_token
num_tokens
initial_token
Partitioners
The Random partitioner
The Byte-ordered partitioner
The Mumur3 partitioner
Snitches
SimpleSnitch
PropertyFileSnitch
GossipingPropertyFileSnitch
RackInferringSnitch
EC2Snitch
EC2MultiRegionSnitch
Replica placement strategies
SimpleStrategy
NetworkTopologyStrategy
Multiple data center setups
Launching a cluster with a script
Creating a keyspace
Authorization and authentication
Summary
5. Performance Tuning
Stress testing
Database schema
Data distribution
Write pattern
Read queries
Performance tuning
Write performance
Read performance
Choosing the right compaction strategy
Size-tiered compaction strategy
Leveled compaction
Row cache
Key cache
Cache settings
Enabling compression
Tuning the bloom filter
More tuning via cassandra.yaml
commitlog_sync
column_index_size_in_kb
commitlog_total_space_in_mb
Tweaking JVM
Java heap
Garbage collection
Other JVM options
Scaling horizontally and vertically
Network
Summary
6. Managing a Cluster – Scaling, Node Repair, and Backup
Scaling
Adding nodes to a cluster
Adding new nodes in vnode-enabled clusters
Adding a new node to a cluster without vnodes
Removing nodes from a cluster
Removing a live node
Removing a dead node
Replacing a node
Backup and restoration
Using the Cassandra bulk loader to restore the data
Load balancing
DataStax OpsCenter – managing large clusters
Summary
7. Monitoring
Cassandra's JMX interface
Accessing MBeans using JConsole
Cassandra's nodetool utility
Monitoring with nodetool
cfstats
netstats
status
ring and describering
tpstats
compactionstats
info
Managing administration with nodetool
drain
decommission
removenode
move
repair
upgradesstable
snapshot
DataStax OpsCenter
The OpsCenter features
Installing OpsCenter and an agent
Prerequisites
Running a Cassandra cluster
Installing OpsCenter from tarball
Setting up an OpsCenter agent
Monitoring and administrating with OpsCenter
Other features of OpsCenter
Nagios – monitoring and notification
Installing Nagios
Prerequisites
Preparation
Installation
Installing Nagios
Configuring Apache httpd
Installing Nagios plugins
Setting up Nagios as a service
Nagios plugins
Nagios plugins for Cassandra
Executing remote plugins via the NRPE plugin
Installing NRPE on host machines
Installing the NRPE plugin on a Nagios machine
Setting up things to monitor
Monitoring and notification using Nagios
Cassandra log
Enabling Java options for GC logging
Troubleshooting
High CPU usage
High memory usage
Hotspots
Open JDK's erratic behavior
Disk performance
Slow snapshots
Getting help from the mailing list
Summary
8. Integration with Hadoop
Using Hadoop
Hadoop and Cassandra
Introduction to Hadoop
HDFS
Data management
NameNode
DataNodes
Hadoop MapReduce
JobTracker
TaskTracker
Reliability of data and processes in Hadoop
Setting up local Hadoop
Testing the installation
Cassandra with Hadoop MapReduce
Preparing Cassandra for Hadoop
ColumnFamilyInputFormat
ColumnFamilyOutputFormat
CqlOutputFormat and CqlInputFormat
ConfigHelper
Wide row support
Bulk loading
Secondary index support
Cassandra and Hadoop in action
Executing, debugging, monitoring, and looking at results
Hadoop in a Cassandra cluster
Cassandra filesystem
Integration with Pig
Installing Pig
Integrating Pig and Cassandra
Integration with other analytical tools
Summary
Index

Mastering Apache Cassandra S...

Table of contents