Learning Apache Cassandra
eBook - ePub

Learning Apache Cassandra

Mat Brown

Share book
  1. 246 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Learning Apache Cassandra

Mat Brown

Book details
Book preview
Table of contents
Citations

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Learning Apache Cassandra an online PDF/ePUB?
Yes, you can access Learning Apache Cassandra by Mat Brown in PDF and/or ePUB format, as well as other popular books in Commerce & Business Intelligence. We have over one million books available in our catalogue for you to explore.

Information

Year
2015
ISBN
9781783989201

Learning Apache Cassandra


Table of Contents

Learning Apache Cassandra
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Up and Running with Cassandra
What Cassandra offers, and what it doesn't
Horizontal scalability
High availability
Write optimization
Structured records
Secondary indexes
Efficient result ordering
Immediate consistency
Discretely writable collections
Relational joins
MapReduce
Comparing Cassandra to the alternatives
Installing Cassandra
Installing on Mac OS X
Installing on Ubuntu
Installing on Windows
Bootstrapping the project
CQL – the Cassandra Query Language
Interacting with Cassandra
Creating a keyspace
Selecting a keyspace
Summary
2. The First Table
Creating the users table
Structuring of tables
Table and column options
The type system
Strings
Integers
Floating point and decimal numbers
Dates and times
UUIDs
Booleans
Blobs
The purpose of types
Inserting data
Writing data does not yield feedback
Partial inserts
Selecting data
Missing rows
Selecting more than one row
Retrieving all the rows
Paginating through results
Developing a mental model for Cassandra
Summary
3. Organizing Related Data
A table for status updates
Creating a table with a compound primary key
The structure of the status updates table
UUIDs and timestamps
Working with status updates
Extracting timestamps
Looking up a specific status update
Automatically generating UUIDs
Anatomy of a compound primary key
Anatomy of a single-column primary key
Beyond two columns
Compound keys represent parent-child relationships
Coupling parents and children using static columns
Defining static columns
Working with static columns
Interacting only with the static columns
Static-only inserts
Static columns act like predefined joins
When to use static columns
Refining our mental model
Summary
4. Beyond Key-Value Lookup
Looking up rows by partition
The limits of the WHERE keyword
Restricting by clustering column
Restricting by part of a partition key
Retrieving status updates for a specific time range
Creating time UUID ranges
Selecting a slice of a partition
Paginating over rows in a partition
Counting rows
Reversing the order of rows
Reversing clustering order at query time
Reversing clustering order in the schema
Paginating over multiple partitions
Building an autocomplete function
Summary
5. Establishing Relationships
Modeling follow relationships
Outbound follows
Inbound follows
Storing follow relationships
Designing around queries
Denormalization
Looking up follow relationships
Unfollowing users
Using secondary indexes to avoid denormalization
The form of the single table
Adding a secondary index
Other uses of secondary indexes
Limitations of secondary indexes
Secondary indexes can only have one column
Secondary indexes can only be tested for equality
Secondary index lookup is not as efficient as primary key lookup
Summary
6. Denormalizing Data for Maximum Performance
A normalized approach
Generating the timeline
Ordering and pagination
Multiple partitions and read efficiency
Partial denormalization
Displaying the home timeline
Read performance and write complexity
Fully denormalizing the home timeline
Creating a status update
Displaying the home timeline
Write complexity and data integrity
Summary
7. Expanding Your Data Model
Viewing a table schema in cqlsh
Adding columns to tables
Deleting columns
Updating the existing rows
Updating multiple columns
Updating multiple rows
Removing a value from a column
Missing columns in Cassandra
Deleting specific columns
Syntactic sugar for deletion
Inserts, updates, and upserts
Inserts can overwrite existing data
Checking before inserting isn't enough
Another advantage of UUIDs
Conditional inserts and lightweight transactions
Updates can create new rows
Optimistic locking with conditional updates
Optimistic locking in action
Optimistic locking and accidental updates
Lightweight transactions have a cost
When lightweight transactions aren't necessary
Summary
8. Collections, Tuples, and User-defined Types
The problem with concurrent updates
Serializing the collection
Introducing concurrency
Collection columns and concurrent updates
Defining collection columns
Reading and writing sets
Advanced set manipulation
Removing values from a set
Sets and uniqueness
Collections and upserts
Using lists for ordered, nonunique values
Defining a list column
Writing a list
Discrete list manipulation
Writing data at a specific index
Removing elements from the list
Using maps to store key-value pairs
Writing a map
Updating discrete values in a map
Removing values from maps
Collections in inserts
Collections and secondary indexes
Secondary indexes on map columns
The limitations of collections
Reading discrete values from collections
Collection size limit
Reading a collection column from multiple rows
Performance of collection operations
Working with tuples
Creating a tuple column
Writing to tuples
Indexing tuples
User-defined types
Creating a user-defined type
Assigning a user-defined type to a column
Adding data to a user-defined column
Indexing and querying user-defined types
Partial selection of user-defined types
Choosing between tuples and user-defined types
Comparing data structures
Summary
9. Aggregating Time-Series Data
Recording discrete analytics observations
Using discrete analytics observations
Slicing and dicing our data
Recording aggregate analytics observations
Answering the right question
Precomputation versus read-time aggregation
The many possibilities for aggregation
The role of discrete observations
Recording analytics observations
Updating a counter column...

Table of contents