Spark Cookbook
eBook - ePub

Spark Cookbook

Rishi Yadav

Share book
  1. 226 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Spark Cookbook

Rishi Yadav

Book details
Book preview
Table of contents
Citations

About This Book

If you are a data engineer, an application developer, or a data scientist who would like to leverage the power of Apache Spark to get better insights from big data, then this is the book for you.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Spark Cookbook an online PDF/ePUB?
Yes, you can access Spark Cookbook by Rishi Yadav in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Year
2015
ISBN
9781783987061
Edition
1

Spark Cookbook


Table of Contents

Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do itā€¦
How it worksā€¦
There's moreā€¦
See also
Conventions
Reader feedback
Customer support
Downloading the color images of this book
Errata
Piracy
Questions
1. Getting Started with Apache Spark
Introduction
Installing Spark from binaries
Getting ready
How to do it...
Building the Spark source code with Maven
Getting ready
How to do it...
Launching Spark on Amazon EC2
Getting ready
How to do it...
See also
Deploying on a cluster in standalone mode
Getting ready
How to do it...
How it works...
See also
Deploying on a cluster with Mesos
How to do it...
Deploying on a cluster with YARN
Getting ready
How to do it...
How it worksā€¦
Using Tachyon as an off-heap storage layer
How to do it...
See also
2. Developing Applications with Spark
Introduction
Exploring the Spark shell
How to do it...
Developing Spark applications in Eclipse with Maven
Getting ready
How to do it...
Developing Spark applications in Eclipse with SBT
How to do it...
Developing a Spark application in IntelliJ IDEA with Maven
How to do it...
Developing a Spark application in IntelliJ IDEA with SBT
How to do it...
3. External Data Sources
Introduction
Loading data from the local filesystem
How to do it...
Loading data from HDFS
How to do it...
There's moreā€¦
Loading data from HDFS using a custom InputFormat
How to do it...
Loading data from Amazon S3
How to do it...
Loading data from Apache Cassandra
How to do it...
There's more...
Merge strategies in sbt-assembly
Loading data from relational databases
Getting ready
How to do it...
How it worksā€¦
4. Spark SQL
Introduction
Understanding the Catalyst optimizer
How it worksā€¦
Analysis
Logical plan optimization
Physical planning
Code generation
Creating HiveContext
Getting ready
How to do it...
Inferring schema using case classes
How to do it...
Programmatically specifying the schema
How to do it...
How it worksā€¦
Loading and saving data using the Parquet format
How to do it...
How it worksā€¦
There's moreā€¦
Loading and saving data using the JSON format
How to do it...
How it worksā€¦
There's moreā€¦
Loading and saving data from relational databases
Getting ready
How to do it...
Loading and saving data from an arbitrary source
How to do it...
There's moreā€¦
5. Spark Streaming
Introduction
Word count using Streaming
How to do it...
Streaming Twitter data
How to do it...
Streaming using Kafka
Getting ready
How to do it...
There's moreā€¦
6. Getting Started with Machine Learning Using MLlib
Introduction
Creating vectors
How to do itā€¦
How it works...
Creating a labeled point
How to do itā€¦
Creating matrices
How to do itā€¦
Calculating summary statistics
How to do itā€¦
Calculating correlation
Getting ready
How to do itā€¦
Doing hypothesis testing
How to do itā€¦
Creating machine learning pipelines using ML
Getting ready
How to do itā€¦
7. Supervised Learning with MLlib ā€“ Regression
Introduct...

Table of contents