Sale!

Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee ISBN 1492050040 9781492050049

Name: Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee ISBN 1492050040 9781492050049
SKU: EB_99116
Price: 24.99 USD
Availability: InStock

Original price was: $70.00.Current price is: $24.99.

Instant download Learning Spark Second Edition Jules S. Damji Brooke Wenig Tathagata Das & Denny Lee after payment

SKU: EB_99116 Category: Ebooks

Description
Reviews (0)

Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee – Ebook PDF Instant Download/Delivery: 1492050040, 978-1492050049
Full dowload Learning Spark 2nd Edition after payment

Product details:

ISBN 10: 1492050040
ISBN 13: 978-1492050049
Author: Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee

Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark.

Updated to emphasize new features in Spark 2.4., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you’ll be able to:

Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets Peek under the hood of the Spark SQL engine to understand Spark transformations and performance Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow Use open source Pandas framework Koalas and Spark for data transformation and feature engineering

Learning Spark 2nd Table of contents:

1. Introduction to Apache Spark: A Unified Analytics Engine

The Genesis of Spark
Big Data and Distributed Computing at Google
Hadoop at Yahoo!
Spark’s Early Years at AMPLab
What Is Apache Spark?
Apache Spark Components as a Unified Stack
Apache Spark’s Distributed Execution
The Developer’s Experience
Who Uses Spark, and for What?
Community Adoption and Expansion

2. Downloading Apache Spark and Getting Started

Step 1: Downloading Apache Spark
Spark’s Directories and Files
Step 2: Using the Scala or PySpark Shell
Step 3: Understanding Spark Application Concepts

3. Apache Spark’s Structured APIs

Spark: What’s Underneath an RDD?
Structuring Spark
Key Merits and Benefits
The DataFrame API

4. Spark SQL and DataFrames: Introduction to Built-in Data Sources

Using Spark SQL in Spark Applications
Basic Query Examples
SQL Tables and Views

5. Spark SQL and DataFrames: Interacting with External Data Sources

Spark SQL and Apache Hive
User-Defined Functions
Querying with the Spark SQL Shell, Beeline, and Tableau
External Data Sources
Other External Sources
Higher-Order Functions in DataFrames and Spark SQL
Built-in Functions for Complex Data Types

6. Spark SQL and Datasets

Single API for Java and Scala
Scala Case Classes and JavaBeans for Datasets
Working with Datasets
Memory Management for Datasets and DataFrames
Serialization and Deserialization (SerDe)

7. Optimizing and Tuning Spark Applications

Optimizing and Tuning Spark for Efficiency
Viewing and Setting Apache Spark Configurations
Scaling Spark for Large Workloads
Caching and Persistence of Data

8. Structured Streaming

Evolution of the Apache Spark Stream Processing Engine
The Advent of Micro-Batch Stream Processing
Lessons Learned from Spark Streaming (DStreams)
The Philosophy of Structured Streaming
The Programming Model of Structured Streaming

9. Building Reliable Data Lakes with Apache Spark

The Importance of an Optimal Storage Solution
Building Lakehouses with Apache Spark and Delta Lake
Configuring Apache Spark with Delta Lake
Loading Data into a Delta Lake Table

10. Machine Learning with MLlib

What Is Machine Learning?
Supervised Learning
Unsupervised Learning
Why Spark for Machine Learning?
Designing Machine Learning Pipelines

11. Managing, Deploying, and Scaling Machine Learning Pipelines with Apache Spark

Model Management
Model Deployment Options with MLlib
Model Export Patterns for Real-Time Inference
Leveraging Spark for Non-MLlib Models
Summary

12. Epilogue: Apache Spark 3.0

Spark Core and Spark SQL
Dynamic Partition Pruning
Adaptive Query Execution
Accelerator-Aware Scheduler
Structured Streaming

People also search for Learning Spark 2nd:

learning spark pdf
learning spark 2nd edition
machine learning spark
azure machine learning spark

Reviews

There are no reviews yet.

Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee ISBN 1492050040 9781492050049

Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee – Ebook PDF Instant Download/Delivery: 1492050040, 978-1492050049Full dowload Learning Spark 2nd Edition after payment

Product details:

Learning Spark 2nd Table of contents:

People also search for Learning Spark 2nd:

Reviews

Be the first to review “Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee ISBN 1492050040 9781492050049” Cancel reply

Login

Learning Spark 2nd Edition by Jules Damji, Brooke Wenig, Tathagata Das, Denny Lee – Ebook PDF Instant Download/Delivery: 1492050040, 978-1492050049
Full dowload Learning Spark 2nd Edition after payment