Sale!

The Data Science Handbook 1st edition by Field Nicholas Cady ISBN 1119092922 9781119092926

Name: The Data Science Handbook 1st edition by Field Nicholas Cady ISBN 1119092922 9781119092926
SKU: EB_4200
Price: 35.00 USD
Availability: InStock

Original price was: $70.00.Current price is: $35.00.

Instant download Data Science Handbook The Field Nicholas Cady after payment

SKU: EB_4200 Category: Ebooks Tags: Data Science, Data Science Handbook, Field Cady, The Field Nicholas, The Field Nicholas Cady

Description
Reviews (0)

The Data Science Handbook 1st edition by Field Nicholas Cady – Ebook PDF Instant Download/Delivery: 1119092922, 9781119092926
Full download The Data Science Handbook 1st edition after payment

Product details:

ISBN 10: 1119092922
ISBN 13: 9781119092926
Author: Field Nicholas Cady

As a resource on data analysis methodology and big data software tools, specifically written for software data science professionals who need to better understand the analytics and mathematics of the discipline as well as researchers who need to learn real-world coding and expand their skill set, i.e. data analysts, statisticians, software developers, software engineers, BI analysts/developer, junior data scientists, managers of data science, and technical executives interested in understanding more of the nuances of the field; as a reference for new graduates seeking entry-level data science positions; as a classroom supplement for advanced undergraduates and entry-level graduate students; and academic and corporate libraries.

The Data Science Handbook 1st Table of contents:

Chapter 1: Introduction: Becoming a Unicorn

1.1 Aren’t Data Scientists Just Overpaid Statisticians?

1.2 How Is This Book Organized?

1.3 How to Use This Book?

1.4 Why Is It All in Python™, Anyway?

1.5 Example Code and Datasets

1.6 Parting Words

Part I: The Stuff You’ll Always Use

Chapter 2: The Data Science Road Map

2.1 Frame the Problem

2.2 Understand the Data: Basic Questions

2.3 Understand the Data: Data Wrangling

2.4 Understand the Data: Exploratory Analysis

2.5 Extract Features

2.6 Model

2.7 Present Results

2.8 Deploy Code

2.9 Iterating

2.10 Glossary

Chapter 3: Programming Languages

3.1 Why Use a Programming Language? What Are the Other Options?

3.2 A Survey of Programming Languages for Data Science

3.3 Python Crash Course

3.4 Strings

3.5 Defining Functions

3.6 Python’s Technical Libraries

3.7 Other Python Resources

3.8 Further Reading

3.9 Glossary

Interlude: My Personal Toolkit

Chapter 4: Data Munging: String Manipulation, Regular Expressions, and Data Cleaning

4.1 The Worst Dataset in the World

4.2 How to Identify Pathologies

4.3 Problems with Data Content

4.4 Formatting Issues

4.5 Example Formatting Script

4.6 Regular Expressions

4.7 Life in the Trenches

4.8 Glossary

Chapter 5: Visualizations and Simple Metrics

5.1 A Note on Python’s Visualization Tools

5.2 Example Code

5.3 Pie Charts

5.4 Bar Charts

5.5 Histograms

5.6 Means, Standard Deviations, Medians, and Quantiles

5.7 Boxplots

5.8 Scatterplots

5.9 Scatterplots with Logarithmic Axes

5.10 Scatter Matrices

5.11 Heatmaps

5.12 Correlations

5.13 Anscombe’s Quartet and the Limits of Numbers

5.14 Time Series

5.15 Further Reading

5.16 Glossary

Chapter 6: Machine Learning Overview

6.1 Historical Context

6.2 Supervised versus Unsupervised

6.3 Training Data, Testing Data, and the Great Boogeyman of Overfitting

6.4 Further Reading

6.5 Glossary

Chapter 7: Interlude: Feature Extraction Ideas

7.1 Standard Features

7.2 Features That Involve Grouping

7.3 Preview of More Sophisticated Features

7.4 Defining the Feature You Want to Predict

Chapter 8: Machine Learning Classification

8.1 What Is a Classifier, and What Can You Do with It?

8.2 A Few Practical Concerns

8.3 Binary versus Multiclass

8.4 Example Script

8.5 Specific Classifiers

8.6 Evaluating Classifiers

8.7 Selecting Classification Cutoffs

8.8 Further Reading

8.9 Glossary

Chapter 9: Technical Communication and Documentation

9.1 Several Guiding Principles

9.2 Slide Decks

9.3 Written Reports

9.4 Speaking: What Has Worked for Me

9.5 Code Documentation

9.6 Further Reading

9.7 Glossary

Part II: Stuff You Still Need to Know

Chapter 10: Unsupervised Learning: Clustering and Dimensionality Reduction

10.1 The Curse of Dimensionality

10.2 Example: Eigenfaces for Dimensionality Reduction

10.3 Principal Component Analysis and Factor Analysis

10.4 Skree Plots and Understanding Dimensionality

10.5 Factor Analysis

10.6 Limitations of PCA

10.7 Clustering

10.8 Further Reading

10.9 Glossary

Chapter 11: Regression

11.1 Example: Predicting Diabetes Progression

11.2 Least Squares

11.3 Fitting Nonlinear Curves

11.4 Goodness of Fit: R2 and Correlation

11.5 Correlation of Residuals

11.6 Linear Regression

11.7 LASSO Regression and Feature Selection

11.8 Further Reading

11.9 Glossary

Chapter 12: Data Encodings and File Formats

12.1 Typical File Format Categories

12.2 CSV Files

12.3 JSON Files

12.4 XML Files

12.5 HTML Files

12.6 Tar Files

12.7 GZip Files

12.8 Zip Files

12.9 Image Files: Rasterized, Vectorized, and/or Compressed

12.10 It’s All Bytes at the End of the Day

12.11 Integers

12.12 Floats

12.13 Text Data

12.14 Further Reading

12.15 Glossary

Chapter 13: Big Data

13.1 What Is Big Data?

13.2 Hadoop: The File System and the Processor

13.3 Using HDFS

13.4 Example PySpark Script

13.5 Spark Overview

13.6 Spark Operations

13.7 Two Ways to Run PySpark

13.8 Configuring Spark

13.9 Under the Hood

13.10 Spark Tips and Gotchas

13.11 The MapReduce Paradigm

13.12 Performance Considerations

13.13 Further Reading

13.14 Glossary

Chapter 14: Databases

14.1 Relational Databases and MySQL®

14.2 Key-Value Stores

14.3 Wide Column Stores

14.4 Document Stores

14.5 Further Reading

14.6 Glossary

Chapter 15: Software Engineering Best Practices

15.1 Coding Style

15.2 Version Control and Git for Data Scientists

15.3 Testing Code

15.4 Test-Driven Development

15.5 AGILE Methodology

15.6 Further Reading

15.7 Glossary

Chapter 16: Natural Language Processing

16.1 Do I Even Need NLP?

16.2 The Great Divide: Language versus Statistics

16.3 Example: Sentiment Analysis on Stock Market Articles

16.4 Software and Datasets

16.5 Tokenization

16.6 Central Concept: Bag-of-Words

16.7 Word Weighting: TF-IDF

16.8 n-Grams

16.9 Stop Words

16.10 Lemmatization and Stemming

16.11 Synonyms

16.12 Part of Speech Tagging

16.13 Common Problems

16.14 Advanced NLP: Syntax Trees, Knowledge, and Understanding

16.15 Further Reading

16.16 Glossary

Chapter 17: Time Series Analysis

17.1 Example: Predicting Wikipedia Page Views

17.2 A Typical Workflow

17.3 Time Series versus Time-Stamped Events

17.4 Resampling an Interpolation

17.5 Smoothing Signals

17.6 Logarithms and Other Transformations

17.7 Trends and Periodicity

17.8 Windowing

17.9 Brainstorming Simple Features

17.10 Better Features: Time Series as Vectors

17.11 Fourier Analysis: Sometimes a Magic Bullet

17.12 Time Series in Context: The Whole Suite of Features

17.13 Further Reading

17.14 Glossary

Chapter 18: Probability

18.1 Flipping Coins: Bernoulli Random Variables

18.2 Throwing Darts: Uniform Random Variables

18.3 The Uniform Distribution and Pseudorandom Numbers

18.4 Nondiscrete, Noncontinuous Random Variables

18.5 Notation, Expectations, and Standard Deviation

18.6 Dependence, Marginal and Conditional Probability

18.7 Understanding the Tails

18.8 Binomial Distribution

18.9 Poisson Distribution

18.10 Normal Distribution

18.11 Multivariate Gaussian

18.12 Exponential Distribution

18.13 Log-Normal Distribution

18.14 Entropy

18.15 Further Reading

18.16 Glossary

Chapter 19: Statistics

19.1 Statistics in Perspective

19.2 Bayesian versus Frequentist: Practical Tradeoffs and Differing Philosophies

19.3 Hypothesis Testing: Key Idea and Example

19.4 Multiple Hypothesis Testing

19.5 Parameter Estimation

19.6 Hypothesis Testing: t-Test

19.7 Confidence Intervals

19.8 Bayesian Statistics

19.9 Naive Bayesian Statistics

19.10 Bayesian Networks

19.11 Choosing Priors: Maximum Entropy or Domain Knowledge

19.12 Further Reading

19.13 Glossary

Chapter 20: Programming Language Concepts

20.1 Programming Paradigms

20.2 Compilation and Interpretation

20.3 Type Systems

20.4 Further Reading

20.5 Glossary

Chapter 21: Performance and Computer Memory

21.1 Example Script

21.2 Algorithm Performance and Big-O Notation

21.3 Some Classic Problems: Sorting a List and Binary Search

21.4 Amortized Performance and Average Performance

21.5 Two Principles: Reducing Overhead and Managing Memory

21.6 Performance Tip: Use Numerical Libraries When Applicable

21.7 Performance Tip: Delete Large Structures You Don’t Need

21.8 Performance Tip: Use Built-In Functions When Possible

21.9 Performance Tip: Avoid Superfluous Function Calls

21.10 Performance Tip: Avoid Creating Large New Objects

21.11 Further Reading

21.12 Glossary

Part III: Specialized or Advanced Topics

Chapter 22: Computer Memory and Data Structures

22.1 Virtual Memory, the Stack, and the Heap

22.2 Example C Program

22.3 Data Types and Arrays in Memory

22.4 Structs

22.5 Pointers, the Stack, and the Heap

22.6 Key Data Structures

22.7 Further Reading

22.8 Glossary

Chapter 23: Maximum Likelihood Estimation and Optimization

23.1 Maximum Likelihood Estimation

23.2 A Simple Example: Fitting a Line

23.3 Another Example: Logistic Regression

23.4 Optimization

23.5 Gradient Descent and Convex Optimization

23.6 Convex Optimization

23.7 Stochastic Gradient Descent

23.8 Further Reading

23.9 Glossary

Chapter 24: Advanced Classifiers

24.1 A Note on Libraries

24.2 Basic Deep Learning

24.3 Convolutional Neural Networks

24.4 Different Types of Layers. What the Heck Is a Tensor?

24.5 Example: The MNIST Handwriting Dataset

24.6 Recurrent Neural Networks

24.7 Bayesian Networks

24.8 Training and Prediction

24.9 Markov Chain Monte Carlo

24.10 PyMC Example

24.11 Further Reading

24.12 Glossary

Chapter 25: Stochastic Modeling

25.1 Markov Chains

25.2 Two Kinds of Markov Chain, Two Kinds of Questions

25.3 Markov Chain Monte Carlo

25.4 Hidden Markov Models and the Viterbi Algorithm

25.5 The Viterbi Algorithm

25.6 Random Walks

25.7 Brownian Motion

25.8 ARIMA Models

25.9 Continuous-Time Markov Processes

25.10 Poisson Processes

25.11 Further Reading

25.12 Glossary

Parting Words: Your Future as a Data Scientist

Index

People also search The Data Science Handbook 1st:

field cady the data science handbook

the data science handbook advice and insights pdf

data science handbook – field cady- wiley

the data science handbook by field cady pdf

field cady the data science handbook pdf

Tags: Field Nicholas Cady, The Data, Science Handbook

Reviews

There are no reviews yet.

The Data Science Handbook 1st edition by Field Nicholas Cady ISBN 1119092922 9781119092926

The Data Science Handbook 1st edition by Field Nicholas Cady – Ebook PDF Instant Download/Delivery: 1119092922, 9781119092926Full download The Data Science Handbook 1st edition after payment

Product details:

The Data Science Handbook 1st Table of contents:

People also search The Data Science Handbook 1st:

Reviews

Be the first to review “The Data Science Handbook 1st edition by Field Nicholas Cady ISBN 1119092922 9781119092926” Cancel reply

Login

The Data Science Handbook 1st edition by Field Nicholas Cady – Ebook PDF Instant Download/Delivery: 1119092922, 9781119092926
Full download The Data Science Handbook 1st edition after payment