Data science in R a case studies approach to computational reasoning and problem solving 1st edition by Deborah Nolan – Ebook PDF Instant Download/Delivery: 1498759874, 9781498759878
Full download Data science in R a case studies approach to computational reasoning and problem solving 1st edition after payment

Product details:
ISBN 10: 1498759874
ISBN 13: 9781498759878
Author: Deborah Nolan
Effectively Access, Transform, Manipulate, Visualize, and Reason about Data and ComputationData Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving illustrates the details involved in solving real computational problems encountered in data analysis. It reveals the dynamic and iterative process by which data analysts
Data science in R a case studies approach to computational reasoning and problem solving 1st Table of contents:
Part I Data Manipulation and Modeling
Chapter 1 Predicting Location via Indoor Positioning Systems
1.1 Introduction
1.1.1 Computational Topics
1.2 The Raw Data
1.2.1 Processing the Raw Data
1.3 Cleaning the Data and Building a Representation for Analysis
1.3.1 Exploring Orientation
1.3.2 Exploring MAC Addresses
1.3.3 Exploring the Position of the Hand-Held Device
1.3.4 Creating a Function to Prepare the Data
1.4 Signal Strength Analysis
1.4.1 Distribution of Signal Strength
1.4.2 The Relationship between Signal and Distance
1.5 Nearest Neighbor Methods to Predict Location
1.5.1 Preparing the Test Data
1.5.2 Choice of Orientation
1.5.3 Finding the Nearest Neighbors
1.5.4 Cross-Validation and Choice of k
1.6 Exercises
Bibliography
Chapter 2 Modeling Runners’ Times in the Cherry Blossom Race
2.1 Introduction
2.1.1 Computational Topics
2.2 Reading Tables of Race Results into R
2.3 Data Cleaning and Reformatting Variables
2.4 Exploring the Run Time for All Male Runners
2.4.1 Making Plots with Many Observations
2.4.2 Fitting Models to Average Performance
2.4.3 Cross-Sectional Data and Covariates
2.5 Constructing a Record for an Individual Runner across Years
2.6 Modeling the Change in Running Time for Individuals
2.7 Scraping Race Results from the Web
2.8 Exercises
Bibliography
Chapter 3 Using Statistics to Identify Spam
3.1 Introduction
3.1.1 Computational Topics
3.2 Anatomy of an email Message
3.3 Reading the email Messages
3.4 Text Mining and Naïve Bayes Classification
3.5 Finding the Words in a Message
3.5.1 Splitting the Message into Its Header and Body
3.5.2 Removing Attachments from the Message Body
3.5.3 Extracting Words from the Message Body
3.5.4 Completing the Data Preparation Process
3.6 Implementing the Naïve Bayes Classifier
3.6.1 Test and Training Data
3.6.2 Probability Estimates from Training Data
3.6.3 Classifying New Messages
3.6.4 Computational Considerations
3.7 Recursive Partitioning and Classification Trees
3.8 Organizing an email Message into an R Data Structure
3.8.1 Processing the Header
3.8.2 Processing Attachments
3.8.3 Testing Our Code on More email Data
3.8.4 Completing the Process
3.9 Deriving Variables from the email Message
3.9.1 Checking Our Code for Errors
3.10 Exploring the email Feature Set
3.11 Fitting the rpart() Model to the email Data
3.12 Exercises
Bibliography
Chapter 4 Processing Robot and Sensor Log Files: Seeking a Circular Target
4.1 Description
4.1.1 Computational Topics
4.2 The Data
4.2.1 Reading an Entire Log File
4.2.2 Exploring Log Files
4.2.3 Visualizing the Path
4.2.4 Exploring a “Look”
4.2.5 The Error Distribution for Range Values
4.3 Detecting a Circular Target
4.3.1 Connecting Segments Behind the Robot
4.3.2 Determining If a Segment Corresponds to a Circle
4.4 Detecting the Target with Streaming Data in Real Time
Bibliography
Chapter 5 Strategies for Analyzing a 12-Gigabyte Data Set: Airline Flight Delays
5.1 Introduction
5.1.1 Computational Topics
5.2 Acquiring the Airline Data Set
5.3 Computing with Massive Data: Getting Flight Delay Counts
5.3.1 The R Programming Environment
5.3.2 The UNIX Shell
5.3.3 An SQL Database with R
5.3.4 The bigmemory Package with R
5.4 Explorations Using Parallel Computing: The Distribution of Flight Delays
5.4.1 Writing a Parallelizable Loop with foreach
5.4.2 Using the Split-Apply-Combine Approach for Better Performance
5.4.3 Using Split-Apply-Combine to Find the Best Time to Fly
5.5 From Exploration to Model: Do Older Planes Suffer Greater Delays?
Bibliography
Part II Simulation Studies
Chapter 6 Pairs Trading
6.1 The Problem
6.1.1 Computational Topics
6.2 The Data Format
6.3 Reading the Financial Data
6.4 Visualizing the Time Series
6.5 Finding Opening and Closing Positions
6.5.1 Identifying a Position
6.5.2 Displaying Positions
6.5.3 Finding All Positions
6.5.4 Computing the Profit for a Position
6.5.5 Finding the Optimal Value for k
6.6 Simulation Study
6.6.1 Simulating the Stock Price Series
6.6.2 Making stockSim() Faster
Bibliography
Chapter 7 Simulation Study of a Branching Process
7.1 Introduction
7.1.1 The Monte Carlo Method
7.1.2 Computational Topics
7.2 Exploring the Random Process
7.3 Generating Offspring
7.3.1 Checking the Results
7.3.2 Considering Alternative Implementations
7.4 Profiling and Improving Our Code
7.5 From One Job’s Offspring to an Entire Generation
7.6 Unit Testing
7.7 A Structure for the Function’s Return Value
7.8 The Family Tree: Simulating the Branching Process
7.9 Replicating the Simulation
7.9.1 Analyzing the Simulation Results
7.10 Exercises
Bibliography
Chapter 8 A Self-Organizing Dynamic System with a Phase Transition
8.1 Introduction and Motivation
8.1.1 Computational Topics
8.2 The Model
8.2.1 The Order Cars Move
8.3 Implementing the BML Model
8.3.1 Creating the Initial Grid Configuration
8.3.2 Testing the Grid Creation Function
8.3.3 Displaying the Grid
8.3.4 Visualizing the Grid
8.3.5 Simple and Convenient Object-Oriented Programming
8.3.6 Moving the Cars
8.4 Evaluating the Performance of the Code
8.5 Implementing the BML Model in C
8.5.1 The Algorithm in C
8.5.2 Compiling, Loading, and Calling the C Code
8.6 Running the Simulations
8.6.1 Exploring Car Velocity
8.7 Experimental Compilation
Bibliography
Chapter 9 Simulating Blackjack
9.1 Introduction
9.1.1 Computational Topics
9.2 Blackjack Basics
9.2.1 Testing Functions
9.3 Playing a Hand of Blackjack
9.3.1 Creating Functions for the Player’s Actions
9.4 Strategies for Playing
9.4.1 Developing the Optimal Strategy
9.5 Playing Many Games
9.6 A More Accurate Card Dealer Shoe
9.7 Counting Cards
9.8 Putting It All Together
9.9 Exercises
Bibliography
Part III Data and Web Technologies
Chapter 10 Baseball: Exploring Data in a Relational Database
10.1 Introduction
10.1.1 Computational Topics
10.2 Sean Lahman’s Database
10.2.1 Connecting to the Baseball Database from within R
10.3 Aggregating Salaries into Payroll
10.4 Merging Payroll Data with Information in Other Tables
10.4.1 Adding Team Names to the Payroll Data
10.4.2 Adding World Series Records to the Payroll Data
10.5 Exploring the Extreme Salaries
10.6 Exercises
Bibliography
Chapter 11 CIA Factbook Mashup
11.1 Introduction
11.1.1 Computational Topics
11.2 Acquiring the Data
11.2.1 Extracting Latitude and Longitude from a CSV File
11.3 Integrating Data from Different Sources
11.4 Preparing the Data for Plotting
11.4.1 Redoing the Merge of the Factbook and Location Data
11.5 Plotting with Google Earth™
11.6 Extracting Demographic Information from the CIA XML File
11.7 Generating KML Directly
11.8 Additional Computational Tasks
11.8.1 Creating Plotting Symbols
11.8.2 Efficiency in Generating KML from Strings
11.8.3 Extracting Latitude and Longitude from an HTML File
11.9 Exercises
Bibliography
Chapter 12 Exploring Data Science Jobs with Web Scraping and Text Mining
12.1 Introduction and Motivation
12.1.1 Computational Topics
12.2 Exploring Different Web Sites
12.3 Preliminary/Exploratory Scraping: The Kaggle Job List
12.3.1 Processing the Text
12.3.2 Generalizing to Other Posts
12.3.3 Scraping the Kaggle Post List
12.4 Scraping CyberCoders.com
12.4.1 Getting the Skill List from a Job Post
12.4.2 Finding the Links to Job Postings in the Search Results
12.4.3 Finding the Next Page of Job Post Search Results
12.4.4 Putting It All Together
12.5 A Reusable Generic Framework for Arbitrary Sites
12.6 Scraping Career Builder
12.7 Scraping Monster.com
12.8 Analyzing the Results: The Important Skills
12.9 Note on Web Scraping
12.10 Exercises
Bibliography
Colophon
People also search for Data science in R a case studies approach to computational reasoning and problem solving 1st :
data science case study topics
data analysis case study examples
case study data analysis methods
r case study
case studies in data science with r
Tags: Deborah Nolan, Data science, problem solving, computational reasoning


