Contents
Chapter 1. Introduction to data science and data analytics 1
1.1 About Data Science
1.2 About the EDISON Project and Data Science Framework
1.2.1 The EDISON project
1.2.2 The EDISON Data Science Framework
1.3 About Data Analytics
1.3.1 Data Analytics Competences
1.3.2 Data Analytics Body of Knowledge
1.3.3 Data Analytics Model Curriculum Approach
1.3.4 Data Analytics Professional Profiles
1.4 About this Book
Chapter 2. Data ...... 49
A. Theory
2.1 Introduction
2.2 Characteristic
2.2.1 Definition of characteristic
2.2.2 Types of characteristics
2.3 Data
2.3.1 Definition of Data
2.3.2 Types of data from their nature
2.3.3 Types of data from their storage
2.4 Available Data
2.4.1 Experiment
2.4.2 Data population
2.4.3 Data Sample
2.4.4 Data Quality
2.5 Frequency
2.5.1 Definition of frequency
2.5.2 Types of frequency
2.5.3 Frequency of grouped Data
2.5.4 Mode
2.6 Mean
2.6.1 Definition of Mean
2.6.2 Arithmetic Mean
2.6.3 Variance and Standard Deviation
2.7 Median
2.7.1 Range
2.7.2 Median
2.7.3 Quantiles
2.7.4 Quantiles range
B. Computer Based Solving
2.8 Reproject
2.9 R graphical user interface
2.10 Data exercises solves with R
C. Data Exercises solves
2.11 Handmade exercises
2.12 Exercises solves in R
Annex. Data Extended Concepts
2.A.1 Frequency
2.A.2 Mean
Chapter 3. Probability
A. Theory
3.1 Introduction
3.2 Event
3.3 Sets theory actions and operations
3.4 La Place or classic probability
3.5 Bayesian Probability
3.6 Probability distribution of random variables
3.6.1 Random Variable
3.6.2 Probability distribution
3.6.3 Discrete probability distributions
3.6.3.1 Bernoulli Probability distribution
3.6.3.2 Binomial Probability distribution
3.6.3.3 Geometric Probability distribution
3.6.3.4 Poison Probability distribution
3.6.4 Continuous probability distribution
3.6.4.1 Normal Distribution
3.6.4.2 Pearson chi square distribution
3.6.4.3 T the student distribution
3.6.4.4 F the fisher distribution
B. Computer Based Solving
C. Probability exercises solved
3.7 Handmade exercises
3.8 Exercises solved in R
Annex. Probability extended concepts
Chapter 4. Anomaly Detection
Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gómez, Adelhamid Tayebi
A. Theory
4.1 Introduction
4.2 Anomaly detection basic on Statistics
4.2.1 Anomaly detection Basic on the mean and the standard deviation
4.2.2 Anomaly detection based on the quartiles
4.2.3 Anomaly detection based errors of the residuals
4.3 Anomaly detection based on proximity. K nearest neighbor algorithm
4.4 Anomaly detection based on density simplified local outlier factor algorithm
B. Computer based solving
4.5 R packages
4.6 Anomaly detection the exercise solves with R
C. Anomaly detection exercises solves
4.7 Handmade exercises
4.8 Exercises solved in R
Chapter 5. Unsupervised Classification
Juan. J Cuadrado-Gallego, Yuri Demchenko, Adelhamid Tayebi
A. Theory
5.1 Introduction
5.2 Unsupervised classification based on distances K Meand Algorithm
5.3 Agglomerative hierarchical clustering
B. Computer Based Solved
5.4 R studio
5.5 Unsupervised classification exercises solves with R
C. Unsupervised Classification Solved
5.6 Handmade exercises
5.7 Exercises solved in R
Chapter 6. Supervised Classification
Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gómez
A. Theory
6.1 Introduction
6.2 Decision tree
6.2.1 Optimizing the construction of a decision tree: ID3 Algorithm
6.2.2 Optimizing the construction of a decision tree: CART Algorithm
6.2.3 Optimizing the construction of a decision tree: Error Algorithm
6.3 Neural Network
6.4 Naïve Bayes
6.5 Regression functions
6.5.1 Lineal regression of polynomial events
6.5.2 Lineal regression of polynomial for three events
6.5.3 Lineal regression of polynomial for K events
6.5.4 No Lineal regression of polynomial for two events
6.5.5 No Lineal regression of not polynomial for two events
6.5.6 Lineal regression validity analysis
B. Computer based solving
C. Supervised classification analysis exercises solved
6.6 Handmade Exercises
6.7. Exercises solves in R
Chapter 7. Association
A. Theory
7.1 Introduction
7.2 Analysis of association of events composed by a single elementary event
7.2.1 Support
7.2.2 Confidence
7.2.3 Contingency
7.2.4 Correlation
7.3 Analysis of association of events composed by more than one elementary event . Apriori algorithm
B. Computer based solving
C. Association analysis exercises solved
7.4 Handmade Exercises
7.5 Exercises solves in R.