Introduction to r for data science session 6 linear regression model in r eda and normality tests

Image gallery for: Introduction to r for data science session 6 linear regression model in r eda and normality tests

Introduction to R for Data Science :: Session 6 [Linear Regression Model in R + EDA, and Normality Tests

Welcome to Introduction to R for Data Science Session 6: Linear Regression + EDA, and Normality tests [Linear Regression in R: Exploratory Data Analysis, assumptions of the simple linear model, correlation, and visualization. Predictions from the linear model. Confidence Intervals and Residuals. Inspecting the basic linear model. Infulential cases and the Influence Plot.]The course is co-organized by Data Science Serbia and Startit. You will find all course material (R scripts, data sets, SlideShare presentations, readings) on these pages.Check out the Course Overview to acess the learning material presented thus far.Data Science Serbia Course Pages [in Serbian]Startit Course Pages [in Serbian]Lecturersdipl. ing Branko Kovač, Data Analyst at CUBE, Data Science Mentor at Springboard, Data Science Serbia Goran S. Milovanović, Phd, DataScientist@DiploFoundation, Data Science Serbia Summary of Session 6, 02. June 2016 :: Linear Regression + EDA, and Normality tests.Linear Regression + EDA and Normality tests. Linear Regression in R: Exploratory Data Analysis, assumptions of the simple linear model, correlation, and visualization. Predictions from the linear model. Confidence Intervals and Residuals. Inspecting the basic linear model. Infulential cases and the Influence Plot.Session 6 SlideShare Session 6 R Script Readings for Session 7 Intro to R for Data Science SlideShare :: Session 6 Introduction to R for Data Science :: Session 6 [Linear Regression in R] from Goran S. Milovanovic R script :: Session 6######################################################## # Introduction to R for Data Science # SESSION 6 :: 2 June, 2016 # Simple Linear Regression in R # Data Science Community Serbia + Startit # :: Goran S. Milovanović and Branko Kovač :: ######################################################## # clear rm(list=ls()) #### read data library(datasets) data(iris) ### iris data set description: # https://stat.ethz.ch/R-manual/R-devel/library/iriss/html/iris.html ### Exploratory Data Analysis (EDA) str(iris) summary(iris) ### EDA plots # plot layout: 2 x 2 par(mfcol = c(2,2)) # boxplot iris$Sepal.Length boxplot(iris$Sepal.Length, horizontal = TRUE, xlab="Sepal Length") # histogram: iris$Sepal.Length hist(iris$Sepal.Length, main="", xlab="Sepal.Length", prob=T) # overlay iris$Sepal.Length density function over the empirical distribution lines(density(iris$Sepal.Length), lty="dashed", lwd=2.5, col="red") # boxplot iris$Petal.Length boxplot(iris$Petal.Length, horizontal = TRUE, xlab="Petal Length") # histogram: iris$Petal.Length, hist(iris$Petal.Length, main="", xlab="Petal Length", prob=T) # overlay iris$Petal.Length density function over the empirical distribution lines(density(iris$Petal.Length), lty="dashed", lwd=2.5, col="red")Created by Pretty R at inside-R.org# NOTE: Boxplot "fences" and outlier detection # Boxplot in R recognizes as outliers those data points that are found beyond OUTTER fences # Source: http://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm # Q3 = 75 percentile, Q1 = 25 percentile # IQ = Q3 - Q1; Interquartile range # lower inner fence: Q1 - 1.5*IQ # upper inner fence: Q3 + 1.5*IQ # lower outer fence: Q1 - 3*IQ # upper outer fence: Q3 + 3*IQ # A point beyond an inner fence on either side is considered a mild outlier # A point beyond an outer fence is considered an extreme outlier # plot variable density in general: Sepal Width # plot layout par(mfcol = c(1,2)) # NOTE: this is kernel density estimation in R. You are not testing any distribution yet. PLengthDensity
Advertisement
6 Inbuilt Data Structures in R with practical examples

ML/AI
Understanding Probability Distributions in R

Programming
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression Model in R + Categorical Predictors, Partial and Part Correlation

No Loafing
Evidence-Based Management
Advertisement
Data Visualization
Interpreting regression coefficient in R

Data Vis
Simultaneous regression and classification for drug sensitivity prediction using an advanced random forest method - Scientific Reports

bullet journal ari
29 Statistical Concepts Explained in Simple English - Part 1 - DataScienceCentral.com

Thesis
Simple Linier Regression

Reference
A Refresher on Regression Analysis

Job stuff
24 Uses of Statistical Modeling (Part I)

Machine Learning
Advertisement
Advertisement
Advertisement
Using R for Time Series Analysis — Time Series 0.2 documentation

R