2) First we fit a linear regression as circumference as response and age as predictor. Then we calculate the residuals from the fitted model and find the largest residual and corresponding response value and the fitted value.The R code is given as below:
d=Orange
y=d$circumference #response
x=d$age #predictor
l=lm(y~x) #fitted model
summary(l)
y_hat=predict(l) #predicted values of y
res=abs(y-y_hat) # absolute residuals
maxres=max(res) #maximum residual
y_maxres=y[which(res==maxres)] #corresponding response
y_hat_maxres=y_hat[which(res==maxres)] #corresponding fitted
response
3) Here we have to find 90% CI. So here the level alpha=0.1. Let
the linear regression model is denoted by
where
. Let the true mean circumference is denoted by
. So fitted mean circumference is given by
So
. Now ,
,
,
\
putting these values we get -
Therefore
and
So
The 90% CI can be obtained from the following equation
which will give the 90%CI as
Here
= 8225644 , n = 35, x0 = 600 ,
t0.05;33 = 1.69236, xbar=922.1429,
= 18594.74 / 33 = 563.477 ,
17.3997 + 0.1068*600 = 81.4797
Putting these values we will get the 90% CI for true mean circumference which is equal to (73.3267,89.6327)
4) Here we need to find 90% PI for a new obsevation corresponding to age=800.
Here
and
. Take
So
E(z)=0 and
.The variance of z is derived similarly as before only extra term
here is V(y0) which is
. Similarly
we will get that
The 90% CI can be obtained from the following equation
which will give the 90% PI as
Where the values of all quantities will remain same except
x0 which is equal to 800 here.
= 17.3997 + 0.1068*800=102.8397
So putting all other values we will get the desired 90% PI for a new observation which is equal to ( 62.0597,143.6197)
All the calculations which are done in R is given below:
x_bar=mean(x)
sxx=sum((x-x_bar)**2)
n=length(y)
tab_t=qt(0.95,33)
res1=(y-y_hat)
ssres=sum(res1**2)
Do 2.3.4 Use the data "Orange" in R. You should include the r code as well...
(Do this problem without using R) Consider the simple linear regression model y =β0 + β1x + ε, where the errors are independent and normally distributed, with mean zero and constant variance σ2. Suppose we observe 4 observations x = (1, 1, −1, −1) and y = (5, 3, 4, 0). (a) Fit the simple linear regression model to this data and report the fitted regression line. (b) Carry out a test of hypotheses using α = 0.05 to determine...
Question 5 (1 point) Orange trees ~ In the 1960s, a botanist was conducting a research on orange trees. He wanted to figure out the relationship between the growth of trees and their ages. The botanist gathered a random sample of 35 orange trees and recorded the circumference of the tree in mm and the age of the tree in days. Circumference of tree is the X variable and Age of tree is the Y variable in this scenario. R...
Solve using R and show R code
Instruction: Please submit your R code along with a brief write-up of the solutions. Some of the questions below can be answered with very little or no programming. However, write code that outputs the final answer and dos not ryuira uper calceulatioms. Q.N. 1) The mammals data set in the MASS package records brain size and body size of 62 different mammals a) Fit a regresion model to describe the relation between brain...
Please help me with these questions with R codes.. thank
you!!
Here’s the data I have obtained for the questions:
Data: 9 students in total
Height(cm) Head Circumference(cm)
179 60
161 55
162 57
155 60
158 56
172 57
191 60
179 57
163 58
2. Draw at most 3 plots to visually describe your data. Is your response variable approximately Normal? 3. Numerically describe the centre, spread and any unusual points of your variables/data. 4. Fit and describe...
2. Suppose Y ~ Exp(a), which has pdf f(y)-1 exp(-y/a). (a) Use the following R code to generate data from the model Yi ~ Exp(0.05/Xi), and provide the scatterplot of Y against X set.seed(123) n <- 500 <-rnorm (n, x 3, 1) Y <- rexp(n, X) (b) Fit the model Yi-Ao + Ax, + ε¡ using the lm function in R and provide a plot of the best fit line on the scatterplot of Y vs X, and the residual...
R is a little difficult for me, please answer if you can
interpret the R code, I want to learn better how to interpret the R
code
4. each 2 pts] Below is the R output for a simple linear regression model Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 77.863 4.199 18.544 3.54e-13 3.485 3.386 0.00329* 11.801 Signif. codes: 0 0.0010.010.05 0.11 Residual standard error: 3.597 on 18 degrees of freedom Multiple R-squared: 0.3891, Adjusted R-squared: 0.3552 F-statistic: 11.47...
2. The following data were collected last semester on ten students. Complete a multiple regression analysis in which you use AGE (A), MATH PROFICIENCY (MP) (on a 1 –10 scale), and GENDER (G) (0 = male, 1 = female) as predictors of FINAL EXAM (FE) performance. Do this analysis in SPSS and then answer the following questions. Subject # A MP G FE 1 35 8 1 90 2 31 6 0 88 3 26 5 1 84 4 33...
The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...
USE R STUDIO The stackloss data frame available in R contains 21 observations on four variables taken at a factory where ammonia is converted to nitric acid. The first three variables are Air.Flow, Water.Temp, and Acid.Conc. The fourth variable is stack.loss, which measures the amount of ammonia that escapes before being absorbed. Read the help file for more information about this data frame. - Give a numerical summarization of each column of the dataset, then use boxplots to help illustrating...
Yes, as it is in the mint abs. NO: 24,46 so 50 is an outlier (g) Find a 95% confidence interval for the slope. Interpret your confidence interval. (h) Test the null hypothesis that the slope is zero and describe your conclusion. (i) Suppose we wish to predict the mean per capita retail sale for the years with per capita personal income 16000. What is the 95% confidence interval for this prediction? 6) If the per capita personal income in...