Data Set A- 7,7,7,9,9,9,10

Data Set B- 4,6,6,6,8,9,9,9,10,10,10

1. Find the sample mean, median (if it exists) mode for each set of data
2. Next find the sample standard deviations for each set of data
3. Eliminate any outliers from the samples. Redo part 1 for each variable (if necessary, if not explain why). Compare your two sets of results. What happened to your averages? Standard deviations? Etc.
4. Write a brief explanation of what you found in the data.

Data Set A

 x 1 2 3 4 5 6 7 y 7 7 7 9 9 9 10

Data Set B

 x 1 2 3 4 5 6 7 8 9 10 11 y 4 6 6 6 8 9 9 9 10 10 10

1. Using the two variables from make a scatter plot.
2. Find the equation for the least-squares line, and graph the line on the scatter plot.
3. Find the sample correlation coefficient r and the coefficient of determination r2. Is r significant? What criteria did you use to determine if it was significant? For each set of data
4. What predictions can you make using this data sets? Write a brief explanation of what you found in the data sets.

A= c(7,7,7,9,9,9,10)
B= c(4,6,6,6,8,9,9,9,10,10,10)
mean(A)=sum of the observations/7 =  8.285714

standard deviation(A)=1.253566
median(A)= 9 (the middle most observation)

mode= 3 median - 2 mean=10.42857=10 (approx)

Q1(A)=7 (25% th i.e. the 2 nd observation)

Q3(A)=9 (75%th i.e the 6th observation)

QD(A)={Q3(A)-Q1(A)}/2=(9-2)/2=3.5

inner fences(A) are {Q3(A)+5.25, Q1(A)-5.25} i.e. {14.25,1.75}

i.e. the boundaries of our outlier fences for data set A is (1.75, 14.25)

Now Data set A= c(7,7,7,9,9,9,10)

Clearly no data of data set A lies outside the boundary .
similarly,

mean(B)=7.909091
standard deviation(B)= 2.071451
median(B)= 9

mode= 3 median - 2 mean=11.18182=11 (approx)

Q1(B)=6

Q3(B)=10

QD(B)={Q3(B)-Q1(B)}/2=(10-6)/2 =2

inner fences(B) are {Q3(B)+3, Q1(B)-3} i.e. {13,3}

i.e. the boundaries of our outlier fences for data set B is (3,13)

B= c(4,6,6,6,8,9,9,9,10,10,10)

Clearly, In data set B no data lies outside the boundary (3, 13).

So clearly not a single data is found neither in data set A nor in the data set B. So all the mean, sd, median, quantiles  will be same as we calculate already above.

