Question

[USING RSTUDIO] I am having trouble already in step 3 of number 1. Im not sure...

[USING RSTUDIO] I am having trouble already in step 3 of number 1. Im not sure why what I typed isn't working. I appreciate anybody that can help me out. Thanks

1. Recall the `iris` data set from last week's exercise. The `iris` data set is already pre-loaded in R - look at the help file using `?iris` for more information on this data set.   
i) Check the structure of the data using the function `str(iris)`.   
ii) Find the average (or mean) measurement of the variable `Sepal.Length`. Do this in two ways as described in the lesson.   
iii) Find the average `Sepal.Length` for the different flower `Species`. Give a brief comment on the averages.
iv) Repeat (ii) and (iii) but use the summary standard deviation `sd()` which describes the spread of the variable.   
v) Describe the shape of the variable `Sepal.Length` by creating a histogram using `histogram()`. Write your description outside the code chunk.
vi) Compare the `Sepal.Length` of the three species of flowers by creating a side-by-side boxplot using `bwplot()`. Write your description outside the code chunk.

### Code chunk
```{r}
# Insert your code for this question after this line
#1.
#i)
str(iris)
#ii)
mean(iris$Sepal.Length)
#iii)
mean(Sepal.Length ~ Species, data = iris)
# last R code line
...

2. The data set `MLB-TeamBatting-S16.csv` contains MLB Team Batting Data for selected variables. Load the data set from the given url using the code below. This data set was obtained from [Baseball Reference](https://www.baseball-reference.com/leagues/MLB/2016-standard-batting.shtml).
* Tm - Team   
* Lg - League: American League (AL), National League (NL)
* BatAge - Battersâ average age
* RPG - Runs Scored Per Game
* G - Games Played or Pitched
* AB - At Bats
* R - Runs Scored/Allowed
* H - Hits/Hits Allowed
* HR - Home Runs Hit/Allowed
* RBI - Runs Batted In
* SO - Strikeouts
* BA - Hits/At Bats
* SH - Sacrifice Hits (Sacrifice Bunts)
* SF - Sacrifice Flies

i) Find the average measurement for the following variables `BatAge`, `RPG`, `R`, `H` and `BA`.   
ii) Create dotplot's or histogram's for each variable in (i).   
iii) Using your own words, describe the distribution of each variable in (i). Write your answer outside the code chunk.   
iv) Find the average and the standard deviation of the variables `RPG`, `H` and `BA` for each league.   
v) Describe any differences or similarities between the leagues. Write your comment outside the code chunk.

### Code chunk
```{r}
# load the data set
mlb16.data <- read.csv("https://raw.githubusercontent.com/jpailden/rstatlab/master/data/MLB-TeamBatting-S16.csv")
str(mlb16.data) # check structure
head(mlb16.data) # show first six rows

# last R code line
```

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Question 1 (Iris dataset ) solved completely - Rcode given below

) Check the structure of the data using the function str(iris) data iris <-iris str (data iris) # # data. frame : ## $ sepal . Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ## $ sepa 1 . width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ## $ Petal . Length : num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ## $ petal,width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ## $species 150 obs. of 5 variables: i) Find the average (or mean) measurement of the variable Sepal.Length. Do this in two ways as described in the lesson summary (data iris) Sepal.Length Sepal.Width Petal.Length Petal.Width ## Min. :4.300 Min. : 2.000 Min. : 1.000 Min.0.100 ** 1st Qu . : 5.100 1st Qu . :2.800 1st Qu .: 1.600 1st Qu . :0.300 ## Median: 5.800 Median: 3.000 Median: 4.350 Median: 1.300 ## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3d Qu.5.100 3rd Qu.:1.800 ## Max. :7.900 Max. :4.400 Max. : 6.900 Max. ## setosa ## ## Species : 50 versicolor : 50 virginica :50 mean (data iris$Sepal.Length)mean (data_irissSepal.Length) ## [1] 5.843333 (Not sure of the two ways shown in class there are many ways to obtain mean of a data) ii) Find the average Sepal.Length for the different flower Species. Give a brief comment on the averages. aggregate( Sepal.LengthSpecies, data iris, mean ) ## 1 ## 2 versicolor ## 3 virginica Species Sepal.Length 5.006 5.936 6.588 setosa We see the Sepal.Length of the flower has increasing trend from Setosa > Versicolor -> Virginica iv) Repeat (ii) and (iii) but use the summary standard deviation sd) which describes the spread of the variable aggregate Sepal.Length Species, data_iris, sd ) Species Sepal.Length 0.3524897 0.5161711 0.6358796 ## 1 setosa ## 2 versicolor 3 virginica As seen in the mean, a similar trend it seen in the standard deviation. Greater the Sepal Length, greater is the deviationv) Describe the shape of the variable Sepal.Length by creating a histogram using histogra( . Write your description outside the code chunk. ?hist () starting httpd help server .. done hist (data-iris$sepal . Length, ngth) xlab Sepal. Length, ylab=Frequency, main Distribution of the sepal Le = = Distribution of the Sepal Length We see that the Sepal.Length are not 4 6 Sepal.Length perfectly normally distributed but are skewwed to the left.vi) Compare the Sepal.Length of the three species of flowers by creating a side-by-side boxplot using bwplot ). Write your description outside the code chunk. boxplot (Sepal.Length Species, data_iris, pal length by Species) xlab species, ylab sepal Length, main Boxplot of the Se Boxplot of the Sepal length by Species setosa versicolor virginica Species

Add a comment
Know the answer?
Add Answer to:
[USING RSTUDIO] I am having trouble already in step 3 of number 1. Im not sure...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • For this exercise, you will need to use the packages `mosaic` and `dplyr`. ```{r warning=FALSE, message=FALSE}...

    For this exercise, you will need to use the packages `mosaic` and `dplyr`. ```{r warning=FALSE, message=FALSE} # install packages if necessary if (!require(mosaic)) install.packages(`mosaic`) if (!require(dplyr)) install.packages(`dplyr`) # load the package in R library(mosaic) # load the package mosaic to use its functions library(dplyr) # load the package dplyr to use data management functions ``` 1. For decades it's been suspected that schizophrenia involves anatomical abnormalities in the hippocampus, an area of the brain involved with memory. The following data...

  • Written in python using puTTy!! i'm having a lot of trouble with this, will upvote! also...

    Written in python using puTTy!! i'm having a lot of trouble with this, will upvote! also here is the address.csv file Name,Phone,Email,Year_of_Birth Elizi Moe,5208534566,emoe@ncsu.edu,1978 Ma Ta,4345667345,mta@yahoo.com,1988 Diana Cheng,5203456789,dcheng@asu.edu,1970 ACTIVITY I Implement in Python the Subscriber class modeled by the UML diagram provided below. Save in a file called MyClasses.py Subscriber name: string yearOfBirth: int phone: string email: string getName() getAge() getPhone() getEmail() Write a test program that does the following: Read the file addresses.csv. For each record, create an object...

  • Could someone help me out. I am not sure what I should be doing. Seeing it...

    Could someone help me out. I am not sure what I should be doing. Seeing it worked out will allow me to understand what I should be doing and then I can complete it on my own. Usando 2. Complete the Dog Class: a. Using the UML Class diagram to the right declare the instance variables. A text version is available: UML Class Diagram Text Version b. Create a constructor that incorporates the type, breed, and name variables (do not...

  • I need help writing python code with following instructions. You will write a program that reads...

    I need help writing python code with following instructions. You will write a program that reads a data file. The data file contains ticket IDs and ticket prices. Your job is to read these tickets (and prices). find minimum, maximum and average ticket prices and output a report file. The report file should look exactly (or better than) the one attached (output.txt). Input: A31 149.99 B31 49.99 A41 179.99 F31 169.99 A35 179.99 A44 169.99 open "input.txt" file using open()...

  • X Part I. Derive Bivariate Regression by hand. Again, we are using the same data set that we used in the in-cl...

    X Part I. Derive Bivariate Regression by hand. Again, we are using the same data set that we used in the in-class assessment. Case Dietary Fat Body Fat 22 9.8 22 11.7 14 8.0 21 9.7 32 10.9 26 7.8 30 21 17 1. Step 1: Find the mean of dietary fat x = 2. Step 2: Find the mean of body fat y = 3. Step 3: Find the sum of (x1 - x)y- y) = 3316 4. Step...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT