This question requires using Rstudio. This is following commands to install and import data into R:
> install.packages("ISLR")
> library(ISLR)
> data(Wage)
The required data installed and imported, now this is description of the data:
This dataset contains economic and demographic data for 3000
individuals living in the mid-Atlantic region. For each of
the
3000 individuals, the following 11 variables are recorded:
year: Year that wage information was recorded
age: Age of worker
maritl: A factor with levels 1. Never Married 2. Married 3. Widowed
4. Divorced and 5.
Separated indicating marital status
race: A factor with levels 1. White 2. Black 3. Asian and 4. Other
indicating race
education: A factor with levels 1. < HS Grad 2. HS Grad 3. Some
College 4. College Grad
and 5. Advanced Degree indicating education level
region: Region of the country (mid-atlantic only)
jobclass: A factor with levels 1. Industrial and 2. Information
indicating type of job
health: A factor with levels 1. <=Good and 2. >=Very Good
indicating health level of worker
health ins: A factor with levels 1. Yes and 2. No indicating
whether worker has health insurance
logwage: Log of workers wage
wage: Workers raw wage
This question continues with the Wage dataset.
You wish to fit a multiple regression model to predict wage using year, age, and jobclass.
However, you are interested in whether the change in wage as a worker ages differs between
industrial workers and information workers. Fit the appropriate model and test the
hypothesis of interest. Include your results and your conclusion.
{Note: I understand the initial model would be
"fit=lm(wage~year+age+jobclass,data=Wage)",
now the interpretation of given interest ("whether the change in
wage as a worker ages
differs between industrial workers and information workers")
implies to me that I need
to fit another model (maybe includes an interaction term?) because
I do not think the
initial model is enough. If so, what would that other model would
look like?
After that, I would run an anova test between two models and make a
decision.
That is my thinking in this question. If it is not, please explain
how it should be done.}
Please provide all necessary codes using Rstudio.
This question requires using Rstudio. This is following commands to install and import data into R:...
The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...
Table 1: 2012 Current Population Survey Data Dependent Variable: Av Hourly Eaming 8.31 0.2 3.85 0.2 8.34 8.32 0.22 3.81 0.22 0.51 0.04 College (X) Female (X2) Age (X3) Northeast (x) Midwest (L.) South (Xs) Intercept 3.80 0.52 04 0.18 36 1.23 31 0.43 30 2.05 17.02 1.87 Forthis question, refer to the table of estimated regressions in Table 1, computed using data from the 2012 CPS. The dataset contains information on 7,440 full-time, full-year workers. The highest educational achievement...
All of the following questions are in relation to the following journal article which is available on Moodle: Parr CL, Magnus MC, Karlstad O, Holvik K, Lund-Blix NA, Jaugen M, et al. Vitamin A and D intake in pregnancy, infant supplementation and asthma development: the Norwegian Mother and Child Cohort. Am J Clin Nutr 2018:107:789-798 QUESTIONS: 1. State one hypothesis the author's proposed in the manuscript. 2. There is previous research that shows that adequate Vitamin A intake is required...
FISCAL POLICY IN THEORY: March, 2020: we are on the verge of Congress and the President passing legislation that will empower the federal government to spend an unprecedented amount of EXTRA money not seen since World War 2 ---- in order to address the pandemic but also to help cushion the blow financially of perhaps ten or twenty million Americans --- or more --- losing their jobs, and thus suffering a drop in income. The scale of the 2020 recession...
10. Write a one-page summary of the attached paper? INTRODUCTION Many problems can develop in activated sludge operation that adversely affect effluent quality with origins in the engineering, hydraulic and microbiological components of the process. The real "heart" of the activated sludge system is the development and maintenance of a mixed microbial culture (activated sludge) that treats wastewater and which can be managed. One definition of a wastewater treatment plant operator is a "bug farmer", one who controls the aeration...