Question

R project for building a k-NN model. In this question you are going to work with...

R project for building a k-NN model. In this question you are going to work with perhaps the best known dataset in the machine learning community. It is about the classification of the iris flower in different species. The original dataset contains 4 attributes and 3 classes, but, to simplify the problem, we are going to work with a subset of the data. The dataset we are going to work with is in iris.csv, and contains only two (continuous) features: the petal length in cm, and the petal width in cm; and two classes: iris versicolor, and iris verginica.

Use the repeated holdout technique to find the near optimal k for the k-NN method for this data. To this end use values of k = 1 to k = 49 with jumps of 4, that is, test k = 1, 5, 9, . . . , 49. For each k run 20 experiments where you would choose 80 random items from the data as your training set, and the remaining items as your test set. For each run, build a k-NN model, and test it on the test set, and find number of misclassified versicolor items, misclassified verginica items and the total number of misclassified items. Take the average over 20 experiments, and divide, respectively by the number of versicolor items, the number of verginica items, and the total number of items in the test set. Collect these three items, along with the values of k in vectors. When done find the best k, that is the one resulting in lowest error rate. Also on the same graphics panel, graph versicolor error rate (proportion of versicolor points incorrectly classified as verginica among all orange points), verginica error rate (proportion of verginica points incorrectly misclassified as versicolor ) and total error rate against values of k.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

#Loading package

install.packages('ggplot2')

#Loading Data

#In order to load data Internet is required.

data = read.csv(url("http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"), header = FALSE)

head(iris)

#Dividing into 80-20 percent

ind <- sample(2, nrow(iris), replace=TRUE, prob=c(0.80, 0.20))
table(ind)

#Assigning Variables

iris_train <- iris[ind==1, 1:4]
iris_test <- iris[ind==2, 1:4]
iris_train_labels <- iris[ind==1, 5]
iris_test_labels <- iris[ind==2, 5]

library(class)

#Importing Knn function
iris_test_pred<-knn(train=iris_train, test=iris_test,cl=iris_train_labels,k=3)

library(gmodels)
CrossTable(x=iris_test_labels, y=iris_test_pred, prop.chisq = FALSE)

Add a comment
Know the answer?
Add Answer to:
R project for building a k-NN model. In this question you are going to work with...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm...

    Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm to solve a classification problem. The kNN is a simple and robust classifier, which is used in different applications. The goal is to train kNN algorithm to distinguish the species from one another. The dataset can be downloaded from UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/machine-learning-databases/iris/ (Links to an external site.)Links to an external site.. Download `iris.data` file from the Data Folder. The Data Set description...

  • The equilibrium constant, K, of a certain first If you were going to graphically determine the...

    The equilibrium constant, K, of a certain first If you were going to graphically determine the enthalpy, order reaction was measured at two temperatures, T The data is shown in this table. ., for thisreaction,what points would you plot? Number Number T(K) K 275 3.51 625 7.84 To avoid rounding errors, use three significant figures in the x values and four significant figures in the y values Number Number point 2 Determine the rise, run, and slope of the line...

  • Show your work. 1. A. Using values for the class data that you computed in Part 2 of the project,...

    Show your work. 1. A. Using values for the class data that you computed in Part 2 of the project, construct a 99% confidence interval estimate for the true proportion of yellow candies using the class data as your sample. Remember that for this computation, n is the number of CANDIES for the entire class data. Include all your work, showing the formula used and appropriate values inserted (neatly written and scanned or typed) or including the appropriate calculator commands...

  • PROJECT 3 INSTRUCTIONS Based on Brase & Brase : sections 6.1-6.3 Note that you must do this Visit...

    continuation to previous question PROJECT 3 INSTRUCTIONS Based on Brase & Brase : sections 6.1-6.3 Note that you must do this Visit the NASDAQ historical prices weblink. First, set the date range to be for exactly 1 year ending on that says "Download Data" to save the file project on your to your computer This project will only use the Close values. Assume that the closing prices of the stock form a normally distributed data set. This means that you...

  • Project Description: In this project, you will combine the work you’ve done in previous assignments to...

    Project Description: In this project, you will combine the work you’ve done in previous assignments to create a separate chaining hash table. Overview of Separate Chaining Hash Tables The purpose of a hash table is to store and retrieve an unordered set of items. A separate chaining hash table is an array of linked lists. The hash function for this project is the modulus operator item%tablesize. This is similar to the simple array hash table from lab 5. However, the...

  • In this assignment you are going to handle some basic input operations including validation and manipulation,...

    In this assignment you are going to handle some basic input operations including validation and manipulation, and then some output operations to take some data and format it in a way that's presentable (i.e. readable to human eyes). Functions that you will need to use: getline(istream&, string&) This function allows you to get input for strings, including spaces. It reads characters up to a newline character (for user input, this would be when the "enter" key is pressed). The first...

  • ONLY DO NUMBER 3 For this project you will test claims and conjectures using hypothesis testing. ...

    ONLY DO NUMBER 3 For this project you will test claims and conjectures using hypothesis testing. For each hypothesis test, report the following: The null hypothesis, H0 The alternative hypothesis, H1 The test statistic rounded to the nearest hundredth (use T Stats or Proportion Stats in StatCrunch to find test statistics) The P-value for the test (use T Stats or Proportion Stats in StatCrunch to find P-values) The formal decision (Reject H0 or Fail to reject H0, remember that reject...

  • ONLY DO NUMBER 7 For this project you will test claims and conjectures using hypothesis testing. ...

    ONLY DO NUMBER 7 For this project you will test claims and conjectures using hypothesis testing. For each hypothesis test, report the following: The null hypothesis, H0 The alternative hypothesis, H1 The test statistic rounded to the nearest hundredth (use T Stats or Proportion Stats in StatCrunch to find test statistics) The P-value for the test (use T Stats or Proportion Stats in StatCrunch to find P-values) The formal decision (Reject H0 or Fail to reject H0, remember that reject...

  • Problem You work for a dealership that deals in all kinds of vehicles. Cars, Trucks, Boats,...

    Problem You work for a dealership that deals in all kinds of vehicles. Cars, Trucks, Boats, and so forth need to be inventoried in the system. The inventory must be detailed so that it could be searched based on number of doors, engine type, color, and so on. This program will demonstrate the following: How to create a base class How to extend a based class to create new classes How to use derived classes Solving the Problem Step 1...

  • For this project, each part will be in its oun matlab script. You will be uploading a total 3 m f...

    For this project, each part will be in its oun matlab script. You will be uploading a total 3 m files. Be sure to make your variable names descriptive, and add comments regularly to describe what your code is doing and hou your code aligns with the assignment 1 Iterative Methods: Conjugate Gradient In most software applications, row reduction is rarely used to solve a linear system Ar-b instead, an iterative algorithm like the one presented below is used. 1.1...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT