a) Consider the dataset in Table 1. Grade, Bumpiness and
Speed-limit are the features and Speed
is label.
| SN |
Grade |
Bumpiness |
Speed-limit | Speed |
|---|---|---|---|---|
| 1 | steep | bumpiness | yes | slow |
| 2 | steep | smooth | yes | slow |
| 3 | flat | bumpiness | no | fast |
| 4 | steep | smooth | no | fast |
Answer the followings:
i) Determine the entropy of Speed.
ii) Which attribute should be selected as a root of the decision
tree?
iii) Construct the decision tree for this dataset based on
information gain.
b) What to you mean by clustering? Consider the following sample points, A (1, 1), B (2, -2), C (2, 3), D (3, 3). Perform k-means clustering, show the calculation of distance matrix and group assignment matrix for two epochs only. [Assume k=2]
The entropy of the speed is 1.
The equation of calculating the entropy is:

Let's consider the Speed fast as Positive
And speed slow as Negative
The formula of probability is Probability is : Number of
favourable outcomes/Total number of Outcomes.
Therefore Probability of the fast = 2/4 = 1/2
And Probability of slow = 2/4 = 1/2
Putting these values in the above equation we get Entropy = 1

a) Consider the dataset in Table 1. Grade, Bumpiness and Speed-limit are the features and Speed...
Question 5 [3 pts]: In the dataset showing in Table 1, please use Gini Index to calculate the correlation between each of the four attributes (outlook, temperature, humidity, wind) to the Class label, respectively [2 pts]. Please rank and select the most important attribute to build the root node of the decision tree [1 pt] 91N
1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID Dimension 1 Dimension2 1 00 8 4 5 4 N 3 2 4 4 6 N 5 2. 00 6 00 8 6 Use the k-means algorithm to cluster the data in the dataset with K=3. You can assume that the records with RIDS 1, 3, and 5 are used for the initial cluster centroids (means). You must include the intermediate results in each...
fish F 5. Consider the following dataset where the decision attribute is restaurant: mealPreference gender drinkPreference restaurant hamburger M coke mcdonalds M pepsi burgerking chicken coke mcdonalds hamburger coke mcdonalds chicken pepsi wendys fish F coke burgerking chicken M pepsi burger King chicken IF coke wendys fish coke mcdonalds hamburger coke mcdonalds IM M F If we want to make a decision tree for determining restaurant, we must decide which of the three non-decision attributes (mealPreference, gender, or drinkPreference) to...
Consider the following table summarizing the speed limit of a certain road and the number of accidents occurring on that road in January. Posted Speed Limit 57 48 40 38 21 23 Reported Number of Accidents 28 25 21 17 18 11 1) Find the slope of the regression line predicting the number of accidents from the posted speed limit. Round to 3 decimal places. 2) Find the intercept of the regression line predicting the number of accidents from the...
Consider the following table summarizing the speed limit of a certain road and Tpe mumbersin the boses, the number of accidents occurring on that road in January. ツPart 1: 10 points g7 Part 2: 10 points Posted Speed Limit 55 45 42 38 2o 24 Part 3: 1 points 21 points Reported Number of Accidents 27 29 25 17 19 13 1) Find the slope of the regression line predieting the number of accidents from the posted speed limit. Round...
Given the following dataset with 2 features and 3 classes: Class 2 1 UniDegree YES NO NO NO YES YES NO YES YES Like Rock Music YES YES NO NO YES YES NO TES NO 3 3 3 2 3 a) Calculate P(LikeRockMusic=NO). b) Calculate P(UniDegree=YES Class=2). c) Calculate P(LikeRockMusic=YESUniDegree=NO). d) Given a test sample whose has university degree and likes rock music, what is the class prediction obtained using Naïve Bayes?
Consider the following table summarizing the speed limit of a certain road and Type numbers in the the number of accidents occurring on that road in January. Part 1:10 points boxes. Posted Speed Limit 53 49 44 36 21 23 * Part 2: 10 points Part 3: 10 points Reported Number of Accidents 25 28 20 17 18 14 30 points 1) Find the slope of the regression line predicting the number of accidents from the posted speed limit.Round to...
Q44 Consider a memory with features in Table Q44 (a) and Table Q44 (b). Memory Type Level 1 cache Level 2 cache Level 3 cache DDR SDRAM Hard disk Table Q44 (a) Clock Cycle 2 6 8 143 178 Table Q44 (b) Item Total Number of L2 Cache accessed by CPU Total Number of Hit Hit Time Value 50 access 45 access 7 ns (a) By using the features as in Table Q44 (a), determine the miss penalty in nanoseconds...
Consider the following data set which will be used for a binary classification problem where the goal is to predict whether a house will sell within 6 months Sold Age Overrpriced Features Location 50 90 60 60 70 80 50 90 80 80 60 70 None Edinburgh None None Aberdeen GarageDundee Garage Edinburgh Pool Edinburgh] PoolInverness None Inverness Garage Edinburgh Garag PoolAberdeen Glagsow e Edinburgh Dundee 1. What is the initial entropy of the Sold variable? 2. If we classify...
11. Consider the following simple database with one table with the following schema: Field | Type Null key | Default | Extra auto_increment ! - I tid | int(4) NO | name varchar(30) | YES | | subji | int(2) | YES | subj2 | int(2) | YES I 1 subj3 | int(2) | YES | total | int(3) | YES PRI NULL | NULL NULL NULL NULL NULL - Transactions INSERT a tuple for a new student with the...