You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

Question

Question

You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations, Your dataset has 57 features, K=2
Answer the following questions
1. How are the initial centroids selected?
2. How many clusters will be produced?
3. What measure is used to evaluate the quality of the clusters?
4. For the evaluation measure, do higher or lower values indicate better clusters? Why?

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

In the question following information is given...

N = 1000 observations

d = 57 features

K = 2 (number of clusters)

Solution :-

i. Initial centroids are selected at random.

ii. 2 number of clusters will be produced. (Because K = 2).

iii. Quality of the clusters can be measures by distance measure (Euclidean distance) known as L2 distance.

iv. For the evaluation measure, lower values indicate better clusters. The distance can be defined as follows :

Our objective is to chose cluster center in such a way that within cluster distance is minimized from the centroid of the cluster.

Add a comment

Answer 2

Similar Homework Help Questions

1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID...

1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID Dimension 1 Dimension2 1 00 8 4 5 4 N 3 2 4 4 6 N 5 2. 00 6 00 8 6 Use the k-means algorithm to cluster the data in the dataset with K=3. You can assume that the records with RIDS 1, 3, and 5 are used for the initial cluster centroids (means). You must include the intermediate results in each...
Business Analytics, Assignment on Clustering As part of the quarterly reviews, the manager of a r...

Business Analytics, Assignment on Clustering As part of the quarterly reviews, the manager of a retail store analyzes the quality of customer service based on the periodic customer satisfaction ratings (on a scale of 1 to 10 with 1 = Poor and 10 = Excellent). To understand the level of service quality, which includes the waiting times of the customers in the checkout section, he collected data on 100 customers who visited the store; see the attached Excel file: ServiceQuality....
Please write full justification for (a) and (b). Will uprate/vote! 4. K-means The goal of K-means clustering is to divide a set of n points into k< n subgroups of points that are "close" t...

Please write full justification for (a) and (b). Will uprate/vote! 4. K-means The goal of K-means clustering is to divide a set of n points into k< n subgroups of points that are "close" to each other. Each subgroup (or cluster) is identified by the center of the cluster, the centroid (μι, μ2' ··· ,14k) In class, we have seen a brute force approach to solve this problem exactly. Each of the k clusters is represented by a color, e.g.,...

K-means clustering K-means clustering is a very well-known method of clustering unlabeled data. The simplicity of...

K-means clustering K-means clustering is a very well-known method of clustering unlabeled data. The simplicity of the process made it popular to data analysts. The task is to form clusters of similar data objects (points, properties etc.). When the dataset given is unlabeled, we try to make some conclusion about the data by forming clusters. Now, the number of clusters can be pre-determined and number of points can have any range. The main idea behind the process is finding nearest...
a) How does PAM (K-medoids) form clusters; how does DBSCAN form clusters? b) Assume you apply...

a) How does PAM (K-medoids) form clusters; how does DBSCAN form clusters? b) Assume you apply DBSCAN to the same dataset, but the examples in the dataset are sorted differently. Will DBSCAN always return the same clustering for different orderings of the same dataset? Give reasons for your answer.
1. Decision trees As part of this question you will implement and compare the Information Gain,...

1. Decision trees As part of this question you will implement and compare the Information Gain, Gini Index and CART evaluation measures for splits in decision tree construction.Let D= (x,y), D = n be a dataset with n samples. The entropy of the dataset is defined as H(D)= P(c|D)log2P(c|D), where P(CD) is the fraction of samples in class i. A split on an attribute of the form X, <c partitions the dataset into two subsets Dy and Dn based on...

Chapter 3: Exercises for Simulation Participants If you are participating in a strategy simulation exercise during...

Chapter 3: Exercises for Simulation Participants If you are participating in a strategy simulation exercise during the academic term, you may be instructed to complete the following exercise. 1. Which of the five competitive forces is creating the strongest competitive pressures for your company? Multiple Choice There is not one competitive force that is strong enough to require a strategy change. Companies should not be concerned with entry barriers; these are always strong enough to prevent new entrants. Any one...
Question: Evaluation: Answer both (a) and (b): (a) Looking at Table 6.1 in our text, explain...

Question: Evaluation: Answer both (a) and (b): (a) Looking at Table 6.1 in our text, explain how you would evaluate this training for a “Behavior and skill-based” outcome. (b) Explain how you would do a "Return on Expectations? To evaluate its training program, a company must decide how it will determine the program’s effectiveness; that is, it must identify what training outcomes or criteria it will measure. Table 6.1 shows the six categories of training outcomes: reaction outcomes, learning or...
Human Relations & Workforce Planning:For your final project, you need to write a 8-12 page.This project...

Human Relations & Workforce Planning:For your final project, you need to write a 8-12 page.This project will require you to design a Human Relations Plan for an organization. The situation this company is facing and all facets you must consider will be in the assignments tab on Blackboard as well as the bottom of this syllabus. Your plan will use the major topics covered in this course to produce sound human relations practices. Keep in mind you will need to...

10. The owner of a company has asked you to conduct an evaluation of the customer...

10. The owner of a company has asked you to conduct an evaluation of the customer satisfaction ratings to see if the company continues to provide customers with customer service that ranks above average. To be considered above average the average customer satisfaction score has to be above 7. Suppose a random sample of 60 customers is taken from a population to evaluate customer satisfaction. The sample mean is 7.25. The sample standard deviation is 1.05. The population mean is...

You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

Homework Answers

Add Answer to:
You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

Post as a guest

Earn Coins

1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID...

Business Analytics, Assignment on Clustering As part of the quarterly reviews, the manager of a r...

Please write full justification for (a) and (b). Will uprate/vote! 4. K-means The goal of K-means clustering is to divide a set of n points into k< n subgroups of points that are "close" t...

K-means clustering K-means clustering is a very well-known method of clustering unlabeled data. The simplicity of...

a) How does PAM (K-medoids) form clusters; how does DBSCAN form clusters? b) Assume you apply...

1. Decision trees As part of this question you will implement and compare the Information Gain,...

Chapter 3: Exercises for Simulation Participants If you are participating in a strategy simulation exercise during...

Question: Evaluation: Answer both (a) and (b): (a) Looking at Table 6.1 in our text, explain...

Human Relations & Workforce Planning:For your final project, you need to write a 8-12 page.This project...

10. The owner of a company has asked you to conduct an evaluation of the customer...

You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

Homework Answers

Add Answer to: You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

Post as a guest

Earn Coins

Add Answer to:
You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...