Question

You have performed an unsupervised k-means clustering on a data set with two attributes and the...

You have performed an unsupervised k-means clustering on a data set with two attributes and the results indicate a k of 2. Later, you determine the class values for each data instance (there are four class values) and a supervised clustering results in a k of 4. Provide a possible explanation for why the two clustering methods disagree on a k value and a draw a sketch of the two clusterings to go along with your explanation.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

The below-shown scenario is an example of what can make such a clustering case,

Supervised Clustering is the problem of training a clustering algorithm to produce desirable clusterings: given sets of items and complete clusterings over these sets, we learn how to cluster future sets of items. Clustering algorithms accept a set of items and produce a partitioning of that set.

Whereas in the case of Unsupervised algorithm, we don't have any guidance we have a change possible, the reason is that we have the goal of this unsupervised machine learning technique is to find similarities in the data point and group similar data points together.

Now when we have a set of items let say of four categories in the supervised clustering then we compare the similarities, of each unclassified element to all the four clusters and thus we have 4 clusters.

Whereas as you can see in the picture we can have the same set of elements without any guidance, the key thing is here there is no particular way of separating the two clusters like blue and red similarly light blue and black.

So when we, have the case of the different algorithms we can have different clusters, actually, even with two unsupervised algorithms we can have the separate number of cluster and that is because it all depends on the similarity and difference measure.

If there is any doubt, you can ask in comments.

Add a comment
Know the answer?
Add Answer to:
You have performed an unsupervised k-means clustering on a data set with two attributes and the...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • 1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID...

    1. apply k-means clustering to a dataset Task Consider the following set of two-dimensional records: RID Dimension 1 Dimension2 1 00 8 4 5 4 N 3 2 4 4 6 N 5 2. 00 6 00 8 6 Use the k-means algorithm to cluster the data in the dataset with K=3. You can assume that the records with RIDS 1, 3, and 5 are used for the initial cluster centroids (means). You must include the intermediate results in each...

  • Please write full justification for (a) and (b). Will uprate/vote! 4. K-means The goal of K-means clustering is to divide a set of n points into k< n subgroups of points that are "close" t...

    Please write full justification for (a) and (b). Will uprate/vote! 4. K-means The goal of K-means clustering is to divide a set of n points into k< n subgroups of points that are "close" to each other. Each subgroup (or cluster) is identified by the center of the cluster, the centroid (μι, μ2' ··· ,14k) In class, we have seen a brute force approach to solve this problem exactly. Each of the k clusters is represented by a color, e.g.,...

  • You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations,...

    You are given the follow information: You need to apply k-means clustering,Your dataset has 1,000 observations, Your dataset has 57 features, K=2 Answer the following questions How are the initial centroids selected? How many clusters will be produced? What measure is used to evaluate the quality of the clusters? For the evaluation measure, do higher or lower values indicate better clusters? Why?

  • 1) For the following set of two-dimensional points, draw a sketch of how they would be split into...

    1) For the following set of two-dimensional points, draw a sketch of how they would be split into two clusters by K-means (when global minimum of SSE is achieved) and by Gaussian mixture model clustering. You can assume the density of points in the darker area is much higher than the density of points in the lighter area 2) Name one other clustering method that might be able to accurately capture the two clusters. 1) For the following set of...

  • Data clustering and the k means algorithm. However, I'm not able to list all of the...

    Data clustering and the k means algorithm. However, I'm not able to list all of the data sets but they include: ecoli.txt, glass.txt, ionoshpere.txt, iris_bezdek.txt, landsat.txt, letter_recognition.txt, segmentation.txt vehicle.txt, wine.txt and yeast.txt. Input: Your program should be non-interactive (that is, the program should not interact with the user by asking him/her explicit questions) and take the following command-line arguments: <F<K><I><T> <R>, where F: name of the data file K: number of clusters (positive integer greater than one) I: maximum number...

  • K-means clustering Problem 1. (10 pts) Suppose that we have the gene expression values for 5...

    K-means clustering Problem 1. (10 pts) Suppose that we have the gene expression values for 5 genes (G1 to G5) under 4 time points (t1 to t4) as shown in the following table. Please use K-Means clustering to group 5 genes into 2 clusters based on Euclidean distance. Find out the final centroids and their affiliated genes. The initial centroids are c1=(1,2,3,4) and c2=c(9,8,7,6). Please write down your algorithm step by step. Result without steps won't get points. t1 t2...

  • 1. Implement the K-means algorithm using these two as a reference. 2.Use Matlab’s implementation of kmeans...

    1. Implement the K-means algorithm using these two as a reference. 2.Use Matlab’s implementation of kmeans to check your results on the fisheriris dataset (https://www.mathworks.com/help/stats/kmeans.html) a. The fisheriris dataset is built into Matlab, and you can load it using ‘load fisheriris’. b. Please note the labels are available for the dataset, so you can check the performance of the kmeans algorithm on the dataset. 274 14 Unsupervised Lnn Fig. 14.1 A two-dimensional domain with clusters of examples weight bot initial...

  • K-means clustering K-means clustering is a very well-known method of clustering unlabeled data. The simplicity of...

    K-means clustering K-means clustering is a very well-known method of clustering unlabeled data. The simplicity of the process made it popular to data analysts. The task is to form clusters of similar data objects (points, properties etc.). When the dataset given is unlabeled, we try to make some conclusion about the data by forming clusters. Now, the number of clusters can be pre-determined and number of points can have any range. The main idea behind the process is finding nearest...

  • Q1. In a digital classification process “training” a computer can be performed with supervised or unsupervised...

    Q1. In a digital classification process “training” a computer can be performed with supervised or unsupervised method. (i) What then is “training”? ……………………………………………………………………………………………. ……………………………………………………………………………………………. (ii) Maximum likelihood algorithm assumes that the bands of data have normal distributions. What is the objective of the assumption of normality in this algorithm? ……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………… (iii) In maximum likelihood algorithm, about three parameters can be used to compute the statistical probability of a given pixel value being a member of a particular land cover category...

  • Question: Use the data file DemoKTC file to conduct the following analysis. (a) Use k-means clustering...

    Question: Use the data file DemoKTC file to conduct the following analysis. (a) Use k-means clustering with a value of k = 3 to cluster based on the Age, Income, and Children variables to reproduce the results in Appendix 4.2. Average distance within least dense cluster Minimum cluster distance to least dense cluster (b) Repeat the k-means clustering for values of: k = 2 Average distance within least dense cluster Minimum cluster distance to least dense cluster k = 4...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT