explain the 5 methods used to measure distance between clusters

Question

Question

explain the 5 methods used to measure distance between clusters

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

Euclidean distance

This is the most usual, “natural” and intuitive way of computing a distance between two samples. It takes into account the difference between two samples directly, based on the magnitude of changes in the sample levels. This distance type is usually used for data sets that are suitably normalized or without any special distribution problem.

Manhattan distance

Also known as city-block distance, this distance measurement is especially relevant for discrete data sets. While the Euclidean distance corresponds to the length of the shortest path between two samples (i.e. “as the crow flies”), the Manhattan distance refers to the sum of distances along each dimension (i.e. “walking round the block”).

Pearson Correlation distance

This distance is based on the Pearson correlation coefficient that is calculated from the sample values and their standard deviations. The correlation coefficient 'r' takes values from –1 (large, negative correlation) to +1 (large, positive correlation). Effectively, the Pearson distance -dp- is computed as dp = 1 - r and lies between 0 (when correlation coefficient is +1, i.e. the two samples are most similar) and 2 (when correlation coefficient is -1).
Note that the data are centered by subtracting the mean, and scaled by dividing by the standard deviation.

Absolute Pearson Correlation distance

In this distance, the absolute value of the Pearson correlation coefficient is used; hence the corresponding distance lies between 0 and 1, just like the correlation coefficient. The equation for the Absolute Pearson distance -da- is:
da = 1 - ½ r ½

Taking the absolute value gives equal meaning to positive and negative correlations, due to which anti-correlated samples will get clustered together.

Un-centered Correlation distance

This is the same as the Pearson correlation, except that the sample means are set to zero in the expression for un-centered correlation. The un-centered correlation coefficient lies between –1 and +1; hence the distance lies between 0 and 2.

Add a comment

Answer 2

explain the 5 methods used to measure distance between clusters

Homework Answers

Add Answer to:
explain the 5 methods used to measure distance between clusters

Post as a guest

Earn Coins

Explain two different methods that can be used to measure the phase angle difference between two...

Which clustering method computes the dissimilarity based the largest distance between two clusters? Write a name...

Explain the similarities and differences between clusters, warehouse-scale computers, and datacenters.

(a) Write down the objective function of K-means. (b) Assume you have n d-dimension vectors, write down the code of K-m...

We used velocity dispersion and average distance between galaxies in the cluster to determine the virial mass, and we used the number of galaxies and the average mass of a galaxy to determine luminous...

Hierarchical clustering is sometimes used to generate K clusters, K > 1 by taking the clusters...

The masses of clusters of galaxies can be measured using methods based on three different physical...

Document one or more methods used to characterize and measure consumer confidence. Compare and co...

Explain encryption methods and how they are used Describe authentication methods and how they are used...

How must the line appear in order to measure the shortest distance between a point and...

explain the 5 methods used to measure distance between clusters

Homework Answers

Add Answer to: explain the 5 methods used to measure distance between clusters

Post as a guest

Earn Coins

Add Answer to:
explain the 5 methods used to measure distance between clusters