Question

What's simple random sampling? Is it possible to sample data instances using a distribution different from...

  1. What's simple random sampling? Is it possible to sample data instances using a distribution different from the uniform distribution? If so, give an example of a probability distribution of the data instances that is different from uniform (i.e., equal probability).

  1. What's stratified sampling?

  1. What's "the curse of dimensionality"?

  1. Provide a brief description of what Principal Components Analysis (PCA) does. [Hint: See Appendix A and your lecture notes.] State what's the input and what the output of PCA is.
0 0
Add a comment Improve this question Transcribed image text
Answer #1

1. Simple random sampling is a sampling technique where every item in the population has an even chance and likelihood of being selected in the sample. Here the selection of items completely depends on chance or by probability and therefore this sampling technique is also sometimes known as a method of chances.

Yes, it is possible to sample data instances using a distribution different from the uniform distribution.

In excel, do the following to obtain a sample dataset.

E.g. to create a 10 element sample from the standard normal distribution, place the formula =NORM.S.INV(RAND()) in cell A1, highlight the range A1:A10 and press Ctrl-D.

2.

Stratified sampling is a type of sampling method in which the total population is divided into smaller groups or strata to complete the sampling process. The strata is formed based on some common characteristics in the population data. After dividing the population into strata, the researcher randomly selects the sample proportionally.

Stratified sampling is a common sampling technique used by researchers when trying to draw conclusions from different sub-groups or strata. The strata or sub-groups should be different and the data should not overlap. While using stratified sampling, the researcher should use simple probability sampling. The population is divided into various subgroups such as age, gender, nationality, job profile, educational level etc. Stratified sampling is used when the researcher wants to understand the existing relationship between two groups.

3.

The curse of dimensionality refers to how certain learning algorithms may perform poorly in high-dimensional data.

Say you're doing rejection sampling, and the sample space has n dimensions. Furthermore, say the upper bound we chose for rejection sampling is pretty mediocre, and about 0.9 of the samples are within target for that dimension.

Unfortunately, we thus accept about 0.9?0.9n of our samples overall since accepted samples must be within target for all dimensions. The number of overall samples scales exponentially with the dimensions of the data! That means we could be very inefficient since we could reject a lot of samples.

This is one example of the curse of dimensionality (that I found easiest to explain concisely). Many other AI algorithms perform poorly in high dimensions as well. Metropolis–Hastings for instance, suffers too since it's hard to come up with a jumping distribution that works well for all the dimensions. K-means clustering suffers as well in high dimensions, especially if many of the dimensions are irrelevant to the ideal clustering boundaries and just add noise to the clustering.

NOTE: As per Chegg policy, I am allowed to answer only 3 questions (including sub-parts) on a single post. Kindly post the remaining questions separately and I will try to answer them. Sorry for the inconvenience caused.

Add a comment
Know the answer?
Add Answer to:
What's simple random sampling? Is it possible to sample data instances using a distribution different from...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT