Question

Please show that answer step by step and explain clearly, thx!!!! 1. Give an example of...

Please show that answer step by step and explain clearly, thx!!!!

1. Give an example of a low dimensional (approx. 20 dimensions), medium dimensional (approx. 1000

dimensions) and high dimensional (approx. 100000 dimensions) problem that you care about.

2. What does the decision boundary of 1 nearest neighbor classifier for 2 points (one positive, one negative)

look like?

0 0
Add a comment Improve this question Transcribed image text
Answer #1

First of all, The high,low dimension of data is not only about the number of columns or features, a dataset have.

For example, if you have 3 data points, and 5 features each, it’s a high dimensional data. On the other hand, even if you have 500k features, once you have 1M samples, it’s still low dimensional.

But to answer your question I am naming some datasets here based on their dimensions.    

1 i. Low dimensional dataset:

Parkinson Dataset

Parkinson is a nervous system disorder that affects movement. The dataset contains 195 records of people with 23 different attributes which contain biomedical measurements. The data is used to separate healthy people from people with Parkinson’s disease.

Link:

https://archive.ics.uci.edu/ml/datasets/parkinsons

ii. Medium dimensional dataset:

Internet Advertisements Data Set

This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image's URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement ("ad") or not ("nonad").

Link:

https://archive.ics.uci.edu/ml/datasets/internet+advertisements

iii. High dimensional dataset :

A dataset with more than 1 lakh features in not a practical one.When performing feature extraction using some pretrained models, we can get such number of features to operate upon.Apart from that, the largest dataset that I can suggest is imageNet.It is a large image database that is organized according to the wordnet hierarchy. It has over 100,000 phrases and an average of 1000 images per phrase. The size exceeds 150 GB. It is suitable for image recognition, face recognition, object detection, etc.

Link :

http://www.image-net.org/

2. 1 nearest neighbor classifier simply finds out the nearest point to the given test sample.Say that test sample is x and it belongs to class y. Then the test sample will also be assigned to the class y.

With only two points in the dataset, the decision boundary will be a linear one i.e, A Straight Line It may look like this.

Add a comment
Know the answer?
Add Answer to:
Please show that answer step by step and explain clearly, thx!!!! 1. Give an example of...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT