I have some binary classifcation problem with ~30 features and 200 observations. I've used L1 (Lasoo) and L2 (Ridge) to run logistic regression and accuracy, prescion and confusion matrix on both training and test data. From there, have found the best hyper parameter from a selection and have the same output (precision, accuracy and confusion matrix) from this. How do I comment on signs of overfitting or underfitting? Is there more analysis or graphing that is required to answer this?
Comment on signs of overfitting or underfitting:
Overfitting: Normally overfiitting happens when it learns about details and noise related to it in the training to an extent that it negative impact (performance) on new data.
overfitting is more likely that have more flexibility on learning a target function,it includes various parameters and techniques to limit and constrain how much it will learn.
For example:Decision trees which are very flexible and is related to overfitting data.This problem can be address by pruning a tree.
Underfitting: A model that can neither model the training data nor generalise to new data refers to underfitting. underfitting is not suitable for any purpose due to poor performance.However it does provide a good contrast to overfitting.
we can limit overfitting through following techniques:
1:use resampling technique to estimate accuracy.
2.store validation dataset.
yes few more analysis that are present to answer this that are as follows :
1.Regression Analysis.
2.Validation Set.
Kindly rate my answer .Thankyou
I have some binary classifcation problem with ~30 features and 200 observations. I've used L1 (Lasoo)...
Performance Metrics: Which of the following are terms used for performance metrics a. Specificity & Precision b. Precision & Recall c. Recall & Sensitivity d. band e All of the above 9. Performance Metrics: When looking at the ROC/AUC curve, what are the values being compared represented on the x-axis and y-axis? a. False Positive Rate and True Positive Rate b. Precision and True Positive Rate c. False Positive Rate and Precision d. True Positive Rate and Specificity e. None...
Problem 1 (Logistic Regression and KNN). In this problem, we predict Direction using the data Weekly.csv. a. i. Split the data into one training set and one testing set. The training set contains observations from 1990 to 2008 (Hint: we can use a Boolean vector train=(Year < 2009)). The testing set contains observations in 2009 and 2010 (Hint: since train is a Boolean vector here, should use ! symbol to reverse the elements of a Boolean vector to obtain the...