Question

How does a decision tree in machine learning work? explain and show some code example please

How does a decision tree in machine learning work? explain and show some code example please

0 0
Add a comment Improve this question Transcribed image text
Answer #1

`Hey,

Note: Brother if you have any queries related the answer please do comment. I would be very happy to resolve all your queries.

A decision tree is a flowchart-like tree structure where an internal node represents feature(or attribute), the branch represents a decision rule, and each leaf node represents the outcome. The topmost node in a decision tree is known as the root node. It learns to partition on the basis of the attribute value. It partitions the tree in recursively manner call recursive partitioning. This flowchart-like structure helps you in decision making. It's visualization like a flowchart diagram which easily mimics the human level thinking. That is why decision trees are easy to understand and interpret.

Decision Tree is a white box type of ML algorithm. It shares internal decision-making logic, which is not available in the black box type of algorithms such as Neural Network. Its training time is faster compared to the neural network algorithm. The time complexity of decision trees is a function of the number of records and number of attributes in the given data. The decision tree is a distribution-free or non-parametric method, which does not depend upon probability distribution assumptions. Decision trees can handle high dimensional data with good accuracy.

The basic idea behind any decision tree algorithm is as follows:

  1. Select the best attribute using Attribute Selection Measures(ASM) to split the records.
  2. Make that attribute a decision node and breaks the dataset into smaller subsets.
  3. Starts tree building by repeating this process recursively for each child until one of the condition will match:
    • All the tuples belong to the same attribute value.
    • There are no more remaining attributes.
    • There are no more instances.

Below is the implementation in python code

import numpy as np

import pandas as pd

from sklearn.metrics import confusion_matrix

from sklearn.cross_validation import train_test_split

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score

from sklearn.metrics import classification_report

  

# Function importing Dataset

def importdata():

    balance_data = pd.read_csv(

'https://archive.ics.uci.edu/ml/machine-learning-'+

'databases/balance-scale/balance-scale.data',

    sep= ',', header = None)

      

    # Printing the dataswet shape

    print ("Dataset Length: ", len(balance_data))

    print ("Dataset Shape: ", balance_data.shape)

      

    # Printing the dataset obseravtions

    print ("Dataset: ",balance_data.head())

    return balance_data

  

# Function to split the dataset

def splitdataset(balance_data):

  

    # Separating the target variable

    X = balance_data.values[:, 1:5]

    Y = balance_data.values[:, 0]

  

    # Splitting the dataset into train and test

    X_train, X_test, y_train, y_test = train_test_split(

    X, Y, test_size = 0.3, random_state = 100)

      

    return X, Y, X_train, X_test, y_train, y_test

      

# Function to perform training with giniIndex.

def train_using_gini(X_train, X_test, y_train):

  

    # Creating the classifier object

    clf_gini = DecisionTreeClassifier(criterion = "gini",

            random_state = 100,max_depth=3, min_samples_leaf=5)

  

    # Performing training

    clf_gini.fit(X_train, y_train)

    return clf_gini

      

# Function to perform training with entropy.

def tarin_using_entropy(X_train, X_test, y_train):

  

    # Decision tree with entropy

    clf_entropy = DecisionTreeClassifier(

            criterion = "entropy", random_state = 100,

            max_depth = 3, min_samples_leaf = 5)

  

    # Performing training

    clf_entropy.fit(X_train, y_train)

    return clf_entropy

  

  

# Function to make predictions

def prediction(X_test, clf_object):

  

    # Predicton on test with giniIndex

    y_pred = clf_object.predict(X_test)

    print("Predicted values:")

    print(y_pred)

    return y_pred

      

# Function to calculate accuracy

def cal_accuracy(y_test, y_pred):

      

    print("Confusion Matrix: ",

        confusion_matrix(y_test, y_pred))

      

    print ("Accuracy : ",

    accuracy_score(y_test,y_pred)*100)

      

    print("Report : ",

    classification_report(y_test, y_pred))

  

# Driver code

def main():

      

    # Building Phase

    data = importdata()

    X, Y, X_train, X_test, y_train, y_test = splitdataset(data)

    clf_gini = train_using_gini(X_train, X_test, y_train)

    clf_entropy = tarin_using_entropy(X_train, X_test, y_train)

      

    # Operational Phase

    print("Results Using Gini Index:")

      

    # Prediction using gini

    y_pred_gini = prediction(X_test, clf_gini)

    cal_accuracy(y_test, y_pred_gini)

      

    print("Results Using Entropy:")

    # Prediction using entropy

    y_pred_entropy = prediction(X_test, clf_entropy)

    cal_accuracy(y_test, y_pred_entropy)

      

      

# Calling main function

if __name__=="__main__":

    main()

Kindly revert for any queries

Thanks.

Add a comment
Know the answer?
Add Answer to:
How does a decision tree in machine learning work? explain and show some code example please
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT