Decision Trees vs. Logistic Regression
Deciding the model type that will be appropriate in accomplishing a machine learning task can be a challenge because of the several algorithms available in literature. Comparing the relative merits of various methods is a challenge because one can perform better than the other in a particular class of problems and come behind in another. Decision trees and logistic regression are among the methods used to handle classification problems and they are recommended for use in different scenarios.
Logistic regression and decision trees are different based on how they generate decision boundaries. Decision boundaries are lines drawn to separate different classes. Logistic regression generate a decision boundary by dividing the space into equal halves, while decision trees bisect the space into multiple smaller spaces. When higher dimensional data is used, the lines become general creating planes and hyperplanes (Rudd & Priestley, 2017). However, the single line boundary in logistic regression is in some cases a limitation. In such scenarios, trees can be an appropriate alternative. For instance, in a problem where a non-linear boundary separates two classes, trees will capture the division better and make it more visible. This results in higher classification performance. However, trees are neither perfect because they are likely to overfit the training data. This makes logistic regression the appropriate technique because linear boundary leads to better generation.
Comparing the interpretability of logistic regression and decision trees, the former has a performance advantage. The advantage logistic regression has over decision trees is based on the fact that a tree that consists of a large number of nodes requires a lot of mental effort to the splits that are behind a given prediction (Rudolfer, Paliouras, Peers, 1999). On the contrary, a logistic regression is simple because it is a list of coefficients.
References
Rudd, J. & Priestley, J. (2017). A Comparison of Decision Tree with Logistic Regression Model
for Prediction of Worst Non-Financial Payment Status in Commercial Credit.
Rudolfer, M.S., Paliouras, G., Peers, S. I. (1999). A Comparison of Logistic Regression to
Decision Tree Induction in the Diagnosis of Carpal Tunnel Syndrome. Computers and
Biomedical Research, 32(5), 391-414.