Supervised Learning
Contents
10. Supervised Learning¶
Supervised learning uses labeled datasets to train algorithms that to classify data or predict outcomes accurately. As input data is fed into the model, it adjusts its weights until the model has been fitted appropriately, which occurs as part of the cross validation process.
In contrast, unsupervised learning uses unlabeled data to discover patterns that help solve for clustering or association problems. This is particularly useful when subject matter experts are unsure of common properties within a data set.
10.1. The scikit-learn Package¶
One of the best known package for machine learning is scikit-learn.
It provides efficient versions of a large number of common algorithms with a
clean, iuniform, and streamlined API.
To install with pip:
pip install scikit-learn
Note that sklearn and scikit-learn both refer to the same package.
For better-looking visualizations, we need the graphviz package. It can be
installed with conda easily:
conda install python-graphviz.
To install with pip, the system library graphviz needs to be
installed first. On a Mac, for example, one could do so with
brew install graphviz.
Then the graphviz Python package can be installed with
pip install graphviz.