# Machine Learning

·

## Origins of machine learning

• Machine learning has its origins in statistics and mathematical modeling of data

Fundamental idea of machine learning

• The fundamental idea of machine learning is to use data from past observations to predict unknown outcomes or values

Examples

• An ice cream store owner using historical sales and weather records to predict daily ice cream sales

• A doctor using clinical data to predict a patient's risk of diabetes

• A researcher using past observations to automate the identification of penguin species

## Machine learning model

• A machine learning model is a software application that calculates an output value based on input values

• The process of defining the model's function is known as training

• After training, the model can be used to predict new values in a process called inferencing.

Training data

• The training data consists of past observations

• Observations include the observed features and the known label

• Features are often referred to as x, and the label as y

Examples

• In the ice cream sales scenario, features (x) are weather measurements and the label (y) is the number of ice creams sold

• In the medical scenario, features (x) are patient measurements and the label (y) is the likelihood of diabetes

• In the Antarctic research scenario, features (x) are penguin attributes and the label (y) is the species

Algorithm and model

• An algorithm is applied to determine a relationship between features and label

• The result is a model that is a function denoted as f

• The model is used for inferencing by inputting feature values and receiving a prediction of the label

• The output from the model is often denoted as ŷ or "y-hat"

# Types of machine learning

## Supervised machine learning

• Training data includes both feature values and known label values

• Used to train models by determining a relationship between features and labels

• Predicts unknown labels for features in future cases

## Regression

• Form of supervised machine learning with numeric label predictions

• Predicts values like number of ice creams sold or selling price of a property

## Classification

• Form of supervised machine learning with categorical label predictions

• Two common scenarios: binary classification and multiclass classification

## Binary classification

• Predicts one of two outcomes, true/false or positive/negative

• Examples: risk for diabetes, loan default, response to marketing offer

## Multiclass classification

• Predicts one of multiple possible classes

• Examples: species of a penguin, genre of a movie

## Unsupervised machine learning

• Training data consists only of feature values without known labels

## Clustering

• Most common form of unsupervised machine learning

• Identifies similarities between observations based on features and groups them into clusters

• Examples: grouping flowers, identifying similar customers

Segmenting Customers:

• Segment customers into groups

Analyzing Customer Groups:

• Identify and categorize different classes of customers

• Examples of customer classes could include high value-low volume customers, frequent small purchasers, etc.

Labeling Clustering Results:

• Use categorizations to label observations in clustering results

Training a Classification Model:

• Utilize the labeled data to train a classification model

• The model will predict which customer category a new customer might belong to.

# Regression

Training a Regression Model

• Regression models are trained to predict numeric label values based on training data

• The training data includes both features and known labels

• The training process involves multiple iterations

• An appropriate algorithm is used to train the model

• The model's predictive performance is evaluated

• The model is refined by repeating the training process with different algorithms and parameters

• The goal is to achieve an acceptable level of predictive accuracy

Key Elements of the Training Process

• Splitting the training data to create a dataset for training the model and another subset for validation

• Using an algorithm (e.g., linear regression) to fit the training data to a model

• Using the validation data to test the model by predicting labels for the features

• Comparing the predicted labels with the actual labels in the validation dataset

• Calculating a metric to indicate the accuracy of the model's predictions

Example of Regression

• Training a model to predict ice cream sales based on temperature as the feature

• Historic data includes records of daily temperatures and ice cream sales.

### Mean Absolute Error (MAE)

• The mean absolute error (MAE) measures the average absolute difference between predicted and actual values.

• In the ice cream example, the MAE is calculated by finding the mean of the absolute errors (2, 3, 3, 1, 2, and 3), resulting in a value of 2.33.

### Mean Squared Error (MSE)

• The mean squared error (MSE) measures the average squared difference between predicted and actual values.

• It amplifies larger errors by squaring individual errors and calculating the mean of the squared values.

• In the ice cream example, the MSE is calculated by finding the mean of the squared absolute values (4, 9, 9, 1, 4, and 9), resulting in a value of 6.

### Root Mean Squared Error (RMSE)

• The root mean squared error (RMSE) is calculated by taking the square root of the MSE.

• In the ice cream example, the RMSE is calculated as the square root of 6, resulting in a value of 2.45 (ice creams).

### Coefficient of determination (R2)

• The coefficient of determination (R2) measures the proportion of variance in the validation results explained by the model.

• R2 values range between 0 and 1, with higher values indicating a better fit.

• In the ice cream example, the R2 calculated from the validation data is 0.95, indicating that the model explains 95% of the variance in the data.

### Iterative training

• In real-world scenarios, data scientists use an iterative process to train and evaluate models.

• This process involves varying feature selection, algorithm selection, and algorithm parameters to improve model performance.

Selection of the best model

• The model that results in the best evaluation metric is selected

• The selected model should have an acceptable evaluation metric for the specific scenario.

# Binary classification

Classification in machine learning

• Classification is a supervised machine learning technique

• It follows an iterative process of training, validating, and evaluating models

Binary classification

• Binary classification predicts one of two possible labels for a single class

• It often uses multiple features (x) and a y value of 1 or 0

Example - binary classification

• In a simplified example, blood glucose level is used to predict diabetes

• The model predicts whether the label (y) is 1 (diabetes) or 0 (no diabetes)

Training a binary classification model

• To train the model, we use an algorithm to fit the training data to a function that calculates the probability of the class label being true (diabetes)

• The probability is measured between 0.0 and 1.0, where 1.0 represents a high probability of having diabetes

• The function describes the probability of the class label being true for a given value of x

• Three observations in the training data have a known class label of true (1.0), and three observations have a known class label of false (0.0)

• An S-shaped curve represents the probability distribution, where values above the threshold predict true (1) and values below predict false (0)

• The threshold is defined at a probability of 0.5

• By applying the function to new data, we can predict the class label (diabetes) based on the probability output