# Machine Learning

## Origins of machine learning

- Machine learning has its origins in statistics and mathematical modeling of data

Fundamental idea of machine learning

- The fundamental idea of machine learning is to use data from past observations to predict unknown outcomes or values

Examples

An ice cream store owner using historical sales and weather records to predict daily ice cream sales

A doctor using clinical data to predict a patient's risk of diabetes

A researcher using past observations to automate the identification of penguin species

## Machine learning model

A machine learning model is a software application that calculates an output value based on input values

The process of defining the model's function is known as training

After training, the model can be used to predict new values in a process called inferencing.

Training data

The training data consists of past observations

Observations include the observed features and the known label

Features are often referred to as x, and the label as y

Examples

In the ice cream sales scenario, features (x) are weather measurements and the label (y) is the number of ice creams sold

In the medical scenario, features (x) are patient measurements and the label (y) is the likelihood of diabetes

In the Antarctic research scenario, features (x) are penguin attributes and the label (y) is the species

Algorithm and model

An algorithm is applied to determine a relationship between features and label

The result is a model that is a function denoted as f

The model is used for inferencing by inputting feature values and receiving a prediction of the label

The output from the model is often denoted as ŷ or "y-hat"

**Types of machine learning**

## Supervised machine learning

Training data includes both feature values and known label values

Used to train models by determining a relationship between features and labels

Predicts unknown labels for features in future cases

## Regression

Form of supervised machine learning with numeric label predictions

Predicts values like number of ice creams sold or selling price of a property

## Classification

Form of supervised machine learning with categorical label predictions

Two common scenarios: binary classification and multiclass classification

## Binary classification

Predicts one of two outcomes, true/false or positive/negative

Examples: risk for diabetes, loan default, response to marketing offer

## Multiclass classification

Predicts one of multiple possible classes

Examples: species of a penguin, genre of a movie

## Unsupervised machine learning

- Training data consists only of feature values without known labels

## Clustering

Most common form of unsupervised machine learning

Identifies similarities between observations based on features and groups them into clusters

Examples: grouping flowers, identifying similar customers

Segmenting Customers:

- Segment customers into groups

Analyzing Customer Groups:

Identify and categorize different classes of customers

Examples of customer classes could include high value-low volume customers, frequent small purchasers, etc.

Labeling Clustering Results:

- Use categorizations to label observations in clustering results

Training a Classification Model:

Utilize the labeled data to train a classification model

The model will predict which customer category a new customer might belong to.

**Regression**

Training a Regression Model

Regression models are trained to predict numeric label values based on training data

The training data includes both features and known labels

The training process involves multiple iterations

An appropriate algorithm is used to train the model

The model's predictive performance is evaluated

The model is refined by repeating the training process with different algorithms and parameters

The goal is to achieve an acceptable level of predictive accuracy

Key Elements of the Training Process

Splitting the training data to create a dataset for training the model and another subset for validation

Using an algorithm (e.g., linear regression) to fit the training data to a model

Using the validation data to test the model by predicting labels for the features

Comparing the predicted labels with the actual labels in the validation dataset

Calculating a metric to indicate the accuracy of the model's predictions

Example of Regression

Training a model to predict ice cream sales based on temperature as the feature

Historic data includes records of daily temperatures and ice cream sales.

### Mean Absolute Error (MAE)

The mean absolute error (MAE) measures the average absolute difference between predicted and actual values.

In the ice cream example, the MAE is calculated by finding the mean of the absolute errors (2, 3, 3, 1, 2, and 3), resulting in a value of 2.33.

### Mean Squared Error (MSE)

The mean squared error (MSE) measures the average squared difference between predicted and actual values.

It amplifies larger errors by squaring individual errors and calculating the mean of the squared values.

In the ice cream example, the MSE is calculated by finding the mean of the squared absolute values (4, 9, 9, 1, 4, and 9), resulting in a value of 6.

### Root Mean Squared Error (RMSE)

The root mean squared error (RMSE) is calculated by taking the square root of the MSE.

In the ice cream example, the RMSE is calculated as the square root of 6, resulting in a value of 2.45 (ice creams).

### Coefficient of determination (R2)

The coefficient of determination (R2) measures the proportion of variance in the validation results explained by the model.

R2 values range between 0 and 1, with higher values indicating a better fit.

In the ice cream example, the R2 calculated from the validation data is 0.95, indicating that the model explains 95% of the variance in the data.

### Iterative training

In real-world scenarios, data scientists use an iterative process to train and evaluate models.

This process involves varying feature selection, algorithm selection, and algorithm parameters to improve model performance.

Selection of the best model

The model that results in the best evaluation metric is selected

The selected model should have an acceptable evaluation metric for the specific scenario.

**Binary classification**

Classification in machine learning

Classification is a supervised machine learning technique

It follows an iterative process of training, validating, and evaluating models

Binary classification

Binary classification predicts one of two possible labels for a single class

It often uses multiple features (x) and a y value of 1 or 0

Example - binary classification

In a simplified example, blood glucose level is used to predict diabetes

The model predicts whether the label (y) is 1 (diabetes) or 0 (no diabetes)

Training a binary classification model

To train the model, we use an algorithm to fit the training data to a function that calculates the probability of the class label being true (diabetes)

The probability is measured between 0.0 and 1.0, where 1.0 represents a high probability of having diabetes

The function describes the probability of the class label being true for a given value of x

Three observations in the training data have a known class label of true (1.0), and three observations have a known class label of false (0.0)

An S-shaped curve represents the probability distribution, where values above the threshold predict true (1) and values below predict false (0)

The threshold is defined at a probability of 0.5

By applying the function to new data, we can predict the class label (diabetes) based on the probability output