top of page

Math Concepts for Supervised and Unsupervised Machine Learning

  • Writer: Joy Tech
    Joy Tech
  • Mar 18, 2023
  • 2 min read

Updated: Mar 20, 2023

Supervised Learning

I. Introduction

  • Definition of machine learning: Probability theory, linear algebra, calculus

  • Types of machine learning: Supervised, unsupervised, reinforcement

  • Applications of machine learning: Natural language processing, computer vision, fraud detection, recommendation systems

II. Supervised Learning A. Basics of Supervised Learning

  • Types of problems:

    • Classification: Bayes' theorem, decision boundaries, softmax function, cross-entropy loss

    • Regression: Linear algebra (matrices, vectors), calculus (derivatives, partial derivatives), loss functions (mean squared error, mean absolute error)

  • Learning process:

    • Training: Optimization algorithms (gradient descent, stochastic gradient descent, Adam), backpropagation (chain rule of derivatives)

    • Testing: Prediction, inference

  • Evaluation metrics:

    • Classification: Confusion matrix, accuracy, precision, recall, F1 score, ROC curve, AUC

    • Regression: Mean squared error, mean absolute error, R-squared, explained variance


B. Linear Models

  • Linear Regression:

    • Optimization algorithms: Normal equation, gradient descent

    • Linear algebra: Matrix multiplication, inverse, transpose

    • Calculus: Partial derivatives, gradients

  • Logistic Regression:

    • Optimization algorithms: Gradient descent, Newton's method

    • Bayes' theorem: Probability theory, conditional probabilities

    • Linear algebra: Matrix multiplication

  • Naive Bayes Classifier:

    • Bayes' theorem: Probability theory, conditional probabilities

    • Probability theory: Joint probabilities, marginal probabilities

  • Support Vector Machines:

    • Optimization algorithms: Quadratic programming, dual problem, kernel methods

    • Calculus: Lagrange multipliers, partial derivatives

    • Linear algebra: Inner product, norms, Gram matrix


C. Tree-Based Models

  • Decision Trees:

    • Entropy: Information theory, probability theory

    • Information gain: Entropy, conditional probabilities

    • Tree traversal: Depth-first search, breadth-first search

  • Random Forests:

    • Ensemble learning: Bagging, bootstrap sampling

    • Decision trees: Gini impurity, information gain

  • Gradient Boosted Trees:

    • Gradient descent: Optimization algorithm, partial derivatives

    • Decision trees: Regression trees, loss functions (mean squared error, mean absolute error)


D. Instance-Based Models

  • k-Nearest Neighbors:

    • Distance metrics: Euclidean distance, Manhattan distance, cosine similarity

    • Voronoi diagrams: Geometry, Delaunay triangulation


E. Deep Learning Models

  • Artificial Neural Networks:

    • Perceptron learning rule: Activation functions, linear combinations

    • Backpropagation: Chain rule of derivatives, gradients

    • Activation functions: Sigmoid, ReLU, softmax

  • Convolutional Neural Networks:

    • Convolutional layers: Convolution operation, feature maps, stride, padding

    • Pooling layers: Max pooling, average pooling

    • ReLU activation: Nonlinearity, rectification

  • Recurrent Neural Networks:

    • Backpropagation through time: Unfolding, gradients

    • LSTM units: Memory cells, input/output/forget gates, activation functions (sigmoid, tanh)


Unsupervised Learning

A. Basics of Unsupervised Learning

  • Probability distributions and density estimation

  • Clustering methods: k-means, hierarchical clustering, density-based clustering

  • Dimensionality reduction techniques: principal component analysis (PCA), independent component analysis (ICA), non-negative matrix factorization (NMF), t-distributed stochastic neighbor embedding (t-SNE)

  • Information theory: entropy, mutual information, KL divergence

  • Optimization: gradient descent, stochastic gradient descent

B. Clustering Algorithms

  • Distance measures: Euclidean distance, Manhattan distance, cosine similarity

  • Objective functions: inertia, silhouette score

  • Optimization: Lloyd's algorithm for k-means, agglomerative hierarchical clustering

C. Dimensionality Reduction Algorithms

  • Linear algebra: eigenvectors, eigenvalues, singular value decomposition (SVD)

  • Optimization: gradient descent, stochastic gradient descent, Adam optimization

  • Information theory: entropy, mutual information, KL divergence

IV. Reinforcement Learning A. Basics of Reinforcement Learning

  • Probability theory: Markov decision processes, stochastic policies, state transition probabilities

  • Bellman equation: state-value function, action-value function, optimal policy

  • Exploration vs. exploitation trade-off

B. Algorithms

  • Q-Learning: Bellman equation, off-policy learning, epsilon-greedy policy

  • Deep Q-Networks: experience replay, target network, neural network approximators

  • Policy gradient methods: policy gradient theorem, REINFORCE algorithm

V. Advanced Topics A. Hyperparameter Tuning

  • Optimization: grid search, random search, Bayesian optimization

  • Cross-validation, overfitting

B. Regularization

  • L1 regularization, L2 regularization, Elastic Net regularization

  • Ridge regression, Lasso regression

C. Gradient Descent

  • Batch gradient descent, stochastic gradient descent, mini-batch gradient descent

  • Momentum, learning rate scheduling

D. Ensemble Methods

  • Bagging: bootstrap aggregating, random forest

  • Boosting: adaptive boosting, gradient boosting, XGBoost

  • Stacking: meta-learner, ensemble of models

E. Transfer Learning

  • Pretrained models, fine-tuning

  • Domain adaptation, multi-task learning

F. Explainable AI

  • Local interpretable model-agnostic explanations (LIME)

  • Shapley Additive exPlanations (SHAP)


ree


 
 
 

Comments


Post: Blog2_Post
  • LinkedIn

©2022 by Joy Tech

bottom of page