02/03/2022 - Sara (ESR #8)

Flyer_Sara_FINAL.png

Machine learning is a form of artificial intelligence, which means the system learns from data rather than through explicit programming. Almost every machine learning algorithm has an optimization algorithm at it’s core. In this post, I will illustrate and explain a simple optimization algorithm: the “gradient descent” algorithm is frequently used to find values of parameters that cannot be calculated analytically and that have to be estimated through an algorithm.

Technically, this algorithm aims to minimize a given mathematical function (say cost function) through a repetitive stepwise approach, involving following iterative steps:

  1. Compute the first order derivative of the function at the current point (gradient)
  2. Make a step in the direction opposite to the gradient: opposite direction of slope increase from the current point by gamma (learning rate) times the gradient at that point, and
  3. Repeat until convergence.

As I understand, this is a rather technical description. Let’s think about a bowl of cereals. The bowl is the plot of the cost function and its bottom would be the minimum of the cost function. A random position on the surface of the bowl is the current value of the coefficients. The goal is to compute the value of the cost function for different values for the coefficients and select the coefficients that give a smaller cost. Then, we would repeat this process until we get the coefficients that give the minimum cost, i.e. the bottom of the bowl.

As part of my PhD, I am using the gradient descent algorithm to do parameter estimation in a fluid-structure-interaction model that is often used for establishing haemodynamics models.