This page explains how to estimate parameters in an ODE model using MATLAB's lsqcurvefit function. The script and data files are available for download.

lsqcurvefit uses [least squares](https://mathworld.wolfram.com/LeastSquaresFitting.html#:~:text=A mathematical procedure for finding,the points from the curve.), which minimizes the total sum of the squared errors (SSE) between the model and the data.

For this example, an SIR model will be fit to data from a flu outbreak in an English boarding school in 1978. A total of 763 boys were at risk and 512 became infected. An article on the outbreak can be found in the British Medical Journal, while the exact data used in this example is taken from Dr. Brian Reich's fitting tutorial.

**Note: This tutorial goes in the order I follow when writing my own scripts, although the order that MATLAB requires overall is different. In particular, MATLAB requires all function definitions go at the end of the script. This is clear when looking at the SIR_lsqcurvefit_example.mlx file.



Files

flu_data.mat

SIR_lsqcurvefit_example.mlx

The SIR Model

The Susceptible-Infected-Recovered (SIR) model is described by the following equations (where a dot denotes differentiation with respect to time t):

$$ \begin{align*}\dot{S} &= -\frac{\beta SI}{N},\\ \dot{I} &= \frac{\beta SI}{N} - \gamma I,\\ \dot{R} &= \gamma I.\end{align*} $$

The total population is denoted $N(t) = S(t) + I(t) + R(t)$. The contact rate per day is denoted by $\beta$, and the recovery rate per day is denoted by $\gamma$. Note that $1/\gamma$ gives the average time it takes (in days) to recover.

Writing the model in MATLAB

First, we type out the model following MATLAB's ode45 example:

function dydt = SIR(t,y,params)
    % parameters to be fit are listed in the "params" vector.
    beta = params(1);
    gamma = params(2);

    S = y(1); I = y(2); R = y(3);
    N = S + I + R;           
    
    dydt = [-1*beta*S*I/N;              % S'(t)
            beta*S*I/N - gamma*I;       % I'(t)
            gamma*I];                   % R'(t)
end

We want to estimate values for betaand gamma, so we put them in the vector params. You can choose what order the estimated parameters will be listed in, but it must be consistent everywhere in the script.

The vector y is a list of all the state variables in the model in our desired order, while dydt represents the equations of the model in the desired order. For example, since we assign S=y(1), the first equation in dydt is $\dot{S}=-\beta SI/N$. Typically, this follows the order of the equations of the model.

"Solve" the model