Let u be a real-valued n-times differentiable function of time.

You are given evaluations of this function u(t0), …, u(tn) at distinct points in time and asked to approximate u(m)(t), the m-th derivative of the function evaluated at some given point t (m≤n).

If you have studied numerical methods, you are probably already familiar with how to tackle this problem with what is sometimes referred to as the “method of undetermined coefficients” (or, in equivalent language, by using a Lagrange interpolating polynomial). In this post, after reviewing the method, we implement it in a few lines of code.

Consider approximating the derivative u(m)(t) by a linear combination of the observations:

\[\boldsymbol{w}^\intercal \boldsymbol{u} \equiv (w_0, \ldots, w_n) (u(t_0), \ldots, u(t_n))^\intercal = w_0 u(t_0) + \cdots + w_n u(t_n).\]

Taylor expanding each term around t,

\[\boldsymbol{w}^\intercal \boldsymbol{u} = \sum_{k = 0}^n w_k \left( u(t) + u^\prime(t) \left(t_k - t\right) + \cdots + u^{(n)}(t) \frac{\left(t_k - t\right)^{n}}{n!} + O(\left(t_k - t\right)^{n+1}) \right).\]

Rearranging the resulting expression,

\[\boldsymbol{w}^\intercal \boldsymbol{u} = O(\max_k \left|t_k - t\right|^{n+1} \Vert \boldsymbol{w} \Vert_\infty) + \sum_{k = 0}^{n} u^{(k)}(t) \left( w_0 \frac{\left(t_0 - t\right)^k}{k!} + \cdots + w_n \frac{\left(t_n - t\right)^k}{k!} \right).\]

The form above makes it clear that the original linear combination is nothing more than an error term (represented in Big O notation) along with a weighted sum of the derivatives of u evaluated at t. Since we are interested only in the m-th derivative, we would like to pick the weights w such that the coefficient of u(k)(t) is 1 whenever k=m and 0 otherwise. This suggests solving the linear system

The matrix on the left hand side is a Vandermonde matrix, and hence this system has a unique solution. Denoting by v the solution of this system, we have

\[u^{(m)}(t) = \boldsymbol{v}^\intercal \boldsymbol{u} + O(\max_k \left|t_k - t\right|^{n+1} \Vert \boldsymbol{v} \Vert_\infty).\]

Example application: backward differentiation formula (BDF)

As an application, consider the case in which we want to compute the first derivative of the function (m=1) and the observations are made at the points tk=t-kh where h is some positive constant. In this case, the linear system simplifies significantly:

For each value of n, we can solve the above linear system to obtain the coefficients:

  n=1 n=2 n=3 n=4 n=5
hw0 1 3/2 11/6 25/12 137/60
hw1 -1 -2 -3 -4 -5
hw2   1/2 3/2 3 5
hw3     -1/3 -4/3 -10/3
hw4       1/4 5/4
hw5         -1/5

As an example of how to read the above table, the third column (n=3) tells us

\[h u^\prime(t) = 11/6 \cdot u(t) - 3 \cdot u(t - h) + 3/2 \cdot u(t - 2h) - 1/3 \cdot u(t - 3h) + O(h^4).\]

The table was generated by the following piece of code:

# bdf.py

import fractions
import numpy as np

def bdf(n):
    """Creates the coefficient vector for a BDF formula of order `n`.

        n: A positive integer.

        A (one-dimensional) numpy array of coefficients `hw`.
        Denoting by `h` a positive step size, the derivative is approximated by
        `(hw[0] * u(t) + hw[1] * u(t-h) + ... + hw[n] * u(t-nh)) / h`
        where u is some real-valued, real-input callable.
    A = np.vander(range(n+1), increasing=True).transpose()
    b = [(1 - 2 * (k % 2)) * int(k == 1) for k in range(n+1)]
    return np.linalg.solve(A, b)

if __name__ == "__main__":
        'all': lambda x: str(fractions.Fraction(x).limit_denominator())
    for n in range(1, 6): print('BDF {}: {}'.format(n, bdf(n)))

As a crude sanity check, we can verify that the n-th BDF formula applied to ex becomes a better approximation of ex as n increases (recall that the exponential function is its own derivative):

The code to generate the plot is given below:

import matplotlib.pyplot as plt
import numpy as np

from bdf import bdf

a = 0.
b = 10.
N = 10
h = (b - a) / N
x = np.linspace(a, b, N+1)
y = np.exp(x)

plt.semilogy(x, y, label="Exponential function")
for n in range(1, 4):
    approx_exp = (np.convolve(bdf(n), y) / h)[n:-n]
        x[n:], approx_exp,
        '-x', label="BDF {} approximation".format(n)