DS1 Discovering Equations

Abstract

Today we’ll talk about on how we can use compressive sensive and optimization to learn the dynamics of an underlying physical system. This blog is based on the paper from Prof. Kutz’s lab - Discovering governing equations from data by sparse
identification of nonlinear dynamical systems. The paper develops a dictionary learning framework with overrepresented dictionary which enables us for interpretable non-linearity. We’ll discuss about it in the details below.

History : Modeling Dynamics from Data

Science started from astronomy, which started by observing the movements of stars, planets and heavenly bodies. Copernicus presented heliocentric theory which proposed sun as the center of the universe instead of earth. Later, Kepler gathered huge amount of planetary motion data. This enabled him to come up with three Kepler’s laws,

Finally, Newton developed a unified theory of motion across the universe. While Kepler’s contribution was primarily in generating detailed data, Newton’s contribution was in developing a unified theory which explains data.

Background: Symbolic Regression and Compressive Sensing

Symbolic regression which is a generalization of regression, aims to learn the underlying interaction in data. It searches over the space of all possible mathematical formulas which best predict the output variable, starting from a set of base functions like addition, trigonometric functions, and exponentials. However, it is computationally expensive, does not clearly scale well to large-scale dynamical systems of interest, and may be prone to over-fitting.

This paper utilizes ideas from sparse regression which has been used to find solutions for underdetermined linear systems. Sparsity helps us break the Nyquist–Shannon sampling theorem. In sparse regression, signal $x$ is K sparse,
$x = \Psi s$
$l1$ norm is often imposed as regularization for sparsity. Extensive research in this area has been done often referred as Compressive Sensing. Terrence Tao, who is a professor at UCLA actively works on this. Their methods bypasees our need to perform a combinatorially intractable bruteforce search for sparse solution. It is found with
high probability using convex methods that scale to large problems.

Problem Set up: Dynamical Systems

$\frac{d}{dt}\mathbf{x}(t) = f(\mathbf{x}(t))$
Let’s assume $\mathbf{x}(t)$ denotes state at time $t$ , $f$ denotes with dynamics f(\mathbf{x}(t). In most physical systems, only a few nonlinear terms in the
dynamics exist. Therefore it’s sparse in a high-dimensional nonlinear function space. So, if we can perform non-linear transformation, we can cast problem of identification of dynamical system as sparse regresssion.

Dynamical Systems as Sparse Regression

Let’s denote $\mathbf{X}$ and $\dot{\mathbf{X}}$ as state and dynamic matrix. $k$ -th row of $\mathbf{X}$ contains observation vector at time $k$ . Now, let’s do many transformation of $\mathbf{X}$ , such as linear, polynomial, cosine and exponentials. This should be guided by the applications where the possible non-linearities is known. Finally we append these matrices to contain a big matrix $\Theta$ . Then we can solve the following problem,
$\dot{\mathbf{X}} =\Theta W$
where, weight $W$ has sparse coefficients. It can be seen that each row has separate optimization. Therefore, complexity increaseslinearly with the number of time instants and not on the number of non-linear transformations. Basis functions are really important here. So we should test many different function bases and use the sparsity and accuracy of the resulting model as a diagnostic tool to determine the correct basis to represent the dynamics.

Approximating Derivatives

In reality we only observe discrete values of $\mathbf{X}$ and $\dot{\mathbf{X}}$ is not known. Therefore, we need to approximate it. Total variation regularization is used to denoise derivative. To counter the noise due to approximation, we should rather solve the following problem,
$\dot{\mathbf{X}} = \Theta W + \sigma Z$
where, $Z$ is iid and normally distributed. Traditionally, people use Lasso for sparse regression. However, it is computationally expensive. The paper uses Sequential thresholded least-squares algorithm to find it. Depending on the noise, it may be necessary to filter $\mathbf{X}$ and $\dot{\mathbf{X}}$ .

Dynamical Systems with PDE

Most of the physical systems are PDE and not ODE. This method poses a serious problem since numerical discretization on a spatial grid is exponentially large. For example, in fluid Dynamics, a simple 2D and 3D flows may require tens of thousands up to billions of variables to represent the discretized system. Therefore, current formulation is ill suited, since each row has separate optimization.

However, the good news that many high-dimensional systems of interest evolve on a low dimensional manifold or attractor that is well-approximated using a low-rank basis. Therefore, we can first use SVD to decompose $\mathbf{X}$ . This has attracted decades of research for fluid dymaics. Proper orthogonal decomposition is a state of the art method for decomposing the flow governed by the Navier stokes formulation.

Results: Lorenz System

The paper shows that proposed method can identify lorenz attractor. It can capture rich and chaotic dynamics that evolve on an attractor. Only a few terms in the right-hand side of the equation are required

Results: Fluid Dynamics

The paper simulates fluid flow data, Data are collected for the fluid flow past a cylinder using direct numerical simulations of the 2D Navier–Stokes equations. It then transforms to a lower dimension and perform sparse regression. We can see that it is able to identify dynamics.

Extensions

There can be a scenario where dynamics is also dependent on other parameters. In these scenario, we can solve,
$\dot{x} = f(x,\mu)$

References

Brunton, S. L., Proctor, J. L., & Kutz, J. N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the national academy of sciences, 113(15), 3932-3937.