Data-driven modeling and computational predictions based on the maximum entropy
principle (MaxEnt principle) aim to find as simple as possible — but not simpler
than necessary — models that allow one to avoid the data-overfitting problem. We
derive a multivariate nonparametric and nonstationary formulation of the
MaxEnt principle and show that its solution can be approximated through a
numerical maximization of the sparse constrained optimization problem with
regularization. Application of the resulting algorithm to popular financial
benchmarks reveals memoryless models allowing for simple and qualitative
descriptions of data of the major stock market indices. We compare the obtained
MaxEnt models to the heteroscedastic models from computational econometrics
(GARCH, GARCH-GJR, MS-GARCH, and GARCH-PML4) in terms of the
model fit, complexity, and prediction quality. We compare the resulting model
log-likelihoods, the values of the Bayesian information criterion, posterior model
probabilities, the quality of the data autocorrelation function fits, as well as the
value-at-risk prediction quality. We show that all of the seven considered
major financial benchmark time series (DJI, SPX, FTSE, STOXX, SMI,
HSI, and N225) are better described by conditionally memoryless MaxEnt
models with nonstationary regime-switching than by the common econometric
models with finite memory. This analysis also reveals a sparse network of
statistically significant temporal relations for the positive and negative latent
variance changes among different markets. The code is provided for open
access.
Keywords
machine learning, financial time series, maximum entropy,
heteroscedasticity, sparsity