Liquidity Forecasting in Mosaic: Part II

Jesper Kristensen and 0xbrainjar

Composable Finance has developed two new baseline forecasting models to help ensure sufficient liquidity is maintained in Mosaic, our cross-layer transferral system. These models, along with our previous work, will help DeFi users have a seamless experience utilizing assets across scaling solutions.

To orient our new research, in Part I of exploring liquidity forecasting within Mosaic we covered how Composable Lab’s Liquidity Simulation Environment (LSE) could be used to develop a forecasting methodology through an autoregressive integrated moving average (ARIMA) model. The ARIMA model was our first non-AI based model — also referred to as a baseline model.

In this post, we expand upon our previous forecasting research by adding another two baseline models which are under the same family. We will also compare the models introduced so far and quantify their predictive power. Through our work, we found that these models serve as early warning systems for liquidity replenishment events within a precision of 10% of the seed liquidity.

How our Models Work with Mosaic

To understand the value of this work, our forecasting models can be positioned to run alongside vaults in Mosaic. As Mosaic is used, tokens move from vault A to vault B and vice versa. If more tokens move from A to B, the overall liquidity (amount of tokens) will drop in vault A. If this drop becomes too severe, we might have to stop token transfers altogether which is an undesirable event for users.

To prevent this from happening, we can use an early detection system which automatically decides if a vault needs more liquidity, and this detection system is based on our forecasting model. Utilizing our forecasts, the next step is to then automate the movement of liquidity — which will be the topic of future posts.

Extending our Data Generation and Introducing New Baseline Forecasting Models

In order to enrich our LSE with advanced data generation capabilities, we have implemented a Geometric Brownian Motion (GBM) simulator that generates cross-layer and cross-chain moves following this structure.

To bolster our forecasting, we introduced two new models. We started with a linear trend model and then proceeded to add a seasonal component.

Holt’s linear trend (HLT) model, also known as double exponential smoothing, identifies a linear trend in the time series and makes a prediction using the smoothed value and linear trend term at a certain time.

Holt-Winters’ seasonal method (HWS), known also as triple exponential smoothing, is a generalization of HLT so that it accounts for seasonal trends alongside the linear trend already included in the HLT model.

Comparing the Baseline Forecasting Models

We tested the performance of the two models introduced above and compared them with the ARIMA model explained in the previous post. These comparisons are on a similar dataset as before that consists of another 1,000 points with an hourly frequency. Both datasets are shown below:

In this figure, dataset 1 (left) and dataset 2 (right) are from the LSE; however, dataset 2 was generated using GBM.

The experimental settings remained the same; the models were trained each time using 200 data points corresponding to a time period of roughly eight days. Their forecasting capabilities are compared on a time period of 1 week (168 time steps).

Starting with the first 200 data points, we proceeded to shift the time window by 10 time steps and repeat the training and forecasting procedure for a total of 80 times. The figure below shows instances where the HLT model gives the best forecast.

Overall the HWS model seems to be taking more time to identify the correct trend. Instead, it seems to be explaining data fluctuations as a result of a seasonal effect which can be considered unnecessary and misleading for this particular dataset. As a result, the prediction underestimates the true performance.

On the other hand, the ARIMA model seems to overestimate the upward move during the training time period and predicts values much larger than what eventually was observed. This is an extreme case where it can be seen that, sometimes, the more “sophisticated” models that account for nonlinear trends lead to inaccurate predictions.

For the overall comparison of the three models, we computed the root mean squared error (RMSE) between the predictions and the observed data. The figure below shows the RMSE plots for the three models at all 80 training data sets obtained by shifting the time window 10 steps at a time for both datasets.

The top image depicts forecasted liquidity values between the three models for dataset 1. The bottom image depicts a comparison between the three models for dataset 2.

The values are scaled to represent a percent of the initial seed money, $2 million in this case. Overall, it appears that both the ARIMA and HLT models outperform the HWS model as they seem to almost always achieve lower prediction error values.

At certain time instances, the ARIMA model seems to provide a prediction that is very off the observed curve. This is typically a consequence of small, short-term fluctuations in the data that results in identifying the wrong trend and providing extremely wide confidence.

In the remaining cases, both ARIMA and HLT seem to attain very similar RMSE values. However, in certain cases, such as the Polygon and Arbitrum liquidity forecasting models shown previously, the HLT model is more accurate.

This falling off of the ARIMA model can be explained by its tendency to provide more conservative values. This is due to its ability to account for nonlinear trends as opposed to HLT that can only estimate linear trends on the data.

We note here that, predicting each time point correctly in a 1-week look-ahead also is technically not the objective. Really, the accuracy of the last time point 1-week out is what matters. We will return with more comparisons in this direction, but for now, we have considered how well each hour of the entire week is predicted.

Next Steps in our Forecasting Work

Mosaic’s Proof of Concept (PoC) has been running for several weeks, and we are currently using in-house tools built to retrieve this data combining it with new ways to analyze this data. We plan to share our results of the PoC in an upcoming post as well.

After this, we will develop and share more on advanced models diving into if and how machine learning (ML) and the more general field of artificial intelligence (AI) can help to further improve our forecasting framework. The eventual vision is to include this capability into the Mosaic product.

Finally, we are also imagining how these forecasting models and capabilities can come to play a larger role within Composable (for example: in the context of fee forecasting). As we progress, we will identify further use-cases and share them with our community.

If you are a developer with a project you think fits our ecosystem and goals, and you would like to participate in our interactive testing landscape at Composable Labs, reach out on Telegram at @brainjar.