Salesforce AI Research recently launched Moirai-MoE, described as the first mixture-of-experts time series foundation model. The model aims to empower time series foundation models with a sparse mixture of experts (MoE) and achieve token-level specialisation in a data-driven manner. The research team took to X to announce this update.
Introducing Moirai-MoE: — the first mixture-of-experts time series foundation model, a breakthrough in universal forecasting!
Moirai-MoE achieves token-level model specialization autonomously, delivering an impressive 17% performance boost over its predecessor Moirai at the… pic.twitter.com/4J1qgOifHS— Salesforce AI Research (@SFResearch) November 8, 2024
This model is a significant upgrade from the previous Moirai model. While Moirai used multiple input/output layers to handle time series data with different frequencies, the updated model simplifies this by using just one input/output layer. It relies on sparse mixture-of-experts transformers to effectively capture a variety of time series patterns.
Testing Metrics and Results
Salesforce researchers Juncheng Liu and Xu Liu revealed on the company’s official blog that they tested Moirai-MoE on 29 datasets from the Monash benchmark and found it performed better than all its competitors.
The researchers had discussed this update in a research paper released earlier this year.
Moirai-MoE-Small stood out by being 17% more effective than its dense version, Moirai-Small. It even surpassed larger models like Moirai-Base and Moirai-Large by 8% and 7%, respectively.
For zero-shot forecasting, they evaluated Moirai-MoE on 10 different datasets and compared its results using continuous ranked probability score (CRPS) and mean absolute scaled error (MASE) metrics, which measure accuracy. Compared to all versions of Moirai, Moirai-MoE-Small showed a 3%–14% improvement in CRPS and an 8%–16% improvement in MASE.
Moirai-MoE-Base delivered the best zero-shot performance, surpassing TimesFM and Chronos, which use some of the evaluation data in their pretraining.
What’s even more remarkable is that Moirai-MoE-Small has just 11 million active parameters, making it 28 times smaller than Moirai-Large while still delivering outstanding results.
How Can We Improve Time Series Predictions?
Time series forecasting is moving towards universal models that work across many tasks without extra training. This new approach uses a pre-trained model that handles different data types, domains, and prediction lengths, making the process simpler.
Forecasting time series data can be tricky because it comes in many different forms, making unified training challenging. Traditional models, like Moirai, use multiple layers tailored for specific data frequencies, adding complexity.
Instead of relying on several predefined input/output layers, this new mixture-of-experts (MoE) transformer approach uses one general layer while letting specialised experts within the Transformer capture different time series patterns. This data-driven method works directly at the token level, making the model more adaptable.
Additionally, Moirai-MoE employs a decoder-only training method. This strategy speeds up training by allowing different context lengths to be processed simultaneously, boosting efficiency.
In short, Moirai-MoE offers an automated, streamlined approach to handling the diverse nature of time series data, simplifying training and enhancing performance.
The post Salesforce’s New Moirai-MoE Model Sets a Benchmark in Time Series Prediction appeared first on Analytics India Magazine.