As an Amazon Associate I earn from qualifying purchases.

Adapting language model architectures for time series forecasting

[ad_1]

Time series forecasting is essential for decision making across industries such as retail, energy, finance, and health care. However, developing accurate machine-learning-based forecasting models has traditionally required substantial dataset-specific tuning and model customization.

TrendRec probabilistic graphic model.png

Related content

Time series forecasting enables up-to-the-minute trend recognition, while novel two-step training process improves forecast accuracy.

In a paper we have just posted to arXiv, we present Chronos, a family of pretrained time series models based on language model architectures. Like large language models or vision-language models, Chronos is a foundation model, which learns from large datasets how to produce general representations useful for a wide range of tasks.

The key insight behind Chronos is treating time series data as a language to be modeled by off-the-shelf transformer architectures. To tokenize real-valued time series observations into a fixed vocabulary, we scale the time series by its absolute mean and then quantize the scaled time series into a fixed number of uniformly spaced bins.

In addition to these bin tokens, we add two special tokens, PAD and EOS, to denote padding/missing values and end-of-sequence, respectively. We can then train standard language models like T5 on such a “language of time series” using the conventional cross-entropy loss function, with no changes to the model architecture itself.

High-level depiction of Chronos. Left: Input time series is scaled and quantized to obtain a sequence of tokens. Center: The tokens are fed into a language model, which is trained using the cross-entropy loss. Right: During inference, tokens are sampled autoregressively from the model and mapped back to numerical values.

Despite its simplicity, Chronos is remarkably accurate. In a comprehensive evaluation involving 42 datasets, Chronos significantly outperformed classical statistical methods, as well as specialized deep-learning models, on data held out from its training sets. More important, on entirely new datasets, Chronos’s zero-shot performance was comparable and occasionally superior to that of models trained directly on those datasets.

A core strength of Chronos is its ability to leverage diverse time series data from different domains to improve generalization. To enhance the model’s robustness, we augmented the public data sources used for pretraining with randomly mixed-in real samples (TSMix) and with a synthetically generated dataset based on Gaussian processes (KernelSynth).

Cuboid decomposition.16x9.png

Related content

Novel “cuboid attention” helps transformers handle large-scale multidimensional data, while diffusion models enable probabilistic prediction.

The impressive zero-shot capabilities of Chronos position it as a viable “general-purpose” forecasting solution that simplifies deployment pipelines. Rather than training separate models for each bespoke application, practitioners can use an off-the-shelf Chronos model to make accurate forecasts immediately, reducing computation costs and making it easier to adopt advanced forecasting.

Despite Chronos’s strong empirical results, our exploration only scratches the surface of what we can achieve by aligning language modeling with time series forecasting. As the paper discusses, future research can explore more-sophisticated time series tokenization schemes, architectures tailored to serial data, and explicit incorporation of auxiliary features or domain knowledge.

The use of pretrained models for time series forecasting is an exciting frontier. By reformulating the forecasting task as a kind of language modeling, Chronos demonstrates a simpler path to general and accurate prediction. Moreover, Chronos will be able to seamlessly integrate future advances in the design of LLMs. We invite researchers and practitioners to engage with Chronos, now available open-source, and join us in developing the next generation of time series models.



[ad_2]

Source link

We will be happy to hear your thoughts

Leave a reply

Discover Your Essential Style at NovaEssentials
Logo