Abstract
Prithvi WxC, a 2.3 billion parameter foundation model using an encoder-decoder architecture with transformer concepts, addresses weather forecasting, downscaling, and extreme events estimation with a mixed masked reconstruction and forecasting objective.
Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to address multiple, different use cases -- the developments on the weather and climate side largely focus on single-use cases with particular emphasis on mid-range forecasting. We close this gap by introducing Prithvi WxC, a 2.3 billion parameter foundation model developed using 160 variables from the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture, incorporating concepts from various recent transformer models to effectively capture both regional and global dependencies in the input data. The model has been designed to accommodate large token counts to model weather phenomena in different topologies at fine resolutions. Furthermore, it is trained with a mixed objective that combines the paradigms of masked reconstruction with forecasting. We test the model on a set of challenging downstream tasks namely: Autoregressive rollout forecasting, Downscaling, Gravity wave flux parameterization, and Extreme events estimation. The pretrained model with 2.3 billion parameters, along with the associated fine-tuning workflows, has been publicly released as an open-source contribution via Hugging Face.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PuYun: Medium-Range Global Weather Forecasting Using Large Kernel Attention Convolutional Networks (2024)
- Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region (2024)
- FuXi Weather: An end-to-end machine learning weather data assimilation and forecasting system (2024)
- MetMamba: Regional Weather Forecasting with Spatial-Temporal Mamba Model (2024)
- Regional data-driven weather modeling with a global stretched-grid (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
🌎 𝐓𝐡𝐞 𝐟𝐢𝐫𝐬𝐭 𝐞𝐯𝐞𝐫 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐰𝐞𝐚𝐭𝐡𝐞𝐫 𝐦𝐨𝐝𝐞𝐥: 𝐏𝐫𝐢𝐭𝐡𝐯𝐢 𝐖𝐱𝐂 𝐞𝐧𝐚𝐛𝐥𝐞𝐬 𝐥𝐢𝐟𝐞-𝐬𝐚𝐯𝐢𝐧𝐠 𝐰𝐞𝐚𝐭𝐡𝐞𝐫 𝐟𝐨𝐫𝐞𝐜𝐚𝐬𝐭𝐬
Hurricane Katrina killed hundreds of people as it made landfall on New Orleans in 2005 - many of these deaths could have been avoided if alerts had been given one day earlier. Accurate weather forecasts are really life-saving.
🔥 Now, NASA and IBM just dropped a game-changing new model: the first ever foundation model for weather! This means, it's the first time we have a generalist model not restricted to one task, but able to predict 160 weather variables!
Prithvi WxC (Prithvi, “पृथ्वी”, is the Sanskrit name for Earth) - is a 2.3 billion parameter model, with an architecture close to previous vision transformers like Hiera.
But it comes with some important tweaks: under the hood, Prithvi WxC uses a clever transformer-based architecture with 25 encoder and 5 decoder blocks. It alternates between "local" and "global" attention to capture both regional and global weather patterns.
And boy, does it deliver.
𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
🔮 𝗡𝗮𝗶𝗹𝘀 𝘀𝗵𝗼𝗿𝘁-𝘁𝗲𝗿𝗺 𝗳𝗼𝗿𝗲𝗰𝗮𝘀𝘁𝘀 - Prithvi WxC crushed it on 6-12 hour predictions, even outperforming some traditional numerical weather models
🌀 𝗧𝗿𝗮𝗰𝗸𝘀 𝗵𝘂𝗿𝗿𝗶𝗰𝗮𝗻𝗲𝘀 𝗹𝗶𝗸𝗲 𝗮 𝗰𝗵𝗮𝗺𝗽 - For Hurricane Ida, it predicted the landfall location within 5 km (vs 20+ km errors from other AI models), which is a huge progress!
🔍 𝟲𝘅 𝗱𝗼𝘄𝗻𝘀𝗰𝗮𝗹𝗶𝗻𝗴 𝗽𝗼𝘄𝗲𝗿 - Can zoom in on weather data to 6x higher resolution with 4x lower error than basic methods
🌊 𝗠𝗼𝗱𝗲𝗹𝘀 𝗲𝗹𝘂𝘀𝗶𝘃𝗲 𝗴𝗿𝗮𝘃𝗶𝘁𝘆 𝘄𝗮𝘃𝗲𝘀 - Accurately simulates these crucial but hard-to-capture atmospheric oscillations
😎 The coolest part? 𝗣𝗿𝗶𝘁𝗵𝘃𝗶 𝗪𝘅𝗖 𝗶𝘀𝗻'𝘁 𝗮 𝗼𝗻𝗲-𝘁𝗿𝗶𝗰𝗸 𝗽𝗼𝗻𝘆. Its flexible design lets researchers fine-tune it for all kinds of specialized tasks. They've already adapted it for things like detailed regional climate projections, modeling tiny atmospheric gravity waves, and hurricane tracking.
This opens up tons of possibilities for improving climate models, severe weather prediction, and more. As climate change intensifies, tools like Prithvi WxC will become more and more crucial to avoid disasters!
Models citing this paper 3
Datasets citing this paper 0
No dataset linking this paper