Prithvi WxC:天氣和氣候基礎模型
Prithvi WxC: Foundation Model for Weather and Climate
September 20, 2024
作者: Johannes Schmude, Sujit Roy, Will Trojak, Johannes Jakubik, Daniel Salles Civitarese, Shraddha Singh, Julian Kuehnert, Kumar Ankur, Aman Gupta, Christopher E Phillips, Romeo Kienzler, Daniela Szwarcman, Vishal Gaur, Rajat Shinde, Rohit Lal, Arlindo Da Silva, Jorge Luis Guevara Diaz, Anne Jones, Simon Pfreundschuh, Amy Lin, Aditi Sheshadri, Udaysankar Nair, Valentine Anantharaj, Hendrik Hamann, Campbell Watson, Manil Maskey, Tsengdar J Lee, Juan Bernabe Moreno, Rahul Ramachandran
cs.AI
摘要
由於意識到人工智慧仿真器可以與在高性能計算系統上運行的傳統數值天氣預測模型相匹敵,現在有越來越多大型人工智慧模型應用於預測、降解或即時預報等用例。儘管人工智慧文獻中的平行發展著重於基礎模型,這些模型可以有效調整以應對多個不同的用例,但天氣和氣候方面的發展主要集中在特定用例,特別強調中程預報。我們通過引入Prithvi WxC來彌補這一差距,這是一個擁有23億參數的基礎模型,使用了來自現代回顧分析與應用第二版(MERRA-2)的160個變量。Prithvi WxC採用了基於編碼器-解碼器的架構,融入了各種最近的變壓器模型中的概念,以有效捕捉輸入數據中的區域和全球依賴性。該模型被設計為能夠容納大量標記,以在精細分辨率下對不同拓撲中的天氣現象進行建模。此外,它通過將遮罩重建與預測的範式相結合,以混合目標進行訓練。我們在一組具有挑戰性的下游任務上對模型進行測試,包括:自回歸滾動預報、降解、重力波通量參數化和極端事件估計。擁有23億參數的預訓練模型,以及相應的微調工作流程,已通過Hugging Face作為開源貢獻公開發布。
English
Triggered by the realization that AI emulators can rival the performance of
traditional numerical weather prediction models running on HPC systems, there
is now an increasing number of large AI models that address use cases such as
forecasting, downscaling, or nowcasting. While the parallel developments in the
AI literature focus on foundation models -- models that can be effectively
tuned to address multiple, different use cases -- the developments on the
weather and climate side largely focus on single-use cases with particular
emphasis on mid-range forecasting. We close this gap by introducing Prithvi
WxC, a 2.3 billion parameter foundation model developed using 160 variables
from the Modern-Era Retrospective Analysis for Research and Applications,
Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture,
incorporating concepts from various recent transformer models to effectively
capture both regional and global dependencies in the input data. The model has
been designed to accommodate large token counts to model weather phenomena in
different topologies at fine resolutions. Furthermore, it is trained with a
mixed objective that combines the paradigms of masked reconstruction with
forecasting. We test the model on a set of challenging downstream tasks namely:
Autoregressive rollout forecasting, Downscaling, Gravity wave flux
parameterization, and Extreme events estimation. The pretrained model with 2.3
billion parameters, along with the associated fine-tuning workflows, has been
publicly released as an open-source contribution via Hugging Face.Summary
AI-Generated Summary