Paper on a new way to assess how well model forecasts are doing, showing improvements since 2015, released online in Weather and Forecasting

Summary:  Forecasters use computer models to help predict the weather.  One important and simple way to see how good the computer forecasts are is to check how good predictions of where the high and low pressure systems are at a level about 5.5 km (3.5 miles) above the surface of the earth. But this doesn’t tell the whole story, so we have come up with a new method, called “Summary Assessment Metrics,” or SAMs. The first step to create a SAM is to collect the forecasts for different times (24 h, 48 h, etc. into the future), different levels (near the surface, jet stream level, etc.), different regions (Northern Hemisphere, the Tropics, etc.) and different variables (wind, temperature, etc.). Each forecast is given a grade, or ‘score,’ depending on how accurate it is. Then, the scores are ranked (what we call “normalized”) so the best forecast gets a one and the worst forecast gets a zero. All the normalized scores are averaged to give the SAM.

We looked at SAMs for the three best computer models in the world, the Global Forecast System (GFS) run by the National Oceanic and Atmospheric Administration in the U. S., the Unified Model (UM) run by the United Kingdom Met Office, and the Integrated Forecasting System (IFS) run by the European Centre for Medium Range Weather Forecasting.

Screen Shot 2018-11-02 at 5.26.57 PM.png


  • From 2015 to 2017, the IFS was the best of the three models, followed by the GFS and UM.
  • The three models are slowly getting better at predicting the weather, though the GFS forecasts during the first few days has improved a lot. This is due to a major improvement to the way weather data gets into the model that was made in May 2016.
  • SAMs can help in describing and understanding how accurate forecasts are.


