franksingleton
Well-known member
In situ data at fixed times from weather observing stations, radiosondes, ships and tethered buoys account for about 15% of the value of a forecast. In situ data at random times, aircraft and drifting buoys provide 8 or 9%. The majority of the input comes from satellites. Air Motion Vector data at fixed times from GEOS account for about 11 or 12% of the value. These are by tracking high level cloud using visible and infrared as well as tracking areas of water vapour high up.Frank understates the difficulty of ingesting data into the system. Observations are often at point locations, and although these are carefully chosen, may not always sample the overall field effectively. The time of observations doesn't always correspond to the time-steps of the model. Interpolation is fraught with its own problems; choosing the best interpolation scheme for a particular dataset is not intuitive. Satellite data, while requiring less interpolation may not measure the desired parameters directly, and may have a non-linear relationship with the desired parameters.
All these are handled routinely by the Met Office and other similar agencies, but that step is vital to modelling. I understand that the whole process is dynamic; the previous model is adjusted to match the latest data values.
I also understand that hind-casting is used to validate the modelling process, using data that were not available in time for forecasting.
As Frank says, the real issue is the ACTUAL resolution of the models used. Of course they can be resampled to provide a nicer looking display - but the real resolution is unchanged.
By far the greater data volumes are from LEOS. These provide about 50% of the forecast value. A major problem is that neither infrared nor microwave soundings soundings actually provide temperature or humidity. They measure the effects of temperature and humidity. The radiative equations cannot be inverted.
The most recent satellite data source is radio occultation using GPS (GNSS) signals. These provide about 8% of the forecast outcome. These measure the refraction of radio signals at high resolution vertically but low horizontally. Thus, they give profiles of air density. In the stratosphere that means the, effectively, measure temperature. In the troposphere the effects of temperature cannot be separated. The importance of GPSRO is that there is no instrumental bias and they give very good data in the stratosphere.
Data analysis is by optimum fitting over the period since the last forecast. For each time step over the past 6 hours (12 for ECMWF) the models predict all measured data - including radiances, microwave emissions and RO. The fitting is a multi-dimensional version of fitting 2 dimensional data to a straight line or a curve.
On top of all the above is the problem of vastly different data volumes. There are small amounts of accurate, in situ data versus vast quantities of data of variable resolutions
I hope all that makes sense and gives an indication of just how difficult is data analysis. Models have certainly improved greatly in recent years but a major contribution has been the improvements both in satellite sensing and the way the data are used.