![]()
VALIDATION METHODS AND LINKS TO A COASTAL-GIS IN THE DEVELOPMENT OF A HIGH RESOLUTION LIMITED AREA MODEL (HIRLAM) FOR PRODUCING A 40-YEAR WAVE ATLAS FOR THE IRISH AND CELTIC SEAS
(1), (3) Nandamudi Vijaykumar,
(1), (2) Robert Devoy (1) Jeremy Gault, Declan Dunne, Cathal O'Mahony
(1) Coastal and Marine Resources Centre (CMRC), Environmental
Research Institute -ERI
(2) Department of Geography, University College Cork (NUIC), Cork (IE)
(3) Laboratory of Computing and Applied Mathematics (LAC), National Institute
for Space Research (INPE), S J Campos, SP (BR)
A 40-year hindcast (1958 to 1997) Wave Atlas for the Irish and Celtic Seas is being developed. Similar atlases are being produced for other European shelf seas and coastal waters. Linked project work in Ireland also involves the use of forward modelling techniques for wind - wave prediction in the Irish regions for the 21st Century. The methods used involve the downscaling of large-scale atmospheric General Circulation Models (GCMs) to regional - coastal levels. These are integrated with a Limited Area Model (HIRLAM) to obtain high-quality wind information. A second phase, consists of integrating the downscaled wind results with a wave model (WAM Cycle 4), in order to obtain the required local - regional scale wave climate. This information can then be used to assess wave impacts on coastal processes after incorporating the wind - wave results into a Geographical Information System. In this complex process of modelling and data analysis then it is essential to verify the reliability of the models and the results before data archiving and linkage to a GIS. This paper discusses the validation issues and procedures needed in the wind-wave modelling process. The methodology used and sources of data to which the model results are compared are described.
1. Introduction
According to Intergovernmental Panel on Climate Change (IPCC), climate warming
for the 21st Century will be associated directly with coastal and marine environmental
risks (Gates et al., 1992; Houghton et al., 1996). These risks will have important
impacts on human populations and wildlife. In response there has been a demand
by people living in coastal areas to have more detailed information about
these impacts of climate change in such areas. Recent reports have shown a
predicted increase in wind speeds, wave heights and the heights of storm surges
(Gunther et al., 1998). For the European North Atlantic margin, and particularly
the Irish - Scottish regions, then storminess impacts will be significant
for coastal erosion and process changes (Devoy, 1994; Lozano et al., 2000,
in press).
In assessing these predicted risks of climate change then it is valuable if potentially high risk and environmentally sensitive coastal areas can be identified (Devoy, 1992, 2000, in press). A methodology used in this approach is to quantify the continuous regional - local wind and wave conditions. These data are then used to identify those coasts commonly experiencing high wind-generated waves, as these will lead to strong surges, floods and loss of land.
In order to disseminate these results in a clear and structured manner, and to enable coastal managers to take appropriate protection measures, then the data must be incorporated into a Geographic Information System (GIS) (Star and Estes, 1990). GIS can be an efficient tool for marine and coastal data analysis and, therefore, will play an important role in Coastal Zone Management (Bartlett and Wright, 2000). Any such GIS needs the identification of geo-information, to assess coastal processes, organisation of a database to store information, the evaluation of computer system-interoperability issues and finally the development of tools for displaying and analysing the information. In order to disseminate the information properly as well as to guarantee its reliability, then rigorous validation procedures must be applied to guarantee the quality of information that is to be stored and visualised. The work here concentrates on the validation procedures needed for the primary production of wind and wave data.
2. Winds and Wave Modelling
A EU Framework 5 research project, Hindcast of Dynamic Processes of the Ocean
and Coastal Areas of Europe (HIPOCAS) is being undertaken to produce a Wave
Atlas for European coastal waters for the years 1958 to present (e.g., Soares
et al., 2002). The possibility for developing such numerical modelling of
wind-wave dynamic processes has been limited until recently (Gates et al.,
1992; WASA Group, 1998). Work by this research group (University College Cork)
in the HIPOCAS project is to generate the wave hindcast atlas for the Irish
and Celtic Seas and for adjacent North Atlantic waters.
In order to produce the wave statistics (hourly resolution) then sea surface (+10m sea surface) wind parameters need to be generated from numerical models (i.e., using downscaled GCM sources) and these then integrated with a separate wave model. This process allows the production of wave height, direction and frequency information.
The basis for the HIPOCAS project has been the 40-year global atmospheric reanalysis (Kalnay et al., 1996) carried out by the National Center for Environmental Prediction (NCEP), Washington, USA and the National Center for Atmospheric Research (NCAR), Boulder, Colorado, USA. The resolution of the GCM outputs is 200km with results produced at 6-hour intervals. However, the coarse-scale nature of these GCM and linked regional models is not adequate for conducting wave studies in coastal areas (Weisse and Gayer, 2000). Consequently, this GCM modelling work forms the source (reanalysis data) for model downscaling to a regional model level for the North Atlantic (REMO) (Jacob and Podzun, 1997), with a model resolution of 50 km and wind data generated for every hour. For some areas, at this regional scale, a spectral nudging technique (von Storch et al., 2000) has been applied to enhance lateral boundary conditions by imposing time-variable large-scale atmospheric states on the regional atmospheric model. In case of Irish and Celtic waters, these model results (REMO output) have been used further as boundary forcing for integration with a limited area model, in order to increase both temporal and spatial resolutions at the needed local scales (10 km: latitude and longitude grid spacing of 0.120). The limited area model used for generating high-resolution wind fields for the Irish region was HIRLAM (High-Resolution Limited Area Model) (Sass et al., 2000). In this project the results of the limited area model are wind fields. These data are then used as input into a wave model (WAMDI Group, 1988; Gunther et al., 1992).
These downscaled atmospheric and wave models must undergo a
validation process in order to verify the reliability of the models and the
subsequent wind and wave results. Validation is based ideally upon known (real)
observations against which the model outputs can be compared. Regrettably,
there is only very limited continuous meteorological and wave observational
information available, mainly from the post 1980s, about the past wind and
wave conditions/events from coastal - offshore areas. This information also
is generally spatially disjunct. Improvements have been made progressively
though to the present in these 'continuous' observations, particularly through
the deployment of more weather buoys and in satellite imagery (http://www.ecmwf.int).
In spite of these improvements model validation using real data remains difficult
and a range of procedural options for validation have to be considered. The
importance of this for subsequent databases and the production of coastal
GIS is, of course, "garbage in: garbage out".
3. Validation Procedure and Results
3.1 Data option 1
Data validation plays a major role in guaranteeing the reliability and accuracy
of models. Climate and ocean models are complex and good validation procedures
are particularly necessary, especially if forward modelling, the prediction
of events or of continuous environmental changes (e.g. storms, wind, pressure,
rainfall), is an objective of the modelling study. In such modelling it is
usual to run the models initially for periods over which a good quality time
series of observations exists (Lozano et al., in press). Further, in this
Wave Atlas project several countries are participating and producing wind
- wave results from different regions in Europe. Consequently, it is important
to establish rigorous and common validation procedures to guarantee a similar
quality for the data generated for use in the integrated regional atlases.
Given the inevitable unevenness of data quality and availability in the different
regions then this is no easy task.
Observed data needed for validation purposes in this Wave Atlas must be based on wind fields over the European offshore waters. This means that in order to obtain measured data, one has to depend on the availability of ship and/or weather buoy information distributed strategically over the region. There is only very limited continuous information, mainly from the post 1980s, about the past wind and wave conditions/events from these coastal - offshore waters (http://www.ecmwf.int and http://www.noaa.gov ). The information available is also generally spatially disjunct, subject to time breaks and other non-homogeneities. Similar homogeneity problems are common in most environmental records, even from land areas (e.g., Hickey, 2003)). Improvements in offshore records though have been made progressively to the present in these 'continuous' observations, particularly through the deployment of more weather buoys and in the technology of satellite imagery (http://www.ecmwf.int and http://www.noaa.gov ).
In the 'Irish region', the initial validation procedure adopted has been to compare the results of the limited area atmospheric model wind fields (HIRLAM) with wind fields from these direct observational records. (A similar initial approach was adopted by the other atlas project partners [Soares et al., 2002].) The data comparisons have then been tested using correlation coefficient, root mean square (rms) and other standard inferential statistical tests. The data sources used have included time series observations from ships, weather buoys and satellite imagery, as linked to weather charts (for a initial wind - wave pattern checking).
One traditional type of 'real' data source for the Irish Sea
region has been from ship-based observations. The data available have rarely
been found to be ideal for validation purposes and an example serves to illustrate
the practical difficulties of using such observations. In this case the ship-based
data provided were of wind speeds for the year 1992, taken close to the sea
surface (i.e. comparable with the model +10m sea surface wind output data).
However, the ship's recording positions and observation times were not regular.
A 'fixed' ship position was not available. Consequently, the known ship positions
were clustered and a 'mean' coordinate point established. The wind observations
made, even though taken from different coordinates, are still very close to
one another and can be represented by the averaged ship's position. Further,
a regular time series of observations (i.e., 1, 3, 6 hourly intervals etc.)
was also unavailable. Here 'regular' means that if a wind speed observation
was made (e.g., for 11.00 hours) on one day, it would not necessarily be available
for the following day for the same time. In spite of these inadequacies these
were the only data available!
A single month (January 1992) offered a reasonably consistent ship position
and observational time series. For this month the wind fields (East-West component
[u] and North-South component [v]) were extracted from the HIRLAM results
for the ship coordinates and the model's time intervals were selected as close
to those of the ship's as possible. Wind speeds from both the two sources
are given in Figure 1 and show a good visual match. The correlation coefficient
for the wind speed obtained was 0.68 and the rms obtained was 2.99.

Figure 1. HIRLAM results and the ship observed data, Irish Sea.
One more source of observational data used was from weather buoys located at stations K2 (Latitude: 51.00o; Longitude: -13.300) and K4 (Latitude: 54.540; Longitude: -12.360). These stations have observations of wind fields from 1993 (data available for every 3 hours) to present (available with hourly time slices). Wind speeds plotted comparing HIRLAM results with data from one of the stations (K4) are shown in Figure 2. Data were selected for the period of September 1st to September 7th 1996.

Figure 2. HIRLAM results and K4 Station
Weather Buoy
3.2 Data option 2
Even though such observational records may often suffer from a range of these
types of data inequalities they are still likely to be the best option for
use in model validation. However, it is not always the case that these data
are on open access or their existence well publicised. Commercial (private),
government and linked public service institutions (e.g., meteorological services,
defence, energy organisations) often hold such data. But the information may
be treated as proprietary and, depending on the type of data, available only
at commercial costs and in limited formats, or they may be considered sensitive
(e.g., for security and commercial competition reasons). Pricing policies
may also vary greatly between the different providers. In spite of these now
inherent problems in data provision and holding, meteorological and linked
state services are often very helpful in releasing such data at nominal costs
if they are known to be for research and non-commercial purposes only.
Due to the difficulties in obtaining observed/real data for long time series
(>months) for offshore areas, particularly prior to the 1990s, then other
'validation' options have to be considered. One such option in the case of
meteorological information is the use of data derived from the modelling reanalysis
process (i.e. model prediction/ output data), produced from models which also
cover land areas, or from models initially intended for use in different research
contexts (e.g. for ocean - offshore area meteorological data then the use
of NCEP, ECHAM4, REMO, ERA-15, ERA-40 model outputs is possible).
An advantage of using such data is that they are often easily available and are free (or almost so). These types of data are commonly stored in public domain databases, on websites, or in linked Geographic Information Systems. The major negative aspect of these data sets is that they do not have the same potential quality as that of observational records; since they are the outputs/reanalyses of model runs and the models may also be producing the data at much lower scale resolutions than are needed. Further, whilst these types of reanalysis data should themselves of course be the outcome of rigorous validation of their own source models, this cannot be assumed and the dangers for 'methodological circularity' in producing erroneous results are real.
For this Wave Atlas work (HIPOCAS- Irish Sea region) one such source of modelled (reanalysis) data has come from the ERA-40 project (e.g., Gibson et al., 1997). This project's objective has been to define (parameterise) and analyse the state of the global atmosphere, and has used satellite derived data sources. Data sets consist of 6-hour time slices, with monthly mean data also available The ERA-40 project was developed through the ECMWF (European Centre for Medium-Range Weather Forecasts) (http://www.ecmwf.int) and covers the period 1957 to 2001. The project was based on earlier work from ERA-15, for the years 1979 to 1993. . Although developed for large-scale work the wind data sets are applicable to the Wave Atlas project and have been used to compare with the HIRLAM outputs.
Figure 3 gives an example of the wind results (for +10m sea level) from the two model outputs. A coordinate point of -5.00 (Longitude) and 55.00 (Latitude) was taken for the time series (of 28 points shown on X-Axis) within the period September 1st 1996 to September 7th. Time intervals used were of 6 hours. The correlation for the wind speed obtained was 78.6% (u component: 95.4%; v component: 64.4%) whereas the rms obtained was 1.10 (u component: 1.23; v component: 1.22).

Figure 3. Wind velocity results from HIRLAM
and the ERA-40 reanalysis data.
3.3 Data option 3
In addition to the use of reanalysis/ model results, the modelling methodology
itself provides a further validation option (more an internal process check)
in helping the verification of the accuracy of the model downscaling. For
the Wave Atlas the HIRLAM runs (1958-1997) for the wind field for the Irish
regions used the regional scale model (REMO) to provide the forcing atmospheric
boundary conditions. A basic check to test whether the HIRLAM model outputs
worked correctly is, therefore, to compare the HIRLAM results with the initial
REMO boundary parameters; to verify whether the HIRLAM model was properly
nested into the REMO simulations. This test was undertaken and examples of
some results for wind field data are given (Figure 4), together with the correlation
coefficients and rms values (Tables 1 and 2).
The trends and variations in the wind plots for the 2 models are very close (Figure 4). The difference seen in the wind velocity values though is to be expected, due to the different heights used to generate the wind fields: +10 metre heights are given for HIRLAM, whilst for REMO wind values are given at heights between +20 and +40 metres above the sea surface.

Figure 4. Example of wind velocities
for HIRLAM and REMO data.
Correlation and root mean square tests were applied to various random coordinate points for several time periods. Tables 1 and 2 show the statistical results for one of the HIRLAM model's boundary coordinate points (Longitude of -7.30 and Latitude of 55.30). However, the last row (1-7 September 1996) shows the correlation and rms for the coordinate point with Longitude -5.00 and Latitude 55.00. The number of days used in the time series was 4, with the exception for the last row for which 7 days were considered. In both the coordinate positions, 3-hourly time slices were used. The results show a very strong positive correlation, supporting a correct nesting and operation of the HIRLAM model.

Table 1. Correlation Coefficients: HIRLAM x
REMO

Table 2. RMS: HIRLAM x REMO
3.4 Further statistical testing
In this presentation the statistical measures shown as assessment of the quality
of HIRLAM results have been standard correlation coefficient and root mean
square tests. It will be necessary in continuing work to use other types of
test in order to make the verification methodology more robust. This is particularly
important as only the wind and wave output statistics would normally be presented
in a database and linked Geographic Information System.
One such statistical operation suggested (Weisse and Gayer, 2000) and used in assessing the quality of hindcast of wind fields is that of complex correlation. This operation is very practical and it determines the correlation between two vector time series, as is the case for winds. In short, complex correlation uses uobs + ivobs and umod and ivmod to calculate the coefficient. The superscript obs denotes observation values whereas mod denotes the results from a forecast model and i represents . Detailed information of this complex correlation operator can be found on the website http://storms.rsmas.miami.edu/~cook/thesis/node19.html and in Kundu (1976).
Another potentially useful technique, though it has not been used commonly in evaluating a time series of weather related parameters such as wind fields, is Gradient Pattern Analysis (GPA). This technique is innovative and has been developed to apply on non-linear spatio-temporal structures to estimate the gradient moment based on the asymmetries among the vectors of the gradient field of the scalar fluctuations. It is used to characterise the formation and evolution of extended patterns based on the spatial-temporal correlations between large and small amplitude fluctuations of the pattern-structure represented as a gradient field (Rosa et al., 1998; Assireu et al., 2002; and Rosa et al., 2003). Unlike most of the statistical tools, GPA does not rely on the statistical properties of the series but depends solely on the local symmetry properties of the signal's gradient pattern. Moreover, when considering complex variability, it takes into account the directional information contained in the vector field. This technique is based on two operators: Asymmetric Amplitude Fragmentation (AAF) (Rosa et al., 1999) and Complex Entropy Form (CEF) (Ramos et al., 2000). AAF performs a global gradient asymmetry measurement by determining the symmetry breaking of a given dynamical pattern. The Complex Entropic Form operator is a measure of regularity and permits the quantification of the degree of phase disorder associated with a given gradient field. The idea here is to conduct a more rigorous study of how the application of such operators would help establish a valid verification methodology of weather related information before storing them in a GIS for future use. Further useful tests would concentrate on the accuracy of the model in predicting the occurrence of extreme events (timing and scale parameters), together with the testing of data homogeneity over time (Soares et al., 2002; Soares, 2003).
4. Conclusions
The evaluation of climate change and linked environmental processes have drawn
a lot of attention in the media since the 1990s, and have been a topic of
fundamental research amongst both academic and commercial institutes around
the world. Within the scope of coastal science and process work, many studies
are being conducted as to how global climate changes will affect coastal regions,
in terms of coastal flooding and linked river catchment functioning, erosion
and many other coastal management concerns (Smith, 2000). These evaluations
for the future depend necessarily upon the use of numerical modelling techniques
and upon their accuracy in predicting the operation of environmental systems.
In this work good model validation procedures are vital.
The work of the Wave Atlas project (HIPOCAS) is linked to the evaluation of coastal processes and to assessing the wider uses of ocean environments (e.g., shipping and port management), particularly through the use of data incorporated into coastal GIS. The work requires the development of validation procedures before wind field data obtained from a high resolution atmospheric model (HIRLAM) can be incorporated into a wave model, together with the further validation of the wave data outputs. This validation though is inherently difficult because of the limited sources of good quality and readily available observational data from the offshore zone.
Initial verifications to validate the wind fields generated by HIRLAM for the Irish Atlantic region have been based upon standard statistical testing (e.g., using correlation coefficient and rms measures). The data targeted for comparisons with the models have been derived from three main sources: 1. observational data (weather charts, observations collected by ships and wave buoys), 2. model reanalysis results (e.g., ERA-40 reanalysis) and, 3. initial model boundary forcing parameters (REMO), In all the cases the HIRLAM results have been shown to be reliable, giving consistently high correlation values from the three approaches used. A perfect statistical fit from HIRLAM is unlikely, given the nature of the modelling process, and this is also true for all available weather forecast models. Some might deliver better results than others depending upon the spatial/region of interest (i.e. including what real observations are available for model testing) and how the model boundary conditions are set-up. The valuable aspect of the HIRLAM work for wind and later integration into wave predictions (WAM Cycle 4 model) is that the model produces high-resolution output, giving data which will be of direct use by environmental managers The communication of the quality (validity) of model results and their subsequent incorporation into management tools (e.g., databases, GIS), remains a problem. Data storage tools rarely provide all the supporting model information, dealing necessarily with only the most relevant and usable results. Further, the modelling source work may be prevented from providing complete data sets, due to confidentiality and non-commercial use restrictions. One approach to helping this quality control problem would be to provide a quality/ validation rating structure, based upon the model's performance under an internationally agreed range of validation measures. The authors are not aware that any such functioning quality control system exists: this approach is worth an investigation.
In the Wave Atlas project, then the next steps are to conduct validation measures based on complex correlation tests that are useful in case of vector fields such as wind parameters. Tests will also be conducted based on an operator known as AAF very useful in evaluating gradient pattern analysis. Once data are validated, procedures will be developed in order to store this data into a Geographical Information System as well as develop interfaces to integrate these wind fields into a wave model so that wave field data are produced. These will form the basis for the evaluation of coastal process operations.