Journal topic
Atmos. Meas. Tech., 11, 1009–1017, 2018
https://doi.org/10.5194/amt-11-1009-2018
Atmos. Meas. Tech., 11, 1009–1017, 2018
https://doi.org/10.5194/amt-11-1009-2018

Research article 20 Feb 2018

Research article | 20 Feb 2018

# Importance of interpolation and coincidence errors in data fusion

Importance of interpolation and coincidence errors in data fusion
Simone Ceccherini1, Bruno Carli1, Cecilia Tirelli1, Nicola Zoppetti1, Samuele Del Bianco1, Ugo Cortesi1, Jukka Kujanpää2, and Rossana Dragani3 Simone Ceccherini et al.
• 1Istituto di Fisica Applicata “Nello Carrara” del Consiglio Nazionale delle Ricerche, Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy
• 2Finnish Meteorological Institute, Earth Observation Unit, P.O. Box 503, 00101 Helsinki, Finland
• 3European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading, RG2 9AX, UK

Correspondence: Simone Ceccherini (s.ceccherini@ifac.cnr.it)

Abstract

The complete data fusion (CDF) method is applied to ozone profiles obtained from simulated measurements in the ultraviolet and in the thermal infrared in the framework of the Sentinel 4 mission of the Copernicus programme. We observe that the quality of the fused products is degraded when the fusing profiles are either retrieved on different vertical grids or referred to different true profiles. To address this shortcoming, a generalization of the complete data fusion method, which takes into account interpolation and coincidence errors, is presented. This upgrade overcomes the encountered problems and provides products of good quality when the fusing profiles are both retrieved on different vertical grids and referred to different true profiles. The impact of the interpolation and coincidence errors on number of degrees of freedom and errors of the fused profile is also analysed. The approach developed here to account for the interpolation and coincidence errors can also be followed to include other error components, such as forward model errors.

1 Introduction

Many remote sensing observations of vertical profiles of atmospheric variables are obtained with instruments operating on space-borne and airborne platforms, as well as from ground-based stations. Recently, the complete data fusion (CDF) method (Ceccherini et al., 2015) was proposed for use in the combination of independent measurements of the same profile in order to exploit all the available information and obtain a comprehensive and concise description of the atmospheric state. This is an a posteriori method that uses standard retrieval products. With simple implementation requirements, the CDF products are equivalent to those from a simultaneous retrieval, considered to be the most comprehensive way of exploiting different observations of the same quantity (Aires et al., 2012), in spite of a greater computational complexity. However, so far, the data fusion method was mainly applied to measurements performed by the same instrument while sounding the same air sample.

Limited tests were conducted on measurements performed by different instruments when inconsistencies due to differences in the observed true profiles (because of the non-perfect coincidence of the space–time location of the measurements) could degrade the optimal performances of the simultaneous retrieval. About the fusion of data provided by different instruments, it has been proved (Ceccherini, 2016) that the CDF method is completely equivalent to the measurement space solution (MSS) data fusion method (Ceccherini et al., 2009). The latter was successfully applied to the data fusion of MIPAS-ENVISAT and IASI-METOP measurements (Ceccherini et al., 2010a, b) and of MIPAS-STR and MARSCHALS measurements (Cortesi et al., 2016). However, since in these cases the measurements to be fused (referred to as fusing profiles hereafter) carried information about basically complementary altitude ranges, their possible inconsistency did not result in unrealistic fused profiles.

The first applications of data fusion were made with profiles retrieved on the same vertical grid. A first analysis of the effect of different grids on the quality of the fused products was performed and presented by Ceccherini et al. (2016). In this case, the individual profiles were first obtained on grids optimally defined according to the information content of the individual observations. Then, the CDF method was performed using averaging kernel matrices (AKMs) interpolated to a common grid optimized for the data fusion product. Compared to the case in which the individual retrievals are obtained directly on the grid optimized for the data fusion, the number of degrees of freedom (DOF) is reduced by about a quarter with this approach. Thus, in data fusion applications the choice of the retrieval grid can lead to an information content loss that cannot be restored with interpolation.

Here, we consider the general problem posed by the application of the CDF method to measurements performed by different instruments that are retrieved on different vertical grids and refer to different true profiles (which correspond to the case of fusing profiles measured in different geolocations). The analysis of this problem suggests a modification of the CDF method, taking into account interpolation and coincidence errors. We determine the expressions of these errors and show how they enter in the CDF formula. The study is performed using simulated measurements of ozone profiles obtained in the ultraviolet and in the thermal infrared in the framework of the Sentinel 4 (S4) mission (ESA, 2017) of the Copernicus programme (http://www.copernicus.eu/main/sentinels). The advantages in using a multispectral approach for observing ozone profiles from space have been studied, using simulated measurements, by Landgraf and Hasekamp (2007), Worden et al. (2007), Natraj et al. (2011), Hache et al. (2014) and Costantino et al. (2017), and, using real measurements, by Fu et al. (2013) and Cuesta et al. (2013). Two review papers on this subject are Lahoz et al. (2012) and Timmermans et al. (2015).

The paper is organized as follows: Section 2 presents an account of the problems that occur when the CDF method is applied to vertical profiles retrieved on different vertical grids and referring to different true profiles. In Sect. 3, we theoretically analyse the problems discussed in Sect. 2 and show how the CDF method can be modified to overcome them. In Sect. 4, we show how the solution proposed in Sect. 3 solves the problems discussed in Sect. 2. In Sect. 5, we describe how to deal with forward model errors. Conclusions are drawn in Sect. 6.

2 Application of the CDF method to profiles retrieved on different vertical grids and related to different true profiles

The future atmospheric Sentinel missions of the Copernicus programme (http://www.copernicus.eu/main/sentinels) will provide great scope and a real test bed for data fusion applications. The wealth of data that will become available from these missions will likely present technical challenges to many applications. With the use of data fusion, the number of products can be reduced while maintaining the information content of the original datasets. For this reason, we test the CDF method on simulated data of the S4. We simulate two S4 ozone vertical profile measurements as they could be obtained from the Infrared Sounder (IRS) in the thermal infrared and from the Ultraviolet, Visible and Near-Infrared Sounding (UVN) spectrometer in the ultraviolet (http://www.eumetsat.int/website/home/Satellites/FutureSatellites/MeteosatThirdGeneration/MTGDesign/index.html) on board the MTG (Meteosat Third Generation) satellite. We refer to these two simulated measurements as TIR measurement and UV measurement, respectively.

In order to evaluate the effect of the variability of vertical grids and of true profiles, three cases are considered:

1. The simulated measurements refer to the same true profile and are retrieved on the same vertical grid.

2. The simulated measurements refer to the same true profile but are retrieved on different vertical grids.

3. The simulated measurements refer to different true profiles and are retrieved on the same vertical grid.

In all three cases, the true profile and the vertical grid of the UV measurement are kept fixed and, when pertinent, are changed for the TIR measurement. For simplicity, we define the vertical grid of the data fusion product to coincide with the fixed grid of the UV measurement. In the following, the vertical grid of the fusion product is referred to as the fusion grid.

For a meaningful comparison of the quality of fusing and fused profiles, it is necessary to have common a priori profiles and common a priori covariance matrices (CMs). Therefore, the a priori of the fusing profiles, which are produced with individual a priori assumptions, have been modified using the method described in Ceccherini et al. (2014). In the comparisons, the same a priori profiles provided by the McPeters and Labow climatology (McPeters and Labow, 2012) are used for all fusing and fused profiles. The a priori CMs are obtained using the standard deviation of the McPeters and Labow climatology when its value is larger than 20 % of the a priori profile and a value of 20 % of the a priori profile in the other cases. The off-diagonal elements are calculated considering a correlation length of 6 km. The correlation length is used to reduce oscillations in the retrieved profile and the value of 6 km is typically used for nadir ozone profile retrieval (Liu et al., 2010; Kroon et al., 2011; Miles et al., 2015).

The results obtained in the three test cases are reported in Figs. 1–3. These figures show the true profiles in panel (a), the mean value of the true profiles and the profiles obtained from the measurements (TIR, UV and data fusion) in panel (b) and the residuals in panel (c), i.e. the differences between the three estimated profiles and the mean value of the true profiles.

We observe that, while in case 1 the differences between the profile obtained from the fusion and the mean of the true profiles are smaller than, or comparable to, those of the profiles obtained from the TIR and UV measurements, in cases 2 and 3 these differences are significantly larger. Therefore, in cases 2 and 3 the fusion provides a product of poorer quality than that of the single products.

These tests show that the CDF algorithm and the equivalent simultaneous retrieval work well in case 1, while they have problems in cases 2 and 3, where the profiles are retrieved on different vertical grids and are referred to different true profiles, respectively.

Figure 1(a) True ozone profiles related to TIR (red line) and UV (blue line) measurements. (b) Ozone profiles obtained from TIR measurement (red line), from UV measurement (blue line), from the data fusion (black line) compared with the mean value of the true profiles (green line). (c) Residual errors obtained as differences of the ozone profiles obtained from TIR measurement (red line), from UV measurement (blue line) and from data fusion (black line) from the mean value of true profiles. All the reported quantities are related to case 1.

Figure 2As Fig. 1 but for case 2.

Figure 3As Fig. 1 but for case 3.

The problem encountered in case 2 is due to the fact that the data fusion is made using estimates of the AKMs on the fusion grid (see Sect. 3.1) obtained by interpolation of the original AKMs (Ceccherini et al., 2016), which are only an approximation of the real AKMs on the fusion grid. We refer to this effect as interpolation error. The problem encountered in case 3 is related to different true profiles and we refer to this effect as coincidence error because it occurs when fusing profiles that do not correspond to the same space–time location.

3 Method

In this section, a theoretical analysis is performed to overcome the problems highlighted in the previous section. In Sect. 3.1, we recall the formulas of the CDF method in order to establish the formalism subsequently used in Sect. 3.2, where an upgrade of the method is proposed.

## 3.1 CDF

Let us assume to have N independent and simultaneous measurements of the vertical profile of an atmospheric target referred to the same space–time location. Performing the retrieval of the N measurements with the optimal estimation method (Rodgers, 2000), we obtain N vectors ${\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}$ (i= 1, 2, ... , N) here assumed to be estimates of the profiles made on a common vertical grid. The use of a priori information ensures the possibility of having a common retrieval grid also in the case of observations with different vertical coverage.

The vectors ${\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}$ are characterized by the CMs Si and the AKMs Ai (Ceccherini et al., 2003; Ceccherini and Ridolfi, 2010; Rodgers, 2000):

$\begin{array}{ll}& {\mathbf{S}}_{i}\equiv 〈{\mathbit{\sigma }}_{i}{\mathbit{\sigma }}_{i}^{T}〉={\left({\mathbf{K}}_{i}^{T}{\mathbf{S}}_{yi}^{-\mathrm{1}}{\mathbf{K}}_{i}+{\mathbf{S}}_{ai}^{-\mathrm{1}}\right)}^{-\mathrm{1}}{\mathbf{K}}_{i}^{T}{\mathbf{S}}_{yi}^{-\mathrm{1}}{\mathbf{K}}_{i}\\ \text{(1)}& & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{\left({\mathbf{K}}_{i}^{T}{\mathbf{S}}_{yi}^{-\mathrm{1}}{\mathbf{K}}_{i}+{\mathbf{S}}_{ai}^{-\mathrm{1}}\right)}^{-\mathrm{1}},\text{(2)}& & {\mathbf{A}}_{i}\equiv \frac{\partial {\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}}{\partial \mathbit{x}}={\left({\mathbf{K}}_{i}^{T}{\mathbf{S}}_{yi}^{-\mathrm{1}}{\mathbf{K}}_{i}+{\mathbf{S}}_{ai}^{-\mathrm{1}}\right)}^{-\mathrm{1}}{\mathbf{K}}_{i}^{T}{\mathbf{S}}_{yi}^{-\mathrm{1}}{\mathbf{K}}_{i},\end{array}$

where σi are the errors on ${\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}$ obtained by propagating the errors of the observations through the retrieval processes (noise errors), Ki are the Jacobians of the forward models, Syi are the CMs of the observations, Sai are the CMs of the a priori profiles and x is the true profile.

The CDF solution for the considered profiles is given by (see Ceccherini et al., 2015)

$\begin{array}{ll}{\mathbit{x}}_{\mathrm{f}}=& \phantom{\rule{0.125em}{0ex}}{\left(\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}\right)}^{-\mathrm{1}}\\ \text{(3)}& & \left(\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbit{\alpha }}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}{\mathbit{x}}_{a}\right),\end{array}$

where

$\begin{array}{}\text{(4)}& {\mathbit{\alpha }}_{i}\equiv {\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}-\left(\mathbf{I}-{\mathbf{A}}_{i}\right){\mathbit{x}}_{ai}={\mathbf{A}}_{i}\mathbit{x}+{\mathbit{\sigma }}_{i},\end{array}$

xai is the a priori profile used in the ith retrieval and xa and Sa are the a priori profile and its CM used to constrain the data fusion.

We note that the vector αi, which can be calculated from the available retrieval products, is a measurement of the vector x, made using the rows of the AKM Ai, and no longer depends on the a priori profile xai. Furthermore, it has the same errors σi as the retrieved profile ${\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}$; therefore, it is characterized by the CM Si.

The fused profile has a CM, obtained by propagating the errors of αi into xf, equal to

$\begin{array}{ll}{\mathbf{S}}_{\mathrm{f}}=& \phantom{\rule{0.125em}{0ex}}{\left(\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}\right)}^{-\mathrm{1}}\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}\\ \text{(5)}& & {\left(\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}\right)}^{-\mathrm{1}}\end{array}$

and an AKM, obtained performing the derivative of xf with respect to the true profile, equal to

$\begin{array}{}\text{(6)}& {\mathbf{A}}_{\mathrm{f}}={\left(\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}\right)}^{-\mathrm{1}}\sum _{i=\mathrm{1}}^{N}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}.\end{array}$

The CDF formula (Eq. 3) involves a summation of AKMs made possible by the common grid. When the fusing profiles ${\stackrel{\mathrm{^}}{\mathbit{x}}}_{i}$ are represented on different vertical grids, the available AKMs are also defined on different vertical grids; thus in this case, it is necessary to perform a resampling of the AKMs (Calisesi et al., 2005), which makes their second index equal to that of the common fusion grid. Following Ceccherini et al. (2016), we define such a transformation as follows:

$\begin{array}{}\text{(7)}& {\mathbf{A}}_{i}^{\prime }={\mathbf{A}}_{i}{\mathbf{R}}_{i},\end{array}$

where Ri are the generalized inverse matrices of the linear interpolation matrices Hi, which interpolate the profiles on the fusing grids to the fusion grid. In this case, using Eq. (7), Eq. (3) becomes

$\begin{array}{ll}{\mathbit{x}}_{\mathrm{f}}=& \phantom{\rule{0.125em}{0ex}}{\left(\sum _{i=\mathrm{1}}^{N}{\mathbf{R}}_{i}^{T}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbf{A}}_{i}{\mathbf{R}}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}\right)}^{-\mathrm{1}}\\ \text{(8)}& & \left(\sum _{i=\mathrm{1}}^{N}{\mathbf{R}}_{i}^{T}{\mathbf{A}}_{i}^{T}{\mathbf{S}}_{i}^{-\mathrm{1}}{\mathbit{\alpha }}_{i}+{\mathbf{S}}_{a}^{-\mathrm{1}}{\mathbit{x}}_{a}\right).\end{array}$

We notice that, in the case of different vertical grids, only the AKMs must be interpolated; neither the CMs nor the αi vectors need to be interpolated.

## 3.2 Interpolation and coincidence errors

Let us first consider the interpolation error. The vectors αi, defined by Eq. (4), are measurements of the true profile, each made with the averaging kernels Ai. Let us assume that each measurement is defined on a different retrieval grid, identified by the same index that identifies the measurements; thus, Eq. (4) becomes

$\begin{array}{}\text{(9)}& {\mathbit{\alpha }}_{i}={\mathbf{A}}_{i}{\mathbit{x}}_{i}^{\left(i\right)}+{\mathbit{\sigma }}_{i},\end{array}$

where ${\mathbit{x}}_{i}^{\left(i\right)}$ is the true profile related to the ith measurement that, by definition, is sampled with the ith grid, as highlighted by the superscript in parentheses.

Equation (8) shows that in the presence of different vertical grids the CDF method combines measurements with sensitivity to the true profile expressed by AiRi. This operation assumes that the measurements are combined on the common fusion grid, i.e. measurements of AiRi${\mathbit{x}}_{i}^{\left(f\right)}$, with ${\mathbit{x}}_{i}^{\left(f\right)}$ being the true profile related to the ith measurement represented on the fusion grid. If using αi (Eq. 9), which is the measurement of ${\mathbf{A}}_{i}{\mathbit{x}}_{i}^{\left(i\right)}$, the estimate of the required measurement AiRi${\mathbit{x}}_{i}^{\left(f\right)}$ is made with an error equal to ${\mathbf{A}}_{i}{\mathbit{x}}_{i}^{\left(i\right)}-{\mathbf{A}}_{i}{\mathbf{R}}_{i}{\mathbit{x}}_{i}^{\left(f\right)}$.

We can explicitly introduce this error in the expression of αi by rearranging Eq. (9) in the following way:

$\begin{array}{}\text{(10)}& {\mathbit{\alpha }}_{i}={\mathbf{A}}_{i}{\mathbf{R}}_{i}{\mathbit{x}}_{i}^{\left(f\right)}+{\mathbf{A}}_{i}\left({\mathbit{x}}_{i}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbit{x}}_{i}^{\left(f\right)}\right)+{\mathbit{\sigma }}_{i}.\end{array}$

It is useful to introduce the following notations for ${\mathbit{x}}_{i}^{\left(i\right)}$ and ${\mathbit{x}}_{i}^{\left(f\right)}$:

$\begin{array}{}\text{(11)}& & {\mathbit{x}}_{i}^{\left(i\right)}={\mathbf{C}}^{\left(i\right)}{\mathbit{x}}_{i},\text{(12)}& & {\mathbit{x}}_{i}^{\left(f\right)}={\mathbf{C}}^{\left(f\right)}{\mathbit{x}}_{i},\end{array}$

where xi is the true profile related to the ith measurement represented on a very fine grid that includes all the levels of the fusion grid (f) and of the N grids (i). C(i) and C(f)are the sampling matrices from the fine grid to the grids (i) and to the grid (f), respectively.

Substituting Eqs. (11) and (12) in Eq. (10), one obtains

$\begin{array}{}\text{(13)}& {\mathbit{\alpha }}_{i}={\mathbf{A}}_{i}{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}{\mathbit{x}}_{i}+{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right){\mathbit{x}}_{i}+{\mathbit{\sigma }}_{i}.\end{array}$

Let us now also consider the coincidence error. In general, measurements made in different space–time locations are only fused when they lie within a given coincidence criterion. These measurements correspond to different true profiles and the purpose of the data fusion can be the determination of either the mean value of these true profiles or the true profile in a given space–time location identified as the central point of the coincidence intervals. We indicate with $\stackrel{\mathrm{‾}}{\mathbit{x}}$ the unknown profile estimated by the data fusion. If we introduce the quantity σi,coin, which gives the deviation of xi from the unknown profile $\stackrel{\mathrm{‾}}{\mathbit{x}}$,

$\begin{array}{}\text{(14)}& {\mathbit{x}}_{i}=\stackrel{\mathrm{‾}}{\mathbit{x}}+{\mathbit{\sigma }}_{i,\mathrm{coin}},\end{array}$

then Equation (13) becomes

$\begin{array}{ll}& {\mathbit{\alpha }}_{i}={\mathbf{A}}_{i}{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\stackrel{\mathrm{‾}}{\mathbit{x}}+{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right)\stackrel{\mathrm{‾}}{\mathbit{x}}+{\mathbf{A}}_{i}{\mathbf{C}}^{\left(i\right)}{\mathbit{\sigma }}_{i,\mathrm{coin}}\\ & \phantom{\rule{1em}{0ex}}+{\mathbit{\sigma }}_{i}={\mathbf{A}}_{i}{\mathbf{R}}_{i}{\stackrel{\mathrm{‾}}{\mathbit{x}}}^{\left(f\right)}+{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right)\stackrel{\mathrm{‾}}{\mathbit{x}}\\ \text{(15)}& & \phantom{\rule{1em}{0ex}}+{\mathbf{A}}_{i}{\mathbf{C}}^{\left(i\right)}{\mathbit{\sigma }}_{i,\mathrm{coin}}+{\mathbit{\sigma }}_{i}\end{array}$

after using Eq. (12) for $\stackrel{\mathrm{‾}}{\mathbit{x}}$.

An estimate of the quantity ${\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right)\stackrel{\mathrm{‾}}{\mathbit{x}}$ can be obtained writing $\stackrel{\mathrm{‾}}{\mathbit{x}}$ as the a priori profile plus the deviation σa from it:

$\begin{array}{}\text{(16)}& \stackrel{\mathrm{‾}}{\mathbit{x}}={\mathbit{x}}_{a}+{\mathbit{\sigma }}_{a}.\end{array}$

Substituting Eq. (16) in Eq. (15) and rearranging the terms of the equation, we can define a new quantity, ${\stackrel{\mathrm{̃}}{\mathbit{\alpha }}}_{i}$, equal to

$\begin{array}{ll}{\stackrel{\mathrm{̃}}{\mathbit{\alpha }}}_{i}& \equiv {\mathbit{\alpha }}_{i}-{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right){\mathbit{x}}_{a}\\ & ={\mathbf{A}}_{i}{\mathbf{R}}_{i}{\stackrel{\mathrm{‾}}{\mathbit{x}}}^{\left(f\right)}+{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right){\mathbit{\sigma }}_{a}\\ \text{(17)}& & \phantom{\rule{1em}{0ex}}+{\mathbf{A}}_{i}{\mathbf{C}}^{\left(i\right)}{\mathbit{\sigma }}_{i,\mathrm{coin}}+{\mathbit{\sigma }}_{i}.\end{array}$

Each ${\stackrel{\mathrm{̃}}{\mathbit{\alpha }}}_{i}$ is a measurement of ${\stackrel{\mathrm{‾}}{\mathbit{x}}}^{\left(f\right)}$ made using the rows of the matrix AiRi and a total error given by the sum of the noise error σi plus the terms Ai(C(i)RiC(f))σa and AiC(i)σi,coin that can be interpreted as the interpolation error and the coincidence error, respectively.

For the estimate of the interpolation error, we use the a priori CM Sa of σa and, therefore, the interpolation error is characterized by the CM:

$\begin{array}{}\text{(18)}& {\mathbf{S}}_{i,\mathrm{int}}={\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right){\mathbf{S}}_{a}{\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right)}^{T}{\mathbf{A}}_{i}^{T}.\end{array}$

To characterize the coincidence error, we introduce the CM Scoin of σi,coin. If $\stackrel{\mathrm{‾}}{\mathbit{x}}$ represents the mean value of the true profiles, Scoin accounts for the dispersion of the true profiles, thus it depends on the coincidence criteria and it is the same for all the measurements to be fused together. If $\stackrel{\mathrm{‾}}{\mathbit{x}}$ represents the true profile in a specific space–time location, Scoin is zero if the measurement is exactly in that location and it increases going away from that location. The values of Scoin as a function of space–time location should reflect the variability of the true profile with the location. Then, the coincidence error is characterized by the CM

$\begin{array}{}\text{(19)}& {\mathbf{S}}_{i,\mathrm{coin}}={\mathbf{A}}_{i}{\mathbf{C}}^{\left(i\right)}{\mathbf{S}}_{\mathrm{coin}}{\mathbf{C}}^{\left(i\right)T}{\mathbf{A}}_{i}^{T}.\end{array}$

In conclusion, the CDF formula, given by Eq. (3), can be modified to account for the interpolation and coincidence errors by replacing αi with

$\begin{array}{}\text{(20)}& {\stackrel{\mathrm{̃}}{\mathbit{\alpha }}}_{i}={\mathbit{\alpha }}_{i}-{\mathbf{A}}_{i}\left({\mathbf{C}}^{\left(i\right)}-{\mathbf{R}}_{i}{\mathbf{C}}^{\left(f\right)}\right){\mathbit{x}}_{a}\end{array}$

and Si with

$\begin{array}{}\text{(21)}& {\stackrel{\mathrm{̃}}{\mathbf{S}}}_{i}={\mathbf{S}}_{i}+{\mathbf{S}}_{i,\mathrm{int}}+{\mathbf{S}}_{i,\mathrm{coin}}.\end{array}$

The CM given by Eq. (21) is also used in place of Si in Eqs. (5) and (6) for the calculation of the CM and AKM of the fused profile.

4 Tests with the upgraded algorithm: results and discussion

## 4.1 The effect on fused profiles

The test cases of fusion 2 and 3 shown in Sect. 2 are here repeated with the modified method described in Sect. 3.2.

In Figs. 4 and 5, we report the noise errors, the interpolation errors and the coincidence errors related, respectively, to case 2 and case 3, for both TIR and UV measurements. These errors are calculated as the square root of the diagonal elements of Si, Si,int and Si,coin, respectively. In case 2, the vertical grids are different for the two measurements, and since the fusion grid coincides with the vertical grid of the UV measurement, the interpolation errors are different from zero for the TIR measurement and equal to zero for the UV measurement. The coincidence errors are equal to zero in both TIR and UV measurements because the true profiles are the same. In case 3, the interpolation errors are equal to zero for both TIR and UV measurements because the fusion grid coincides with that of the fusing profiles. The coincidence errors are instead different from zero because the true profiles are different and their CMs, chosen equal for both TIR and UV measurements, are obtained considering an error of 5 % of the a priori profile (consistent with the difference between the true profiles) and a correlation length of 6 km.

Figures 6 and 7 show the fused profiles and the residuals obtained with the modified algorithm compared with the same quantities reported in panels (b) and (c) of Figs. 2 and 3, respectively. In both tests, the modified method provides residuals that are significantly smaller than those obtained with the original CDF method.

Figure 4Noise errors (red lines), interpolation errors (green lines) and coincidence errors (blue lines) in case 2 for TIR and UV measurements.

Figure 5As Fig. 4 but for case 3.

These tests show that the upgrade of the CDF method proposed in Sect. 3.2 solves the problems observed in Sect. 2 that occur when either the fusing profiles are retrieved on different vertical grids or they refer to different true profiles. The modified method is a generalization of the CDF that allows its application to a wide range of cases.

Figure 6The fused profile and the residual error obtained with the modified algorithm (magenta lines) compared with the same quantities of Fig. 2b and c.

Figure 7The fused profile and the residual error obtained with the modified algorithm (magenta lines) compared with the same quantities of Fig. 3b and c.

## 4.2 The effect on errors and number of DOF

We now look at the effect of the generalized method on the errors and on the number of DOF. Figures 8 and 9 show the errors of the fused profile when we use either the original or the modified method for cases 2 and 3, respectively. These errors are calculated as the square root of the diagonal elements of Sf given in Eq. (5), where, in the modified method, Si is replaced by ${\stackrel{\mathrm{̃}}{\mathbf{S}}}_{i}$. For the three cases described in Sect. 2, Table 1 gives the number of DOF of the profiles obtained from the individual TIR and UV measurements, and from the CDF method using both the original and the generalized formulation. The numbers of DOF are calculated as the trace of the AKMs. For the fused products the AKM is Af given by Eq. (6), where, in the generalized formulation, Si is replaced by ${\stackrel{\mathrm{̃}}{\mathbf{S}}}_{i}$.

The introduction of the interpolation error (case 2) does not significantly modify the errors and determines a decrease in the number of DOF of the fused profile of about 1. The introduction of the coincidence error (case 3) determines a significant increase in the errors and a small decrease in the number of DOF of the fused profile equal to about 0.5. However, in both cases the number of DOF of the fused profile obtained with the modified method is larger than the number of DOF of the individual fusing profiles, proving the information gain provided by the fusion.

Figure 8Errors of the fused profile when we use the original (black line) and the generalized (magenta line) CDF for case 2.

Figure 9Errors of the fused profile when we use the original (black line) and the generalized (magenta line) CDF for case 3.

From the analysis of errors and number of DOF we deduce that the interpolation error has the largest impact on the vertical resolution, while the coincidence error has the largest impact on the errors. However, these numerical results depend on the values that interpolation and coincidence errors have in the single cases.

Table 1Number of DOF of the profiles obtained with the TIR measurement, the UV measurement, the original fusion method and the modified fusion method for each of the three cases described in Sect. 2.

5 Other error sources

In this paper, we considered simulated measurements, which generally do not include all the error components that are present in real measurements. When real measurements are considered, there are other important error sources that can cause inconsistency among the fusing profiles, such as forward model errors, due, for example, to approximations in the model and uncertainties in atmospheric and instrumental parameters. When performing data fusion, these errors can also lead to quality loss and show problems similar to those described in Sect. 2. These problems can be avoided by accounting for them in the CDF formulation. In particular, Eq. (21) can be modified to account for an extra CM term, Si,other, as follows:

$\begin{array}{}\text{(22)}& {\stackrel{\mathrm{̃}}{\mathbf{S}}}_{i}={\mathbf{S}}_{i}+{\mathbf{S}}_{i,\mathrm{other}}+{\mathbf{S}}_{i,\mathrm{int}}+{\mathbf{S}}_{i,\mathrm{coin}}.\end{array}$
6 Conclusions

We analysed the problem posed by the application of the CDF method to vertical profiles obtained with different instruments, which use different retrieval grids and observe different true profiles. To this purpose, we studied simulated ozone profile measurements expected from the MTG payload for the S4 mission of the Copernicus programme: namely, those provided by the IRS in the thermal infrared and by the UVN spectrometer in the ultraviolet. The study showed that the CDF algorithm works well when the fusing profiles are represented on the same vertical grid and refer to the same true profile; otherwise the algorithm provides unsatisfactory results because the fused profile differs from the mean of the true profiles significantly more than the fusing profiles. In the latter case, the CDF method, which uses all the existing information for the determination of the best fused profile, is exploiting the differences due to the inconsistency of the measurements as useful information and provides unrealistic fused profiles.

In order to overcome this new problem, we performed a theoretical analysis that led to a generalization of the CDF method to the cases in which interpolation and coincidence errors occur. The interpolation error is present when the vertical grids of the fusing profiles differ from the fusion grid, meaning that an interpolation of the AKMs is necessary. In this case, the interpolated AKMs are only an approximation of the real AKMs on the fusion grid. The coincidence error is a consequence of the fact that the fusing profiles are not generally co-located in space and time, thus referring to different true profiles.

The generalized algorithm allows for these inconsistencies and provides fused profiles that are in better agreement with the true profiles than those obtained with the original CDF algorithm.

With the new algorithm, the fusion generally provides fused profiles that are also better than the fusing profiles in terms of total error and number of DOF. However, a more comprehensive error budget, which may even cause the fused profile to have larger errors than the fusing profiles (coincidence and interpolation errors do not have to be considered for the individual fusing profiles), is now considered. If neither of the qualifiers (total error and number of DOF) is improved, the fusion process is not justified.

An approach similar to that used to account for interpolation and coincidence errors can also be useful to include other error components, such as forward model errors, in the fusion process.

Data availability
Data availability.

The data of the simulations presented in the paper are available upon request to the authors.

Author contributions
Author contributions.

SC deduced the expression of the interpolation and coincidence errors and wrote the draft version of the paper. BC suggested the idea to introduce the interpolation and coincidence errors and contributed to the interpretation of the results. NZ wrote the Python code of the complete data fusion. CT and SDB performed the simulation of the infrared measurements. JK performed the simulation of the ultraviolet measurements. UC put together the team of authors and coordinated its activity. RD performed a detailed revision of the manuscript.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Acknowledgements
Acknowledgements.

The results presented in this paper arise from research activities conducted in the framework of the AURORA project (http://www.aurora-copernicus.eu/) supported by the Horizon 2020 research and innovation programme of the European Union (call: H2020-EO-2015; topic: EO-2-2015) under grant agreement no. 687428.

Edited by: Brian Kahn
Reviewed by: two anonymous referees

References

Aires, F., Aznay, O., Prigent, C., Paul, M., and Bernardo, F.: Synergistic multi-wavelength remote sensing versus a posteriori combination of retrieved products: Application for the retrieval of atmospheric profiles using MetOp-A, J. Geophys. Res., 117, D18304, https://doi.org/10.1029/2011JD017188, 2012.

Calisesi, Y., Soebijanta, V. T., and Oss, R. V.: Regridding of remote soundings: formulation and application to ozone profile comparison, J. Geophys. Res., 110, D23306, https://doi.org/10.1029/2005JD006122, 2005.

Ceccherini, S.: Equivalence of measurement space solution data fusion and complete fusion, J. Quant. Spectrosc. Ra., 182, 71–74, 2016.

Ceccherini, S. and Ridolfi, M.: Technical Note: Variance–covariance matrix and averaging kernels for the Levenberg–Marquardt solution of the retrieval of atmospheric vertical profiles, Atmos. Chem. Phys., 10, 3131–3139, https://doi.org/10.5194/acp-10-3131-2010, 2010.

Ceccherini, S., Carli, B., Pascale, E., Prosperi, M., Raspollini, P., and Dinelli, B. M.: Comparison of measurements made with two different instruments of the same atmospheric vertical profile, Appl. Optics, 42, 6465–6473, 2003.

Ceccherini, S., Raspollini, P., and Carli, B.: Optimal use of the information provided by indirect measurements of atmospheric vertical profiles, Opt. Express, 17, 4944–4958, 2009.

Ceccherini, S., Carli, B., Cortesi, U., Del Bianco, S., and Raspollini P.: Retrieval of the vertical column of an atmospheric constituent from data fusion of remote sensing measurements, J. Quant. Spectrosc. Ra., 111, 507–514, 2010a.

Ceccherini, S., Cortesi, U., Del Bianco, S., Raspollini, P., and Carli, B.: IASI-METOP and MIPAS-ENVISAT data fusion, Atmos. Chem. Phys., 10, 4689–4698, https://doi.org/10.5194/acp-10-4689-2010, 2010b.

Ceccherini, S., Carli, B., and Raspollini, P.: The average of atmospheric vertical profiles, Opt. Express, 22, 24808–24816, 2014.

Ceccherini, S., Carli, B., and Raspollini, P.: Equivalence of data fusion and simultaneous retrieval, Opt. Express, 23, 8476–8488, 2015.

Ceccherini, S., Carli, B., and Raspollini, P.: Vertical grid of retrieved atmospheric profiles, J. Quant. Spectrosc. Ra., 174, 7–13, 2016.

Cortesi, U., Del Bianco, S., Ceccherini, S., Gai, M., Dinelli, B. M., Castelli, E., Oelhaf, H., Woiwode, W., Höpfner, M., and Gerber, D.: Synergy between middle infrared and millimeter-wave limb sounding of atmospheric temperature and minor constituents, Atmos. Meas. Tech., 9, 2267–2289, https://doi.org/10.5194/amt-9-2267-2016, 2016.

Costantino, L., Cuesta, J., Emili, E., Coman, A., Foret, G., Dufour, G., Eremenko, M., Chailleux, Y., Beekmann, M., and Flaud, J.-M.: Potential of multispectral synergism for observing ozone pollution by combining IASI-NG and UVNS measurements from the EPS-SG satellite, Atmos. Meas. Tech., 10, 1281–1298, https://doi.org/10.5194/amt-10-1281-2017, 2017.

Cuesta, J., Eremenko, M., Liu, X., Dufour, G., Cai, Z., Höpfner, M., von Clarmann, T., Sellitto, P., Foret, G., Gaubert, B., Beekmann, M., Orphal, J., Chance, K., Spurr, R., and Flaud, J.-M.: Satellite observation of lowermost tropospheric ozone by multispectral synergism of IASI thermal infrared and GOME-2 ultraviolet measurements over Europe, Atmos. Chem. Phys., 13, 9675–9693, https://doi.org/10.5194/acp-13-9675-2013, 2013.

ESA: Sentinel-4: ESA's Geostationary Atmospheric Mission for Copernicus Operational Services, SP-1334, April 2017, available at: http://esamultimedia.esa.int/multimedia/publications/SP-1334/SP-1334.pdf (last access: 16 February 2018), 2017.

Fu, D., Worden, J. R., Liu, X., Kulawik, S. S., Bowman, K. W., and Natraj, V.: Characterization of ozone profiles derived from Aura TES and OMI radiances, Atmos. Chem. Phys., 13, 3445–3462, https://doi.org/10.5194/acp-13-3445-2013, 2013.

Hache, E., Attié, J.-L., Tourneur, C., Ricaud, P., Coret, L., Lahoz, W. A., El Amraoui, L., Josse, B., Hamer, P., Warner, J., Liu, X., Chance, K., Höpfner, M., Spurr, R., Natraj, V., Kulawik, S., Eldering, A., and Orphal, J.: The added value of a visible channel to a geostationary thermal infrared instrument to monitor ozone for air quality, Atmos. Meas. Tech., 7, 2185–2201, https://doi.org/10.5194/amt-7-2185-2014, 2014.

Kroon, M., de Haan, J. F., Veefkind, J. P., Froidevaux, L., Wang, R., Kivi, R., and Hakkarainen, J. J.: Validation of operational ozone profiles from the Ozone Monitoring Instrument, J. Geophys. Res., 116, D18305, https://doi.org/10.1029/2010JD015100, 2011.

Lahoz, W. A., Peuch, V.-H., Orphal, J., Attié, J.-L., Chance, K., Liu, X., Edwards, D., Elbern, H., Flaud, J.-M., Claeyman, M., and El Amraoui, L.: Monitoring air quality from space: The case for the geostationary platform, B. Am. Meteorol. Soc., 93, 221–233, https://doi.org/10.1175/BAMS-D-11-00045.1, 2012.

Landgraf, J. and Hasekamp, O. P.: Retrieval of tropospheric ozone: The synergistic use of thermal infrared emission and ultraviolet reflectivity measurements from space, J. Geophys. Res., 112, D08310, https://doi.org/10.1029/2006JD008097, 2007.

Liu, X., Bhartia, P. K., Chance, K., Spurr, R. J. D., and Kurosu, T. P.: Ozone profile retrievals from the Ozone Monitoring Instrument, Atmos. Chem. Phys., 10, 2521–2537, https://doi.org/10.5194/acp-10-2521-2010, 2010.

McPeters, R. D. and Labow, G. J.: Climatology 2011: An MLS and sonde derived ozone climatology for satellite retrieval algorithms, J. Geophys. Res., 117, D10303, https://doi.org/10.1029/2011JD017006, 2012.

Miles, G. M., Siddans, R., Kerridge, B. J., Latter, B. G., and Richards, N. A. D.: Tropospheric ozone and ozone profiles retrieved from GOME-2 and their validation, Atmos. Meas. Tech., 8, 385–398, https://doi.org/10.5194/amt-8-385-2015, 2015.

Natraj, V., Liu, X., Kulawik, S., Chance, K., Chatfield, R., Edwards, D. P., Eldering, A., Francis, G., Kurosu, T., Pickering, K., Spurr, R., and Worden, H.: Multispectral sensitivity studies for the retrieval of tropospheric and lowermost tropospheric ozone from simulated clear sky GEO-CAPE measurements, Atmos. Environ., 45, 7151–7165, https://doi.org/10.1016/j.atmosenv.2011.09.014, 2011.

Rodgers, C. D.: Inverse Methods for Atmospheric Sounding: Theory and Practice, Series on Atmospheric, Oceanic and Planetary Physics, vol. 2, World Scientific, Singapore, 2000.

Timmermans, R. M. A., Lahoz, W. A., Attié, J.-L., Peuch, V.-H., Curier, R. L., Edwards, D. P., Eskes, H. J., and Builtjes, P. J. H.: Observing System Simulation Experiments for air quality, Atmos. Environ., 115, 199–213, https://doi.org/10.1016/j.atmosenv.2015.05.032, 2015.

Worden, J., Liu, X., Bowman, K., Chance, K., Beer, R., Eldering, A., Gunson, M., and Worden, H.: Improved tropospheric ozone profile retrievals using OMI and TES radiances, Geophys. Res. Lett., 34, L01809, https://doi.org/10.1029/2006GL027806, 2007.