Atmospheric composition and thermodynamic retrievals from the ARIES airborne TIR-FTS system – Part 2 : Validation and results from aircraft campaigns

This study validates trace gas and thermodynamic retrievals from nadir infrared spectroscopic measurements recorded by the UK Met Office Airborne Research Interferometer Evaluation System (ARIES) – a thermal infrared, Fourier transform spectrometer (TIR-FTS) on the UK Facility for Airborne Atmospheric Measurements (FAAM) BAe146 aircraft. Trace-gas-concentration and thermodynamic profiles have been retrieved and validated for this study throughout the troposphere and planetary boundary layer (PBL) over a range of environmental variability using data from aircraft campaigns over and around London, the US Gulf Coast, and the Arctic Circle during the Clear air for London (ClearfLo), Joint Airborne IASI (Infrared Atmospheric Sounding Interferometer) Validation Experiment (JAIVEx), and Measurements, process studies, and Modelling (MAMM) aircraft campaigns, respectively. Vertically resolved retrievals of temperature and water vapour (H2O), and partial-column retrievals of methane (CH4), carbon monoxide (CO), and ozone (O3) (over both land and sea) were compared to corresponding measurements from high-precision in situ analysers and dropsondes operated on the FAAM aircraft. Average degrees of freedom for signal (DOFS) over a 0–9 km column range were found to be 4.97, 3.11, 0.91, 1.10, and 1.62 for temperature, H2O, CH4, CO, and O3, respectively, when retrieved on 10 vertical levels. Partial-column mean biases (and bias standard error) between the surface and ∼ 9 km, when averaged across all flight campaigns, were found to be −0.7(±0.3) K, −479(±56) ppm, −11(±2) ppb, −3.3(±1.0) ppb, and +3.5(±1.0) ppb, respectively, whilst the typical a posteriori (total) uncertainties for individually retrieved profiles were 0.4, 9.5, 5.0, 21.2, and 15.0 %, respectively. Averaging kernels (AKs) derived for progressively lower altitudes show improving sensitivity to lower atmospheric layers when flying at lower altitudes. Temperature and H2O display significant vertically resolved sensitivity throughout the column, whilst trace gases are usefully retrieved only as partial-column quantities, with maximal sensitivity for trace gases other than H2O within a layer 1 and 2 km below the aircraft. This study demonstrates the valuable atmospheric composition information content that can be obtained by ARIES nadir TIR remote sensing for atmospheric process studies.

compositional and thermodynamic data for monitoring and modelling applications, and how such data sets can complement satellite retrievals (typically at lower spatial resolution) and high accuracy (but point-specific) in situ measurements to aid regional process studies. In summary, airborne remote sensing can help to bridge the gap between spatial extremes locally and regionally through their ability to observe wide (and selectable) fields of view and to perform targeted sampling, for example, through manoeuvring in the vertical. Illingworth et al. (2014) described and characterised the Manchester Airborne Retrieval Scheme (MARS), a configurable system tailored for the optimally estimated retrieval of atmospheric composition from infrared spectra recorded by the ARIES open-path-FTS (Fourier transform spectrometer) instrument (described in detail by Wilson et al., 1999), flown on the UK Facility for Airborne Atmospheric Measurements (FAAM) BAe-146 aircraft. The ARIES is an analogue of the Infrared Atmospheric Sounding Interferometer (IASI) flown on the MetOp-A and B satellites, both having an apodized spectral resolution of ∼ 0.5 cm −1 between 4 and 16 µm. No further description of the ARIES and retrieval formalism will be given here, and readers are referred to Illingworth et al. (2014) and references therein for details.
We focus here on the validation of operationally retrieved profiles of temperature, water vapour (H 2 O), methane (CH 4 ), carbon monoxide (CO), and ozone (O 3 ), which will be referred to collectively as the retrieval products hereafter. In this paper, validation refers to the statistical and profile-byprofile comparison of retrieved data with their in situ counterpart, both directly and after convolution with retrievalspecific ARIES averaging kernels (AKs). For the trace gases, partial columns will be compared due to their constrained vertical resolvability (see Sect. 3). We will report the performance of operational retrievals from ARIES spectra across a range of environments, using airborne in situ measurements for the purpose of validation for each location. For context and later comparison, we now briefly discuss example validation studies of the retrieval products of concern to this study for three example infrared remote-sensing instruments on satellite, airborne, and ground-based platforms.
The Total Column Carbon Observing Network (TCCON) is a network of ground-based, sun-viewing, near-IR Fourier transform spectrometers that has been established to measure greenhouse gases as total column dry molar fractions (DMFs). Since its inception in 2004, the TCCON network has grown to include 18 sites globally, and currently produces DMFs of H 2 O, CO 2 , CO, CH 4 , and other trace gases (Wunch et al., 2011). Due to cited systematic biases in the spectroscopy, the absolute accuracy of the column measurements is quoted as ∼ 1 %; however, this can be improved by calibrating them to the World Meteorological Organization (WMO) in situ trace-gas measurement scales, using profiles obtained with in situ instrumentation flown on aircraft over the TCCON sites (Wunch et al., 2010). After this calibration, the precision of the DMFs retrieved from single spectra im-proves significantly, and is about 0.15 % for CO 2 , 0.2 % for CH 4 , and up to 0.5 % for CO (Toon et al., 2009).
The Methane Airborne MAPper (MAMAP) is an airborne spectrometer system, also measuring in the near-IR designed to make measurements of dry-air partial columns of CO 2 and CH 4 on small spatial scales with a precision of better than 2 % . MAMAP operates with a ground pixel resolution of approximately 29 m × 33 m for a typical aircraft altitude of 1250 m and a velocity of 200 km h −1 . The main uncertainties in the retrieval were noted to arise from potential inaccuracies in the calculation of the solar zenith angle and the surface elevation of the scene. Such uncertainties (important in the visible and near-IR) are not expected in the thermal infrared.  reported that by using a CH 4 proxy method (in which the retrieved CH 4 is used to account for the light path modification by simultaneously retrieving alongside CO 2 ), the total uncertainty estimate was reduced to 0.24 % in a standard individual column retrieval of CO 2 . Furthermore, given that both MAMAP and TCCON measure in the near-IR (as opposed to the thermal infrared for ARIES) and thus observe (mainly) solar radiation, they have very different vertical sensitivity to ARIES.
The IASI has an instantaneous field of view (IFOV) that is approximately 12 km in diameter at nadir (Blumstein et al., 2004). Depending on the trace gas and the retrieval scheme employed, IASI can provide weakly resolved vertical profiles, with the number of independent pieces of information for each gas depending mostly on the thermal state of the atmosphere (e.g. 1-2 for CO in the troposphere, and 3-5 for O 3 up to 0.1 hPa; Hilton et al., 2012). Using an optimal estimation method (OEM), developed by Coheur et al. (2005) and Boynard et al. (2009), showed that on average IASI O 3 retrievals exhibit a consistent positive bias of about 3 % compared to ground-based measurements. Similarly, Illingworth et al. (2011) showed that on average total tropospheric column CO retrievals from IASI exhibit a positive bias of approximately 3 % when compared to modelled data. Despite small biases in comparison to other data sets, IASI retrieved products also have large associated uncertainties for individually retrieved profiles, where the dominant term is typically caused by the smoothing of the continuous atmosphere by the retrieval schemes, which necessarily assume a discretised atmosphere. Illingworth et al. (2011) noted that typical smoothing uncertainty for IASI total tropospheric columns ranges from 18 to 34 %.
The brief discussion above demonstrates the relative limitations and benefits of remote-sensing measurements within the troposphere from viewpoints below, within, and far above it. Each has specific weighting in terms of sensitivity to different layers within the tropospheric column and each has different uncertainties. We highlight here how aircraft remote sensing can help to bridge spatial sampling scales between ground-based and satellite platforms, whilst high-precision G. Allen et al.: Part 2: Validation and results from aircraft campaigns 4403 in situ data can be simultaneously provided (where equipped) to routinely validate and calibrate retrievals.
The remainder of this manuscript is structured as follows: in Sect. 2 we describe the validating measurements used for this study, Sect. 3 describes the validation flight campaigns where ARIES was operated, and Sect. 4 compares operational retrievals with in situ measurements.

Data sources
Measured data discussed in this paper were recorded using instrumentation on board the BAe-146-301 atmospheric research aircraft (ARA). In this section, we describe the aircraft platform and in situ instrumentation used here for validation. Only relevant FAAM in situ instrumentation that records measurements corresponding to the retrieval products is introduced here.

The BAe-146 platform
The BAe-146-301 ARA is operated by Directflight Ltd and managed by FAAM, which is a joint entity of the Natural Environment Research Council (NERC) and the UK Met Office. This four-engine jet plane is modified for research use and capable of up to a 5 h flight duration with a scientific payload of up to 4000 kg. It has an operational ceiling altitude of ∼ 33 000 feet (∼ 10 km). In situ instrumentation described in Sect. 2.1 sampled ambient air inside the converted passenger cabin. This air was fed by purpose-built rearward facing window-mounted inlets (O'Shea et al., 2013). Typical air speed and aircraft pitch angle on science runs were around 115 m s −1 and +4.5 • , respectively. The GPS position, aircraft orientation, and velocity were all sampled at 50 Hz, and recorded at 32 Hz by an Applanix POS AV 510 GPSaided Inertial Navigation (GIN) unit.

Trace gas and thermodynamic measurements
Thermodynamic and trace-gas instruments on the BAe-146 used for this study are listed in Table 1. A five-hole turbulence probe mounted on the aircraft nose was used in conjunction with the GIN system to provide 3-D wind fields and high-frequency (32 Hz) turbulence measurements. Thermodynamic instruments include a General Eastern GE 1011B Chilled Mirror Hygrometer, which measures dew-point temperature, and a Rosemount/Goodrich type-102 True Air Temperature sensor, which recorded data at 32 Hz using a non-de-iced Rosemount 102AL platinum resistance immersion thermometer, mounted outside of the boundary layer of the aircraft near the nose. The turbulence probe also used measurements from the GIN and measurements of the ambient air temperature to correct for kinetic effects.
Carbon monoxide was measured at 1 Hz by an AL5002 Fast CO Monitor using a UV fluorescence methodology, as described by Gerbig et al. (1999); the instrument was regu-larly calibrated (once every 30 min) in-flight against certified standards. Ozone was recorded at 1 Hz by a TECO 49C UV photometer, and the transmission time from the exterior to the instrument via the sampling line can be assumed to be negligible (less than the 1 s integration time for these in situ sensors). These instruments are core to the aircraft fit, and are used regularly in a variety of FAAM campaigns. Therefore, the accuracy of the reported O 3 and CO concentrations has been regularly assessed by intercomparisons with ground-based instruments and equivalent instrumentation on other aircraft. In those comparisons, both CO and O 3 have been found to be consistently accurate to within 5 ppb across a range of typical atmospheric concentrations (e.g. as compared with instrumentation on the NSF C-130 aircraft reported in Allen et al., 2011). This compares favourably with the reported instrument precision of 1 % above the instrument limits of detection, which are ∼ 20 and 5 ppb for CO and O 3 , respectively.
The CH 4 observations on board the FAAM BAe-146 were made using a cavity-enhanced absorption spectrometer. This system is based on a commercially available analyser (Fast Greenhouse Gas Analyser, Model RMT-200) from Los Gatos Research Inc., USA, which has been modified for airborne operation (O'Shea et al., 2013). Calibration curves are determined in-flight using three WMO traceable standards, with accuracy/bias estimated at no more than 1.28 ppb for CH 4 (with 1σ precision of 2.48 ppb at 1 Hz). Measurements are reported as dry-air mole fractions.
In addition to the in situ instrumentation, for some of the flights in this study Vaisala RD93 dropsondes were released from the aircraft, from high altitude and when over the sea. The RD93 is a general-purpose dropsonde for highaltitude deployment from a variety of aircraft. Slowed in its descent through the atmosphere by a special parachute, the RD93 measures the atmospheric profiles of pressure, temperature, relative humidity, and wind from the point of launch to the ground. The RD93 transmits meteorological data via a 400 MHz meteorological band telemetry link to the receiving system on board the aircraft, with an on-board GPS receiver tracking the dropsonde horizontal movement as it is borne by the wind. The manufacturer-specified accuracies of the RD93 are 0.2 K, 0.4 hPa, and 2 % for temperature, pressure, and relative humidity, respectively.

Cloud and aerosol lidar
A mini-lidar cloud system on the FAAM aircraft has also been used here to test for successful cloud screening of the ARIES data (see Sect. 4). The mini-lidar is a Leosphere (Model ALS450) elastic backscattering system with daytime capability, suitable for aerosol and cloud observations, and features a depolarisation channel. Its operational wavelength is 355 nm and it is mounted in a nadir-viewing geometry. For more details about the mini-lidar instrument, see Marenco et al. (2011).

A priori data sets
A full description of the choice and source of a priori data used to initialise MARS can be found in Part 1. In summary here, temperature and water vapour profile priors were extracted from co-located European Centre for Medium-range Weather Forecasts (ECMWF) operational analysis fields produced by the ECMWF Integrated Forecasting System (IFS Cycle 29r2) on a 2.5 • × 2.5 • geospatial grid on 91 hybrid model levels. Trace-gas profile priors were extracted from the MACC-II (monitoring atmospheric composition & climate) reanalysis data set. For further details on surface properties and other auxiliary data sets (temperature and emissivity), please refer to Part 1.

FAAM campaigns used for validation
For validation purposes, we have chosen to use wellcharacterised data sets from several FAAM aircraft campaigns, conducted in diverse locations, to capture the typical natural variability of composition and thermodynamic backgrounds across the range of environments in which the FAAM aircraft typically samples. The campaigns chosen for this study were the Joint Airborne IASI Validation Experiment (JAIVEx), the Clear air for London (ClearfLo) study, and the Methane and other greenhouse gases in the Arctic -Measurements, process studies, and Modelling (MAMM) project. These campaigns were based around the US Gulf Coast, London, and the Arctic Circle, respectively, and are described in more detail below.
Sections 3.1 to 3.3 describe the flights and campaigns in further detail with a focus on the validation manoeuvres and sampling principles relevant in comparison to ARIESretrieved data. More generally here it is important to note that there will always be a spatio-temporal mismatch between in situ measurements and remote sensing from aircraft (as with any static or moving platform). As with any remote-sensing validation, it is never possible to co-locate in situ measurement and retrieval exactly by the nature of the sampling and this introduces a potentially unquantifiable uncertainty that must be minimised and rationalised. Moreover, there is no singular spatial or temporal mismatch that can be calculated as the aircraft is always moving and the in situ measurements (at various heights) all map to the retrieved profile differently at any single point in time through the column. The important point here is that it is necessary to be able to assume that the spatial scales of transport and reactive chemistry (relative to the tracers in question here) are sufficiently small (or slow) such that the atmospheric composition does not change significantly in the time between in situ sampling and nadir retrieval. This mismatch will be quantified for each campaign in turn in the following sections.

JAIVEx
The JAIVEx campaign was a calibration-validation campaign which used ARIES radiance data to radiometrically validate the IASI instrument. It was conducted over the Gulf of Mexico and operated out of Houston, USA, during April-May 2007. See Larar et al. (2010) for an overview of the JAIVEx mission, and see Newman et al. (2012) for a full discussion of the performance of ARIES during the JAIVEx project. In addition to measuring temperature, water vapour, and trace-gas concentrations (see Sect. 2.2), the FAAM aircraft released dropsondes, which sampled the atmospheric thermodynamic structure below the aircraft at high spatial resolution (∼ 6 m), which will also be used here for validation. We present data collected during flight B290 during JAIVEx, which took place on the morning of 30 April 2007 over the Gulf of Mexico. The B290 flight track and profile are shown in Fig. 1. Take-off time from Houston Airport was 12:45 UTC (07:45 LT) and landing time at New Orleans was 17:20 UTC (12:20 LT). The Gulf of Mexico area and the operational area of the aircraft were mostly cloud free on 30 April 2007, as observed in-flight and from GOES satellite cloud imagery (not shown). This makes this flight an ideal case study for nadir remote-sensing validation, where cloudy scenes would otherwise prevent retrieval by MARS. Indeed, this area at this time of year was chosen for its climatologically low cloud fraction to facilitate this IASI calibration-validation mission.
Two extended periods (between 30 min and 1 h in duration) at cruising altitudes of 7.3 and 9 km were conducted. These are the northwest-southeast and northeast-southwest transects seen in Fig. 1. At these altitudes, the instantaneous ground footprint of ARIES due to the instrument's 44 mrad circular field of view (full angle) has a radius of ∼ 161 m and 198 m, representing an instantaneous footprint area of ∼ 0.08 and 0.12 km 2 , respectively. The exact footprint of the ARIES retrievals is then a product of both this instantaneous footprint and the ground-track of the aircraft integrated over the ARIES sampling/integration time (5 s in this case). The maximum time and horizontal distance between in situ sampling and the nearest retrieval were ∼ 65 min and ∼ 360 km, respectively. Air mass composition for all parameters other than water vapour and temperature was not observed to change significantly (less than the a priori variance) over these scales in the in situ data examined. The water vapour and temperature changes along the flight track were captured by radio dropsondes and the variability and how this is captured in the retrieved data set will be discussed in Sect. 4.2.

ClearfLo
The ClearfLo project was conceived to provide long-term integrated measurements of the meteorology and composition of London's urban atmosphere, recorded at street level and at elevated sites, and complemented by modelling to improve and characterise predictive capability for air quality. A separate but synergistic FAAM airborne project took place during July and August 2012, consisting of five 5 h flights during which the ARIES and in situ trace-gas instrumentation was operated to record measurements in a wide area around and centred on London (see Fig. 2). Repeated sampling was targeted on the downwind London plume and upwind background inflow; a detailed description of the ClearfLo campaign is given by Bohnenstengel et al. (2014).
For validation, we have used data from flights B724 and B725, both conducted between 10:00 and 16:30 UTC for both 30 July and 9 August 2012, representing relatively clean and polluted cases, respectively, and characterised by wellmixed Atlantic westerly maritime inflow in the former and stagnant air (high pressure) in the latter. This contrast is useful for validation to characterise the ability to retrieve information in clean and polluted environments. Flight tracks for these two flights can be seen as the thick (B724) and thin (B725) traces in Fig. 2. In both flights, air upwind of London was seen to be less polluted than air downwind in the in situ measurements (see Sect. 4). The maximum time and horizontal distance between in situ sampling and the nearest retrieval on flight B724 were 20 min and 50 km, respectively, for altitudes below 5 km; 110 min and 215 km for higher altitudes due to the height restrictions imposed on the aircraft over most of the flight track due to air traffic. For B725, this was reduced to ∼ 42 min and ∼ 112 km, due to the higher-altitude patterns over the English Channel. Variability in the air mass (especially for methane and carbon monoxide) was observed over the flight and this will be discussed in the context of how they are captured in the retrieved data set in Sects. 4.4 and 4.5.

MAMM
The MAMM project aims to improve quantitative knowledge of Arctic CH 4 and other gases from various sources, whilst also determining their magnitudes and spatial distributions. The FAAM component of this mission involved three separate flying campaigns within the Arctic Circle: July 2012, August 2013, and September 2013. In this study we have used data from the July 2012 period during two flights: B719 and B720, on 17 July 2012 and 18 July 2012, respectively, conducted between 09:00 UTC and 16:00 UTC. The former was conducted over the wetlands of western Finland and the latter predominantly over the Norwegian Sea between the coasts of Norway and Svalbard (see Fig. 3). These two flights provide contrast between sea and land retrievals in an otherwise similar natural environment, thereby allowing us to examine potential sources of systematic bias associated with surface type. The spiral ascent pattern seen in Fig. 3 (flown during B720 near 27 • E, 68 • N) was centred on the Sodankylä TCCON site; however, cloudy conditions on this day prevented a direct comparison with TCCON CH 4 and CO 2 measurements. The in situ measurements recorded during this spiral provide the vertical profiles we have used for retrieval validation with in situ data for this flight, and ARIES spectra were successfully screened for cloud prior to retrieval as described earlier. The maximum time and horizontal space between in situ sampling and the nearest retrieval in-flight for B720 were 35 min and ∼ 60 km, respectively. For B719, the offset maximum was ∼ 52 min and 100 km, respectively (because of the reciprocal nature of the flight pattern between the coast of Norway and Svalbard). In both flights, the air mass composition was not observed to change significantly (less than the a priori variance) over these scales in the in situ data examined for each flight for this remote location (relative to both JAIVEx and ClearfLo).

Results and discussion
The results of the validation using the FAAM data set outlined in Sect. 2.3 are now presented and discussed. To illustrate typical examples for individual retrievals, we show retrieval metrics of spectral fit and residual, AKs, and sources of total-and-component a posteriori retrieval uncertainty for profiles chosen from one flight for each of the retrieved parameters where comparable in situ data exist. We then present a statistical interpretation of the whole validation data set across selected flights in terms of mean bias and uncertainty for the entire data set (i.e. across all campaigns). The spectral window and co-retrieved state vectors for each nominal parameter (described further in Part 1 of this study) are given in Table 2. The ARIES spectra were co-added over 5 s of sampling time (10 scans) in all retrievals considered here, and retrievals were all performed on 10 vertical levels unless otherwise stated.

Cloud detection and screening performance
We have tested a cloud-detection scheme based on the brightness temperature difference in an atmospheric window and non-window spectral region (described further by Illingworth et al., 2014). This method screens ARIES data for otherwise cloudy spectra and therefore false or poor retrievals. Clouds were detected by lidar using the nondepolarised, range-corrected signal P , of the UK Met Office mini-lidar system on the FAAM aircraft (described by Marenco and Hogan, 2011 where P (R c ) < 1.5 × P (R − 200 m). The algorithm works by detecting large gradients in the lidar signal, with peaks below 500 m a.s.l. automatically discarded as surface return; see also Osborne et al. (2014). The lidar cloud detections were compared to co-located detections found using the ARIES cloud filter over a range of flights during the Microwave Emission Validation over sub-Arctic Lake Ice (MEVALI) campaign, which took place in March 2012. In total, cloud masks for over 2500 different scenes over a range of clear-land and open-sea, frozen and unfrozen surface types were compared, and an average Spearman's rank correlation coefficient was calculated to be 0.91 indicating that the cloud filter performs well. Also, 100 % of clouds detected by the lidar were detected and some additional false positives were flagged by the ARIES scheme. We accept this small loss of some data, as the alternative would be to permit cloudy spectra into the retrieval scheme that would otherwise affect the quality of the retrieved data set. Figure 4 shows convergence parameters for a single water vapour retrieval example from a flight altitude of 7.4 km during flight B290 from JAIVEx. Figure 4a shows the measured (black) and fitted (green) radiance spectra; the fact that the measured spectrum cannot be readily observed on this figure demonstrates the excellent spectral fit. Figure 4b shows the residual (difference) spectrum between the fitted and measured spectra and the total instrumental spectral radiance uncertainty (black dashed lines), demonstrating that this residual is comparable with the expected measurement uncertainty (which is represented here as the sum of radiance uncertainties resulting from detector noise, radiometric calibration, and ILS (instrument line shape) measurement as described earlier and in Part 1). Note that the spikes in the black spectrum in Fig. 4b between 1360 cm −1 and 1400 cm −1 represent uncertainty due to spectral artefacts of residual water vapour in the calibration cell used for this flight. The absence of significant residual spectral structure or absorption lines gives confidence that no potentially important absorbing trace-gas species have been excluded from the simulated atmosphere. Figure 4c shows the water vapour averaging kernel (AK) for the partial column below the aircraft. This AK and the associated degrees of freedom of signal (DOFS) value of 3.34 demonstrate that there is significant vertical resolution of the retrieved H 2 O profile from this high altitude when using 10 vertical levels. There are partially independent peaks in the AK at the uppermost (6 and 7 km) layers of the retrieval and a relatively smoothed free-tropospheric re- gion between the surface and 4.5 km. This is consistent with the DOFS and vertical sensitivity simulated at comparable altitudes for Part 1 of this study (∼ 3.0 DOFS).

Water vapour
Retrieval uncertainty components are shown in Fig. 4d. Here (and in all analogous figures for other parameters in the remainder of this section) the forward model parameter error is calculated along with a smoothing, measurement, and systematic error, following the methodology outlined for a linear approach by Rodgers (2000) and described further in Part 1. The smoothing error represents the loss of fine structure in the retrieved state, the measurement error is derived from the total radiance error of the ARIES instrument, and the parameter error is associated with the non-retrieval of parameters other than the target parameter. The systematic error is derived from the level-1b processing of uncalibrated ARIES spectra. The total a posteriori retrieval uncertainty for a singular retrieval (orange line in Fig. 4d) in this example ranges between 1500 ppm (∼ 10 %) at the surface and 120 ppm (∼ 22 %) at 7 km. It should be noted that the choice of prior can potentially have a large impact on the calculated DOFS (as discussed in Illingworth et al., 2013). In this study, the calculated DOFS above are representative of the MARS scheme and the method used to select prior information from ECMWF meteorological reanalysis data. Figure 5 shows retrievals for the whole of flight B725, compared to dropsonde data over both land and sea surfaces Figure 5. Retrievals and comparison to in situ data for (a) water vapour retrieval concentration profiles for flight B725 colour coded for flight altitude (light blue corresponds to 3 km, orange to 6.1 km). Retrieval uncertainty is shown as the dotted red bars for each profile. In situ dropsonde (black) and a priori profiles (blue) are also shown. (b) Weighted-mean profiles from flight B725 for retrieved (red), in situ measured (black), and in situ average convolved with ARIES averaging kernels (green). The standard deviation on the mean-retrieved profiles and the corresponding in situ 1σ are shown as correspondingly coloured bars at each vertical level. near to the flight track shown in Fig. 2. Figure 5a shows individual retrievals (coloured for flight altitude) and dropsonde data (black). The a posteriori uncertainties for each retrieved profile are shown as coloured dotted horizontal bars. In Fig. 5a, we note that the ECMWF water vapour a priori profile has a positive bias (up to 1000 ppm in places) relative to the dropsonde data and that it does not contain fine structure present in the real atmosphere, for example the dry layer at 4.5 km. In contrast, the retrieved profiles derived from ∼ 6 km altitude (yellow colours) do capture this dry layer due to the good vertical sensitivity and vertical resolution of layers ∼ 2 km below the aircraft. Conversely, fine structure in profiles retrieved from a higher altitude does not appear well resolved for lowermost layers because of poor sensitivity there (note the 3000 ppm negative bias in the yellow coloured profiles between 0 and 2 km). However, when flying at lower altitudes, there is good sensitivity to the near surface -this is reflected in the much smaller bias (less than 500 ppm) seen in the light-blue profiles in Fig. 5a retrieved from ∼ 3 km flight altitude. In all retrievals shown in Fig. 5a, the aforementioned retrieval represents an improvement on the a priori profile. Figure 5b shows flight-averaged profiles, binned (and averaged) into 10 equidistant altitude layers, for the retrieved (red) and in situ (black) data along with the in situ profile convolved with the ARIES AK (green) for the flight. The convolved profile is defined as x a + A(x − x a ), where A is the flight-mean AK, x is the retrieved profile, and x a is the a priori profile. By comparing the mean of the retrieved pro-file with the mean of the in situ data across an entire flight, we can compare a more consistent data set than we would by comparing individual retrieved profiles. Due to the varying flight altitude, the mean-retrieved and convolved profile represents a weighted-mean reflecting the different sampling frequency within each altitude bin. The red bars seen at each level in Fig. 5b (and all analogous figures for other retrieved parameters in the following sections) represent 1 SD of the mean-retrieved water vapour concentration in each vertical level. For the retrieval, this variability would be expected to be a convolution of both natural (sensed) air mass variability and the root mean square of a posteriori retrieval uncertainty (which is Gaussian in nature and scales with sample size); on the other hand, for the in situ data, measurement error is sufficiently small enough (of order 100 ppm in the troposphere) that the black bars can be assumed to represent sampled natural variability only. Therefore, we can compare the mean difference between these two profiles for bias (and bias significance), and we can compare the width of the 1σ bars to establish if the retrieval is able to capture the expected variability in the assumed absence of changes in air mass composition between in situ sampling and remote sensing (and when sample size is sufficiently large to reduce a posteriori uncertainty to levels much less than the expected variability).
We see that the AK-convolved in situ profile (green) compares well with the mean-retrieved profile (red), with the latter overlapping well within the corresponding 1σ of the dropsonde measurements. This shows that the retrieval agrees well with an idealised retrieval scheme giving confidence in the optimal performance of the MARS. The mean bias in Fig. 5a ranges between 110 ppm (1 %) at 500 m and ∼ 1140 ppm (14 %) at 3 km. The increased bias at 3 km is due to the poorer performance of the retrievals from a higher altitude which dominate the contribution to the mean profile at this altitude (yellow profiles in Fig. 5a), whereas the profiles recorded from an altitude just below 3 km (light blue in Fig. 5a) do capture the locally drier layer between 2.5 and 3 km. In summary, there is information content in vertically resolved water vapour nadir retrievals from ARIES and fine vertical structure can be resolved in the layers nearest to the observer (within ∼ 2 km). Table 3 lists the performance across all flights where dropsonde data exist for validation, and reports the weightedmean bias and standard error of this mean bias across the validation data set. The DOFS remain similar across all campaigns (average of 3.11) and the flight-mean a posteriori uncertainty ranges from 5 to 13 % (average across all flights of 9.5 %) with the highest uncertainty noted for flight B720, which may be expected as this flight was conducted in a cold Arctic environment with consequently reduced thermal contrast. Furthermore, the average retrieval a posteriori uncertainty (∼ 9.5 %) is much reduced relative to the a priori uncertainty constraint (20 %). The partial-column mean bias is −479 ppm (−4.8 %) with a standard error of the bias of 56 ppm (0.6 %). This compares to a standard deviation of the Table 3. Summary of retrieval metrics and validation results across all flights. Numbered from left to right columns show (1) target parameter, (2) FAAM flight number, (3) number of ARIES retrievals, (4) mean degrees of freedom for signal, (5) flight-mean column-averaged a posteriori retrieval uncertainty, (6) mean bias of retrieved partial columns relative to in situ data, (7) the standard error of the mean bias, and (8)  in situ data of 775 ppm (7.5 %), suggesting that the mean bias is significant when compared with the observed natural variability. A direct comparison between in situ data and remote-sensing data is never possible in practice due to the fact that air masses can shift below the aircraft in the time between in situ measurement and retrieval from above. However, the statistical agreement seen here across several flights and 389 retrieved profiles confirms that MARS water vapour profiles can be retrieved with a typical individual partialcolumn mean profile uncertainty of 1144 ppm (∼ 10 %), with a statistically meaningful bias of −4.8 % over a large sample of profiles. This uncertainty is also consistent with the limit of the theoretical performance found for water vapour in Part 1 of this study. The individual profile uncertainty compares similarly with that reported for IASI (10 % between 800 and 300 mb; Pougatchev et al., 2009). The ARIES mean bias (−4.8 %; see Table 3) also compared well with that re-ported for IASI in validation studies using radiosondes (10 % between 800 and 300 mb; Pougatchev et al., 2009). Figure 6 shows convergence parameters (analogous to those presented for H 2 O in Fig. 4) for an example temperature retrieval recorded over the UK mainland at 8.9 km flight altitude during flight B725 (ClearfLo) on 8 August 2013. Figure 6a illustrates a generally good simulated spectral fit (green) to the ARIES-measured spectrum (black). Figure 6b shows the residual and we note some small residual structure, especially at the centre of a strong Q branch of CO 2 at ∼ 720 cm −1 . The intensity of this Q branch and its strong sensitivity to temperature makes it very sensitive to the effects of vertical discretisation necessary for the radiative transfer modelling and as such some error may be expected. However, the P and R branches of this band, which are likewise sensitive to temperature, but which do not saturate over path lengths similar to the thickness of the layers used here (∼ 600 m), provide the bulk of the measurement information in this spectral window. This is precisely why CO 2 and temperature are simultaneously retrieved. There are also two weak unidentified potential absorption lines in the measured spectrum at 740 and 746 cm −1 . However, the overall residual is commensurate with the ARIES measurement (radiance) uncertainty (black dotted lines in Fig. 6b). The effect of this is also implicit to the a posteriori uncertainty calculation, which is consistently ∼ 0.6 K (Fig. 6d) across the tropospheric profile and dominated by the smoothing and measurement uncertainty terms (Fig. 6d). This ARIES accuracy compares favourably with typical singular retrieval uncertainty reported for IASI in the troposphere but also represents a significant improvement over IASI at the surface (IASI accuracy is reported as 0.6 K between 800 and 300 mb, worsening to 2 K at the surface; Pougatchev et al., 2009). The temperature AKs (Fig. 6c) for this example demonstrate excellent vertical resolution with a DOFS value of 4.73, which compares with the simulated (idealised) DOFS of ∼ 4 in Part 1 of this study. The AK peak at each altitude is only slightly dependent on information content from other levels and is typically smoothed over a ±1 km length (when using 10 vertical levels at 9 km flight altitude). This result confirms that vertically resolved tropospheric profiles of temperature can be usefully reported using MARS for ARIESmeasured spectra. This capability is especially useful for at- Figure 7. (a) A total of 103 individual temperature retrieval profiles across flight B290 colour coded for observer (flight) altitude. In situ dropsonde (black) and a priori profiles (blue) are also shown with 1σ variability bars. (b) Mean-difference profiles (binned and evaluated at eight equidistant vertical levels) for retrieved (red), and in situ (black), differenced relative to the mean in situ profile after convolution with the ARIES averaging kernel. The root mean square retrieval uncertainty (total error) is shown by the red bars and the in situ 1σ variability is shown as black bars at each vertical level.

Temperature
mospheric process studies such as boundary layer transport and outflow, where knowledge of the thermodynamic structure of the lower atmosphere is important. Figure 7 shows temperature retrievals for 103 individual profiles across flight B290 (Fig. 7a) and the weighted-mean flight profiles together with their in situ counterparts (Fig. 7b) in the same manner as that presented for water vapour in Fig. 5. This flight was chosen as there were four dropsondes released over various locations along the flight track and we were interested in how MARS might respond to the presence of temperature inversions in the real atmosphere. Figure 7a shows that the retrieved temperature profiles (blue) were consistently negatively biased relative to dropsonde data between 2.5 and 4 km by up to 5 K at peak. This compares with a negative bias in the ECMWF a priori profile of 3 K over the same altitude range. Also, the a priori does not show a weak temperature inversion seen in the dropsonde data between 1.5 and 2.25 km. In the individual retrievals above 4.5 km, we see a clear tendency away from the a priori toward the dropsonde data and mean bias reduces to less than 0.5 K (see Fig. 7b). However, just below the temperature inversion at ∼ 1.5 km, we see a positive bias in the retrieval of ∼ 2 K. The retrieval of such a sharp temperature inversion is not expected to be possible from ARIES spectra recorded from high altitudes but we might expect (as we do observe here) that the retrieval will manifest such inversions as a positive and negative bias, due to smoothing across the inversion prescribed by the AK.
The mean-profile bias averaged across this flight was −0.4 K with a standard error of 0.1 K, which compares with a 2.5 K standard deviation for the in situ validation data set. The mean-retrieved and AK-convolved profiles in Fig. 7a fall well within 1σ of the dropsonde data at all altitudes. Therefore, we can conclude that the −0.4 K bias is statistically significant but that this small bias is small compared with the range of natural variability observed across this flight. From Table 3, we see that mean bias averaged across all flights is −0.7 K (with a standard error of 0.3 K), compared to a 2.1 K standard deviation in the in situ data set. This bias is similar to reported temperature biases for IASI in the troposphere (±0.5 K between 900 and 100 mb; Pougatchev et al., 2009). As the ARIES bias is consistently and significantly less than the natural variability, we report the mean a posteriori uncertainty (0.9 K) as an appropriate conservative uncertainty for individual temperature retrievals from ARIES using MARS; furthermore, we quote −0.7 K as a representative bias. Figure 8 shows convergence parameters for a typical CH 4 retrieval, derived from ARIES spectra recorded over the UK mainland around midday at 9.0 km flight altitude during flight B724 from ClearfLo on 30 July 2012. Again we see an excellent simulated spectral fit to the ARIES-measured spectrum (Fig. 8a) and a featureless residual (Fig. 8b). The AKs for methane (Fig. 8c) demonstrate significantly less vertical resolution than for H 2 O or temperature with a DOFS value of 0.86, which compares well with the typical simulated DOFS for CH 4 of ∼ 1.0 predicted in Part 1 of this study at similar altitudes. There is clearly more sensitivity to the upper layers of the column (between 5 and 8 km); however, information in these layers is noted to be strongly influenced by the layers below. On inspection of the spectrally resolved weighting function for CH 4 (not shown), it can be seen that this arises because of saturation of strong CH 4 absorption lines with the remainder of the lower layer information coming from much weaker lines and a commensurately reduced signal to noise. This is also typical of IASI retrievals of methane in the troposphere, which likewise show limited penetration and sensitivity into the tropospheric column, and confirms that only partial columns can be usefully reported for ARIES retrievals. It is also important to note that this partial-column information is mainly weighted to a 2 km layer below the aircraft. The reason for this is due to the optical depth associated with the strong versus the weaker absorbing methane lines across the wide spectral band used here for CH 4 retrieval. Whilst the strong lines saturate on length scales less than 1 km, the weaker lines saturate at various lengths between 1 km and the ground. The convolution of this information content from all spectral lines for CH 4 results in an AK peak at 2 km below the aircraft mostly independent of aircraft height (in the troposphere).

Methane
The total a posteriori uncertainty (Fig. 8d) for independent retrievals is significant at ∼ 100 ppb (∼ 5 %) of in situ concentration across the profile, which is again dominated by the smoothing and measurement uncertainty components. However, when comparing the convolved profile with the retrieved profile (where smoothing error should not be considered), we should note that the total uncertainty (dominated by the measurement term; see Fig. 8d) is smaller at ∼ 20 ppb (∼ 1 %) throughout the column. Figure 9a shows 389 methane concentration retrievals (coloured profiles) from flight B725, compared to vertically binned (averaged into 10 equidistant layers across the profile) in situ concentration profiles measured by the FGGA (Fast Greenhouse Gas Analyser) (black). First, we note that the a priori (operationally derived from the MACC-II reanalysis data set; see Inness et al., 2013, and Part 1 of this study for details) in blue shows a significant negative bias relative to in situ data of around 3 % (∼ 60 ppb at all altitudes). Despite this, the retrieved profiles tend well toward the in situ data in all cases and the a posteriori error bars (dotted lines in Fig. 9a) always overlap the in situ profile. When averaged across a flight, Fig. 9b shows good agreement between retrieval and in situ data in the flight-averaged profiles between 2.5 and 9 km, but shows a clear negative bias (up to ∼ 2.5 %) in the lowest layers (below 2 km). This is due to the lack of near-surface sensitivity noted from the AK in Fig. 8c, meaning that the retrieval in those layers tends toward a negatively biased a priori. However, the agreement in the upper layers demonstrates that the retrieval can allow for large departures in ambient CH 4 from expected climatology but also highlights a need for a better choice of a priori (as well as also highlighting a potential bias in the MACC-II data set). Mean profiles from flight B725 for retrieved (red), in situ measured (black), and in situ average convolved with ARIES averaging kernels (green). The in situ 1σ measurement variability is shown as black bars at each binned vertical level and the root mean square retrieval uncertainty (total error) is shown by the red bars.
To test the sensitivity of MARS to this poor a priori, we also performed retrievals which used the measured in situ profile as the a priori constraint (not shown). This yielded marginally better mean profiles compared to those shown in Fig. 9b (< 0.5 % bias and a 1σ of 2 %). However, we will report on the use of the MACC-II prior for validation with the operational MARS scheme, which we would use in the absence of prior knowledge from the FGGA measurements. As such, we can characterise performance across the entire ARIES data set where we have no choice but to rely on the available climatology.
Averaged across all flights (see Table 3), the mean bias in the retrieved CH 4 columns is −11 ppb (−0.6 %) with a standard error of 2 ppb. However, the bias for individual flights ranges from −2.7 % (flight B719) to +1.1 % (flight B720). This global mean bias (and its standard error) is comparable with the measured natural variability (14.8 ppb). Therefore, we characterise uncertainty for individual methane retrievals using a conservative upper limit corresponding to the (larger) total a posteriori uncertainty, which is consistently ∼ 5 % (∼ 100 ppb) for partial columns up to 9 km altitude. We also quote a mean bias here of −0.6 %. We note that this does not reach the very high accuracy of TCCON of 0.2 % for methane. However, TCCON has the great advantage of directly observing the sun and reaches this high accuracy only after calibration against WMO gas standards during aircraft profiling.  Figure 10 shows convergence parameters for a carbon monoxide retrieval for ARIES spectra recorded over the UK mainland at 7.7 km flight altitude during flight B725 from ClearfLo at 11:55 UT, 8 August 2012. Again we see a largely featureless residual broadly comparable with the measurement uncertainty. However, several of the CO lines are not fitted well. This is a persistent feature of the operational CO retrievals and cannot be improved further in the MARS. There are many potential sources for this error. Several principal sources have been investigated which include wave number shift and ARIES instrument line shape. Other errors may be associated with the HITRAN (high-resolution transmission molecular absorption database) 2012 reference spectroscopy used (HITRAN is described by Rothman et al., 2013) for CO, but this seems unlikely as such residuals have not been noted in IASI retrievals for example. We note this error here and it is inclusive to the measurement error component seen in Fig. 10d (red line). This equates to ∼ 10 ppb (∼ 8 % in concentration terms), making it the second-most dominant term after the smoothing component in the a posteriori uncertainty, which is highly significant at between ∼ 60 ppb (30 %) at the surface and ∼ 50 ppb (35 %) in the uppermost layers. This is similar to the uncertainty of 34 % reported for tropospheric IASI CO retrievals (Illingworth et al., 2011). Figure 11. (a) A total of 203 individual carbon monoxide concentration retrieval profiles across flight B290 colour coded for observer (flight) altitude. Retrieval uncertainty (total error) is shown as the dotted bars in each case. In situ and a priori profiles are also shown as per legend. (b) Mean profiles from flight B290 for retrieved (red), in situ measured (black), and in situ average convolved with ARIES averaging kernels (green). The in situ 1σ measurement variability is shown as black bars at each binned vertical level and the root mean square retrieval uncertainty (total error) is shown by the red bars.

Carbon monoxide
The AK for CO (Fig. 10c) also demonstrates weak vertical sensitivity to the lowest layers (below 2 km) of the atmosphere with a DOFS value of 0.92. This compares well with the typical simulated DOFS for CO of ∼ 1.0 simulated in Part 1. There is a broad (yet smoothed) sensitivity to much of the partial column below the aircraft with sensitivity down to ∼ 2 km. Much like CH 4 , IASI likewise shows limited penetration and sensitivity into the tropospheric column, confirming that, like IASI, only partial columns can be usefully reported for ARIES retrievals of CO. Figure 11a shows 203 CO retrievals from flight B290 (JAIVEx) compared to vertically binned (10 equidistant levels) in situ concentration profiles measured by the Aerolaser Inc. instrument (black line). The a priori (operationally derived from the MACC database) shows a negative bias relative to the in situ profile of around 30 % (∼ 20-45 ppb across the profile). Due to the expected high relative variability of CO in the real atmosphere (evident here by the ±25 ppb 1σ bars for in situ data in Fig. 11a), we use a 20 % a priori uncertainty constraint (as described further in Part 1), which allows for the retrieval algorithm to diverge away from a potentially inaccurate climatology. Comparing the flight-mean and in situ partial columns ( Fig. 11b and Table 3) we see a mean bias of −2.2 % with a standard error of ±1.3 % (2 ppb).
This high variability in the retrieved bias is smaller than the natural variability in CO measured in situ (41 ppb; see Table 3), which represents a special case (other flights did not see such variation). This could indicate that the a priori is over-constrained for this flight. To examine this, we have also tested a more relaxed a priori covariance constraint in MARS. Figure 12a and b show CO mean-flight retrievals for two other flights -B720 and B724, from the MAMM and ClearfLo campaigns, respectively. For those flights, we tested the performance of MARS with a 25 % a priori covariance for each retrieval level. When using this relaxed constraint, and despite the positively biased a priori, we observe much better retrieval performance in the mean for altitudes above 2 km when comparing the in situ profile and that convolved with the ARIES AK (black and green lines, respectively). For the B720 and B724 flights we see a mean partialcolumn bias of −3.3 and −2.2 %, respectively, with a corresponding standard error of 2.6 and 7.3 %, respectively. This compares well to the natural sampled variability of ∼ 8 and ∼ 12 %, respectively (see Table 3). Given the small overall mean bias (−3 %, 3.3 ppb) in Table 3 compared to the overall natural variability of CO measured in the atmosphere (17 %, 17.6 ppb), and a small standard error of this mean bias (1.3 %, 1.0 ppb), we can be confident that the a posteriori uncertainty from individual profiles is a conservative uncertainty for individual retrievals here; this is ∼ 21 % of the partial column (see Table 3). We also quote a bias of −3.3 ppb (−3 %, with a standard error of 1.3 %). This is an improvement on the upper IASI uncertainties of 34 % reported by Illingworth et al. (2011) for CO. Figure 13 shows convergence parameters for example O 3 retrievals over northern Sweden at 8.3 km flight altitude during flight B719 during the MAMM campaign on 21 July 2012. We see a largely featureless residual comparable within the instrumental radiometric uncertainty (Fig. 13b). The AK shows little vertical resolution and a sensitivity weighted to a layer ∼ 3 km below the aircraft (Fig. 13c). Total a posteriori uncertainty is ∼ 17 ppb (∼ 25 % in this example) across the profile and dominated by the smoothing term (80 % of total error), with the measurement error term contributing ∼ 20 % to the total error (Fig. 13d). We also show results for flight B724 as a special contrasting case. Figure 14a shows 42 O 3 retrievals from flight B724 (ClearfLo) compared to vertically binned in situ concentration profile measured by the 2B Technologies instrument (black). The a priori (operationally derived from the MACC-II data set in this case) has little bias below 7 km but appears to misrepresent the presence of stratospheric air enriched in ozone above 7 km (confirmed also by the aircraftmeasured potential temperature profile, not shown). Meteorological charts for this day show a tropopause fold over the area (not shown) -a mesoscale feature not commonly captured by coarse-scale global circulation models such as those employed for MACC-II. This makes this case study particularly interesting in assessing the performance of MARS to unexpected events. Encouragingly, the retrieved profiles in Fig. 14 do capture some of the (vertically smoothed) structure of this stratospheric intrusion despite the a priori constraint above 7 km. The AK at 7.1 km (Fig. 13c) contains dominant peaks from both that layer and the two adjacent layers (8.95 and 5.61 km), and this smoothing is manifest in the retrieved profile as a positive and negative bias in the layers around a rapidly increasing gradient in ozone at 7 km. This is analogous to the retrieval response to the presence of a strong temperature inversion discussed in Sect. 4.2 and shows that MARS can capture important (and unexpected) vertical gradients in ozone within 2 km of the aircraft altitude. Figure 14. (a) A total of 42 ozone retrieval profiles from flight B724 colour coded for flight altitude. Retrieval uncertainty (total error) is shown as the dotted bars in each case. In situ and a priori profiles are also shown as per legend. (b) Mean profiles from flight B724 for retrieved (red), in situ measured (black), and in situ data average convolved with ARIES averaging kernels (green). The in situ 1σ measurement variability is shown as black bars at each binned vertical level and the root mean square retrieval uncertainty (total error) is shown by the red bars. Comparing the B724 flight-mean and in situ partial columns ( Fig. 14b and Table 3) we see a mean bias of +3.7 % but with a large standard error of ±5.4 % (4.8 ppb) suggesting that this bias may not be statistically significant. This also compares with a larger natural variability of 10.9 ppb (8 % at 1σ ). Figure 14b shows the standard deviation of the meanretrieved profile (red bars) and we see that this overlaps well within the corresponding 1σ in situ bars (black).

Ozone
Due to the potential of the FAAM aircraft to routinely sample stratospheric air, two further flight examples are shown (Fig. 15) for incidences and absences of stratospheric intrusion during flights B720 (MAMM; Fig. 15a), and B290 (JAIVEx; Fig. 15b), respectively. In both examples, the re-trieval performs well and captures smoothed vertical structure in the layers within 3 km below the aircraft. In these flights, mean partial-column biases were found to be +4.4 and +8.1 % with standard errors of 2.6 and 1.0 %, respectively (Table 3). This compares to a natural variability (at 1σ ) of 17 and 21 %, respectively. Global data set bias (+3.5 ppb) can be summarised as being small compared to natural variability. In summary, we quote uncertainty on individual ARIES ozone partial-column retrievals as being conservatively characterised by the a posteriori uncertainty (15 %, or 11 ppb in the weighted-mean-column concentration), with a bias of +3.5(±1) ppb (4.7 %).

Conclusions
Atmospheric trace-gas-concentration and thermodynamic profiles have been retrieved and validated for the ARIES instrument using the MARS scheme throughout the troposphere and planetary boundary layer (PBL) for aircraft campaigns around London, the US Gulf Coast, and the Arctic Circle during the ClearfLo, JAIVEx, and MAMM aircraft projects, respectively.
Typically high DOFS for temperature (4.71) and water vapour (3.11) confirm that vertically resolved information can be obtained for these parameters, whilst only partialcolumn retrievals of CO, CH 4 , and O 3 can be usefully reported. In the case of temperature and water vapour, PBL inversion layers and dry/moist layers could be qualitatively discerned. Retrieved data were compared to corresponding measurements from high-precision in situ analysers and dropsondes operated on the FAAM aircraft. Partial-column mean biases (and mean bias standard error) averaged across all flight campaigns were −0.7(±0.3) K, −479(±56) ppm, −11(±2) ppb, −3.3(±1.0) ppb, and +3.5(±1.0) ppb for T , H 2 O, CH 4 , CO, and O 3 , respectively. Average a posteriori uncertainties for singular retrievals were 0.4 %, 9.5 %, 5.0 %, 21.2 %, and 15.0 %, respectively, representing a typical uncertainty for singular ARIES FOV retrievals. ARIES mean bias for methane (11 ppb, ∼ 0.6 %) compares similarly with previously reported near-IR remote-sensing statistical accuracy of CH 4 from the TCCON network (0.2 % after calibration to gas standards) and the MAMAP aircraft instrument (2 %); ARIES performs significantly better for all tropospheric state parameters studies here when compared to IASI.
Averaging kernels derived for progressively lower altitudes showed improving sensitivity to lower atmospheric layers when flying at lower altitudes, typically peaking between 1 and 2 km below the aircraft. In particular, vertical structure in this layer was accurately detected and resolved in the case of ozone (e.g. during two stratospheric intrusions not expected in reanalysis thermodynamic and ozone data used as a priori). This demonstrates that valuable additional information content can be obtained by nadir infrared remote sensing using ARIES by optimising the vertical sampling of the FAAM aircraft for future atmospheric process studies using the MARS scheme.