MIPAS IMK / IAA CFC-11 ( CCl 3 F ) and CFC-12 ( CCl 2 F 2 ) measurements : accuracy , precision and long-term stability

E. Eckert, A. Laeng, S. Lossow, S. Kellmann, G. Stiller, T. von Clarmann, N. Glatthor, M. Höpfner, M. Kiefer, H. Oelhaf, J. Orphal, B. Funke, U. Grabowski, F. Haenel, A. Linden, G. Wetzel, W. Woiwode, P. F. Bernath, C. Boone, G. S. Dutton, J. W. Elkins, A. Engel, J. C. Gille, F. Kolonjari, T. Sugita, G. C. Toon, and K. A. Walker Karlsruhe Institute of Technology, Institute of Meteorology and Climate Research, Karlsruhe, Germany Instituto de Astrofísica de Andalucía, CSIC, Granada, Spain Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, VA 23529-0126, USA Department of Chemistry, University of Waterloo, Waterloo, Ontario, Canada NOAA Earth System Research Laboratory, Boulder, CO 80305, USA Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO 80309, USA Institut für Atmosphäre und Umwelt, J. W. Goethe Universität, Frankfurt, Germany

E. Eckert et al.: MIPAS IMK/IAA CFC-11 and CFC-12: accuracy, precision and long-term stability 12) for the RR and the FR period.Between ∼ 15 and 30 km, most comparisons agree within 10-20 pptv (10-20 %), apart from ILAS-II, which shows large differences above ∼ 17 km.Overall, relative differences are usually smaller for CFC-12 than for CFC-11.For both species -CFC-11 and CFC-12we find that differences at the lower end of the profile tend to be larger at higher latitudes than in tropical and subtropical regions.In addition, MIPAS profiles have a maximum in their mixing ratio around the tropopause, which is most obvious in tropical mean profiles.Comparisons of the standard deviation in a quiescent atmosphere (polar summer) show that only the CFC-12 FR error budget can fully explain the observed variability, while for the other products  only two-thirds to three-quarters can be explained.Investigations regarding the temporal stability show very small negative drifts in MIPAS CFC-11 measurements.These instrument drifts vary between ∼ 1 and 3 % decade −1 .For CFC-12, the drifts are also negative and close to zero up to ∼ 30 km.Above that altitude, larger drifts of up to ∼ 50 % decade −1 appear which are negative up to ∼ 35 km and positive, but of a similar magnitude, above.

Introduction
Chlorofluorocarbons (CFCs) have been monitored for some decades because of their potential to release catalytically active species that destroy stratospheric ozone, which was first discovered by Molina and Rowland (1974).Even though there are also natural sources of halogens, observations focus on man-made CFCs, such as CFC-11 and CFC-12, because increased release of active chlorine species due to elevated amounts of these substances can significantly alter the equilibrium of stratospheric ozone formation and destruction.Under certain conditions (sufficiently cold temperatures for chlorine activation; polar stratospheric clouds, PSCs), this can lead to severe ozone depletion.Since CFCs have very long lifetimes in the atmosphere (52 years with an error range of 43-67 years for CFC-11; 102 years with an error range of 88-122 years for CFC-12, SPARC, 2013) and are insoluble in water, they can easily reach the stratosphere because they are neither destroyed nor washed out before they arrive at these altitude regions.In the stratosphere, halogen source gases, such as CFC-11 or CFC-12, are photolyzed or otherwise broken up and finally converted to so-called reservoir gases, particularly hydrogen chloride (HCl) or chlorine nitrate (ClONO 2 ), by chemical reactions and under the influence of solar ultraviolet radiation.Stratospheric abundances of hydrogen chloride and chlorine nitrate increased significantly during the later decades of the past century (World Meteorological Organization, 2011), as a consequence of intensified anthropogenic emissions of CFCs and other ozonedepleting substances (ODSs), which were used for refrigeration, foam blowing and several other purposes.While di-rect reactions of ozone with the reservoir species HCl and ClONO 2 are not relevant for ozone depletion, these reservoir species are transformed into active chlorine species (ClO x ; mainly ClO, Cl and Cl 2 O 2 ) under sufficiently cold temperatures.The active chlorine species catalytically destroy ozone via the so-called ClO-dimer cycle (Molina and Molina, 1987) and the synergistic interaction of ClO and BrO (McElroy et al., 1986).Here, heterogeneous reactions on the surface of cold aerosols of PSCs occur and, in combination with sunlight, result in the reactivation of chlorine which can then destroy ozone catalytically and ultimately leads to ozone depletion and the formation of the ozone hole.
Once it was observed (Farman et al., 1985) that these processes could lead to severe ozone depletion in reality, the Montreal Protocol was adopted in 1987 to control the emission of CFCs and other ozone-depleting substances.Afterwards, the emission of CFCs decreased and ceased completely in 2010 (World Meteorological Organization, 2011), which led to decreasing amounts of these species in the atmosphere.However, since several CFCs have lifetimes of up to 100 years and more -which makes them excellent tracers for the Brewer-Dobson circulation (Schoeberl et al., 2005; SPARC, 2013) -significant amounts of these species are still present in the atmosphere.Hence, their monitoring and the closer examination of their evolution in the atmosphere are important tasks, as Kellmann et al. (2012) have shown by illustrating that there are trends in CFC-11 and CFC-12 which can so far only be explained by changes in circulation.In addition to their ozone-depleting potential, CFC-11 and CFC-12 have a pronounced global warming potential (World Meteorological Organization , WMO, e.g., Fig. 1-6-4), which is another reason for monitoring these species.In the following, we describe the data products and the different characteristics of the instruments used in the comparisons (Sect.2), followed by an explanation of the validation method (Sect.3).Since the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) malfunctioned in 2004 and the retrieval setup had to be changed afterwards to address the altered situation, two sets of the data exist for either species, one (FR, full spectral resolution) referring to the period of July 2002 to March 2004 and one (RR, reduced spectral resolution 1 ) referring to the period of January 2005 to April 2012.The spectral resolution degraded from the FR to the RR period, but more scans in the vertical are performed per profile during the RR period, which leads to better altitude resolution (Kellmann et al., 2012, Table 1).Thus, in Sect. 4 we show the extensive results of the validation of version V5R_220 and V5R_221 (corresponding to the RR period) of MIPAS CFC-11 and CFC-12 products and also a few comparisons for version V5H_20 (corresponding to the FR period) of the same species.The paper concludes with a summary.

Instruments
All the instruments used in this study and their main characteristics are summarized in this section.Information on vertical coverage, vertical resolution and utilized spectral region is collected in Table 1.The table also gives an overview of the observation period and spectroscopic data used for the retrievals.The spectral regions used for each remote sensor under consideration are illustrated in Fig. 1, along with the contributions of all interfering species.Besides the Institute of Meteorology and Climate Research/Instituto de Astrofísica de Andalucía (IMK/IAA) data product, MIPAS CFC data by ESA (Raspollini et al., 2013, validated by Engel et al., 2016), Oxford University (The University of Oxford Physics Department, 2008), Forschungszentrum Jülich (Hoffmann et al., 2008) and MIPAS Bologna Facility (Dinelli et al., 2010) also exist.Since these retrievals rely on the same measurements as our data, they are not independent and thus have not been used for comparison.

MIPAS data and retrieval
The Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) was one of 10 instruments aboard Envisat (Environmental Satellite).The satellite was launched into a polar, sun-synchronous orbit on 1 March 2002.The last contact with the satellite was made on 8 April 2012.This adds up to an observation period of 10 years.Envisat orbited the Earth 14 times a day at an altitude of approx.800 km.The equator crossing times were 10:00 and 22:00 local time for the descending and ascending node, respectively.
The MIPAS instrument was a high-resolution Fourier transform spectrometer.It measured thermal emissions from the atmospheric limb in the mid-infrared range between 685 and 2410 cm −1 (14.6 and 4.1 µm) (Fischer et al., 2008).The MIPAS measurement period is split into two parts based on the spectral resolution of the measurements.Until March 2004, the measurements were performed with a spectral resolution of 0.025 cm −1 (unapodized), which was the nominal setting.Due to an instrumental failure, later measurements, commencing in January 2005, could only be performed with a reduced spectral resolution and a spectral sampling of 0.0625 cm −1 .In correspondence, we denote the two periods as full (FR) and reduced (RR) spectral resolution periods, respectively.In the present validation study we focus on measurements that were performed in the "nominal observation mode".In this mode, spectra at 17 tangent heights between 6 and 68 km were obtained in the FR period.The horizontal sampling was about 1 scan per 510 km and overall, more than 1000 scans were performed per day.During the RR period the sampling improved in the horizontal domain to one scan per 410 km and in the vertical domain to 27 spectra between 7 and 72 km.More than 1300 scans were obtained on a single day, covering the entire latitude range.
The CFC-11 and CFC-12 data sets that are used in this study have been retrieved with the IMK/IAA processor that has been set up together by the Institute of Meteorology and Climate Research (IMK) in Karlsruhe (Germany) and the Instituto de Astrofísica de Andalucía (IAA) in Granada (Spain).The retrieval employs a nonlinear least-squares approach with a first-order Tikhonov-type regularization (von Clarmann et al., 2003(von Clarmann et al., , 2009)).The simulation of the radiative transfer through the atmosphere is performed by the KOPRA (Karlsruhe Optimized and Precise Radiative transfer Algorithm) model (Stiller, 2000).In the comparisons, we consider data that were retrieved with the retrieval versions V5H_CFC-11_20 and V5H_CFC-12_20 for the FR period as well as V5R_CFC-11_220/221 and V5R_CFC-12_220/221 for the RR period (Kellmann et al., 2012).Version 220 covers the time period from January 2005 to April 2011 and version 221 is attributed to the time afterwards.
Table 1.Overview of important characteristics of the instruments used for the validation of the MIPAS CFC-11 and CFC-12 products.ILAS-II is only used for the validation of the MIPAS high spectral resolution (FR) period.ACE-FTS and MkIV are used to validate the MIPAS data from both periods, while the rest of the instruments only cover the reduced spectral resolution period.The only change between these two versions is the source of the temperature a priori data.Initially, the a priori data were based on NILU's (Norwegian Institute of Air Research) post-processing of ECMWF (European Centre for Medium-Range Weather Forecasts) data.Later, they were taken from ECMWF directly as NILU's processing had ceased.Overall, the CFC data sets comprise more than 480 000 individual profiles for the FR period and more than 1.8 million profiles for the RR period.For reasons of legibility, MIPAS Envisat is referred to as MIPAS throughout this document, although other versions of the MIPAS instruments are also considered in this paper.

Cryosampler data
The cryosampler instrument is a balloon-borne cryogenic whole air sampler originally developed at Forschungszentrum Jülich (Germany) in the early 1980s (Schmidt et al., 1987).The cryosampler used in this comparison is the BON-BON instrument.The first observations date back to 1982.The instrument consists of a Dewar with 15 stainless steel sampling containers which is filled with liquid neon to cool the sampling containers down to 27 K.This allows the sampling of a sufficient mass of air even at low pressures, which will freeze out immediately.The sampler inlets face downward; hence the BONBON measurements are optimized for the descending leg of the flight in order to avoid contamination from balloon outgassing.After the flight, the collected samples are analyzed on the abundance of a long list of trace gases by means of gas chromatography.In this comparison we consider five balloon flights that were performed by the University of Frankfurt (Germany) (e.g., Laube et al., 2008).

MkIV data
The Mark IV interferometer is a balloon-borne highresolution Fourier transform spectrometer which has been developed at the Jet Propulsion Laboratory in Pasadena (USA) in the 1980s.The instrument employs the solar occultation technique measuring absorption spectra over a wide wavelength range from 650 to 5650 cm −1 (15.39-1.77µm) with a very high spectral resolution of up to 0.006 cm −1 .Since 1989, more than 20 flights were conducted (Toon, 1991;Velazco et al., 2011).The flight duration varies between a few hours and up to 30 h, allowing one or two occultations to be taken during one flight.The occultations cover the altitude range between the tropospheric cloud tops and the floating altitude, which is typically within the 35-40 km range.The vertical sampling is about 2 to 4 km.The profile retrieval is based on an iterative nonlinear least-squares fitting algorithm with a derivative constraint.MkIV CFC-11 retrievals were performed using an empirical pseudo-linelist derived from the laboratory measurements of Li and Varanasi (1994).MkIV CFC-12 retrievals used a pseudo-linelist derived from the laboratory measurements of Varanasi and Nemtchinov (1994).These linelists, and a description of their derivation, can be found at http://mark4sun.jpl.nasa.gov/pseudo.html.The vertical resolution of the retrieved data is close to the vertical sampling.

MIPAS-B data
MIPAS-B denotes a balloon-borne version of the MIPAS type of instruments and can be regarded as a precursor of the satellite instrument that flew on Envisat as described in Sect.2.1.The instrument was developed in the late 1980s and early 1990s at the Institut für Meteorologie und Klimaforschung in Karlsruhe (Germany) and two models were built (Fischer and Oelhaf, 1996;Friedl-Vallon et al., 2004).MIPAS-B interferometers have been operated since 1989 (von Clarmann et al., 1993) and more than 20 flights have been carried out to date.MIPAS-B covers the wavenumber region from 750 to 2500 cm −1 (13.3 to 4 µm).Balloon-borne observations require excellent pointing accuracy which is realized by a sophisticated line of sight stabilization system.Also, multiple spectra taken at the same elevation angle are averaged to reduce the noise of the measurement data for the comparison with MIPAS.Typically, the MIPAS-B floating altitude lies between 30 and 40 km ,and limb scans are performed with a vertical sampling of about 1.5 km up to this altitude.The retrieval algorithm for MIPAS-B observations is based on the same retrieval strategy and forward model as that employed by the MIPAS IMK/IAA processor; however the microwindows from which the CFC information is derived are slightly different.In total, eight balloon flights were performed during the lifetime of MIPAS.Five of these flights were conducted during the RR period from 2005 to 2012 which is the key period of the present comparisons.

MIPAS-STR data
The cryogenic Fourier transform infrared limb sounder Michelson Interferometer for Passive Atmospheric Sounding -STRatospheric aircraft (MIPAS-STR; Piesch et al., 1996) aboard the high-altitude research aircraft M55 Geophysica is the airborne sister instrument of MIPAS.Here, we use MIPAS-STR observations during the Arctic RECONCILE campaign (Reconciliation of essential process parameters for an enhanced predictability of Arctic stratospheric ozone loss and its climate interactions; von Hobe et al., 2013) for the validation of MIPAS observations.The characterization, calibration, retrieval and validation of the MIPAS-STR observations during the considered flight on 2 March 2010 are discussed by Woiwode et al. (2012).Characteristics of MIPAS-STR, the data processing and uncertainties of the retrieval results are briefly summarized in the following.Further information on MIPAS-STR is found in Keim et al. (2008), Woiwode et al. (2015) and references therein.MIPAS-STR em-ploys four liquid He-cooled detectors/channels in the spectral range between 725 and 2100 cm −1 (13.8 and 4.8µm).The spectral sampling is 0.036 cm −1 .An effective spectral resolution of 0.069 cm −1 (full width at half maximum) is obtained after applying the Norton-Beer strong apodization (Norton and Beer, 1976).Depending on the sampling program, the dense MIPAS-STR limb observations cover the vertical range between ∼ 5 km and flight altitude (in Arctic winter typically at 17 to 19 km geometrical altitude) and are complemented by upward-viewing observations.A complete limb scan including calibration measurements is recorded typically within 2.4 to 3.8 min.This corresponds to an alongtrack sampling of about 25-45 km.
Similar to the MIPAS data processing, the forward model KOPRA (Karlsruhe Optimized and Precise Radiative Transfer Algorithm;Stiller, 2000) and the inversion module KOPRAFIT (Höpfner et al., 2001), involving a first-order Tikhonov-type regularization, were used.The retrieval was performed sequentially, i.e., species with low spectral interference with other gases were retrieved first.Then, their mixing ratios were kept constant in the subsequent retrievals of the following species.Additional retrieval parameters were spectral shift and wavenumber-independent background continuum for each microwindow.Typical vertical resolutions of 1-2 km were obtained between the lowest tangent altitude and flight altitude.

Aura/HIRDLS data
The High Resolution Dynamics Limb Sounder (HIRDLS) was an instrument that performed observations aboard NASA's (National Aeronautics and Space Administration) Aura satellite (Gille et al., 2008).The satellite was launched into a sun-synchronous orbit at an altitude of 705 km.During launch, large parts (∼ 85 %) of the instrument's aperture got blocked by a plastic film that was dislocated.This impacted both the performance of the radiometer as well as the geographical coverage of the observations.Useful vertical scans could only be performed at a single azimuth angle of 47 • backward to the orbital plane on the far side of the sun.Hence, the latitudinal coverage was limited to 65 • S to 82 • N and in the longitudinal domain, the coverage degraded to the orbital separation.On 17 March 2008 the instrument's chopper failed, ending the measurement period that started in January 2005.
Like MIPAS, HIRDLS measured the thermal emission at the atmospheric limb in the altitude range between 8 and 80 km.The instrument had 21 channels in the wavelength range between 566.9 and 1632.9 cm −1 (17.64 and 6.12 µm).Profile data are retrieved with a maximum a posteriori retrieval based on the optimal estimation theory (Rodgers, 2000).In the present comparison, data from the retrieval version 7 are used (Gille et al., 2014).The single profile precision for both species minimizes between 200 and 100 hPa, with values in the range between 10 and 20 %.Below, the precision is within the order of 50 %, while above it degrades with increasing altitude to values of more than 100 %.The mean HIRDLS errors shown in the comparisons are derived from the variability of the retrieved species (Gille et al., 2014, Sects. 5 and 5.4.), using the average of 10 sets of 12 consecutive profiles of regions with little variability.In total the HIRDLS data set comprises more than 6.3 million individual profiles that can be used for comparison with the MIPAS reduced resolution observations.A one-time normalization of the HIRDLS radiances relative to the Whole Atmosphere Community Climate Model (WACCM) was completed, and applied to all the HIRDLS CFC radiances.This affects the absolute CFC values but the morphologies and relative values are unchanged.

SCISAT/ACE-FTS data
The Atmospheric Chemistry Experiment Fourier Transform Spectrometer (ACE-FTS) is an instrument aboard the Canadian SCISAT satellite (Bernath et al., 2005).SCISAT was launched into a high inclination (74 • ) orbit at 650 km altitude on 12 August 2003.The ACE-FTS instrument utilizes the solar occultation technique measuring the attenuation of sunlight by the atmosphere during 15 sunsets and 15 sunrises a day in two latitude bands.The viewing geometry and the satellite orbit allow a latitudinal coverage between 85 • S and 85 • N over a year with a clear focus on midlatitudes and high latitudes.The instrument scans the atmosphere between the middle troposphere and 150 km, obtaining spectra in the wavelength range between 750 and 4400 cm −1 (13.3 and 2.3 µm) with a spectral resolution of 0.02 cm −1 .The vertical sampling varies as function of altitude and is also dependent on the beta angle, which is the angle between the orbit track and the direction the instrument has to look to see the sun.In the middle troposphere the sampling is around 1 km, between 10 and 20 km altitude it is typically between 2 and 3.5 km and in the upper stratosphere and mesosphere the sampling declines to 5 to 6 km.The instrument has a field of view of 1.25 mrad which corresponds to 3-4 km depending on the exact observation geometry.
In the comparisons, ACE-FTS data from the retrieval version 3.5 are employed, which currently cover the time period from early 2004 into 2013.The ACE-FTS retrieval uses a weighted nonlinear least-squares fit method in which pressure and temperature profiles are derived in a first step, followed by the volume mixing ratios of a vast number of species (Boone et al., 2005).The retrieval of CFC-11 data utilizes spectral information from four microwindows.The main window is located between 830 and 858 cm −1 (12.05 and 11.65 µm), similar to the MIPAS IMK/IAA retrieval.The other microwindows are much smaller and are centered at 2976.5 cm −1 (3.35 µm), 1977.6 cm −1 (5.06 µm) and 1970.1 cm −1 (5.08 µm).However, the latter microwindows do not contain information on CFC-11 but are included to improve the retrieval for interfering species (Boone et al., 2013).Individual profiles exhibit precisions within 5 % up to almost 20 km, increasing to 40-50 % at the highest altitudes covered.The ACE-FTS CFC-12 profiles are usually cut off at higher altitudes than the CFC-11 profiles, but exhibit similar precision estimates.The cut-off criteria for CFC-11 and CFC-12 are empirical functions as follows: for CFC-11: for CFC-12: where z top,CFC-11 and z top,CFC-12 are the altitudes (in kilometers) at which the profile is cut off for CFC-11 and CFC-12, respectively, and ϕ is the latitude.Overall, there are about 27 000 CFC-11 and CFC-12 profiles available for comparison, of which 375 cover the MIPAS FR period.

ADEOS-II/ILAS-II
The second version of the Improved Limb Atmospheric Spectrometer (ILAS-II) was a Japanese solar occultation instrument aboard the Advanced Earth Observing Satellite-II (ADEOS-II), also known as Midori-II Nakajima et al. (2006).After more than 10 months, on 24 October 2003, the satellite failed due to a malfunction of the solar panels.ADEOS-II used a sun-synchronous orbit at 800 km altitude and an inclination of 98.7 • , performing typically 14 orbits per day.The corresponding 28 occultations covered exclusively higher latitudes, i.e., polewards of 64 • in the Southern Hemisphere and between 54 and 71 • in the northern counterpart.The instrument consisted of four grating spectrometers, obtaining spectral information in the infrared (spectrometer 1: 6.21-11.76µm/850-1610 cm −1 ; spectrometer 2: 3.00-5.70µm/1754-3333 cm −1 ; spectrometer 3: 12.78-12.85µm/778-782 cm −1 ) and very close to the visible wavelength range (spectrometer 4: 753-784 nm/12 755-13 280 cm −1 ).The instantaneous field of view was 1 km in the vertical domain and between 2 and 21.7 km in the horizontal domain, depending on the spectrometer.
In the comparisons we employ results from the latest retrieval version 3. The retrieval is based on an onion peeling method Yokota et al. (2002).Multiple parameters for gases and aerosols are derived simultaneously on a 1 km altitude grid using a least-squares fit (Oshchepkov et al., 2006).The CFC results are based on the spectrum obtained by spectrometer 1 that is fitted in its entirety.For the comparisons with MIPAS more than 5600 individual profiles are available, covering the time period from April to October 2003 Nakajima et al. (2006), and thus provide comparison measurements for the MIPAS FR period.For sunrise measurements, only measurements below 34 km were considered as suggested by the data provider.

HATS data
HATS denotes the Halocarbons and other Atmospheric Trace Species group at NOAA's (National Oceanic and Atmospheric Administration) Earth System Research Laboratory in Boulder (USA).Since 1977 this group has conducted observations of surface levels of N 2 O and several CFCs, providing a long-term reference (e.g., Elkins et al., 1993;Montzka et al., 1996).These measurements are analyzed by gas chromatography, either with electron capture detection or with detection by mass spectrometry.The observations started at six locations; currently 15 locations are covered on all continents except Asia.Data from different measurement techniques and instruments are combined to provide the longest possible monthly mean time series for the individual locations.In the comparison we check whether the tropospheric MIPAS observations exceed the upper volume mixing ratio limit that is given by the HATS observations and whether their temporal development is consistent.HATS data are available for more than a dozen stations during the MIPAS measurement period.Their measurements were weighted with the cosine of latitude, and an average was calculated for each month.

Validation methods
In order to reduce the influence of natural variability and sampling artifacts, the majority of the comparisons were performed using collocated pairs of measurements.During this study, the coincidence criteria applied in most of the cases were a maximum distance of 1000 km and maximum time difference of 24 h.In the case of HIRDLS the criteria were cut down to a distance of 250 km and a time difference of 6 h, due to the large number of measurements of the instrument.No measurement is taken into account twice, meaning that only the best coincidence is taken in cases where two measurements of one instrument collocate with the same measurement of the other instrument.For MIPAS-B comparisons, diabatic 2-day forward and backward trajectories were calculated by the Free University of Berlin (J.Abalichin, private communication, 2014).The trajectories are based on ECMWF 1.25 • × 1.25 • analyses and start at different altitudes at the geolocation of the balloon observation to search for a coincidence with the satellite measurement along the trajectory path within a matching radius of 1 h and 500 km.Data of the satellite match have been interpolated onto the trajectory match altitude, such that these values can be directly compared to the MIPAS-B data at the trajectory start point.The MIPAS averaging kernels were not applied in any of the comparisons, due to two reasons.First of all, most of the instruments used for comparison have a vertical resolution similar to that of MIPAS.In addition, the vertical profiles of CFC-11 and CFC-12 are very smooth.They do not contain any obvious extrema -as, for example, ozone does -and thus smoothing with the MIPAS averaging kernel was shown to have only minor effects on the profiles.
The data of remote sensing instruments used for the comparison were interpolated onto the MIPAS retrieval grid, which is a fixed altitude grid with 1 km spacing in the altitude range relevant for comparison of CFC-11 and CFC-12.When provided on an altitude grid, the instruments' measurements were interpolated linearly onto the MIPAS grid, while in the case of a pressure grid, the MIPAS pressure-altitude relation was used after logarithmic interpolation.
For the comparison of MIPAS-STR, HIRDLS, ACE-FTS and ILAS-II, we use the following statistics: the mean difference: the standard deviation (SD) of the differences: (4) the standard error of the mean differences: and the combined error of the measurements: where σ mean are the averaged random errors of the respective instruments.σ x is the adequate quantity for the assessment of the mean difference.Correspondingly, if the combined error is smaller than the standard deviation of the differences, this hints at error estimates being too small, e.g., if not all sources of errors are considered or the retrieval error is underestimated.Since the measurements are not taken exactly at the same location and time, natural variability also contributes to differences between the combined error and the standard deviation (von Clarmann, 2006).
For comparisons to the HATS network, MIPAS measurements at 3 km below the tropopause are used.The altitude where the tropopause is located was calculated from each MIPAS temperature profile as follows.
-Between 25 • S and 25 • N the altitude at 380 K potential temperature was used.
-At higher latitudes the WMO criterion was used, e.g., the altitude where the vertical temperature gradient drops below 2 K km −1 and remains that small within a layer of 2 km.
The MIPAS CFC mixing ratio at 3 km below the tropopause altitude is chosen for each MIPAS profile.Cases for which the estimation of the tropopause height went obviously wrong were rejected.All available MIPAS measurements are used.To increase comparability of the data sets, monthly zonal means were calculated from MIPAS measurements in 10 • bins.In addition, these zonal means (and their standard deviation) were weighted with the cosine of the latitude to simulate the approach performed for the HATS data.Since some of the MIPAS detectors were shown to have timedependent nonlinearity correction functions due to detector aging (Eckert et al., 2014), we estimated drifts caused by this feature from a small subset of data.The comparison with HATS exhibits differences in the trends of MIPAS and the HATS time series.We compared the differences in these trends with the drift estimated due to detector aging.For the latter we calculated the mean drift by interpolating the drifts to 3 km below the tropopause and weighting them with the cosine of the latitude.

Validation results
The comparisons between MIPAS and the independent measurements were performed by applying the validation schemes described in Sect.3. The results of these comparisons are discussed in the following, first for CFC-11 and, subsequently, for CFC-12.The mean distance and time for comparisons based on collocated measurements (e.g., with MIPAS-STR, HIRDLS, ACE-FTS and ILAS-II) are shown in Table 2.

Results CFC-11: cryosampler
Multiple MIPAS profiles are compared to those of the cryosampler (Fig. 2).For each cryosampler profile (black dots), several MIPAS profiles meet the coincidence criteria (blue-grayish lines).The latter cover a considerable range of variability.The closest MIPAS profile (blue solid line) matches the cryosampler profile remarkably well in all five cases, with maximum differences of 30 pptv (∼ 13 %), except for the 10-15 km region on 3 October 2009 (right column, bottom panel).In addition, the mean of all coincident MIPAS profiles (red line) agrees reasonably well with the cryosampler profile, suggesting that the air within the entire region meeting the coincidence criteria is well represented by the cryosampler.Contrary to that, the respective seasonal zonal mean of MIPAS measurements (light orange line) occasionally deviates considerably from the actual measurements, particularly on 1 April 2011.On the other hand, the same seasonal mean (March/April/May 2005-2011) can agree well with the collocated mean and the closest MIPAS profile as well as the cryosampler measurement as for the comparison on 10 March 2009 (lower right panel).This confirms that  both the cryosampler and MIPAS can reliably detect atmospheric conditions deviating largely from the climatological state.Similar patterns were found for the 1 April 2011 comparison of MIPAS and cryosampler for other trace gases for this specific cryosampler flight (Chirkov et al., 2016).In this particular case, strong stratospheric subsidence has led to extraordinarily low mixing ratios of CFCs.This uncommon atmospheric situation went along with substantial ozone destruction (Manney et al., 2011;Sinnhuber et al., 2011).

Results CFC-11: Mark IV
Only one measurement of the balloon-borne MkIV instrument (black line in Fig. 3) coincides with MIPAS measurements during the RR period.Three collocated profiles of MIPAS (blue-grayish lines) were found, of which also the mean profile (red line) and the closest profile (blue line) are shown.Up to approximately 25 km, the MkIV profile reports higher mixing ratios than all of the MIPAS profiles, especially compared with the closest MIPAS profile.However, the gradient of the MkIV profile and all MIPAS profiles is very much alike between ∼ 17 and 24 km.Contradictory to the comparison with the cryosampler, the closest coincident MIPAS profile deviates most from the MkIV profile throughout the whole altitude range.While the three collocated measurements lie within the MkIV error bars from the lower end of the profiles up to ∼ 17 km, this is generally not the case from that altitude upwards, but only around the crossing point of the MkIV profile with the MIPAS profiles at about 25-26 km.Up to that altitude, the MkIV profile ex- hibits higher mixing ratios of CFC-11 than MIPAS, while above, MkIV shows lower, mostly negative values.However, the differences with the MIPAS mean profile rarely exceed 20 pptv, except for around 20 km where we find deviations of up to 30 pptv.This corresponds to less than 10 % at the lower end of the profile and up to 15 % around 20 km.Velazco et al. ( 2011) found similar differences in their comparisons of ACE-FTS and MkIV, which are based on noncoincident validation using a potential vorticity/potential temperature (PV/Theta) coordinate system (Manney et al., 2007).They also find largest deviations of the profiles around or slightly below 20 km, with maximum differences of up to ∼ 18 % and minimum differences of the order of ∼ 5 % around 17 km.Above 20 km, the mean profile of MIPAS and the MkIV profile agree well.Differences mainly stay within 10 %, except for above 26 km where MkIV mixing ratios become negative.
Considering the small number of coincident MIPAS profiles (3), the instruments agree reasonably well below 20 km and well between 20 and 26 km.

Results CFC-11: MIPAS-B
For the comparison with two independent measurements of MIPAS-B, trajectory-corrected profiles of the instrument were used (Fig. 4, Sect.3).In the comparison for the MIPAS-B flight of 24 January 2010, the agreement with MIPAS is remarkably good above ∼ 18 km as it stays well within 10 pptv.Below this altitude the mean profile of all collocated MIPAS measurements (Fig. 4: upper left panel; solid red line) shows higher values than the MIPAS-B profile (solid black line).However, the values of all collocated MIPAS profiles (red squares) cover a wider range, such that the MIPAS-B profile lies within their spread at all altitudes.The profiles deviate by approximately 30 pptv (∼ 30 %) at the largest around 16-17 km (middle and right panel) and stay within 20 pptv (corresponding to ∼ 10 % at the lower end of the profile) for the rest of the covered altitude range.Throughout the whole vertical extent, MIPAS shows higher mixing ratios of CFC-11.However, the bias does rarely exceed the standard deviation of the differences.Large percentage errors above 19 km occur due to division by very small absolute amounts of CFC-11 at these altitudes.The combined error shows the total error estimate for both instruments.However, the error component of the spectroscopy was left out, since both instruments use the same spectroscopic data, meaning this effect should cancel out and thus cannot explain possible differences between the combined error and the standard deviation.Below 15 km, the combined error of the two instruments and the standard deviation of the differences are similar, except for the lowermost point of the profile.Above this altitude the combined error is considerably smaller than the standard deviation, hinting at either an underestimation of the error by one or both of the instruments.Natural variability cannot be excluded as a source of the deviation either.The MIPAS profile is smoother, presumably due to several measurements being averaged to a mean profile.The flight on 31 March 2011 (Fig. 4, lower panels) supports the conclusions drawn from the first comparison.Maximum differences are slightly larger (close to 35 pptv at the largest).The largest deviations between the MIPAS mean profile and the measurements of MIPAS-B appear at altitudes around 13 km, and, here, exceed the standard deviation of the differences.Around 17 km a second peak occurs in the differences, which is at similar altitudes as for the first comparison.Similar as for the first comparison, the combined error and the standard deviation of the differences are very close, while the standard deviation of the differences is significantly larger than the combined error above this altitude.This rather hints at an underestimation of the error by one or both instruments above ∼ 15 km than the feature being attributed to natural variability.In general, both comparisons support the impression of MIPAS showing slightly higher values of CFC-11 below ∼ 18 km, even though the MIPAS-B profile is still included within the spread of all MIPAS collocated profiles (left panel: red squares).The shape of the profiles, in terms of slope and reversal points, agrees well for both comparisons.Differences might be due to horizontal viewing direction and/or horizontal smoothing by the MIPAS-B measurement, since the observations are combined using trajectories which are associated with the localized coordinates.This is most important in the presence of pronounced atmospheric structures and strong gradients, e.g., the mixing barrier associated with the polar vortex.

Results CFC-11: MIPAS-STR
Seven profile pairs of collocated measurements were found for comparisons of MIPAS with MIPAS-STR (Fig. 5).The comparison is performed using mean profiles, rather than comparing each set of collocated pairs.Since MIPAS-STR profiles were originally retrieved on a finer altitude grid (left panel; blue line) than MIPAS profiles (red line), these profiles were interpolated onto the MIPAS grid (black line).The agreement of the profiles is good and the vertical structure is similar, showing minimum differences around 16-17 km for both instruments.Differences are largest at the bottom end of the profiles at 8 km (middle panel).However, they do not exceed 30 pptv (corresponding to up to ∼ 15 % below 12 km and up to ∼ 20 % around 14 km at the largest) throughout the rest of the profile and are not significant for the majority of the altitude levels.Above 14 km, the differences mainly stay within 10 pptv corresponding to ∼ 3 to 15 %.The mean difference oscillates around zero, which is most pronounced at altitudes below ∼ 15 km.As for the comparison with MIPAS-B, the total error is shown for both instruments but the spectroscopy error is left out because MI-PAS and MIPAS-STR use the same spectroscopic data.The standard deviation of the differences (right panel; brown line) exceeds the estimated combined error (purple line) for most of the covered altitude range.It is rather likely that this difference is due to natural variability than an underestimation of the error budget of either instrument, since there is also a region (around 12 km) where this is the opposite, even though the mean distance and time difference are only about 170 km and 1 h 45 min, respectively (see Table 2).

Results CFC-11: HIRDLS
The results of the comparison of MIPAS CFC-11 with that of HIRDLS are displayed in Figs.6-8. Figure 6 shows that the HIRDLS profiles scatter the most at the ends of the profiles, e.g., at rather high altitudes (around ∼ 30 km; blue-greenish points) and the lowermost altitudes (around ∼ 10 km; redyellowish points).It is also apparent that the measurements of HIRDLS CFC-11 cover a large range of values at all altitudes, which is evident in the large scatter throughout the whole vertical extent, with the largest spread at the lower end of the profiles, i.e., at high CFC-11 mixing ratios.Negative CFC-11 values do not exist in the HIRDLS results because the retrieval for the volume mixing ratio is logarithmic.The histograms shown in Fig. 7 give a more detailed picture of the frequency distributions of the CFC-11 mixing ratios of MIPAS (top panels) and HIRDLS (bottom panels) measurements at 16 km (left panels) and 23 km (right panels).The mean and the median are close in all four cases.Both  at 16 and at 23 km, MIPAS seems to see a bimodal distribution (which is much more pronounced at 23 km), while HIRDLS only exhibits one obvious peak at 16 km and a slight shoulder, which seems to be a smeared-out second mode, at 23 km.In both cases HIRDLS does not see the distinct second mode at higher values visible in MIPAS measurements around 250 pptv at 16 km and around 150 pptv at 23 km.The peak at lower mixing ratios appears around similar values for both instruments, slightly below 200 pptv at 16 km and between 0 and 50 pptv at 23 km.The maximum is shifted slightly towards lower mixing ratios in the case of HIRDLS.The comparison of the mean profiles (Fig. 8, left panel), which are calculated from more than 90 000 collocated profiles of HIRDLS (black) and MIPAS (red) over all latitudes, shows good agreement of the two instruments down to ∼ 16 km.Deviations stay within 10-15 pptv above this altitude.Below, MIPAS continuously shows higher mixing ratios of CFC-11 than HIRDLS (middle panel), with differences reaching as high as 60 pptv at the bottom end of the profile.This presumably reflects the more pronounced second mode in the MIPAS frequency distribution (Fig. 7).However, MIPAS CFC-11 mixing ratios are no more than 40 pptv (∼ 20 %) larger than those of HIRDLS at altitude ranges between 9 and 16 km.At the bottom end of the profiles, the largest deviations of the mean profiles of MIPAS and HIRDLS can be found.The error bars (left panel) shown for MIPAS depict the total error, while HIRDLS error bars represent an estimated error, derived from 10 sets of 12 consecutive profiles at regions of little variability (Gille et al., 2014).In the right panel, the combined error is compared to the standard deviation of the differences.The covered vertical range of the combined error is smaller, since HIRDLS error estimates were only given for these altitude levels.Presumably, a combination of an underestimation of either or both error budgets and natural variability result in the differences between the combined error and the standard deviation of the differences.Due to the fact that the coincidence crite-  ria allow for certain differences in time and geolocation, the mean distance between the collocated measurements is approximately 200 km and the time difference is nearly 3 h (Table 2).However, this effect is presumably less important than for the comparison with, e.g., ACE-FTS for which the mean distance and time difference are about twice as large as for the comparison with HIRDLS.Overall, the agreement of MI-PAS and HIRDLS CFC-11 measurements is excellent down to approximately 15 km as differences rarely exceed 10 pptv.Below that altitude, MIPAS exhibits a slight high bias.However, it is important to remember that HIRDLS CFC radiances have been normalized using WACCM, so slight biases might also occur because of this.

Results CFC-11: ACE-FTS
The correlation between MIPAS and ACE-FTS CFC-11 measurements (Fig. 9) is very close to linear, even though MIPAS measures slightly higher CFC-11 values in general.This is most obvious at higher CFC-11 mixing ratios, e.g., at lower altitudes (red-yellowish points) where the correlation is slightly off the 1:1 relation.The values do not scatter as much as for HIRDLS, presumably due to the fact that in the case of ACE-FTS the signal to noise ratio is better, since it measures in occultation.The distribution of the mixing ratios at 16 km (Fig. 10: left panels) and 23 km (right panels) agree reasonably well for the two instruments.The skewness is very similar for both instruments, but the multimodal scheme is more pronounced for ACE-FTS at 16 km.A frequency maximum of mixing ratios appears slightly below 200 pptv in the case of MIPAS and between 150 and 200 pptv in the case of ACE-FTS.There is a second peak around 250 pptv in the ACE-FTS measurements which is less pronounced in the MIPAS values.At 23 km, both instruments show a bimodal distribution of the mixing ratios, with values peaking between 0 and 50 pptv and close to 150 pptv.The ACE-FTS frequency distribution exhibits an additional peak at negative values, which are unphysical.The upper limit of the ACE-FTS CFC-11 retrieval for the polar region is 23 km.For these occultations, the spectrum presumably contains little CFC-11 signal near 23 km and the retrieval is possibly compensating for some effect (e.g., bad residual from one of the interferers, mild channeling in the interferometer, a contribution to the spectral region from the aerosol layer) by giving negative CFC-11 mixing ratios.Similar to HIRDLS, the main mode at 23 km is shifted to slightly lower values in the case of ACE-FTS compared to MIPAS.
The figure of the mean profile comparison (Fig. 11) supports the conclusion from Fig. 9 that MIPAS sees higher volume mixing ratios of CFC-11.This is most pronounced at lower altitudes, approximately below ∼ 17-18 km (left and middle panel), where MIPAS CFC-11 mixing ratios (red line) are about 20 pptv (less than 10 %) higher than those of ACE-FTS, both compared to ACE-FTS on its original grid (blue line) and interpolated onto the MIPAS grid (black line).Again, MIPAS error bars represent the total error.The ACE-FTS errors are the random errors from the least-squares fitting process, the square root of the diagonal elements of the covariance matrix.Additionally, the error budget of the version 3.5 ACE-FTS data contains an additional term in the reported error.This term is derived from the difference between a retrieved CO 2 volume mixing ratio (VMR) profile and the assumed CO 2 VMR profile employed in the pressure/temperature retrieval and is a measure of the ability of the retrieval system to reproduce the fixed input profile for the given occultation (Boone et al., 2013).The right-hand panel shows that the combined error (purple line) is often larger than the standard deviation of the differences (brown line) for almost the complete altitude range.This suggests that for one or both of the instruments, the error budget is overestimated, but rather large natural variability compensates the effect where the standard deviation of the differences exceeds the combined error.The latter plays a more important role than for e.g., HIRDLS, since the coincidence criteria for ACE-FTS with MIPAS are considerably less strict compared to those of the HIRDLS comparison and the mean distance and mean time difference are about 350 km and more than 6 h, respectively (Table 2) and thus are about twice as large as those of the HIRDLS comparison.
Around 25 km (left panel) one can see a feature not known from any previous CFC-11 profiles, represented as a bump of suddenly increasing values.This increase in CFC-11 around 25 km does not originate from an actual atmospheric state, but is simply a sampling issue.ACE-FTS profiles are cut off at the upper end, when the mixing ratios become too small to be retrieved satisfactorily.Since CFC-11 values are largest in the tropics, the profiles are cut off at higher altitudes than in polar regions; i.e., above 23 km only tropical -higher -values are shown.However, Fig. 11 shows the global mean of all collocated ACE-FTS and MIPAS profiles.Hence, around 25 km the mean is suddenly strongly dominated by tropical profiles, dragging it to higher values.Furthermore, it is admittedly not intuitive that regridding systematically adds a bias to the ACE-FTS profiles (interpolation from blue to black line).This shift towards mixing ratios valid at approximately 0.5 km below does not appear in the interpolated sin-gle profiles but only in the mean of the interpolated profiles.This is a pure sampling effect caused by the same mechanism as the artificial bump explained above: due to the resampling on the MIPAS grid, the ACE-FTS cut-off altitude -and thus the bump -are shifted 500 m downwards.
Overall the MIPAS and ACE-FTS CFC-11 measurements agree reasonably well as the differences stay within 20 pptv over almost the entire altitude range.This comparison contradicts the conclusion from other comparisons that MIPAS has a slight high bias at the lower end of the profile, even though the effect is far less pronounced.If the comparison is broken down into latitude bands (Fig. A2) the bump disappears.In addition, this breakdown into several latitude bands indicates that the tendency of MIPAS to detect higher amounts of CFC-11 at the lower end of the profile is more pronounced at higher latitudes.Similar results have been found by Tegtmeier et al. (2016), who also find a slight high bias in their comparison of MIPAS and a multi-instrument mean (MIM) CFC-11 that seems to be more pronounced at higher latitudes.This feature is also visible in the latitudinal breakdown of the comparison with HIRDLS (Fig. A1).
The behavior of the tropical profiles in these figures is also interesting.Compared to ACE-FTS, the MIPAS profile shows slightly increasing CFC-11 mixing ratios up to ∼ 15 km.An increase, from the bottom of the profile upwards, is also visible in ACE-FTS, but it is far less pronounced.The latitudinal breakdown for HIRDLS and MI-PAS shows that this increase is most pronounced in HIRDLS.This behavior of the mean profile is suspicious, since CFC-11 mixing ratios are expected to be constant throughout the troposphere, since CFC-11 is well mixed, which might hint at problems concerning the retrieval and/or spectroscopic data in this region.

Results CFC-11: HATS
The high bias of MIPAS CFC-11 below approximately 15-17 km detected so far is further quantified by comparison to ground-based measurements of the HATS network (Fig. 12).Similar mixing ratios of stable source gases are to be expected at the surface and in the upper troposphere.Instead, the mean of the MIPAS measurements (continuous red line with large red circles) is about 10 to 15 pptv (∼ 5 %) higher than the mean of the data collected by the HATS network (continuous black line).Since the troposphere is well mixed, these values should agree well, which indicates a slight high bias of the MIPAS measurements.Both MIPAS and the HATS data exhibit a descending slope during the RR period (2005-2012 in Fig. 12) in their time series, but the decrease in MIPAS measurements seems to be slightly steeper.This effect is slightly more pronounced than the estimated drift at this altitude (Sect.7,Fig. 34, left panel).Absolute drifts due to detector aging at 3 km below the tropopause were estimated to be −3.58pptv decade −1 .The drift estimated from the difference in the trend in Fig. 12  (see Sect. 3 for details on the method).Therefore, only part of the difference in the trends can be explained by the drift resulting from detector aging.However, the drift estimate due to detector aging is only based on drifts between 35 • S and 35 • N. Trends in the comparison with HATS result from measurements with almost pole to pole coverage.Thus, the comparison between the drift due to detector aging and the difference in the trends can only serve as an approximation.The amplitude of periodic variations is slightly more pronounced in MIPAS measurements, but qualitatively, both instruments agree well.The standard deviation of the MIPAS data (dashed red line with small red circles) shows that the spread is rather large which is not surprising, considering that the mean includes all MIPAS measurements within this time period, which have a wider spread than the locally confined HATS measurements.Even though some HATS data lie within the standard deviation of the MIPAS measurements, the difference is obviously systematic, supporting the finding that MIPAS CFC-11 is too high in the upper tropopause.

CFC-11: high spectral resolution time period (FR)
Due to data availability we only compare MIPAS CFC measurements during the FR period with those of MkIV, ACE-FTS, ILAS-ll and HATS.

Results CFC-11 V5H: MkIV
During the high spectral resolution (FR) period, two MkIV measurements are coincident with several MIPAS measurements (Fig. 13).While 16 MIPAS profiles were found to coincide with the MkIV profile taken on 16 December 2002, we find even 25 matches for the MkIV measurement taken on 1 April 2003.The color coding is the same as in Fig. 3, showing collocated MIPAS measurements (blue-grayish lines), the mean of these profiles (red line) and the closest MI- PAS profile (blue line) compared to the corresponding MkIV measurement (black line).The agreement is excellent up to 15-16 km with differences of less than 20 pptv (up to 10 %), while above that altitude MIPAS shows considerably higher values than MkIV for the 16 December 2002 measurement of MkIV.Above 21 km, MkIV even shows negative values at some altitude levels.The second comparison shows larger differences approximately around 15 km, but the agreement with the mean profile of the coincident MIPAS measurements is excellent below that altitude and up to about 20 km.Deviations of MkIV with the MIPAS mean profile range up to ∼ 30 pptv in both cases, while larger differences show up for comparisons to the closest MIPAS profile on 1 April 2003.These differences exceed 50 pptv around 15 km.However, the agreement between MIPAS and MkIV measurements of CFC-11 is similarly good for the FR and the RR period.

Results CFC-11 V5H: ACE-FTS
For the comparison of MIPAS CFC-11 with ACE-FTS, 171 profile pairs matching the coincidence criteria were found during the FR period (Fig. 14).As in the case of the MI-PAS RR period, the ACE-FTS data were interpolated from their original grid (left panel: blue line) onto the MIPAS grid (black line) and were, after averaging, compared to MIPAS data (red line).Between 10 and 20 km the agreement between the two mean profiles is excellent, while below and above, MIPAS shows higher mixing ratios of CFC-11.From 10 up to 20 km, deviations of the mean profiles mostly stay within 10-20 pptv (middle panel), corresponding to ∼ 5 % around 10 km and ∼ 30 % around 20 km.Above and below, the differences are larger and sometimes exceed 30 pptv.Even though the standard error of the mean differences is considerably larger than for the RR period (due to far fewer pairs of collocated profiles), it does not include zero for most of the covered altitude range, indicating that the deviation of the profiles is still significant.The error budget of one of the instruments or both is overestimated below 15 km.This is more pronounced than for the comparison of MIPAS with ACE-FTS during the RR period.
Even though certain similarities with the MIPAS RR period, like the known high bias at the lower end of the profile, occur in the comparison of the MIPAS FR data set with ACE-FTS, the agreement between the two instruments is better than for the RR version in the region between 10 and 20 km.This might be ascribed to the better spectral resolution of MIPAS during the FR period.However, the collocated measurements for the FR period only consist of profiles taken at higher northern latitudes.Thus the result may generally expose differences compared to the RR period, independently from differences due to the altered MIPAS retrieval setup because the mean for the RR period consists of measurements over all latitudes and several years compared to only high latitude profiles taken during February and March 2004 for the FR period.

Results CFC-11 V5H: ILAS-II
About 5000 matches were found for the comparison of MI-PAS CFC-11 measurements with ILAS-II (Fig. 15) during the FR period.However, apart from the approximate altitudes where the vertical gradient changes most rapidly, the MIPAS (red line) and the ILAS-II mean profile (blue line: on its original grid; black line: on the MIPAS altitude grid) do not agree very well.Below 20 km, MIPAS shows higher mixing ratios of CFC-11 than ILAS-II and lower mixing ratios above that altitude.This is the first comparison with the newest version of the ILAS-II CFC-11 and CFC-12 data (version 3).However, a similar feature has already been seen in comparisons of ILAS-II CFC-11 version 1.4 and version 2 measurements with MIPAS-B (Wetzel et al., 2008).The differences of MIPAS and ILAS-II exceed those of other comparisons by far.At the lower end of the profile, deviations go beyond 100 pptv (middle panel), which corresponds to relative differences of ∼ 50 %.Another conspicuous feature of this comparison is the very large error bars estimated from the ILAS-II retrieval (left panel: horizontal black and blue lines).However, Wetzel et al. (2008) show similarly large error bars in their comparison of the former version of ILAS-II with MIPAS-B data.Since the right panel of Fig. 15 demonstrates that the combined error of the two instruments (purple line) is far larger than the standard deviation of the differences (brown line), we suspect that the ILAS-II errors are largely overestimated.Above 20 km, Wetzel et al. (2008)  found higher mixing ratios.All in all, the agreement of MI-PAS CFC-11 measurements taken during the FR period with those of ILAS-II is not good as it shows far larger differences at the bottom end of the profile than comparisons with, e.g., ACE-FTS or HATS, that are as big as 50 % and also large deviations at the upper end of the profiles that exceed 100 % above 25 km.Thus, the results for the comparison with ILAS-II should be treated with care, since large differences with MIPAS-B and the former versions of ILAS-II have been found previously.

Results CFC-11 V5H: HATS
The comparison of MIPAS CFC-11 with HATS during the FR period covers less than 2 years (Fig. 12, July 2002-April 2004).This short time period, along with annual variations, is an obstacle to the interpretation of the results.While the MIPAS time series for this period (continuous red line with large red circles) oscillates around a relatively constant value, the HATS time series (black line) shows declining mixing ratios.Even though some values of the HATS measurements lie within the standard deviation of the MIPAS measurements, a systematic deviation is evident.The mixing ratios differ from values of less than 10 pptv (less than 4 %) at the begin-ning of the compared time series and to slightly higher values of about 12 pptv (∼ 4.5 %) at the end.While we consider the differences to be real, since the deviations are systematic and are consistent with the RR time period, we suggest being careful not to overinterpret possible short-term linear variations.

CFC-12
This section is dedicated to the results of the comparisons of MIPAS CFC-12 measurements with those of the cryosampler, MkIV, MIPAS-B, MIPAS-STR, HIRDLS, ACE-FTS and the HATS network .

Results CFC-12: cryosampler
For CFC-12, as well as for CFC-11, cryosampler measurements (Fig. 16: black dots) were compared to MIPAS measurements.MIPAS measurements fulfilling the coincidence criteria (blue-grayish lines) exhibit a widely spread set of profiles enclosing the cryosampler measurements.In most of the cases, deviations of the cryosampler and the mean collocated MIPAS profile stay within 50 pptv (corresponding to ∼ 10 % at the lower end of the profile and increasing above due to smaller volume mixing ratios of CFC-12).The clos- est of these collocated MIPAS profiles (blue line) agrees very well with the cryosampler measurements.Only the cryosampler measurement taken on 3 October 2009 exhibits some outliers, deviating considerably (by ∼ 150 pptv) from all coincident MIPAS profiles at about 20-25 km, while the rest of this profile still agrees well (within ∼ 50 pptv) with all collocated MIPAS measurements.It is possible that cryosampler captured variations due to laminae of small vertical extent here, which cannot be detected by MIPAS as the spatial resolution is too coarse.While the mean of the collocated MIPAS profiles (red line) comes very close to the cryosampler measurements as well as the closest MIPAS profile, except for the outliers just mentioned, the seasonal latitudinal mean of MIPAS (light orange line) can differ considerably from the cryosampler and the closest MIPAS profile (particularly on 1 April 2011).This provides proof of large natural variability in this case.As already stated for CFC-11, this is presumably due to subsidence in the remarkably cold and stable Arctic polar vortex being present during that winter.Similar as for CFC-11, in the 10 March 2009 comparison cryosampler, the closest MIPAS profile as well as the mean MIPAS profile agree well with the seasonal mean.Therefore, for CFC-12 as well, we can conclude that both instruments are capable of capturing deviations from the mean state of the atmosphere.Even though there are a few cryosampler outliers not matching the MIPAS data, the CFC-12 cryosampler measurements agree very well with those of the mean and the closest MI-PAS profile as deviations usually stay within 50 pptv in general.

Results CFC-12: Mark IV
Comparison of MIPAS CFC-12 with MkIV measurements exhibits a similar behavior as for CFC-11 (Fig. 17) up to slightly below 30 km.MkIV (black line) shows higher mixing ratios of CFC-12 than both the mean MIPAS profile (red line) and, even more pronounced, the closest MIPAS profile (blue line).The gradient of the profiles between ∼ 20 and 27 km is similar for all profiles.Above approximately 27 km, however, the MIPAS profiles oscillate considerably, which is most apparent in the closest profile.The MkIV profile exhibits small wiggles above that altitude as well, but not as pronounced as any of the MIPAS profiles.Differences of the profiles stay within ∼ 50 pptv throughout most of the altitude range between the lower end of the profile up to approximately 27 km, except for levels around 20 km where differences sometimes come close to 100 pptv.These values correspond to 10-15 % for most of the profile below 27 km and slightly over 20 % around 20 km.Velazco et al. ( 2011) also find higher values of MkIV compared to ACE-FTS throughout their whole altitude comparison range, with an indication of the largest differences occurring around 20 km.However, they only find differences of up to 15 %.Above 35 km, deviations between the MkIV profiles and the MIPAS profiles are noticeably larger.Up to that altitude, however, the comparison of MIPAS with MkIV CFC-12 measurements shows reasonably good agreement, considering only three coincident MIPAS profiles were found.

Results CFC-12: MIPAS-B
Multiple MIPAS profiles coinciding with backward and forward trajectories of two MIPAS-B measurements taken over Kiruna (Sweden) in January 2010 and March 2011 (Fig. 18, upper and lower panels, respectively), were taken into account for the CFC-12 validation with MIPAS-B.For the January 2010 MIPAS-B profile (Fig. 18 upper panels: black line), the agreement with the mean MIPAS profile (red line) is very good.The MIPAS-B profile is embedded in the spread of MIPAS collocated profiles (red squares) throughout the whole vertical range.At altitudes above 17 km, the agreement between MIPAS-B and MIPAS is remarkably good, showing differences smaller than 25 pptv and closing in to zero above 21 km (middle panel).Below 18 km, MIPAS shows slightly larger values of CFC-12, with differences of up to ∼ 40 pptv (corresponding to ∼ 10 %) at the largest.Below ∼ 17 km, these differences are similar to the standard deviation of the instruments (middle and right panel) and considerably smaller above.
Large percentage errors above ∼ 22 km occur due to small absolute values of CFC-12 from this altitude upwards.Except for the region below 14 km, the standard deviation of the differences exceeds the total mean combined error of the instruments, suggesting that the error budget for one or both instruments is underestimated.In the comparison with the MIPAS-B measurement taken in March 2011, MIPAS shows considerably higher mixing ratios of CFC-12 below 15 km and around 18 km (Fig. 18, lower panels).From 15 km upwards, the MIPAS-B profile and the MIPAS mean profile show good agreement in gradient and turning points of the profiles and above 19 km they also agree very well quantitatively as the differences are smaller than 7 pptv, closing in to zero at some points.Deviations between MIPAS and MIPAS-B range up to ∼ 75 pptv around 12-13 and 18 km (middle panel) in absolute values, corresponding to relative differences of approximately 15 and 30 %, respectively (right panel).However, except for these regions, the two instruments show differences smaller than the standard deviation of the differences.While the shape of the two profiles is far more similar in the first comparison (MIPAS-B flight in January 2010), correction of the less than perfect coincidence, by using trajectories to collect collocated MIPAS measurements, might not have worked that well in this particular case.This particular atmospheric situation (winter and spring of 2011) was characterized by extraordinarily low temperatures and a very stable vortex.Due to possibly sharp horizontal gradients, MIPAS-B might have captured an air parcel having different characteristics than the mean of all collocated MIPAS profiles, even though trajectory-corrected collocated profiles were used.Thus deviations due to natural variability might still occur.

Results CFC-12: MIPAS-STR
The comparison of MIPAS-STR and MIPAS mean profiles consists of seven pairs of collocated measurements (Fig. 19).The mean profiles of MIPAS-STR (blue line: on original grid; black: interpolated onto the MIPAS grid) and MIPAS (red line) agree very well.The minimum occurs around the same altitudes (approximately 17 km) and both profiles show a similar behavior of decreasing CFC-12 volume mixing ratios from the bottom of the profile up to the minimum, even though the MIPAS profile oscillates slightly at altitudes below ∼ 15 km.The difference oscillates around zero and is very similar in shape with the difference profile of CFC-11 (Fig. 5: middle panel).This is because the same observations were used as in the case of CFC-11.The overall vertical distribution of CFC-11 and CFC-12 indicated by the MIPAS-STR observations fits well with the distribution of these species derived from the MIPAS measurements.This is plausible, since the distribution of the CFCs in the lower stratosphere is predominantly altered by dynamic processes www.atmos-meas-tech.net/9/3355/2016/Atmos.Meas.Tech., 9, 3355-3389, 2016 and the considered observations of both instruments cover horizontally extended regions (i.e., several degrees in latitude).Differences are largest around 11 km and exhibit deviations of more than 40 pptv, corresponding to approximately 10 % (middle panel).Except for this altitude, the differences are mostly insignificant and stay within 30-40 pptv at the largest and are often considerably smaller.This corresponds to less than 10 % at the lower end of the profile and less than 5 % at the upper end.Again, the total error of both instruments without the spectroscopy error is shown for both instruments.The combined error of the instruments exceeds the standard deviation of the differences from 10 to 12 km.As previously mentioned, this suggests that either one or both of the instruments overestimate their error budget there.As mentioned in Sect.4.1.4,natural variability might also play a role, even though the mean distance and time difference are rather small as for CFC-11 (Table 2).This probably results in the standard deviation of the differences being larger than the combined error of the instruments outside the region from 10 to 12 km.Overall, the agreement of MIPAS-STR and MIPAS is excellent, since differences rarely exceed 30 pptv and are mostly insignificant.

Results CFC-12: HIRDLS
Comparisons of MIPAS and HIRDLS measurements of CFC-12 are summarized in Figs.20-22.Figure 20 shows the correlation between MIPAS and HIRDLS measurements.HIRDLS measurements have several outliers in CFC-12, which tend to occur more frequently at smaller mixing ratios/higher altitudes.However, it is still visible that the measured mixing ratios of MIPAS and HIRDLS are correlated linearly in general.Obvious differences appear in Fig. 21, where the frequency of the measured amounts of CFC-12 at 16 (left panels) and 23 km (right panels) is shown.While  the distributions look very similar at 16 km, clear differences are visible at 23 km.At 16 km, both measurements' frequencies only show one peak, which is approximately centered between 450 and 500 pptv in the case of HIRDLS and is slightly shifted to higher values in the case of MIPAS, where the peak is rather centered around 500 pptv and exhibits a steeper histogram at higher mixing ratios.At 23 km one can clearly make out three peaks in the MIPAS distribution, while for HIRDLS this feature is barely visible as it is smeared out quite severely, and thus the rightmost peak is hardly discernible in the HIRDLS distribution.This also leads to a flatter frequency distribution for HIRDLS.The middle maximum peaks at similar amounts of CFC-12 for both instruments and lies between 200 and 250 pptv.
The comparison of the mean profiles of MIPAS (Fig. 22: red line) and HIRDLS collocated measurements (black line) shows very good agreement between the two instruments.The shape of the mean profiles, as well as their maxima and turning points are very similar, even though the MIPAS profile branches off at slightly lower altitudes and exhibits a sharper turn around 16 km.The higher volume mixing ratios of CFC-12, which MIPAS shows below 17 km, stay mostly within ∼ 20 pptv (∼ 4 %) difference, except from the lowest value (middle panel) which is slightly larger than 40 pptv (close to 10 %).Between 18 and 25 km, MIPAS measures smaller amounts of CFC-12 than HIRDLS, with differences of up to nearly 40 pptv (corresponding to ∼ 10 %).From 25 to 30 km, MIPAS CFC-12 volume mixing ratios agree excellently with those of HIRDLS and differences are generally smaller than 20 pptv, corresponding to ∼ 2.5 % around 25 km, about 10 % at 28 km and increasing above 30 km.
The combined error of the instruments is similar to the standard deviation of the differences up to ∼ 15 km.Above that, the standard deviation of the difference is always larger than the combined error and the difference increases with altitude.This suggests that the error estimate of the two instruments is appropriate in cases where natural variability is negligible.Since the mean spatial and temporal distance between the measurements are almost 200 km and close to 3 h, and thus natural variability might be responsible for the differences between the combined error and the standard deviation of the differences above ∼ 15 km, the error estimate of either or both of the instruments might be slightly too conservative.
The latitudinally broken down comparisons (Fig. A3) exhibit similar features as for CFC-11.At higher latitudes, deviations of the profiles at the bottom end seem larger than in tropical or subtropical regions.Overall the agreement between the mean MIPAS and HIRDLS CFC-12 profiles is excellent, since the differences mainly stay within 20 pptv, while the scatter plot shows that this mean is derived from a sample with a rather wide spread.Since HIRDLS CFC radiances have been normalized using WACCM, slight biases might occur due to that normalization.

Results CFC-12: ACE-FTS
The comparison of ACE-FTS and MIPAS CFC-12 profiles is shown in Figs.23-25.Figure 23 exhibits a correlation of the measurements that is very close to being linear.The agree-  ment of the two instruments appears to be good, with very few outliers even though MIPAS measures slightly higher volume mixing ratios at large values, e.g., at the lower end of the profile.This impression is supported by Fig. 24, which shows the frequency of MIPAS (top panels) and ACE-FTS (bottom panels) at 16 (left panels) and 23 km (right panels).It exhibits considerable numbers of MIPAS CFC-12 measurements reporting volume mixing ratios of 500-600 pptv at 16 km, while ACE-FTS does not report appreciable numbers of CFC-12 values above 550 pptv.This leads to a far steeper histogram at higher mixing ratios in the ACE-FTS frequency distribution at 16 km, while the histogram at lower mixing ratios is more similar to that of MIPAS, even though it is still a bit steeper.The only obvious peak at this altitude occurs at similar volume mixing ratios for both instruments (around 450 pptv in the case of ACE-FTS and between 450 and 500 pptv in the case of MIPAS).At 23 km both instruments clearly show a trimodal distribution, peaking close to zero, around ∼ 250 pptv and around ∼ 450 pptv.While the leftmost peak appears to be more pronounced in the ACE-FTS distribution, the middle and right peaks are very similar.The impression of MIPAS seeing higher values of CFC-12 at the lower end of the profile is confirmed in Fig. 25 as well.While the MIPAS (red line) and the ACE-FTS profiles (blue line: on original grid; black line: interpolated onto the MI-PAS grid) are very close together at the bottom end (around ∼ 6 km), the MIPAS profile exhibits a steeper ascent than the ACE-FTS profiles, leading to deviating profiles of the instruments up to 18 km.Here, the MIPAS mean profile exhibits volume mixing ratios of CFC-12 that are up to 25-30 pptv (6-7 %) higher than those of ACE-FTS (middle panel).From 18 km up to ∼ 27-28 km, MIPAS and ACE-FTS agree remarkably well with deviations of approximately 10 pptv, corresponding to ∼ 3 % around 18 km and less than 10 % around 27 km.Above these altitudes, ACE-FTS reports higher volume mixing ratios of CFC-12 than MIPAS.Around 30 km, the comparison exhibits the largest deviations, appearing in differences of up to 50 pptv and more (which corresponds to ∼ 25 % and more at these altitudes).
The comparison of the estimated precision and the standard deviation of the differences (right panel of Fig. 25) shows that they come close above 13 km, while below this altitude the combined error even exceeds the standard deviation of the differences.This suggests that one or both of the instruments error budgets are overestimated, while this effect if canceled out or even reversed above 13 km by natural variability.Natural variability might play a more important role than for the comparison with HIRDLS, since the HIRDLS coincidence criteria were chosen far more strictly than for the comparison of MIPAS with ACE-FTS.This results in a mean distance and time difference that are similar to CFC-11 with about 375 km and 6 h, respectively (Table 2) and thus almost twice as large as for the comparison with HIRDLS.Both profiles show a bump, which is even more pronounced than for CFC-11.The explanation for this feature is the same as for CFC-11 and illustrates the sampling issue created by the combination of the cut-off of the ACE-FTS profiles at low CFC-12 values and the distribution of the gas, e.g., higher volume mixing ratios at lower latitudes.Different to CFC-11, the bump is not removed completely in the latitudinal breakdown (Fig. A4).An indication of the bump at the upper end of the mean profiles is still visible at midlatitudes, which is presumably attributed to high variability of CFC-12 within these bins.This originates from a similar sampling effect as for the whole set of measurements, just in smaller magnitude.At higher altitudes, the mean profile is again dominated by low-latitude profile contributions, since profiles from higher latitudes are cut off at a lower altitude.As for the comparison with HIRDLS, we observe that differences at the lower end of the profile are largest at higher latitudes for CFC-12.
Again, Tegtmeier et al. (2016) found a similar behavior with a slight high bias in MIPAS CFC-12 that seems to be more pronounced at higher latitudes, even though the relative differences between MIPAS and the MIM are smaller than for CFC-11.Their findings agree well with the results of this study.Despite some differences, the mean MIPAS and ACE-FTS CFC-12 profiles are in good agreement as they stay within 15 pptv between 17 and 28 km and within 30 pptv below, which is slightly larger than in the comparison with HIRDLS for most of the covered altitude range.The scatter plot shows a narrower point cloud than the one for HIRDLS.

Results CFC-12: HATS
Similarly as for CFC-11, a comparison of HATS data with MIPAS measurements at an altitude of 3 km below the estimated tropopause was performed for CFC-12 as well (Fig. 26).This comparison suggests that MIPAS (continuous red line with large circles) detects slightly higher values than the HATS stations (continuous black line) at tropospheric levels.However, this effect is less pronounced than for CFC-11.Deviations mainly stay within 10 pptv, which corresponds to ∼ 2 %, since CFC-12 amounts are larger than for CFC-11.MIPAS's CFC-12 volume mixing ratios cover a wide range of values, which is reflected in the large standard deviation (dashed red line with small circles) of approximately ±30 pptv.The values of HATS time series are very close to the MIPAS measurements throughout the whole comparison period.Even though periodic variations in the MIPAS time series have larger amplitudes, the oscillations in both measurements agree with respect to their period and phase.Similar to CFC-11, there is an indication that the MI-PAS CFC-12 time series for the RR period (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) declines faster than that of HATS.The difference in the trends between MIPAS and HATS is −6.85 pptv decade −1 (Sect.3 for details on the method).A similarly large drift (−6.89 pptv decade −1 ) is found for results due to detector aging at 3 km below the tropopause.Hence, for CFC-12 the drift due to detector aging can explain the differences in the trends between MIPAS and HATS to a large extent, even though only drifts between 35 • S and 35 • N have been analyzed.All in all, differences between the data sets are very small.profile showing larger mixing ratios below 25 km in the 1 April 2003 comparison, while this is not visible in the 16 December 2002 comparison.However, the compared profiles show good agreement in general, with differences up to about twice as large as for the comparison with HIRDLS.

Results CFC-12 V5H: ACE-FTS
The comparison of MIPAS FR CFC-12 and ACE-FTS (Fig. 28) data is very similar to that of the reduced resolution period (RR: Fig. 25), but the agreement is even better around ∼ 10 to 15 km.Since the comparison does not reach up beyond 28 km, the bump seen in the mean profiles for the RR period does not appear in either of the mean profiles for the FR period (left panel).This is mainly because collocated measurements only exist at high latitudes for the short overlap of the ACE-FTS period and the MIPAS FR period (February and March 2004).For most of the vertical range the differences stay within ∼ 10 pptv (middle panel), corresponding to ∼ 1 % at the lower end of the profile and ∼ 20 % around 28 km.These values are only exceeded around ∼ 10 and 17-18 km, as well as at the lowest altitudes, where differences can reach up to 20-30 pptv (∼ 6 % below 13 km and less than 10 % around 17-18 km).MIPAS shows slightly higher mixing ratios than ACE-FTS up to ∼ 14 km and lower ones above this altitude.The comparison of the combined error and the standard deviation of the differences (right panel) looks similar to the one for the RR period, just slightly more pronounced, where the combined error exceeds standard deviation of the differences up to ∼ 16 km.Above, the stan- dard deviations are not explained by the combined errors.Explanations as for the RR period (Sect.4.3.6)apply here as well.Overall, the agreement of MIPAS and ACE-FTS CFC-12 measurements is remarkably good for the MIPAS FR period as the differences stay within 15 pptv for most of the covered altitude range and are thus smaller than in all comparisons for the RR period.

Results CFC-12 V5H: ILAS-II
The comparison of MIPAS CFC-12 measurements from the FR period with ILAS-II measurements (Fig. 29) consists of about 5000 collocated profiles.Throughout the whole altitude range, with very few exceptions, ILAS-II (blue line: on its original grid; black line: on the MIPAS altitude grid) shows higher mixing ratios of CFC-12 than MIPAS (left panel: red line).However, while the mean profiles of MIPAS and ILAS-II agree rather well up to about 17 km, ILAS-II shows considerably larger mixing ratios of CFC-12 above that altitude, which is most pronounced around 25 km.Apart from the lowermost two altitudes, the differences of the mean profiles do not exceed 50 pptv up to ∼ 17 km (middle panel), which corresponds to relative differences of approximately 10 % at the largest.From 17 km upwards however, deviations can be as large as close to 150 pptv around 25 km, resulting in relative differences of over 100 %.Wetzel et al. (2008)

Results CFC-12 V5H: HATS
The short time series of the MIPAS FR period (Fig. 26: July 2002 to April 2004) is compared to the measurements collected by the HATS network during the same time period for CFC-12.Similar as for CFC-11, MIPAS (continuous red line with large red circles) exhibits larger annual and interannual variations than the HATS data (continuous black line) from mid-2002 to early 2004.While MIPAS oscillates around a constant mixing ratio of approximately 550 pptv at 3 km below the tropopause, the HATS ground-based measurements show mixing ratios well within the range of 540 to 545 pptv.Thus, the difference between MIPAS and HATS is very small, of an order of ∼ 10 pptv at the largest, which corresponds to relative differences of less than 2 %.Other than for CFC-11, the mixing ratios of both time series stay rather constant during this period.According to the small differences of only ∼ 10 pptv, we consider the agreement of MIPAS with HATS CFC-12 measurements to be remarkably good during the FR period, which strongly supports the high accuracy of the MIPAS CFC-12 measurements during the FR period.
Figures 30 and 31 show an overview of the relative differences between MIPAS and the comparison instruments for CFC-11 and CFC-12, respectively.The differences show MI-PAS minus the comparison instrument and use MIPAS as a reference.

Overview: CFC-11
A slight high bias in MIPAS CFC-11 is clearly visible in the comparison with ACE-FTS for the FR period (left) below ∼ 10 km and also indicated at the lower end of the differences profile in the comparison with MkIV.For the RR period (right panel), the comparison with HIRDLS shows a high bias in MIPAS CFC-11 below ∼ 15 km, where the difference is larger than 20 % at the lowest altitude.This high bias at the lower end of the profile does not appear in the RR period comparison with ACE-FTS and MIPAS-B and is far less pronounced in the comparison with MIPAS-STR, where only the value at the lowest altitude level exceeds 20 % while oscillating around zero above.While the relative differences between MIPAS and ACE-FTS are around or below 5 % between 10 and 18 km for the FR period, they are around 10 % between the lowest altitude level and almost 20 km for the RR period.Both ACE-FTS and HIRDLS show relative differences of about 20 % close to 25 km.However, one should keep in mind that the volume mixing ratios of CFC-11 are small at this altitude, so that this bias is not that obvious in the absolute comparisons (Figs. 8 and 11).There, the difference is only around 10-15 pptv in both comparisons.Even though the relative differences between MIPAS and MIPAS-STR and MIPAS and MIPAS-B oscillate strongly, the overall impression is that MIPAS shows slightly too high values for CFC-11 during the RR period.This impression is also supported by the comparisons with the cryosampler and HATS, even though HATS only shows a high bias of close to 5 % at 3 km below the tropopause.MkIV is the only instrument showing higher volume mixing ratios than MIPAS up to 25 km.The other instruments show a high bias mainly around 10 %, that is only exceeded at the lowest altitude levels and above 20 km.For the FR period this bias seems to be slightly smaller in general, while the difference compared to HATS is similar to the RR period.However, both measurement periods are consistent and agree qualitatively and quantitatively to a large extent.

Overview: CFC-12
Both for the FR period (left) and the RR period (right), relative differences between MIPAS and the comparison instruments stay within 10 % below 20 km.While ACE-FTS shows a similar profile for both MIPAS periods, with slightly higher MIPAS values up to 15-17 km and slightly lower MIPAS values above, MkIV measurements agree better with MIPAS during the FR period and show lower MIPAS mixing ratios from ∼ 18 to 30 km during the RR period.Relative differences with ILAS-ll show that the instrument measures values that are about 20 % higher than those acquired by MIPAS at the bottom end of the profile.Above 25 km, all comparisons for both periods indicate a low bias in MIPAS CFC-12, except the one vs.HIRDLS.Overall, relative differences between MIPAS and the comparison instruments are small below 25 km, only occasionally exceeding 10 % for most of the comparisons, while above that altitude, there seems to be an indication of MIPAS CFC-12 values being slightly lower.Comparisons with HATS do not indicate a bias in MIPAS CFC-12 measurements, as the difference is only 0.5 % for both the FR and the RR period.The results of the comparisons for both MIPAS measurement periods indicate that the CFC-12 products are very consistent, as they show similar features and differences of similar magnitudes for both data products.

MIPAS random error
The assessment of whether MIPAS random error estimates are realistic suffers from possible natural variability and the fact that even though most of the comparison instruments provide the full random error budget, it is not clear whether the random error estimates of the comparison instruments are realistic.Thus, our random error assessment is complemented by the following study; we know that the total observed variability σ total is composed of the natural variability σ nat and the random measurement error estimate σ ran : Thus the observed variability can be considered an upper bound of the random measurement error.For calm atmospheric conditions where low natural variability is expected (polar summer), the observed variability should be dominated by the measurement error.The difference between the In order to verify the temporal stability of MIPAS CFC-11 measurements, drifts resulting from changing assumptions regarding the nonlinearity correction (Fig. 34) were calculated.As shown by Eckert et al. (2014) and Kiefer et al. (2013), the assumption of the nonlinearity correction for the MIPAS detectors being time-independent cannot be held any more.Time-dependent coefficients for the nonlinearity correction were found be able to explain drifts between MIPAS and other instruments, e.g., Aura MLS for ozone.Thus, the same method was used to calculate drifts in MIPAS CFC-11 measurements.MIPAS results produced using the retrieval setup for bulk processing are compared to results derived using newly suggested time-dependent nonlinearity coefficients (see Eckert et al., 2014, Sect. 3.3).The difference between these results is calculated for a subset of measurements taken between June 2005 and October 2011.Subsequently, the temporal development of these differences is assessed by fitting a linear variation to them.The left panel in Fig. 34 shows an Altitude-latitude cross section of the estimated drifts, where bluish tiles indicate that MIPAS is seeing more negative/less positive trends using the old, not time-dependent, nonlinearity coefficients.Red tiles indicate that MIPAS is seeing more positive/less negative trends for using the old setup.The drifts are very small compared to absolute mixing ratios of CFC-11, and only occasionally exceed 2 % decade −1 .Larger drifts appear exclusively at high latitudes in the Northern Hemisphere, which is a region with large natural variability, and thus larger differences between the fit and the measurements lead to less reliable results.In order to prove that former results by Kellmann et al. (2012) are still valid, we compared the drift results with the trends for the whole MIPAS time series (Fig. 34, left panel).Reddish tiles indicate positive trends (only in the Southern Hemisphere between 25 and 30 km), while blueish tiles mean that the CFC-11 mixing ratios have decreased during the MI-PAS measurement period.Hatching indicates non-significant trends at 2-sigma level.While the trends are very small below ∼ 20 km (even ∼ 25 km in the tropics), negative trends of down to about −50 % were found above this altitude in the Northern Hemisphere.Positive trends range up to ∼ 20 %.These trends are by far larger than the estimated drifts, and thus the conclusions drawn from these trends by Kellmann et al. (2012) still hold, i.e., that decadal change in stratospheric circulation is needed to explain the results.

CFC-12
The temporal stability over the whole MIPAS measurement period was examined for CFC-12, as for CFC-11.The results of the drift estimation (Fig. 35) (left panel) exhibit small, even close to zero, negative drifts in CFC-12 below ∼ 30 km.
Above that altitude, up to ∼ 35 km, larger negative drifts appear, which are largest at midlatitudes and high latitudes and range down to about −50 %.From 35 km upwards, large positive drifts were found which exceed 50 % at some points, with largest drifts shown at higher altitudes and latitudes.Compared to the trends (Fig. 35, right panel), the drifts are approximately of the same order of magnitude up to ∼ 20 km (∼ 25 km in the tropics).Between that altitude and ∼ 30 km the trends are considerably larger and also show positive values in the Southern Hemisphere.From ∼ 30 to 35 km negative trends are almost entirely canceled out by the drifts.This also applies to the positive trends above ∼ 35 km.Keeping this in mind, the most pronounced trends are those between ∼ 20 and 30 km, which have already been found and interpreted by Kellmann et al. (2012).Since drifts in this altitude range are very small, the conclusions drawn in their paper still hold, and decadal changes in stratospheric circulation are evident.Above ∼ 35 km, the apparent trend actually is a drift due to the time-dependent nonlinearity of the detector which has not been accounted for in the bulk processing of the MIPAS data to date.After fixing this for the next data version, by using the new nonlinearity correction coefficients, we assume the MIPAS CFC-12 data will be temporally stable throughout the whole vertical range.

Conclusions
The MIPAS CFC-11 product shows good overall agreement with the presented collocated observations.A slight high bias is found at low altitudes, below ∼ 10 km for the full spectral resolution (FR) period and ∼ 15 km for the reduced spectral resolution (RR) period.These differences stay mainly within 50 pptv, corresponding to 25 % at the largest.Larger differences appear in the comparison with ILAS-II, but we suggest treating these results with care since Wetzel et al. (2008) found similarly large differences when comparing MIPAS-B results to a former version of ILAS-II measurements.Differences in CFC-11 tend to be smaller than 30 pptv above 15 km in most cases for the RR period, which corresponds to approximately 20 % at the largest.For the FR period, ACE-FTS and MkIV agree with MIPAS within 20 pptv up to 20 km, but show increasing differences above which exceed 30 pptv at the uppermost level in the case of ACE-FTS and even more in the case of MkIV.Even though the comparisons of the standard deviation in a quiescent atmosphere and the MIPAS error budget suggest that the latter is slightly underestimated, this conclusion cannot be drawn from the comparisons with the other instruments.However, it cannot be falsified either, since it is unclear how reliable the error estimates of the other instruments are and how large the contribution of natural variability is.Except for a few outliers in the comparison with the cryosampler measurement taken on 3 October 2009 and MkIV above ∼ 19 km, the CFC-12 product exhibits excellent agreement with all compared instruments.During the FR period both ACE-FTS and MkIV agree very well with MIPAS up to 20 km, with differences staying within 5 %.For the RR period, similarly good agreement is found with all instruments.Maximum differences are of the same order of magnitude as for CFC-11 in the absolute value of about 50 pptv, but since CFC-12 volume mixing ratios are larger than those of CFC-11 in general, the relative deviations of MIPAS from comparison instruments are far smaller and rarely larger than 10 %.This value of relative differences is not even reached in most of the comparisons as typical values stay within 5 % below 18 km.The comparisons of the standard deviation in a quiescent atmosphere and the MIPAS error budget show that both quantities are very similar for the FR period.This suggests that the error budget was estimated accurately.For the RR period, the results of the comparison of the standard deviation in a quiescent atmosphere and the MIPAS random error are similar to those of CFC-11, and thus suggest a slight underestimation of the error budget for this time period.However, as for CFC-11, this is difficult to deduce or falsify from the comparisons with other instruments.MkIV is the only instrument rather suggesting a low bias in MIPAS CFC-11 and CFC-12 RR measurements.Estimated drifts are small for both species below ∼ 25 to 30 km.Above that altitude, CFC-11 is difficult to detect and the test data set for drift estimates from different nonlinearity correction coefficients was sparse, so that no results exist from ∼ 25 km upwards.CFC-12 drifts reach up to magnitudes of about 50 % above ∼ 30 km, showing large negative values up to ∼ 35 km and positive values above.This is reflected in the trend, which is mostly artificial above this altitude.At 3 km below the tropopause, the drift can partly explain the differences in the trends between MIPAS and ground-based HATS CFC-11 data.For CFC-12, the drift is very similar to the differences found in the trends of MIPAS at 3 km below the tropopause and the HATS measurements, and is thus a good candidate for explaining these differences.For future data versions, these results will be taken into account to produce a temporally stable CFC-12 data set, which will then also be suitable for trend analysis above 35 km.

Figure 1 .
Figure 1.Simulated midlatitude emission spectra of the main regions of CFC-11 (left) and CFC-12 (right) at 20 km in July.The spectral regions used for each instrument are shown.Only ILAS-2 uses a far wider range to retrieve CFC-11 and CFC-12.

Figure 2 .
Figure 2. Comparison of a climatological mean of MIPAS CFC-11 measurements (light orange line), collocated measurements (blue-grayish lines) and their mean profile (red line) and the closest MIPAS profile (blue line) with different flights of the cryosampler (black dots).

Figure 3 .
Figure 3.Comparison of one MkIV CFC-11 profile (black line) with three coincident profiles of MIPAS (blue-grayish lines).The closest (blue line) and the mean (red line) of these profiles are shown in addition.The MkIV error estimates are inferred from the fit residuals.

Figure 4 .
Figure 4. Comparison of a mean profile of MIPAS CFC-11 collocated measurements (left panels: red line) with a profile of MIPAS-B (black line) obtained on 24 January 2010 (upper panels) and 31 March 2011 (lower panels) at Kiruna.The error bars (1σ ; left panel) show the total error without the spectroscopy error for MIPAS and MIPAS-B.The difference is shown in absolute (middle panels) and relative (right panels) terms.The dotted red line is the standard deviation and the dashed blue line is the combined error which consists of the root of the squared error of MIPAS-B and the MIPAS mean.

Figure 5 .
Figure 5.Comparison of mean profiles of MIPAS CFC-11 (left panel, continuous red line) and MIPAS-STR (left panel, continuous black line) for seven collocated measurements taken during a flight on 2 March 2010.The error bars consist of the total error without the spectroscopy error for both MIPAS and MIPAS-STR.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 6 .
Figure 6.Correlation of collocated MIPAS CFC-11 measurements with HIRDLS measurements during the time period of 2005 to 2008.

Figure 7 .
Figure 7. Histogram of collocated MIPAS CFC-11 measurements (top panels) and HIRDLS measurements (bottom panels) for the years of 2005-2008 at 16 km (left panels) and 23 km (right panels).The black line indicates the location of the mean of the sample, while the red dashed line marks the median.

Figure 8 .
Figure 8.Comparison of mean profiles of MIPAS CFC-11 (left panel, continuous red line) and HIRDLS (left panel, continuous black line) for the years of 2005-2008.The error bars include the total error in the case of MIPAS and the estimated error -which is derived from the average of 10 sets of 12 consecutive profiles of regions with little variability (Gille et al., 2014) -in the case of HIRDLS.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 9 .
Figure 9. Correlation of collocated MIPAS CFC-11 measurements with ACE-FTS measurements during the time period of 2005 to 2012.

Figure 10 .
Figure 10.Histogram of MIPAS CFC-11 measurements (top panels) and ACE-FTS measurements (bottom panels) for the years 2005-2012 at 16 km (left panels) and 23 km (right panels).The black line indicates the location of the mean of the sample, while the red dashed line marks the median.

Figure 11 .
Figure 11.Comparison of mean profiles of MIPAS CFC-11 (left panel, continuous red line) and ACE-FTS (left panel: blue line denotes ACE-FTS on native grid; continuous black line denotes ACE-FTS interpolated onto the MIPAS grid) for the years of 2005-2012.The error bars include the total error for MIPAS and the random errors from the least-squares fitting process for ACE-FTS.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 12 .
Figure 12.Comparison of MIPAS CFC-11 values at 3 km below the tropopause (red) and ground-based measurements of the HATS network (black).Dashed lines denote the standard deviation.

Figure 13 .
Figure 13.Two MkIV CFC-11 profiles are compared with collocated MIPAS profiles of the FR period.For the measurement on 16 December 2002, 16 collocated MIPAS measurements were found, while 25 MIPAS profiles coincided with the 1 April 2003 MkIV measurement.The setup is similar to Fig. 3.

Figure 14 .
Figure 14.Comparison of mean profiles of MIPAS CFC-11 (left panel, continuous red line) and ACE-FTS (left panel: blue line denotes ACE-FTS on native grid; continuous black line denotes ACE-FTS interpolated onto the MIPAS grid) for February and March 2004, corresponding to the FR period and the MIPAS V5H_CFC-11_20 data set.The dashed lines show the median of each data set.The error bars include the total error for MIPAS and the random errors from the least-squares fitting process for ACE-FTS.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 15 .
Figure 15.Comparison of mean profiles of MIPAS CFC-11 (left panel, continuous red line) and ILAS-II (left panel: blue line denotes ILAS-II on native grid; continuous black line denotes ILAS-II) for the FR period, corresponding to the MIPAS V5H_CFC-11_20 data set.The dashed lines show the median of each data set.The error bars include the total error for both instruments.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 16 .
Figure 16.Comparison of an ensemble of MIPAS CFC-12 measurements (light orange lines), collocated measurements (blue-grayish lines) and their mean profile (red line) and the closest MIPAS profile (blue line) with different flights of the cryosampler (black dots).

Figure 17 .
Figure 17.Comparison of one MkIV CFC-12 profile (black line) with three coincident profiles of MIPAS (blue-grayish lines).The closest (blue line) and the mean (red line) of these profiles are shown in addition.The MkIV error estimates are inferred from the fit residuals.

Figure 18 .
Figure 18.Comparison of a mean profile of MIPAS CFC-12 collocated measurements (left panels: red line) with a profile of MIPAS-B (black line) obtained on 24 January 2010 (upper panels) and 31 March 2011 (lower panels) at Kiruna.The error bars (1σ ; left panel) show the total error without the spectroscopy error for MIPAS and MIPAS-B.The difference is shown in absolute (middle panels) and relative (right panels) terms.The dotted red line is the standard deviation and the dashed blue line is the combined error which consists of the root of the squared error of MIPAS-B and the MIPAS mean.

Figure 19 .
Figure 19.Comparison of mean profiles of MIPAS CFC-12 (left panel, continuous red line) and MIPAS-STR (left panel, continuous black line) for seven collocated measurements taken during a flight on 2 March 2010.The error bars consist of the total error without the spectroscopy error for both MIPAS and MIPAS-STR.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 20 .
Figure 20.Correlation of collocated MIPAS CFC-12 measurements with HIRDLS measurements during the time period of 2005 to 2008.

Figure 21 .
Figure 21.Histogram of collocated MIPAS CFC-12 measurements (top panels) and HIRDLS measurements (bottom panels) for the years of 2005-2008 at 16 km (left panels) and 23 km (right panels).The black line indicates the location of the mean of the sample, while the red dashed line marks the median.

Figure 22 .
Figure 22.Comparison of mean profiles of MIPAS CFC-12 (left panel, continuous red line) and HIRDLS (left panel, continuous black line) for the years of 2005-2008.The error bars include the total error in the case of MIPAS and the estimated error -which is derived from the average of 10 sets of 12 consecutive profiles of regions with little variability (Gille et al., 2014) -in the case of HIRDLS.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 23 .
Figure 23.Correlation of collocated MIPAS CFC-12 measurements with ACE-FTS measurements during the time period of 2005 to 2012.

Figure 24 .
Figure 24.Histogram of MIPAS CFC-12 measurements (top panels) and ACE-FTS measurements (bottom panels) for the years of 2005-2012 at 16 km (left panels) and 23 km (right panels).The black line indicates the location of the mean of the sample, while the red dashed line marks the median.

Figure 25 .
Figure 25.Comparison of mean profiles of MIPAS CFC-12 (left panel, continuous red line) and ACE-FTS (left panel: blue line denotes ACE-FTS on native grid; continuous black line denotes ACE-FTS interpolated onto the MIPAS grid) for the years of 2005-2012.The error bars include the total error for MIPAS and the random errors from the least-squares fitting process for ACE-FTS.The dashed lines show the median of each data set.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 26 .
Figure 26.Comparison of MIPAS CFC-12 value estimates at 3 km below the tropopause (red) and ground-based measurements collected by the HATS network (black).

4. 4
CFC-12: high spectral resolution time period (FR)4.4.1 Results CFC-12 V5H: MkIVFor the comparison of CFC-12 during the FR period (Fig.27), 15 collocated MIPAS profiles were found for the MkIV measurement taken on 16 December 2002, and 25 MIPAS profiles coincide with the MkIV measurement taken on 1 April 2003.The mean MIPAS profile (red line) and the MkIV profile (black line) are close in both cases, showing deviations no larger than 50 pptv (corresponding to 10-20 % for most of the vertical range) and even considerably smaller at some altitude levels.Deviations with the closest MIPAS profile (blue line) are larger than for the mean profile, similar to the other comparisons for CFC-11, ranging up to ∼ 100 pptv.There is a slight indication of the MkIV

Figure 27 .
Figure 27.Two MkIV CFC-12 profiles are compared with collocated MIPAS of the FR period.For the measurement on 16 December 2002, 16 collocated MIPAS measurements were found, while 25 MIPAS profiles coincided with the 1 April 2003 MkIV measurement.The setup is similar to Fig. 3.

Figure 28 .
Figure 28.Comparison of mean profiles of MIPAS CFC-12 (left panel, continuous red line) and ACE-FTS (left panel: blue line denotes ACE-FTS on native grid; continuous black line denotes ACE-FTS interpolated onto the MIPAS grid) for February and March 2004, corresponding to the FR period and the MIPAS V5H_CFC-12_20 data set.The dashed lines show the median of each data set.The error bars include the total error for MIPAS and the random errors from the least-squares fitting process for ACE-FTS.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 29 .
Figure 29.Comparison of mean profiles of MIPAS CFC-12 (left panel, continuous red line) and ILAS-II (left panel: blue line denotes ILAS-II on native grid; continuous black line denotes ILAS-II) for the FR period, corresponding to the MIPAS V5H_CFC-12_20 data set.The dashed lines show the median of each data set.The error bars include the total error for both instruments.The middle panel shows the mean difference (blue) of these profiles and the standard error of the mean.The right panel shows the combined error (purple) of the instruments and the standard deviation of the differences (brown).

Figure 30 .
Figure 30.Relative differences between MIPAS CFC-11 for the FR (left) and RR (right) period and each comparison instrument, calculated as MIPAS−Other MIPAS × 100 %.

Figure 31 .
Figure 31.Relative differences between MIPAS CFC-12 for the FR (left) and RR (right) period and each comparison instrument, calculated as MIPAS−Other MIPAS × 100 %.
observed variability and the measurement error should be small and explainable by the natural variability.If the random measurement error estimate exceeds the observed variability, then the error estimates are too conservative.The results of this analysis for both species and measurement periods are shown are shown in Figs.32 and 33.For the CFC-12 FR period, the observed variability is fully explained by the estimated random errors.For the other products about two-thirds to three-quarters of the observed variability are explained, except for CFC-11 RR below 20 km.

Figure 34 .
Figure 34.Left panel: altitude-latitude cross section of the instrument drift in MIPAS CFC-11.This drift is calculated by comparing the temporal evolution of CFC-11 from two different setups.One setup uses nonlinearity correction coefficients used for the bulk MIPAS retrieval to date.The other uses newly suggested time-dependent nonlinearity correction coefficients (Eckert et al., 2014, Sect.3.3).The drift is shown in relative terms, referring to the mean CFC-11 mixing ratio in the middle of the time series.Blueish tiles indicate that the new coefficients result in higher CFC-11 mixing ratios, while reddish tiles indicate the opposite.White areas indicate that there were too few or no data points available to estimate a drift properly.Right panel: altitude-latitude cross section of relative MIPAS CFC-11 trends without drift correction, calculated from data covering January 2005 to April 2012.The trend is weighted with the CFC-11 mixing ratio of the middle of the time series for each tile.Blueish tiles indicate declining CFC-11 mixing ratios, while increasing mixing ratios are represented by reddish tiles.Hatching indicates that the trends are either not significant at 2-sigma level or that χ 2 is more than 10 % different from one.Note different color bars.

Figure 35 .
Figure 35.Left panel: altitude-latitude cross section of the instrument drift in MIPAS CFC-12.Blueish tiles indicate that the new coefficients result in higher CFC-12 mixing ratios, while reddish tiles indicate the opposite.White areas indicate that there were too few or no data points available to estimate a drift properly.Right panel: altitude-latitude cross section of relative MIPAS CFC-12 trends, calculated from data covering January 2005 to April 2012.Blueish tiles indicate declining CFC-11 mixing ratios, while increasing mixing ratios are represented by reddish tiles.Hatching indicates that the trends are either not significant at 2-sigma level or that χ 2 is more than 10 % different from one.

Table 2 .
Mean matching distance and time for comparisons based on collocated measurements.