The potential of clear-sky carbon dioxide satellite retrievals

Since the launch of the Greenhouse Gases Observing Satellite (GOSAT) in 2009, retrieval algorithms designed to infer the column-averaged dry-air mole fraction of carbon dioxide (XCO2 ) from hyperspectral near-infrared observations of reflected sunlight have been greatly improved. They now generally include the scattering effects of clouds and aerosols, as early work found that absorption-only retrievals, which neglected these effects, often incurred unacceptably large errors, even for scenes with optically thin cloud or aerosol layers. However, these “full-physics” retrievals tend to be computationally expensive and may incur biases from trying to deduce the properties of clouds and aerosols when there are none present. Additionally, algorithms are now available that can quickly and effectively identify and remove most scenes in which cloud or aerosol scattering plays a significant role. In this work, we test the hypothesis that non-scattering, or “clear-sky”, retrievals may perform as well as full-physics retrievals for sufficiently clear scenes. Clear-sky retrievals could potentially avoid errors and biases brought about by trying to infer properties of clouds and aerosols when none are present. Clear-sky retrievals are also desirable because they are orders of magnitude faster than full-physics retrievals. Here we use a simplified version of the Atmospheric Carbon Observations from Space (ACOS) XCO2 retrieval algorithm that does not include the scattering and absorption effects of clouds or aerosols. It was found that for simulated Orbiting Carbon Observatory-2 (OCO-2) measurements, the clear-sky retrieval had errors comparable to those of the fullphysics retrieval. For real GOSAT data, the clear-sky retrieval had errors 0–20 % larger than the full-physics retrieval over land and errors roughly 20–35 % larger over ocean, depending on filtration level. In general, the clear-sky retrieval had XCO2 root-mean-square errors (RMSEs) of less than 2.0 ppm, relative to Total Carbon Column Observing Network (TCCON) measurements and a suite of CO2 models, when adequately filtered through the use of a custom genetic algorithm filtering system. These results imply that non-scattering XCO2 retrievals are potentially more useful than previous literature suggests, as the filtering methods we employ are able to remove measurements in which scattering can cause significant errors. Additionally, the computational benefits of non-scattering retrievals means they may be useful for certain applications that require large amounts of data but have less stringent error requirements.


Introduction
Recently, space-based instruments such as the Greenhouse Gases Observing Satellite (GOSAT; Yokota et al., 2009) and the Orbiting Carbon Observatory-2 (OCO-2; Crisp et al., 2008) have been launched with the goal of providing accurate global measurements of greenhouse gas concentrations, including carbon dioxide (CO 2 ). Only approximately half of the anthropogenically emitted CO 2 stays in the atmosphere. The remaining molecules are absorbed by the land and ocean, but where this absorption takes place is still highly uncertain, especially over land surfaces (Le Quéré et al., 2009). CO 2 inverse modeling systems, designed to answer important questions about Earth's carbon sources and sinks and their interaction with the atmosphere, are heavily dependent on the density and quality of CO 2 measurements (Rayner and O'Brien, 2001;Baker et al., 2010;Chevallier et al., 2007Chevallier et al., , 2009. Global coverage of CO 2 measurements will improve the accuracy and precision of their results, but only if the Specifically, it has been shown that space-based measurements of the column-averaged dry-air mole fraction of carbon dioxide, or X CO 2 , need a precision of better than about 0.5 % (∼ 2 ppm) to gain more information about the carbon cycle compared to only having access to ground-based measurements . In terms of a bias between the measured CO 2 and the true amount present in the atmosphere, even a regional bias of a few tenths of a ppm may be detrimental to CO 2 inverse modeling systems (Chevallier et al., 2007;Basu et al., 2013). Thus, it is critically important to minimize random errors and biases in satellite measurements of CO 2 in order to be able to correctly answer questions about the carbon cycle.
When retrieving X CO 2 from space-based instruments, one of the primary issues is the presence of clouds and aerosols. These contaminants can introduce large errors into a retrieval because they tend to modify the light path in ways that are difficult to quantify. In order to accurately measure X CO 2 , the length of the light path must be known. If clouds and aerosols are present they can scatter the reflected sunlight in multiple directions, which can drastically alter the length of the light path seen by the sensor and result in significant errors when calculating X CO 2 . Thus, neglecting scattering when measuring scenes containing clouds and aerosols can lead to substantial errors in X CO 2 . These errors are often in excess of 1 % (∼ 4 ppm of X CO 2 ) and can be tens of ppm for scenes with thick cloud or aerosol layers (O'Brien and Rayner, 2002;Aben et al., 2007;Butz et al., 2009).
A common method used to avoid these large X CO 2 errors caused by light path modification is to parameterize clouds and aerosols explicitly within the X CO 2 retrieval. This often includes adding one or more scattering particle types to the retrieval along with variables that describe their optical and/or physical properties (e.g., optical depth, height of the scattering layer, single scatter albedo) (Butz et al., 2009;Yokota et al., 2009;Crisp et al., 2010;Reuter et al., 2013;Parker et al., 2011). These particle types and their corresponding properties are intended to represent typical clouds and aerosols found in the atmosphere. However, adding cloud and aerosol parameters to the retrieval algorithm can result in new issues such as creating an underconstrained problem or inducing nonlinearity in the forward model (Nelson, 2015). Further, it has been shown that, for certain retrieval algorithm setups, these "full-physics" retrievals may incur biases from attempting to account for clouds and aerosols when none are present . For ideal, extremely clear scenes, this becomes an issue because the addition of a cloud and aerosol parameterization may be detrimental rather than beneficial. Additionally, Fig. 1 shows a comparison of retrieved full-physics optical depths from build 3.4 (B3.4) of the NASA Atmospheric CO 2 Observations from Space (ACOS) X CO 2 retrieval algorithm Crisp et al., 2010) to optical depths mea- sured from the highly accurate AErosol RObotic NETwork (AERONET; Holben et al., 1998), using coincidence criteria of ±30 min and 0.1 • . The aerosol optical depths retrieved by the full-physics retrieval were not well correlated with the AERONET measurement for a particular scene, which suggests that the full-physics retrieval often has difficulty obtaining information about clouds and aerosols.
The aforementioned problems associated with the fullphysics retrieval algorithm motivated a study of a simplified non-scattering, or "clear-sky", retrieval to test the hypothesis that it could provide comparably accurate X CO 2 measurements, given appropriate filtering of scenes contaminated by clouds and aerosols. These clear-sky retrievals are simple and highly linear because they assume no scattering or absorption effects caused by clouds or aerosols. Thus, clear-sky retrievals may avoid introducing unwanted biases when clouds and aerosols are not present. Recent work by Butz et al. (2013) has demonstrated that, for simulated measurements over ocean, a clear-sky retrieval can theoretically be used when scenes containing significant light path perturbations are removed. Correspondingly, this approach is now used in the operational RemoTeC retrieval Butz et al., 2009).
Clear-sky retrievals are also desirable because of their high computational efficiency relative to full-physics retrievals. This is primarily because of the computational expense associated with calculating scattering from clouds and aerosols. The current operational ACOS algorithm takes roughly 10 min per measurement per CPU core and OCO-2 collects about 10 6 measurements per day. This restricts the number of measurements able to be fully processed. The use of a clear-sky retrieval would thus, with current computational limits, allow for approximately 1-2 orders of magnitude more data to be processed.
We begin by testing our hypothesis that clear-sky retrievals may perform as well as full-physics retrievals for sufficiently R. R. Nelson et al.: Clear-sky carbon dioxide satellite retrievals 1673 clear scenes by evaluating synthetic OCO-2 measurements. We then extend the analysis by testing our hypothesis on real GOSAT measurements. We use mature pre-filtering techniques to remove scenes obviously containing clouds and aerosols and employ the Data Ordering through Genetic Optimization (DOGO) system (Mandrake et al., 2013) to filter out additional contaminated measurements and improve the quality of the data. Global and regional statistics are calculated for retrievals over both land and ocean surfaces.
Section two gives details on the full-physics and clear-sky X CO 2 retrievals. The third section discusses the simulated OCO-2 and real GOSAT data sets used in this study. The fourth section describes the validation sources and the preand post-filtering techniques used to remove scenes containing clouds and aerosols. Section five contains a comparison of clear-sky X CO 2 retrievals to full-physics X CO 2 retrievals. The sixth section summarizes the study's results and draws conclusions about the utility of clear-sky X CO 2 retrievals.

Full-physics vs. clear-sky X CO retrievals
Hyperspectral measurements of reflected sunlight in the near-infrared can be used to infer CO 2 concentrations from space by analyzing molecular absorption. The geometry of the light path must be known in conjunction with the magnitude of the absorption in order for X CO 2 to be accurately estimated. The instruments onboard GOSAT and OCO-2 make use of this method. Typically, a relatively weak CO 2 absorption band located in the near-infrared at approximately 1.6 µm and a stronger CO 2 absorption band at 2.0 µm are used in conjunction to estimate the average amount of CO 2 in the light path seen by the instrument's sensors. Of note, the 2.0 µm band is used to gain information about CO 2 but is also more sensitive to aerosols than the 1.6 µm band. Additionally, an oxygen absorption feature near 0.76 µm, known as the O 2 A-band, is often employed to help filter out clouds and aerosols (Taylor et al., , 2016) (see Sect. 4.2) and to retrieve surface pressure, which acts as a proxy for light path length.
Because current methods for passively measuring CO 2 are unable to give much information about the vertical distribution of CO 2 (Connor et al., 2008), a column-averaged value is typically the final product retrieved from the measurement. This value is specifically known as the column-averaged dryair mole fraction of carbon dioxide, or X CO 2 : where N CO 2 (z) is the molecular number density of CO 2 at altitude z and N d (z) is the molecular number density of dry air at altitude z.
Values of X CO 2 are estimated by the ACOS retrieval algorithm using the measured radiances and a priori information to optimize a state vector (Rodgers, 2000). Complete  (Wunch et al., 2011a).
In this work, we performed full-physics and clear-sky X CO 2 retrievals on both simulated OCO-2 measurements and real GOSAT measurements. Some details of both retrievals are given in Table 1.
The clear-sky retrieval utilizes the CO 2 near-infrared bands at 1.6 and 2.0 µm but does not include cloud or aerosol parameters in the state vector. Instead of using the O 2 Aband at 0.76 µm to retrieve information about surface pressure, which is used to estimate N d , the clear-sky retrieval simply uses the a priori surface pressure value. We included Rayleigh scattering by air molecules for the two nearinfrared CO 2 bands, but these effects are likely negligible at such long wavelengths.
The full-physics retrieval uses the two near-infrared CO 2 bands as well as the O 2 A-band at 0.76 µm. The O 2 A-band is more sensitive to small cloud and aerosol particles, and therefore its inclusion can improve the measurement of cloud and aerosol parameters in the full-physics retrieval. The ACOS B3.4 retrieval parameterizes scattering effects by including four unique cloud and aerosol types in its state vector , which are assumed to have a vertical Gaussian distribution and are assigned a magnitude, width, and height. Two of the four types are a generic water cloud and ice cloud. The remaining two types are the Kahn 2b and 3b aerosol types (Kahn et al., 2001). The Kahn 2b aerosol type is a mixture of coarse-and fine-mode dust, while the Kahn 3b aerosol type is a mixture of smaller carbonaceous aerosols. Both 2b and 3b also contain sulfate and sea salt components. Simulations have suggested that a combination of these four scattering types is sufficient to approximately represent any type of scene observed by GOSAT or OCO-2 (O'Dell et al., 2012).

Data
The simulated OCO-2 data set consists of retrievals performed on approximately 44 000 synthetic measurements spanning 17-18 June 2012 and 19-20 December 2012 (a total of 58 orbits), providing a full range of solar and satellite geometries. Scenes over land used nadir viewing geometry, while those over ocean used glint viewing geometry. Glint viewing geometry over land was not examined in this study because GOSAT is unable to make measurements using that viewing method. The simulated radiances were generated by the Colorado State University (CSU) Orbit Simulator, which uses realistic meteorology, cloud and aerosol distributions, and surface properties (O'Brien et al., 2009). Gaussian noise, consistent with the actual OCO-2 instrument noise (Frankenberg et al., 2015), was added to the synthetic measurements to make the retrievals as realistic as possible. For this study, National Centers for Environmental Prediction (NCEP) reanalysis data were used for the meteorological a priori, while ECMWF IFS model data were used to create the synthetic radiances. This intentional mismatch in meteorology mimics real-world inaccuracies when measuring a given scene. For both the full-physics and clear-sky retrievals performed on simulated measurements, the a priori surface pressure was taken from the NCEP reanalysis data. As discussed in Sect. 2, the full-physics retrieval then used the O 2 A-band to fine-tune the surface pressure estimate, while the clear-sky retrieval assumed the a priori to be correct. The vertical profiles of clouds and aerosols used to create the synthetic measurements were derived from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument onboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO; Winker et al., 2009), which currently flies in approximately the same polar orbit as OCO-2 as part of the Afternoon Constellation (L'Ecuyer and Jiang, 2010). Land surfaces included albedos and bi-directional reflectance distribution functions (BRDFs) from the Moderate-resolution Imaging Spectroradiometer (MODIS), while ocean surfaces were modeled as specular reflectors (Cox and Munk, 1954) with a foam component based on wind speed taken from the ECMWF IFS.
The GOSAT data set contained retrievals performed on 25 000 real measurements made from April 2009 to December 2012. We included both ocean and land scenes and attempted to represent the majority of surface types across the globe without being regionally biased. Measurements were only included in the data set if they had a corresponding X CO 2 validation source (see Sect. 4.1). Surface pressure a priori values were taken from the ECMWF IFS model, which has been shown to be accurate to within 1-2 hPa under most conditions (Salstein et al., 2008;Crowell et al., 2015). As the clear-sky retrieval did not retrieve surface pressure, it was necessary to have an accurate a priori surface pressure estimate. To ensure this methodology was valid, we tested a version of the clear-sky retrieval that was allowed to retrieve surface pressure from the O 2 A-band and found the results to be consistent. Finally, OCO-2 and GOSAT retrievals that did not converge in the full-physics retrieval were not used for this study.

Methodology
In this section we discuss methods of characterizing the X CO 2 errors in the retrievals, the pre-filtering used to initially remove heavily contaminated measurements, and postfiltering through the use of the DOGO system used to further improve the quality of the data set by removing additional contaminated scenes.

X CO 2 validation sources
To evaluate the accuracy and precision of the X CO 2 retrievals, a "true" X CO 2 value was needed. For the OCO-2 retrievals, the truth was known because the measurements were synthetically created. For the GOSAT retrievals, we considered the truth to be either a TCCON measurement co-located in time and space using the technique described in Guerlet et al. (2013) or the average of seven CO 2 models that assimilate ground-based and aircraft CO 2 measurements and agreed to within 1.0 ppm for a given GOSAT measurement location and time. This ensured we had sufficient ocean validation because TCCON stations are mostly concentrated on large land masses. The CO 2 models used include two from the University of Edinburgh (Feng et al., 2011), one from Le Laboratoire des Sciences du Climat et de l'Environnement (Chevallier et al., 2010), two from the National Institute for Environmental Studies (Maksyutov et al., 2013), the 2010 version of CarbonTracker (Peters et al., 2007), and the National Oceanic and Atmospheric Administration Parameterized Chemistry and Transport Model (NOAA PCTM; Kawa et al., 2004). Both methods of X CO 2 validation have limitations, but we believe they are still useful in evaluating the performance of the retrieval algorithms. Additionally, the results of this study were found to be largely insensitive to the method of validation used.

Pre-filtering
In order to remove measurements heavily contaminated by clouds or aerosols, the OCO-2 and GOSAT data sets were pre-filtered using the O 2 A-band Preprocessor (ABP) O'Dell et al., 2012). The ABP works by estimating surface pressure and albedo from the O 2 A-band assuming no clouds or aerosols are present. If the scene is contaminated by thick cloud or aerosol layers, the results may have identifiably large residuals between the measured and modeled radiances as well as unrealistic surface pressure or albedo values. The ABP was run on all the measurements because it is extremely fast and computationally inexpensive. The use of the ABP removes most, but not all, scenes contaminated by clouds and aerosols (Taylor et al., , 2016. Clear-sky measurements at high latitudes and over ice and snow, which typically give reduced signal to noise ratios, are also removed by our pre-filters. Figure 2 shows the global distribution of the 44 000 prefiltered synthetic OCO-2 measurements, while Fig. 3 shows the location of the 25 000 pre-filtered real GOSAT measurements used in this study, and whether the X CO 2 validation source (see Sect. 4.1) was a model consensus or a TCCON measurement.

DOGO: Data Ordering through Genetic Optimization
While the pre-filtering techniques we employed are effective at removing many scenes containing clouds and aerosols, other tools are needed to identify very clear scenes that are nearly free of cloud and aerosol contamination on which we believe X CO 2 retrievals will be most accurate. One method used to filter ACOS B3.4 data was a suite of approximately 18 tests to identify scenes of the highest quality, which tend to be the most clear . In this study a genetic algorithm (DOGO; Mandrake et al., 2013), which is an optimization tool designed to mimic natural selection to explore high-dimensional parameter spaces, was employed to find optimal post-filters for retrievals performed on synthetic OCO-2 measurements and real GOSAT measurements.
Unpublished studies have shown that this approach yields similar results to the hand-tuned filters, but typically with far fewer filtering parameters necessary. Additionally, while the hand-tuned procedure requires trial and error to find the best possible filters, DOGO is automated and can quickly find an optimum filter set with minimal hand-tuning required. For this work, DOGO attempted to find variables that, when used to filter a data set, minimized the root mean square of the X CO 2 error, where the X CO 2 error is defined as the difference between the retrieved X CO 2 and the true X CO 2 (described in Sect. 4.1). We chose this parameter because it instructs DOGO to remove outliers as well as reduce the overall bias, as the formula for RMSE includes both variance and bias. The parameters allowed for selection by DOGO were all derived from the near-infrared measurements themselves, e.g., band signal levels, signal to noise ratios, retrieved surface pressure. The algorithm was not allowed to select certain variables, such as the validation X CO 2 or CALIOP measurements (used to create the OCO-2 simulations). This was to ensure that DOGO did not "cheat" by having access to external information that would not generally be available for real instruments. Filtering was done for different "throughputs", which equal the percent of data retained after post-filtering  with DOGO at a given filtration level. The DOGO system can also use more than one "rule", or filtering variable, when determining an optimal filtering strategy. That is, one rule selects the single most effective parameter at minimizing the X CO 2 RMSE at a given throughput, while two rules select the best combination of two parameters in reducing the error. A larger number of rules results in a greater reduction in error for a given throughput, but typically only two to five rules are needed to maximize this reduction (Mandrake et al., 2013). In this study we used four rules and found that increasing the number of rules did not further reduce the errors. DOGO was run separately for both clear-sky and full-physics retrievals as well as for land and ocean surfaces because it was hypothesized that a different set of four filtering parameters might be selected for each retrieval and surface type combination. Prior to post-filtering the data with DOGO, we applied a simple bias correction, unique to each retrieval and surface type, to the GOSAT X CO 2 retrievals, similar to what is done for operational GOSAT retrievals (Wunch et al., 2011b;Guerlet et al., 2013). This was because the clear-sky GOSAT retrievals over ocean initially contained a large bias in X CO 2 , which initial testing showed was not entirely removed by DOGO. The X CO 2 data were adjusted using both a slope and an offset: where A is an offset, x is a single variable chosen for bias correction, and B is the slope of a best-fit line through x with respect to the X CO 2 error. We use a single correction parameter, rather than 2-4 as employed operationally, because the qualitative results of this study are unchanged when employing more bias-correction parameters. The difference between the ABP-estimated surface pressure and the a priori surface pressure (dP ABP ) was selected for the bias correction for the clear-sky retrieval over land, clear-sky retrieval over ocean, and full-physics retrieval over land. The signal ratio of the 2.0 µm CO 2 band to the 1.6 µm CO 2 band (SR32) was used for the full-physics retrieval over ocean. These parameters were chosen because they showed the highest correlation with the X CO 2 error. Both dP ABP and SR32 relate to light path modification caused by clouds and aerosols. dP ABP is typically negative and the magnitude is correlated with the amount of light path shortening due to clouds and aerosols. The signal ratio is influenced by the wavelength dependence of clouds and aerosols in the two near-infrared CO 2 bands.

X CO 2 retrieval comparison
In the previous section we described our use of pre-filtering with the ABP and post-filtering with DOGO to produce a data set with biases and outliers minimized to the greatest extent possible. Here we apply and evaluate our technique and examine the performance of the clear-sky X CO 2 retrieval relative to the full-physics X CO 2 retrieval for simulated OCO-2 measurements and real GOSAT measurements. We evaluate the effectiveness of DOGO at reducing RMSEs, investigate the impact of clouds and aerosols on the OCO-2 simulations, and examine regional biases, scatter, and RM-SEs.

Summary of OCO-2 error statistics
We begin by evaluating the performance of the clear-sky retrieval on simulated OCO-2 measurements. Figure 4 demonstrates the effectiveness of DOGO at reducing X CO 2 RMSEs as a function of throughput. The initial reduction in error at high throughputs is dramatic, as the algorithm is easily able to identify and remove highly contaminated scenes. All data sets begin to plateau at approximately 50-80 % throughput as DOGO has already removed the obviously contaminated scenes, as discussed below and demonstrated in Figs. 5 and 6, and is now selecting the best of what remains. At very low throughputs, there are not enough measurements for the algorithm to function properly. Additionally, at such low throughputs, the RMSE approaches the error due to simulated instrument noise over ocean (∼ 0.35 ppm). Interestingly, the RMSEs over land plateau around 0.80 ppm, while the sim-  . DOGO filters applied to full-physics (solid) and clearsky (dashed) retrievals performed on simulated OCO-2 measurements for ocean (blue) and land (orange) surfaces. Four variables were chosen and optimized by the DOGO system. The x axis is throughput, which represents the percentage of data that remains after applying the DOGO filters. The y axis in the top panel is the X CO 2 RMSE. The y axis in the bottom panel is the percent difference between the clear-sky X CO 2 RMSE (E 1 ) and the full-physics Positive values indicate smaller full-physics X CO 2 RMSEs, while negative values indicate smaller clear-sky X CO 2 RMSEs. ulated noise limit is ∼ 0.50 ppm. This suggests an underestimation of the posterior errors for OCO-2 simulations over land, a feature seen previously in both simulations  and real GOSAT data (Kulawik et al., 2016). Over ocean, the clear-sky retrieval (dashed blue line) performs nearly as well as or better than the full-physics retrieval (solid blue line) at all throughputs, never having an X CO 2 RMSE more than 0.25 ppm worse than the full-physics data set. The first 20 % of scenes filtered out by DOGO (from 100 to 80 % throughput) likely contain very thick clouds and aerosols, as discussed below, such that both the full-physics and clear-sky retrievals produce large RMSEs because neither is able to account for such severe light path modifications. It is interesting that the full-physics retrieval performs just as poorly as the clear-sky retrieval for these highly contaminated scenes, even though it attempts to account for the presence of clouds and aerosols. This may suggest that in our simulations over ocean, the full-physics retrieval struggles to properly quantify the light path modifications of clouds and aerosols. However, the relative magnitudes and trends in the RMSE reduction can be sensitive to the pre-filtering setup and bias corrections. From 80 to 30 % throughput, there are still many scenes containing a non-trivial amount of contamination due to clouds and aerosols. Thus, the full-physics retrieval, which has state vector elements designed to handle Atmos. Meas. Tech., 9, 1671-1684 Clear-sky (left) and full-physics (right) retrieval X CO 2 RMSEs vs. the true total optical depth for simulated OCO-2 measurements over ocean. The black lines are binned averages of the X CO 2 RMSE for 100 % throughput (yellow markers), 80 % throughput (green markers), and 30 % throughput (blue markers). The histograms represent the relative amount of data for each throughput.
these scenarios, outperforms the clear-sky retrieval, which is unable to account for any scattering or absorption by even thin clouds and aerosols. Below 30 % throughput, however, the ocean scenes become pristine enough that the clear-sky retrieval has ∼ 10 % smaller X CO 2 RMSEs than the fullphysics retrieval. It is likely that the full-physics retrieval struggles slightly compared to the clear-sky retrieval because it is trying to parameterize nonexistent clouds and aerosols and thus has too many degrees of freedom. This result agrees with Butz et al. (2013), who found that simulated measurements containing light path perturbations caused by clouds and aerosols can be identified and removed, thus allowing a clear-sky retrieval to perform well. Over land, Fig. 4 demonstrates that the clear-sky retrievals (dashed orange line) consistently have higher RMSEs than the full-physics retrievals (solid orange line) at high throughputs, upwards of a difference of 1.5 ppm, or ∼ 40 % larger. At these higher throughputs, most scenes still contaminated by clouds and aerosols remain and the full-physics retrieval performs better, consistent with expectations. The clear-sky retrieval is unable to account for the complex multiplescattering effects caused by cloud and aerosol layers as well as their interaction with the surface. However, the X CO 2 RM-SEs become more comparable at lower throughputs, with the clear-sky retrieval error coming within a tenth of a ppm of the full-physics retrieval error at a throughput of 30 %. This is consistent with our hypothesis that when scenes contaminated by clouds or aerosols are removed, the clear-sky retrieval can perform as well as the full-physics retrieval. However, to demonstrate this explicitly, we must show that DOGO is indeed filtering out contaminated scenes.
Because we know the true profiles of clouds and aerosols used to create these simulated OCO-2 measurements, this is straightforward. Figure 5 shows the binned X CO 2 RMSE vs. the true total optical depth (the sum of the true aerosol, ice cloud, and water cloud optical depths from CALIOP, used to create the synthetic measurements) for clear-sky retrievals (left panel) and full-physics retrievals (right panel) over ocean. At 100 % throughput, there are a significant number of thick (τ > 1.0) cloud and aerosol scenes present, along with a secondary peak of thinner cloud and aerosol scenes. These thick scenes are primarily water clouds near the surface that the ABP was unable to identify and remove. In general, the 100, 80, and 30 % throughput X CO 2 RMSEs for clear-sky and full-physics retrievals over ocean are nearly equivalent, which agrees with our analysis of the ocean retrievals in Fig. 4. For high optical depth scenes (τ > 1.0) at 100 % throughput, the RMSE of the data is large (over 6 ppm for both retrieval types). This indicates that, as hypothesized, both retrieval types have large errors for scenes containing thick cloud or aerosol layers. Going from 100 % throughput (yellow) to 80 % throughput (green) results in DOGO greatly reducing the number of these high optical depth scenes, which corresponds to the steep initial decline of the ocean retrieval RMSEs in Fig. 4. This is impressive because, as was explained in Sect. 4.3, DOGO is not allowed to use the true optical depth as a filter, indicating that it is using other information contained in the near-infrared measurements themselves to infer the amount of clouds and aerosols in a given scene. Specifically, the CO 2 and H 2 O ratios were frequently selected by DOGO to filter the OCO-2 retrievals. The CO 2 and H 2 O ratios were taken from the Iterative Maximum A-Posteriori Differential Optical Absorption Spectroscopy (IMAP-DOAS) algorithm (Frankenberg et al., 2005;Taylor et al., 2016) and are calculated using estimates of CO 2 and H 2 O from both the 1.6 and 2.0 µm CO 2 bands independently using a fast, non-scattering algorithm. Deviations from unity in the ratio of the 1.6 to 2.0 µm band values, caused by the wavelength dependence of clouds and aerosols, allows for the identification of many contaminated scenes. Figure 6. Clear-sky (left) and full-physics (right) retrieval X CO 2 RMSEs vs. the true total optical depth for simulated OCO-2 measurements over land. The black lines are binned averages of the X CO 2 RMSE for 100 % throughput (yellow markers), 80 % throughput (green markers), and 30 % throughput (blue markers). The histograms represent the relative amount of data for each throughput. . DOGO filters applied to full-physics (solid) and clearsky (dashed) retrievals performed on GOSAT measurements for ocean (blue) and land (orange) surfaces. Four variables were chosen and optimized by the DOGO system. The x axis is throughput, which represents the percentage of data that remains after applying the DOGO filters. The y axis in the top panel is the X CO 2 RMSE. The y axis in the bottom panel is the percent difference between the clear-sky X CO 2 RMSE (E 1 ) and the full-physics X CO 2 RMSE (E 2 ): (E 1 − E 2 )/(0.5 · (E 1 + E 2 )) · 100. Positive values indicate smaller full-physics X CO 2 RMSEs, while negative values indicate smaller clear-sky X CO 2 RMSEs. dP ABP and parameters related to the 2.0 µm CO 2 band signal were also often selected as filters, as they relate to scattering and absorption by clouds and aerosols. At 80 % throughput (green lines and histograms), the full-physics retrieval has slightly smaller RMSEs than the clear-sky retrieval for scenes containing a moderate amount of clouds or aerosols (0.1 < τ < 1.0). This supports our hypothesis that the fullphysics retrieval's parameterization of clouds and aerosols is helpful for these types of scenes and that the clear-sky retrieval struggles because it is unable to account for the light path modifications caused by these contaminants. When only 30 % of the ocean data remain (blue lines and histograms), primarily low optical depth scenes (τ < 0.3) remain and the clear-sky retrieval performs about as well as the full-physics retrieval in terms of X CO 2 RMSE. The same is true for precision and bias (not shown). One might think that even slightly contaminated scenes (τ ∼ 0.1-0.3) should have been removed for the 30 % throughput case, but DOGO's goal is to minimize the X CO 2 RMSE, not to remove scenes with high optical depths. A similar analysis was performed for measurements over land, with the results displayed in Fig. 6. As was seen over ocean in Fig. 5, at larger throughputs higher optical depths correspond to larger RMSEs in the X CO 2 data. The clearsky retrieval struggles with these high optical depth scenes, but also has relatively large RMSEs (∼ 3 ppm) for moderate to small optical depths. Interestingly, the initial number of high optical depth scenes over land is considerably smaller than over ocean. This indicates that the ABP may be more effective at removing high optical depth scenes over land, likely because scattering between cloud or aerosol layers and the surface makes these contaminates more identifiable. As the throughput decreases, the RMSE becomes smaller and more uniform over the entire range of optical depths for both retrieval types. Interestingly, at a throughput of 30 % some high optical depth scenes (τ > 1.0) for the clear-sky retrieval over land still remain. However, the RMSE is still optimally reduced by DOGO and only ∼ 10 % worse than the full-physics retrieval. For the full-physics data set, DOGO chooses to remove nearly all of these thick cloud or aerosol scenes. This suggests that the clear-sky retrieval may be slightly less sensitive to some high optical depth scenes over land, perhaps due to complex light path cancellation effects.
In addition to the statistical analysis of the entire data set, spatial errors in the OCO-2 retrievals were analyzed to see if regional variability existed and if there were regions where the clear-sky retrieval had smaller or larger errors relative to the full-physics retrieval. The binned mean X CO 2 errors, standard deviation of the X CO 2 errors, and X CO 2 RM-SEs for the OCO-2 simulations are shown in Fig. 7. Here we use a throughput of 30 %, where the globally averaged clear-sky X CO 2 RMSEs are approximately equivalent to the full-physics errors over ocean and slightly larger (0.1 ppm) over land (as shown in Fig. 4). The coverage of post-filtering to 30 %, shown in Fig. 7, is spatially dependent because of a preference to remove measurements over regions that persistently contain clouds or aerosols (e.g., the Sahara) or have low signal to noise ratios (e.g., high latitudes). The clear-sky and full-physics retrieval mean error spatial patterns are similar and both relatively small in magnitude, which indicates that they do not contain large regional biases and, more importantly, that the clear-sky retrieval does not have significantly larger biases than the full-physics retrieval. The scatter and the RMSEs both show limited regional variability over ocean and modest variability over land. The regional clear-sky retrieval errors are approximately the same magnitude as the full-physics retrieval errors over ocean and only marginally larger over land. Overall, these simulated results are promising because they demonstrate that the clear-sky X CO 2 retrieval has global and regional error statistics similar to the full-physics X CO 2 retrieval.

Summary of GOSAT error statistics
We have shown that clear-sky retrievals can be as accurate as full-physics retrievals for OCO-2 simulations over both land and ocean surfaces when filtering is employed to remove low-quality scenes, including those contaminated by clouds Figure 9. GOSAT full-physics (left column) and clear-sky (right column) retrieval mean X CO 2 errors (top row), standard deviation of the X CO 2 errors (middle row), and X CO 2 RMSEs (bottom row) for 8 • × 4 • (longitude × latitude) bins for a throughput of 30 %. and aerosols. In this section, we explore whether this result is reproducible using real observations. The effectiveness of applying DOGO to the pre-filtered GOSAT data sets is shown in Fig. 8. Initially, as in the OCO-2 simulations, there is a large reduction in the X CO 2 RMSE as the throughput is decreased. Based on our results from the OCO-2 simulations, this is likely because DOGO is identifying and filtering out highly contaminated scenes that have large X CO 2 errors due to scattering effects. Interestingly, the initial RMSEs over ocean are much smaller than in the simulations, and the initial reduction of error is modest. This may indicate that our pre-filtering and bias correction techniques are especially effective for retrievals over ocean. Regarding the parameters chosen by DOGO, the primary filters selected were dP ABP and parameters related to the 2.0 µm CO 2 band signal. These parameters, as discussed in Sect. 5.1, relate to the light path modification effects of clouds and aerosols.
Over ocean surfaces, the clear-sky retrieval (dashed blue line) has larger X CO 2 RMSEs than the full-physics retrieval (solid blue line), even at very high levels of filtration (low throughputs). The clear-sky retrieval X CO 2 RMSEs over ocean range from ∼ 1.0 to 2.0 ppm, depending on throughput. This error is about 0.5-1.0 ppm larger than the corresponding full-physics retrieval errors. As the throughput is initially decreased, the difference in error between the clearsky and full-physics retrieval over ocean stays approximately constant. Once the throughput drops below ∼ 40 %, however, the clear-sky errors begin to approach the full-physics errors. This qualitatively agrees with our simulated OCO-2 results in that the clear-sky retrieval performs better as contaminated scenes are preferentially removed by DOGO. However, even at low throughputs (less than 40 %), the clear-sky retrievals still have X CO 2 RMSEs over ocean roughly 20-35 % larger than those of the full-physics retrieval. This is in contrast to our simulation-based OCO-2 results, suggesting that addi-tional unknown real-world mechanisms not included in the simulations may limit the performance of the clear-sky retrieval on real measurements over ocean surfaces. It is also possible that, despite promising results from our simulationbased tests, our filtering technique is unable to sufficiently remove scenes contaminated by clouds and aerosols over ocean surfaces.
Over land, the clear-sky retrieval (dashed orange line) has errors closer to the full-physics retrieval (solid orange line) compared to over ocean. For throughputs greater than ∼ 40 %, the clear-sky retrieval typically has RMSEs ∼ 20 % larger than the full-physics retrieval. These errors are, as they were over ocean, roughly constant for the entire throughput range above ∼ 40 %, which may suggest that even a moderate amount of contamination by clouds or aerosols prevents the clear-sky retrieval from performing as well as the fullphysics retrieval for real GOSAT measurements. However, as the throughput is decreased further and most of the contaminated scenes are removed, the clear-sky errors become comparable to the full-physics errors. Unlike over ocean, this result qualitatively agrees with our simulated results over land and suggests that the data can be filtered well enough so that the clear-sky retrieval can perform as well as the full-physics retrieval.
The regional binned X CO 2 mean errors, standard deviation of the X CO 2 errors, and X CO 2 RMSEs are shown in Fig. 9. The data sets shown were post-filtered using DOGO with a throughput of 30 %. The post-filtering, as was seen in the simulated OCO-2 data, has a preference to remove measurements in regions persistently contaminated by clouds or aerosols and measurements at high latitudes that had low signal to noise ratios. The Sahara, for example, is entirely devoid of measurements at a throughput of 30 %, likely because of contamination by large dust particles. Over ocean, the clear-sky retrievals have mean X CO 2 errors similar to those of the full-physics retrieval. Regarding the standard deviation of the X CO 2 errors, the clear-sky retrieval generally has modestly larger values. Because of this, the clear-sky retrieval consistently also has modestly larger regional X CO 2 RM-SEs than the full-physics retrievals. The results over land are more variable but are still in agreement with Fig. 8. The regional clear-sky biases may be slightly larger in a few places (e.g., North America), but this may partly be due to low number statistics. In general, the clear-sky mean errors, standard deviation of the errors, and RMSEs are comparable to their full-physics counterparts. The observed regional variability, compared to relatively uniform ocean error patterns, could be due to heterogeneous surface characteristics or cloud and aerosol compositions. Thus, we can not say with confidence that clear-sky retrievals perform better or worse for specific regions for real GOSAT data without further investigation, but in general the two retrievals are roughly equivalent over land when contaminated measurements are removed.

Conclusions
In this study we evaluated the performance of non-scattering, or "clear-sky", X CO 2 retrievals performed on hyperspectral near-infrared measurements of reflected sunlight by comparing them to "full-physics" X CO 2 retrievals, which include scattering and absorption by clouds and aerosols. From our statistical analysis, we conclude that clear-sky X CO 2 retrievals typically do not perform as well as full-physics X CO 2 retrievals when no filtering is applied, consistent with previous findings. However, with the application of pre-and postfilters to remove low-quality measurements contaminated by clouds and aerosols using only information contained in the near-infrared measurements themselves, our OCO-2 simulation-based tests demonstrate that clear-sky retrievals are of similar or only slightly reduced quality compared to the full-physics retrieval, depending on filtration level.
For GOSAT measurements over land, the clear-sky retrieval has X CO 2 RMSEs 0-20 % larger than the full-physics retrieval, when the data sets are filtered to remove scenes contaminated by clouds and aerosols. Over ocean, the clearsky retrieval has X CO 2 RMSEs roughly 20-35 % larger than the full-physics retrieval. The source of this extra error in the clear-sky retrieval applied to real GOSAT measurements, especially over ocean, is unclear at this point and requires further study. Analysis of real OCO-2 measurements, which were unavailable during the time of this work, may help answer this question. An alternative way to view this result is that the full-physics cloud and aerosol parameterization benefits GOSAT measurements over both land and ocean, but the improvement is more substantial over ocean.
For synthetic OCO-2 measurements and real GOSAT measurements over both land and ocean surfaces, the clear-sky retrieval has X CO 2 RMSEs less than 2.0 ppm when the data set is sufficiently filtered. In addition, clear-sky retrievals can be 1-2 orders of magnitude faster than full-physics retrievals, as scattering by clouds and aerosols can be completely ignored. For proposed sensors that collect an enormous volume of data, such as GeoCarb (Polonsky et al., 2014) and CarbonSat (Bovensmann et al., 2010), this could allow for significantly more data to be processed. Additionally, estimates of parameters from a clear-sky retrieval, such as surface albedo, could serve as a useful first guess for the fullphysics retrieval.