The Orbiting Carbon Observatory-2: ﬁrst 18 months of science data products

. The Orbiting Carbon Observatory-2 (OCO-2) is


Introduction
Human activities including fossil fuel combustion, cement production, and deforestation are now adding almost 40 billion tons of carbon dioxide (CO 2 ) to the atmosphere each year (see Le Quéré et al., 2015). If all of this CO 2 remained in the atmosphere, the atmospheric CO 2 concentration would increase by more than 1 % per year. Interestingly, Published by Copernicus Publications on behalf of the European Geosciences Union.

550
A. Eldering et al.: First 18 months of OCO-2 science data precise measurements collected by a growing global network of greenhouse gas monitoring stations over the past 60 years indicate that less than half of this CO 2 remains airborne (Dlugokencky and Tans, 2015). The rest is being absorbed by the oceans and the land biosphere. Measurements of the partial pressure of CO 2 in seawater collected over this period indicate that almost a quarter of the CO 2 emitted by human activities is being absorbed by the ocean (see Takahashi et al., 2009), where it contributes to ocean acidification. For mass balance reasons, another 10 billion tons of CO 2 must be absorbed by processes on land, the identity and location of which are less well understood. Some studies have attributed this absorption to tropical (Schimel et al., 2015) or Eurasian temperate  forests, while others indicate that these areas are just as likely to be net sources as net sinks of CO 2 (Chevallier et al., 2014). The efficiency of these natural land and ocean sinks also appears to vary dramatically from year to year (Le Quéré et al., 2015). Some years, they absorb CO 2 equivalent to almost all of that emitted by human activities, while in other years they absorb very little. Because the identity, location, and processes controlling these natural sinks are not well constrained, it is not clear whether they will continue to reduce the rate of atmospheric CO 2 buildup by half in the future (Schimel et al., 2015). This introduces a major source of uncertainty in predictions of the rate of future CO 2 increases and their effect on the climate (Friedlingstein et al., 2006;Arora et al., 2013).
Measurements from the network of ground-based greenhouse gas stations accurately track the global atmospheric CO 2 budget and its trends. Remote sensing of the columnaveraged CO 2 dry air mole fraction (X CO 2 ) from space is intended to provide finer spatial coverage enabling smallerscale sources emitting CO 2 into the atmosphere and natural sinks absorbing this gas at the Earth's surface to be better quantified. Surface weighted X CO 2 estimates can be retrieved from high-resolution spectroscopic observations of reflected sunlight in near-infrared CO 2 and O 2 bands (see Rayner and O'Brien, 2001;Crisp et al., 2004;Buchwitz et al., 2006;O'Dell et al., 2012). This is a challenging space-based remote sensing observation because even the largest regional CO 2 sources and sinks produce changes in the background X CO 2 distribution no larger than 2 %, and most are smaller than 0.25 % (1 part per million (ppm) out of the background 400 ppm) (see Miller et al., 2007).
The European Space Agency (ESA) EnviSat SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY) (Burrows et al., 1995) and Japanese Greenhouse Gases Observing Satellite (GOSAT) thermal and near-infrared sensor for carbon observation Fourier transform spectrometer (TANSO-FTS) (Nakajima et al., 2010) were the first satellite instruments designed to exploit this measurement approach. SCIAMACHY enabled retrieval of column-averaged CO 2 and methane (X CH 4 ) measurements over the sunlit hemisphere from 2002 to 2012. Spectra from TANSO-FTS have been used to produce X CO 2 and X CH 4 observations since April 2009. These data have provided an important proof of concept and are beginning to yield new insights into the carbon cycle (Feng et al., 2016;Guerlet et al., 2013;Wunch et al., 2013;Schneising et al., 2014), but improvements in sensitivity, resolution, and coverage are still needed.
The Orbiting Carbon Observatory-2 (OCO-2) is the first NASA satellite designed to measure atmospheric CO 2 columns with the accuracy, resolution, and coverage needed to detect CO 2 sources and sinks on regional scales over the globe. OCO-2 is a replacement for the Orbiting Carbon Observatory (Crisp et al., 2004 which was lost in 2009, when its launch vehicle malfunctioned and failed to reach orbit. OCO-2 was successfully launched from Vandenberg Air Force Base in California on 2 July 2014. Since 6 September of 2014, this instrument has been routinely returning almost 1 million soundings each day over the sunlit hemisphere. Optically thick clouds and aerosols preclude observations of the full atmospheric column, but 7 to 12 % of these soundings are sufficiently cloud free to yield full-column estimates of X CO 2 with single-sounding random errors between 0.5 and 1 ppm at solar zenith angles as large as 70 • . Here we provide a brief introduction to the instrument and the mission operations to date, highlighting the global coverage, resolution, and precision of the dataset. We describe the overall flow of data in Sect. 4 and some key results in terms of data quantity, quality, and features, with discussions of X CO 2 (Sect. 4.3.1), data quality indicators (Sect. 4.3.3 and 4.3.4), and overall data density (Sect. 4.3.5). The trends in X CO 2 in space and time as seen from OCO-2 are discussed in Sect. 5. This paper is one of a number of papers describing the OCO-2 mission and its early results. On-orbit calibration and validation of the level 1 radiances are described in Crisp et al. (2017a, b). Details of the X CO 2 retrieval algorithm, including filtering and bias correction, are given in O' Dell et al. (2017), while the validation of X CO 2 via comparisons to the Total Carbon Column Observing Network (TCCON) are given in Wunch et al. (2016). Finally, analysis of the solarinduced fluorescence (SIF) product derived from OCO-2's oxygen A-band (ABO2) is described in Sun et al. (2017). Interested readers are advised to consult these references for details.

The instrument
The instrument of OCO-2 is a three-band spectrometer, which measures reflected sunlight in three separate bands. The ABO2 measures absorption by molecular oxygen near 0.76 µm, while two carbon dioxide bands, labeled here as the weak and strong CO 2 bands (WCO2 and SCO2 hereafter), are located near 1.6 and 2.0 µm, respectively. The instrument has 1016 spectral elements in each band, and 160 pixels are averaged in groups of ∼ 20 along the slit, creating eight spatial footprints. The instrument field of view creates footprints that are nominally 1.25 km in width, and the spacecraft motion spans ∼ 2.4 km of the ground in the 0.33 s of integration time. The spacecraft rotates along the orbit, maintaining a constant angle between the plane defined by the instrument, the point observed on the ground, and the sun. As a result, the footprint shapes change during the orbit, from very narrow and long near the Equator, to smaller and smaller aspect ratios (approaching rectangular footprints), with increasing latitudes (see details in Crisp et al., 2017b). The rate of data collection results in approximately 1 million sets of three band measurements per day.
The OCO-2 instrument collects data over very narrow spectral ranges, with a resolving power (λ/ λ) of roughly 19 000 : 1 in each band that reveals the trace gas spectral absorption lines. The spectral ranges for the ABO2, WCO2, and SCO2 are 0.7576 to 0.7726, 1.5906 to 1.6218, and 2.0431 to 2.0834 µm, respectively. Details of the spectral and radiometric calibration of the instrument are reported in Lee et al. (2017) and Rosenberg et al. (2017), respectively. Onorbit instrument performance is described in detail in Crisp et al. (2017a). Coincident measurements from the three channels are combined into "soundings" that are analyzed with a "full-physics" retrieval algorithm to yield estimates of X CO 2 and other geophysical quantities (see Boesch et al., 2006Boesch et al., , 2011O'Dell et al., 2012O'Dell et al., , 2017Crisp et al., 2012).

The observatory in space
The OCO-2 observatory was launched successfully from Vandenberg Air Force Base in California on 2 July 2014 at 02:56 am Pacific daylight time. During the 10 days following launch, the spacecraft team completed a functional check of both the observatory and the instrument. The observatory was then maneuvered into its position in the 705 km Afternoon Constellation, also called the A-train, arriving on 3 August 2014. A number of atmospheric remotesensing satellites fly in coordination in this constellation, such as the Moderate Resolution Imaging Spectrometer (MODIS) and Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) which can be used for cross comparisons of clouds and radiances. After achieving the operational orbit, the instrument and focal planes were brought to and stabilized at their operational temperatures. During the more extensive in-orbit checkout (IOC) of the instrument, measurements were collected to refine the geometric, radiometric, and spectral calibration. On 6 August 2014, the first spectral data were collected with the instrument at operating temperatures and processed with calibration parameters from pre-launch calibration experiments. As reported in Basilio et al. (2014), these data showed high resolution with high signal to noise characteristics similar to the prelaunch measurements. Another critical activity during the IOC were lunar measurements that were used, in combination with data from coastal crossings, to determine the alignment of the spec-trometers and derive the updated pointing coefficients. Calibration data collected during IOC were used to update the instrument gain coefficients, dark correction, and to update the map of bad pixels on the focal plane. This was completed on 5 September 2014. Data after that date are considered scientifically usable, as the instrument temperatures were stable, and the key radiometric parameters were up to date. The OCO-2 mission formally ended the IOC period on 12 October 2014.
As of the summer of 2016, the instrument and spacecraft are performing extremely well, and data collection continues. Crisp et al. (2017a, b) provide details of data interruptions, which have been primarily driven by instrument operations.

The observing strategy
The observing strategy of the OCO-2 mission evolved over the first year. Initially, the strategy was to collect 16 days of nadir data, collecting data by measuring directly below the spacecraft, followed by 16 days of glint measurements, where the instrument is pointed towards the glint spot, to collect higher signal ocean data. This strategy was updated over time, and it is illustrated in Fig. 1. The key changes were (1) the geometry of glint measurements, (2) changes to the frequency of alternating glint and nadir mode orbits, (3) changes to the geometry of nadir orbits, and (4) the specification of some orbit paths as perpetual glint measurements.
During early instrument checkout (7 August 2014), the nominal 16-day nadir-glint pattern was disrupted after very high signals were observed during glint measurements. For the safety of the instrument, the observing mode was shifted to nadir measurements while the cause was investigated. We concluded that an incident of glint measurements over very still water, which may have had a layer of highly reflective material on its surface, was the cause of the high signal measurements (see Crisp et al., 2017b, for more discussion), and they posed no risk to the instrument, so glint data collection was restarted on 8 September 2014. In mid-September 2014 it was recognized that the measurements were consistent with a polarization sensitivity that was rotated by 90 • from our expectations (again, see Crisp et al., 2017b). To improve the signal-to-noise ratio (SNR) of the glint mode observations, particularly near the Brewster's angle, the spacecraft was yawed 30 • during glint measurements after 26 October 2014. To provide more uniform temporal distribution of glint measurements over ocean, an additional change was made to the data collection beginning 3 July 2015. The nadir and glint data collection were changed to an orbit by orbit interleaving (one orbit nadir, one orbit glint, ad infinitum). Over a 32-day period, nadir and glint data are collected over the same set of locations as in the original 16-day alternating scheme, but the new approach does not have large time gaps in ocean data collection. In late October 2015, to reduce the temperature changes of the instrument when changing from glint to nadir, the nadir geometry was updated to collect data at the same 30 • yaw as glint data are collected in. This allows for the collection of three to five glint orbits in a row between nadir orbits. With this change, orbits that are solely over water, such as the Pacific and Atlantic, can be measured in glint at all times. This type of data collection was started on 12 November 2015, and it is expected that this approach will be used for the remainder of the mission. Figure 1 provides a calendar view of the observing strategy and data outages.

Overall data flow
The overall flow of the data pipeline is illustrated in Fig. 2. All data products except the so-called "Lite files" contain one granule of data, which is restricted to one mode (such as nadir, glint, target, or transition). A granule corresponds to a complete orbit of measurements except in the cases where the orbit includes a switch to target measurements. In these cases there are separate data product files for the target and the transition before and after the target. The data that are processed as they are collected are referred to as v7, or the forward processing stream. They use calibration coefficients that are predicted based on recent measurements. This dataset is created in the Science Data Operations System (SDOS) at JPL. The v7r refers to the retrospective data, or data processed with calibration coefficients based on measurements before, during, and after the measurement time period. This dataset is typically processed on supercomputer resources (NASA's Pleiades and cloud computing resources).
The raw (L1a) measurements are geolocated, and the calibration coefficients are applied to generate geolocated, calibrated radiances (L1B) as discussed in Crisp et al. (2017a). These data are then passed to the preprocessors, which are used to identify the scenes that are most likely to be cloud free and successful in generating converged retrievals. One preprocessor routine also provides estimates of SIF. The X CO 2 retrievals are performed on a subset of data selected by the preprocessors outcomes. The v7 and v7r standard (L2Std) and diagnostic (L2Dia) products report these data, which include the X CO 2 estimates. In a final step, a bias correction and data quality flag (warn level) are integrated, and each day of quality data is packaged into a single so-called "Lite file" (further details in Sect. 4.3 and in Mandrake et al., 2015). All L1B, L2, and Lite products are delivered to the NASA Goddard Earth Science Data and Information Services Cen-ter (GES DISC) for distribution and archiving (http://disc. sci.gsfc.nasa.gov/OCO-2). The L1 and L2 products are described in greater detail in the OCO-2 Data Product User's Guide and the L1B and L2 Algorithm Theoretical Basis Documents (ATBDs) and other documents, which are posted along with the products at the GES DISC (http://disc.sci.gsfc. nasa.gov/OCO-2/documentation/oco-2-v7) Crisp et al., 2015;Eldering et al., 2015;Mandrake et al., 2015).

Calibrated radiances
The level 1B (L1B) product consists of full orbits or fractions of orbits of calibrated and geolocated spectral radiances from the ABO2, WCO2, and SCO2 channels. The details of the transformation of raw measurements into calibrated spectral radiances are discussed in the L1B Algorithm Theoretical Basis document . The pre-flight spectral and radiometric calibration is discussed in  and Rosenberg et al. (2017). The in-flight performance is discussed in detail in Crisp et al. (2017a). The L2 data products are not impacted by the calibration issues discussed in Crisp et al. (2017a) with the exception of timedependent radiometric correction factors that are now understood to be in error for the v7/v7r data, with an increasing error in time. This radiometric error has a magnitude of about 4 % by 18 months into the mission and is an error in the absolute radiometry, not a growing uncertainty on the radiances. Analysis of a set of test retrievals where this error was removed showed that an absolute radiance error of 4 % will impart an X CO 2 error of 0.22, 0.12, and 0.4 ppm in nadir land, glint land, and glint water measurements, respectively. This error is not addressed in the analysis presented here, where data are used as provided in the v7/v7r files.

Preprocessors
For the v7 and v7r OCO-2 dataset, the A-band (ABP)  and IMAP-DOAS (IDP) preprocessors (Frankenberg et al., , 2014 were used for the selection of data to be processed to L2. To limit the demands on the computing system, no more than 6 % of data collected each day are processed to L2 in the v7 forward processing stream. The v7r processing stream includes all data that meet pre-processing criteria, which is on average 17.9 % for glint data and 6.6 % for nadir. Taylor et al. (2016) describe the preprocessor outcomes in detail. In summary, the ABP compares the measured radiance spectra with spectra calculated with a non-scattering forward model to test for the presence of clouds. The IDP also uses a non-scattering forward model, but it is applied to the WCO2 and SCO2 independently. Ratios of the single band column retrievals are then analyzed to identify scenes that are impacted by clouds and aerosols. As reported in Taylor et al. (2016) the combined ABP and IDP OCO-2 preprocessors screen approximately 85-90 % of the co-located data that MODIS reports to be cloudy, with overall global agreement of ∼ 85 % between the two sensors. The regions of significant disagreement were found to be tropical and subtropical oceans and desert land. Comparisons to CALIOP measurement of the vertical distribution of cloud optical thickness confirmed the conclusion derived from simulations that the combined ABP and IDP preprocessors successfully identify high, optically thin clouds and midlevel clouds and aerosols but fail to identify contamination in about 25 % of the cases of low, optically thick clouds and aerosols. Additional pre-filters remove all land data south of 65S and further limit the surface albedo in the ABO2 to less than 0.55 for a rough proxy of the presence of snow and ice on the ground, which can cause the retrievals significant problems .

Level 2 algorithm products
The OCO-2 project reports two key products at L2 (derived geophysical data at the spatial resolution of the measurement), the dry air mole fraction of carbon dioxide (X CO 2 ) and SIF. As described in the preprocessor section, only a subset of data are considered to be sufficiently cloud-(and aerosol-) free (optical depths less than ∼ 0.35 as determined in the preprocessors) for the next step of processing in the L2 Full Physics algorithm, which produces the X CO 2 data product. The SIF product is generated by the IDP preprocessors (Frankenberg et al., 2014). As described in Frankenberg et al. (2014), most of the fluorescence signal is retained, even through moderate clouds (optical depths up to 5). As a consequence, SIF results are reported for a much larger fraction of the OCO-2 observations compared to the X CO 2 product.
The OCO-2 retrievals for X CO 2 are created using the full physics algorithm that has been described previously . The retrieval algorithm is based on an optimal estimation scheme and an efficient radiative transfer technique that accounts for multiple scattering and polarization effects. A standard cost function is minimized to find the state vector that produces the maximum a posteriori probability. While the focus is the retrieval of X CO 2 , other parameters such as surface albedo, aerosols, temperature, water vapor, and wind speed (for water surfaces only), are co-retrieved. Prior to the launch of OCO-2, this algorithm was adapted for application to the GOSAT measurements, with these results reported in O'Dell et al. (2012) and Crisp et al. (2012), and for OCO-2 it remains largely unchanged from what was reported in those papers.
The X CO 2 data are reported in the L2_Standard files and the L2_Diagnostic files, where the diagnostic files contain additional information that may be useful for detailed assessment of the algorithm and for the modeling community . Examples of the additional information are the averaging kernels and the a posteriori covariance matrix,Ŝ. In v7, the L2 Standard and Diagnostic files, containing about 60 000 soundings per file, do not contain warn levels values which indicate data quality (Mandrake et al., 2013), nor has a bias correction been applied. This information is calculated subsequently and included in the Lite files described below.
A summary daily data product, referred to as the Lite files, is created, to simplify data volumes and data structures. Specific files for X CO 2 (Mandrake et al., 2015) and separately for SIF product contain 1 day of data per file (Frankenberg, 2015). For X CO 2 a bias correction is applied and warn levels are assigned, with all converged soundings included in the file.

L2 X CO 2 results
The X CO 2 data record from OCO-2 now extends more than 18 months, and Figs. 3, 4, and 5 show maps of these X CO 2 measurements. These maps illustrate averages over monthlong periods, so there are nadir and glint data in each panel. The data included in these maps and all that follow have been screened and have had the bias correction applied (v7rB Lite file data with the 0/1 data quality flag applied; see Mandrake et al., 2015). These two processes will be discussed in more detail in Sect. 4.3.4 and 4.3.6. As expected, these maps show the large annual changes in X CO 2 . CO 2 builds up over the Northern Hemisphere during winter and then is rapidly removed from the atmosphere as spring arrives and the terrestrial ecosystem activity increases rapidly. This is most apparent in the month of June, when the decrease of X CO 2 over northern Asia is order 10 ppm. The overall gradients of a few ppm from north to south are apparent in the data, as well as the secular increase in CO 2 from October 2014 to March 2016. Other features are apparent in the data maps, such as the higher CO 2 concentrations over the eastern US and China between October and December (see Figs. 3 and 5), when the overall global X CO 2 gradient is small. Enhanced X CO 2 coin-  The latitudinal coverage of the v7r dataset is also apparent from these maps. Data selection for processing through L2 relies on screening from the preprocessor results, as well as limitations on geographical extent. Analysis of the preprocessor data  shows that a large fraction of these higher latitude data are marked as cloudy, which is in agreement with the MODIS cloud fields. The current data selection does not select data south of 65 • in latitude, as experience with ACOS data showed that retrievals over ice failed routinely. We intend to retrieve the small number of cloud-free scenes over bare ground at these latitudes in the next version of the retrieval. Due to clouds, solar illumination, and geometry, any given month has data that span about 100 • in latitude, but the coverage band shifts north and south with the seasons.

Signal-to-noise ratios
The OCO-2 instrument was designed to provide adequate continuum SNR to achieve 0.3 % precision for X CO 2 measurements. The SNR design requirements were 290, 270, and 190 at nominal radiance levels (5.8, 2.1, and 1.1 × 10 19 photons m −2 sr −1 µm −1 s −1 ) in the ABO2 and the WCO2 and SCO2, respectively. The in-flight performance has met or exceeded all expectations, with SNR values as provided in the data product (radiance mean value in the continuum divided by the radiance noise value in the continuum) typically between 250 and 450 for the ABO2, 400 and 800 for the WCO2, and 200 and 500 for the SCO2. Figure 6 illustrates just 1 month of SNR levels, as no large seasonal dependence is observed. There are spatial patterns, with high SNR values over the bright deserts and in cloudy regions. The lowest SNR values are over oceans, especially when observed at higher solar zenith angles, particularly for the ABO2.

χ 2 goodness of fit parameter
The reduced χ 2 goodness of fit parameter is a convenient measure of the magnitude of the spectral residuals relative to the measurement error. The equation for per band (χ 2 i ) is given in Eq. (1), where i is the band index (ABO2, WCO2, SCO2), y is the measured radiance spectrum, ε is the error on the measured radiance spectrum, and F (x) is the forward model with the state vector x . The summation is over the n valid spectral points. As discussed in Crisp et al. (2015), the persistent spectral residuals caused by limitations in the spectroscopic input data and instrumental effects are removed by fitting to empirically derived spectral vectors. This approach systematically reduces χ 2 and also reduces the dependence of χ 2 on the SNR.
For OCO-2, we have seen that there is little seasonal dependence, but there are clear spatial patterns, as illustrated in Fig. 7. In the ABO2, prominent features occur in the region of the South Atlantic Anomaly (SAA) (Crisp et al., 2017a). The effects of this region of a high density of high energy particles are seen as radiance spikes in the ABO2 measurements. We attempt to screen out the effects, but the fitting is still poor in this region. For the WCO2 and SCO2, the bright desert of the Sahara results in larger chi-square values, and mountainous regions impact the strong CO 2 fits.

Warn levels
The data presented in this paper have data quality screening applied. For the OCO-2 dataset, we have developed warn levels (Mandrake et al., 2013). The concept behind the warn levels is that the data are ordered by quality as defined by a number of data variance metrics, allowing the user to make decisions concerning the trade off between data volume and data quality. This is a more flexible approach then the traditional good or bad quality assignment, and it reflects the fact that data quality is a continuum, not a binary quantity, and should be indicated as such. The OCO-2 warn levels range from 0 to 19, with 0 indicating the highest quality and 19 considered the lowest quality. More details of the process used to develop warn levels are reported in Mandrake et al. (2013) as well as the OCO-2 Lite file documentation (Mandrake et al., 2015). Our recommendation is that users should not use data above a warn level of 15 for all land data or above 18 for water glint. This removes approximately 25 % of the land data and 10 % of the water glint data.
For the v7r data, outliers were screened with a set of additional flags, related to the cloud preprocessors, aerosol optical depths, surface characteristics, etc. The detailed flagging parameters and thresholds are provided in the Lite file user's guide. The warn level thresholds and outlier screening are combined in the 0/1 flag that is included in the Lite file, to be compatible with the European Greenhouse Gas Climate Change Initiative (GHG-CCI) data product specifications (Buchwitz et al., 2015). We have used this screening for the maps shown in this paper, but we strongly encourage users to carefully evaluate the warn levels that are appropriate for their science analysis.

Data density after quality screening
The data density after quality screening for a few select months is illustrated in Fig. 8. The monthly total data density ranges from 1.3 million to 2.4 million soundings per month selected by the xco2_quality_flag in the Lite file for periods without decontamination cycles, influenced by the mixture of nadir and glint measurements, as well as clouds and season. For individual 2 • by 2 • regions, the number of soundings in a month range from a few to over a thousand. There is a roughly inverse relationship, so for example, on a monthly basis, about 100 of the 2 • by 2 • cells have 100 soundings, and 10 have 1000 soundings. The preprocessors, as described in Taylor et al. (2016), limit the data that are put through L2 processing, and then processing failures and data screening further trim the dataset. Nevertheless, there is a large volume of high quality data available from OCO-2. The highest densities of data are over desert areas, although midlatitude data density is high during some seasons. As reported in Taylor et al. (2016) the prescreening and resulting data density is consistent with MODIS cloud statistics.
The cloudy region of the Intertropical Convergence Zone (ITCZ) has lower data density, as does northern South America. This region is impacted by clouds as well as the SAA, where cosmic ray events impact OCO-2 measurements. For the v7/v7r data, the preprocessors do not account for the SAA impacts, and thus a significant fraction of data are screened out. In the next version, the preprocessors will have SAA treatment integrated, and we expect that the data yield will increase in this region.

Bias correction
The bias correction described in O'Dell et al. (2017) and the OCO-2 documentation (Mandrake et al., 2015) was applied to the X CO 2 data shown in Figs. 3, 4, and 5. The monthly mean bias corrections for 3 sample months are shown in Fig. 9. The bias correction seeks to remove systematic footprint-to-footprint differences, mode-to-mode differences (for example systematic differences between land glint and land nadir measurements), and systematic differences that appear to be correlated to other retrieval variables. Two predictive variables are currently used in the bias correction for land retrievals, and three are used for ocean retrievals. In addition, the bias correction process puts the OCO-2 data on the same scale as the TCCON ground-based measurements, which are tied to the WMO scale for carbon dioxide (Wunch et al., , 2010(Wunch et al., , 2011. The OCO-2 mission development included a validation plan which recognized the need for the TCCON and a special data collection mode to gather adequate validation data. A detailed discussion of the Figure 9. Maps of the bias correction applied to the X CO 2 data. Statistics are provided for 2 • by 2 • bins for data selected with the data quality flag. ground-based data and the OCO-2 data that are collected in target mode at these locations can be found in . Details of the derivation of the bias correction and its relationship to other variables can be found in O' Dell et al. (2017) and Mandrake et al. (2015). The monthly distribution of the bias correction values are well described by Gaussian distributions. Overall, for the water glint observations on monthly scales, the mean of the distribution is 0.0 to 0.4 ppm, with a standard deviation of about 0.55 ppm. For land glint observations, the mean is larger, 0.9 to 1.1 ppm, and the standard deviation is typically 1.2 ppm. The land nadir distribution has a similar standard deviation, about 1.2 ppm, with a mean of 1.3 to 1.8 ppm. The patterns strongly follow latitudinal gradients, likely driven by viewing geometry with aerosol and cloud scattering becoming more important as the instrument views through longer paths of the atmosphere. The bias correction is described in more detail in O'Dell et al. (2017).

Uncertainty on X CO 2 product
The OCO-2 data products include an estimate of the uncertainty on the X CO 2 data. As discussed by Connor et al. (2008Connor et al. ( , 2016, this estimate is a lower bound, as it includes error related to the noise on the radiance measurement, the smoothing error, and interference error. Propagation of systematic errors in input terms for the forward model to the X CO 2 estimate is not considered in the error estimate reported in the Figure 10. Maps of the average X CO 2 uncertainty in the OCO-2 data product. Statistics are provided for 2 • by 2 • bins for data selected with the data quality flag. v7/v7r L2 products. Figure 10 is a set of maps of the average X CO 2 uncertainty from the data product for a 6-month period. This shows that the estimated uncertainty is generally smaller over water than the land surface and that the uncertainty is larger at the extreme latitudes, where interference errors grow. Worden et al. (2017) have made a careful assessment of the OCO-2 uncertainty estimates, by evaluating the standard deviation of the difference from the mean X CO 2 for collections of soundings within 100 km in latitude. They compare this to the expected standard deviation due to noise. This research showed that while linearly correlated, the X CO 2 calculated measurements error in the data product appears to underestimate the empirically derived X CO 2 measurement error by a factor of approximately 2, with a larger underestimate for land data and a smaller underestimate for water glint measurements.
In the optimal estimation retrieval, algorithm input choices such as the a priori mean state vector (x a ) and a priori covariance (S a ), or constraint, can impact the variability in the retrieval error in X CO 2 . The a posteriori covariance matrix (Ŝ) is also an important output of the L2 retrieval process, as it is critical for the data assimilation process used to determine CO 2 fluxes. The OCO-2 project is in the midst of an evaluation of this quantity and the accuracy of the algorithm's reported uncertainty as a measure of the error variability, through the use of large-scale simulations. By running simplified retrievals over large ensembles of input variables (priors, constraints, and other parameters), one can assess the characteristics of the retrieval bias and variance and evaluate what is reported in the data product (Hobbs et al., 2017). The choice of prior becomes particularly impactful for moderate to large aerosol optical depths (0.1 or more).
There are many other variables that are co-retrieved with the X CO 2 , including surface pressure, aerosol optical depth, surface albedo, water profile scaling factor, and an offset of the temperature profile. The aerosol optical depths are be- ing compared against independent measurements, such as AERONET optical depths, while an analysis of the retrieved water vapor profiles against SuomiNet and the Advanced Microwave Scanning Radiometer-2 (AMSR-2) is also being conducted (Nelson et al., 2016). As discussed in detail in O'Dell et al. (2017), many of these parameters will compensate for one another in the retrieval algorithm, so must be considered "effective quantities" (e.g., "effective albedo" and "effective optical depth") as they are the values that minimize the fit in an optimal estimation scheme, but they are at times not directly related to the physical quantity (Kulawik et al., 2006;Eldering et al., 2008). The performance and relationships of these parameters are discussed at length in O' Dell et al. (2017).

Solar-induced fluorescence
Using GOSAT and GOME-2 spectra, Frankenberg et al. ( , 2014Frankenberg, 2015;Joiner et al., 2011) demonstrated that, by using the observed Fraunhofer line fractional depths, solar-induced fluorescence of chlorophyll can be quantified. Frankenberg et al. (2014) performed a preflight assessment of the fluorescence measurement performance of OCO-2. This measurement approach is being applied to the OCO-2 data, motivated in part because neglect of this phenomenon results in errors in surface pressure and aerosol optical depth, which propagate into a small bias in the X CO 2 retrieval .
The IDP preprocessor performs the SIF retrieval, along with single band retrievals of the water and CO 2 columns that are used for cloud screening purposes. As described in Frankenberg et al. (2014) the SIF retrieval is impacted less strongly by clouds than the X CO 2 retrieval, so useful data are collected over a much larger number of soundings. However, high single-measurement precision errors warrant aggregation in space and/or time for scientific use. The SIF product is derived at two wavelengths, 757 and 771 nm, and it is recommended that the user examine both fields independently, as this first dataset (v7r) may have different errors in each product. Figure 11 illustrates a year of SIF retrievals, where data have been averaged across seasons. These show expected features, such as the high SIF values in the regions of intense agriculture during early summer, and the low SIF in the Northern Hemisphere during its winter. The SIF signal in the tropics has some seasonality to it, but it is always larger than 0.5 W m −2 µm −1 sr −1 .
Campaigns are underway to compare OCO-2 measurements to data at flux towers and to underfly the OCO-2 measurements with an aircraft-mounted grating spectrometers. Details of these intercomparisons are in Sun et al. (2017). The objective of those studies is to quantify the relationship of OCO-2 derived SIF with independent measurements. 5 Gradients and trends in observed X CO 2 5.1 Growth rate of X CO 2 The dense, global dataset from OCO-2 can be used to assess the annual growth rate of X CO 2 . Figure 12 shows the annual zonal growth rates derived from OCO-2 for five dif- ferent 12-month periods. The growth rate as determined from the NOAA ESRL station at Mauna Loa is shown for comparison. The growth rates are generally between 2.5 and 3 ppm per 12 months from 2014 to 2015, which includes the largest growth rate ever recorded at the Mauna Loa Observatory. More detailed analysis of the growth rate such as that presented for GOSAT data in Kulawik et al. (2016) and Lindqvist et al. (2015) is required to quantitatively assess the growth rate from OCO-2, but this first look shows the OCO-2 has a reasonable range of values. The figure also illustrates the longitudinal standard deviation of the OCO-2 data for each latitude band. Note that the Mauna Loa Observatory is a background site, whereas the OCO-2 measurements span both background sites and populated regions. This variability may drive the standard deviation, although OCO-2 glint retrievals over water tend to have lower variability then OCO-2 land retrievals, which could also explain the standard deviation. The relative sampling of regions of emissions and uptake differs in time with OCO-2, which will result in a different 12-month growth rate than that derived from the NOAA ESRL station.

Seasonal cycle of X CO 2 near Hawaii
A time series of weekly average X CO 2 from OCO-2 for a region around Hawaii is shown in Fig. 13. For this analysis, we have selected glint data over water only, applied the quality flag, and calculated the mean and standard deviation over a region that spans from 175 to 130 • W in longitude and from 15 to 25 • N in latitude. The time series clearly shows weekly and monthly changes as observed by OCO- Figure 13. Time series of weekly average OCO-2 X CO 2 measurements near Hawaii. Glint water measurements selected with the data quality flag from the Lite files.
2. The standard deviation of the weekly averaged data range from 0.5 to 0.8 ppm, smaller than the seasonal changes and at times the monthly changes. There are 2000 to 20 000 measurements averaged per week for the OCO-2 data. The time series shows little growth between January and February 2015 and in early 2016. The minimum of the year occurs in August and September, similar to the timing of the minimum in surface measurements. Now that OCO-2 has a full 2year record, seasonal cycle analysis such as that in Lindqvist et al. (2015) can be conducted with OCO-2 data.  provide time series at all of the TCCON locations, with all OCO-2 measurement mode data (nadir, glint, target).

Assessment of overall data quality
The OCO-2 mission has been successful in collecting over a million measurements of radiance spectra in the ABO2 and the WCO2 and SCO2 each day. After screening for clouds, and applying post retrieval quality flags, OCO-2 typically delivers 100 000 global measurements of CO 2 per day. Detailed comparisons have been made against the TCCON, and the OCO-2 measurements agree within 1 ppm in most cases (see Wunch et al., 2016).
There are regions of the world that have consistent high data yields, such as desert regions and the oceans to the north and south of the cloudy ITCZ. Regions of persistently low data yield include the region over South America that is impacted by the SAA, ocean regions of the ITCZ, and regions where the solar zenith angles are large (especially northern latitudes in NH winter and southern latitudes during SH winter).
A. Eldering et al.: First 18 months of OCO-2 science data The dataset is consistent in time, showing stability in diagnostic parameters such as the measurement SNR and retrieval χ 2 as well as the overall data density. Not surprisingly, there are some data features that are inconsistent with the validation dataset and different from model predictions. The largest feature is a high bias in X CO 2 over water for southern latitudes during the Southern Hemisphere winter. This issue is apparent in the TCCON comparisons for Wollongong shown in Wunch et al. (2016) and in the comparison to models presented in O'Dell et al. (2017). This bias has been extensively examined by the OCO-2 teams, who have considered viewing geometry, polarization effects, interferents such as aerosols, surface models, and instrument performance. The analysis has not yet yielded insights into the root cause, although in early testing there are indications that the lack of stratospheric aerosols in the current version of the retrieval algorithm can significantly increase bias.
The v7/v7r data version discussed here is the current operational data product. In the future, a v8/v8r data product will be produced that addresses calibration issues as described in Crisp et al. (2017a, b), as well as retrieval algorithm improvements described in O'Dell et al. (2017) such as the land surface treatment and others that are not yet fully tested. Future changes to the retrieval algorithm will focus on improving the parameterization of the patterns of bias for correction, if not direct reduction of the bias.

Conclusions
The OCO-2 mission has been successful in collecting a dense, global set of high-spectral-resolution measurement that are used to estimate the column-averaged atmospheric CO 2 dry air mole fraction, X CO 2 . The first 18 months of the missions have provided 1.3 to 2.4 million X CO 2 measurements per month after screening for data quality. As described in Wunch et al. (2016), the data have median difference of less than 0.5 ppm with the primary ground-based validation network and root mean square differences typically below 1.5 ppm. This statistic from Wunch et al. (2016) is for data with a warn level below 11 and an "outcome_flag" of zero, which are slightly less strict selection criteria then the 0/1 quality flag. Large-scale features, such as the drawdown of CO 2 in the Northern Hemisphere spring and the increase of CO 2 over Northern Hemisphere winter, are obvious in the data. By meeting the mission goals for accuracy, resolution, and coverage, the OCO-2 mission has provided a dataset that can now be used to assess regional-scale sources (emitters) and sinks (absorbers) around the globe.

Data availability
All of the OCO-2 data products are publicly available through the NASA Goddard Earth Science Data and Information Services Center (GES DISC) for distribution and archiving (http://disc.sci.gsfc.nasa.gov/OCO-2; OCO-2 Science Team, 2015).