Validation of routine continuous airborne CO2 observations near the Bialystok Tall Tower

Since 2002 in situ airborne measurements of atmospheric CO2 mixing ratios have been performed regularly aboard a rental aircraft near Bialystok (53 08 N, 2309 E), a city in northeastern Poland. Since August 2008, the in situ CO2 measurements have been made by a modified commercially available and fully automated non-dispersive infrared (NDIR) analyzer system. The response of the analyzer has been characterized and the CO 2 mixing ratio stability of the associated calibration system has been fully tested, which results in an optimal calibration strategy and allows for an accuracy of the CO2 measurements within 0.2 ppm. Besides the in situ measurements, air samples have been collected in glass flasks and analyzed in the laboratory for CO 2 and other trace gases. To validate the in situ CO 2 measurements against reliable discrete flask measurements, we developed weighting functions that mimic the temporal averaging of the flask sampling process. Comparisons between in situ and flask CO2 measurements demonstrate that these weighting functions can compensate for atmospheric variability, and provide an effective method for validating airborne in situ CO 2 measurements. In addition, we show the nine-year records of flask CO2 measurements. The new system, automated since August 2008, has eliminated the need for manual in-flight calibrations, and thus enables an additional vertical profile, 20 km away, to be sampled at no additional cost in terms of flight hours. This sampling strategy provides an opportunity to investigate both temporal and spatial variability on a regular basis.


Introduction
The increase of CO 2 mixing ratios in the atmosphere since pre-industrial times is the most important cause of climate change (IPCC, 2007), and this rise is due to human activities, mainly those involving fossil fuel burning and land use change (Le Quere et al., 2009). Since atmospheric CO 2 contains a signature of surface carbon sources and sinks, a global observational network has been established to monitor CO 2 mixing ratios in the atmosphere. A quantitative determination of the distribution of carbon sources and sinks is paramount if climate studies are to be able to analyze the response of terrestrial ecosystems to climate change and monitor fossil fuel emissions reductions in the near future. To achieve these objectives, long term accurate monitoring of atmospheric CO 2 is indispensable (Heimann, 2009).
Atmospheric transport models have been employed in inverse studies to infer the distribution of carbon sources and sinks from regular long-term CO 2 observations (Rayner et al., 1999;Roedenbeck et al., 2003;Peters et al., 2007); however, these estimates are uncertain due to the sparseness of observational constraints as well as to transport and representation errors (Engelen et al., 2002;Gurney et al., 2002;Gerbig et al., 2008). Atmospheric transport models in particular do not accurately represent vertical CO 2 gradients of aircraft profiles, which could potentially be responsible for biases in the flux estimations . Therefore, regular aircraft profiles are desirable in order to increase the coverage of atmospheric CO 2 observations and to improve how the vertical mixing is represented in transport Published by Copernicus Publications on behalf of the European Geosciences Union.
H. Chen et al.: Validation of routine continuous airborne CO 2 observations near the Bialystok Tall Tower models. Moreover, measuring vertical profiles of CO 2 is the only way to validate observations based on remote sensing techniques, such as Fourier Transform Spectrometers (FTS) (Washenfelder et al., 2006;Deutscher et al., 2010;Geibel et al., 2010;Wunch et al., , 2011Messerschmidt et al., 2011a) from the Total Carbon Column Observing Network (TCCON) and satellite observations, which are expected to become an important source of information in the future (Miller et al., 2005).
Regional scale CO 2 fluxes have been investigated by aircraft campaigns throughout North America (Gerbig et al., 2003a,b) and south-western France (Sarrat et al., 2007). These campaign-based aircraft measurements are meant to provide intensive regional CO 2 information about a specific region during short periods; they are not, however, able to represent long-term variations of CO 2 fluxes. Instead, existing regular CO 2 profiles obtained from flask measurements allow the quantification of carbon fluxes over a longer period Crevoisier et al., 2010;Ramonet et al., 2010). Therefore, efforts have been made to develop new methods for regular aircraft profiling; for example, both in situ and flask CO 2 measurements have been carried out aboard commercial airliners (Machida et al., 2008;Schuck et al., 2009), and aircraft profiles can now be obtained by an innovative sample system AirCore (Karion et al., 2010).
Although flask sampling is a reliable way to obtain atmospheric measurements of CO 2 and other trace gases, and can be used to calculate column means of CO 2 from flask profiles without statistically significant bias given a sufficient number of flasks (Bakwin et al., 2003), in situ measurements are advantageous when studying high-frequency variability and quantifying boundary layer mixing processes (Tans et al., 1996;Lloyd et al., 2002). Nevertheless, flask measurements are still important for validating in situ observations that may suffer from severe changes of ambient temperature, pressure, and humidity, as well as vibrations aboard an aircraft.
In situ CO 2 mixing ratios have been measured regularly by a modified LI-COR 6251 system on board a rental aircraft (PZL-104 Wilga) near Bialystok, Poland since 2002. A detailed description of the analyzer system is given in Lloyd et al. (2002). Manual calibrations were performed at predefined altitude levels during a flight in order to remove potential biases due to changes of ambient pressure and temperature; however, significant disagreements between in situ and flask measurements were often found in routine operations. In order to improve the measurement accuracy and to obtain more scientifically useful observations within the same amount of available flight hours, a new airborne CO 2 analyzer system has been deployed and tested aboard the aircraft in April 2008, and has replaced the above-mentioned LI-COR system for routine measurements since August 2008. The main purpose of these aircraft measurements is to regularly obtain the vertical distribution of atmospheric CO 2 , which is essential to improve the representation of the vertical mixing in transport models. These profiles are made up to 3 km above ground, and have been used in combination with model results to compare with FTS CO 2 retrievals (Messerschmidt et al., 2011b). The temporal coverage of these profiles made them especially useful to study the seasonal cycle of column averages.
In this paper, we describe and characterize the new automated continuous CO 2 analyzer and its associated calibration system. We also present an accurate way for comparing in situ measurements with the analysis results of flask samples which correctly weighs the in situ data according to their contribution to the flask sample rather than using constant weights for a given time window as done previously. The paper is organized as follows: Sect. 2 introduces the sampling site and the methods of CO 2 observations. Section 3 presents the methods for validating airborne in situ CO 2 measurements against flask measurements. The measurement data are shown in Sect. 4. Conclusions and discussion appear in Sect. 5.

Site description and flight protocol
In situ measurements of CO 2 mixing ratios have been made regularly since 2002 in the vicinity of Bialystok, a city in northeastern Poland. The region is known as "The Green Lungs of Poland", because it is mainly covered by forests, agricultural land, and wetlands with relatively low fossil fuel emissions. Specifically, from 2002 to 2005, in situ ascending CO 2 profiles were made over Biebrza National Park (53 • 31 N, 22 • 40 E, ∼60 km to the northwest of Bialystok); since 2006, the profiles have been sampled over a tall tower (53 • 18 N, 23 • 05 E, ∼20 km to the north of Bialystok), where quasi-continuous in situ measurements of CO 2 , CH 4 , CO, N 2 O, H 2 , and SF 6 have been made since August 2005 (Popa et al., 2010); since August 2008, two profiles of CO 2 have been collected using a new airborne CO 2 analyzer system during each flight: an ascending profile over the tall tower and an additional descending profile located ∼10 km to the southwest of Bialystok (53 • 3 N, 23 • 02 E). During the ascending profiling for all periods, paired flasks were manually taken by an operator using a flask sampler. In most cases, flasks were taken at seven constant altitudes, i.e. 100 m, 300 m, 500 m, 1000 m, 1500 m, 2000 m and 2500 m a.g.l. humidity and temperature probe (Vaisala, HMP35D). The aircraft climbed at a speed of ∼1.5 m s −1 and descended at a speed of ∼ 5.5 m s −1 , corresponding to vertical resolutions of ∼14 m and ∼50 m, respectively (the 90 % response time of the CO 2 analyzer system was ∼ 9 s, see Sect. 2.2).

Characterization of the analyzer system
The new airborne CO 2 analyzer system is a modified version of a commercially available product (AOS Inc., Boulder, CO, USA). It consists of a non-dispersive infrared (NDIR) analyzer, a gas handling and a calibration system. Figure 1 shows the schematic diagram of the analyzer system.
The analyzer employs two infrared light sources, two gas cells, and two solid-state detectors to perform differential absorption measurements. The pressure in a 2 l buffer downstream of the gas cells is stabilized at ∼1100 mbar, a pressure that is higher than the maximum atmospheric pressure. Three CO 2 standards are employed in the analyzer system as calibration gases, which are designated as ref, low, and high. The reference gas has a CO 2 mixing ratio of ∼380 ppm, a level that is close to the atmospheric mean CO 2 mixing ratio. The low and high gases have CO 2 mixing ratios of ∼360 ppm and ∼400 ppm, respectively. There are three operation modes: zero calibration, span calibration, and measurement. During zero calibration, the reference gas flows through the sample cell while no gas flows through the reference cell; thus both cells contain the reference gas, providing a background (zero) signal. Zero calibration is short enough to prevent diffusion of air from the pressure buffer back to the reference cell. During span calibration, low or high standard gas flows through the sample cell, while reference gas flows through the reference cell, resulting in a sensitivity measurement of the analyzer. During measurement mode, the sampling air flows through the sample cell, while the reference gas flows through the reference cell, providing a measurement signal based on the absorption differences in the two cells. The mixing ratio of CO 2 of the sampling air can then be derived using the zero and span measurements.
The flows through the sample and reference cells are ∼180 sccm (standard cubic centimeters per minute, i.e. equivalent to the volume flow rate at 273.15 K and 1013.25 mbar) and ∼10 sccm, respectively. The sample flow is bypassed at the same rate of ∼180 sccm when a zero or span calibration takes place, so that the sample inlet is constantly flushed. Water vapor in the sample air is removed by a chemical dryer tube filled with anhydrous magnesium perchlorate (Mg(ClO 4 ) 2 ) in order to measure the dry mole fraction of CO 2 in air.
The cell volumes are approximately 5 cc. With a flow rate of 180 sccm, the 90 % response time (assuming perfect air mixing in the sample cell) is ∼4 s, which agrees well with the value derived from a laboratory test that switched between calibration and sample gases (see Fig. 2a). The response can be fitted into one exponential curve. However, the 90 % response time required to switch from one sample gas to another sample gas with different CO 2 mixing ratios is ∼9 s; the increase of the response time is due to the mixing of sample air in the chemical dryer tube and is dependent on the size of the dryer tube. The response can then be fitted into a sum of two exponential curves (see Fig. 2b). The inlet is made of a ∼5 m long 1/4 O.D. Synflex tube (type 1300, formerly named as Dekabon or Dekoron), and causes a time delay (from when air enters the inlet until it reaches the sample cell) of 47 s on the ground level and 34 s on the top sampling height (∼2500 m above ground) due to changes of ambient pressure. The total time lag applied to the 1 Hz in situ CO 2 data is the sum of the response time (90 %) and the time delay due to the inlet tube, i.e. from 56 s to 43 s.
Temperature variation around the housing of the detectors and the light sources affects the measurements despite the fact that the two detectors of the analyzer are thermally controlled at constant temperature. When each individual internal component of the analyzer (e.g. light sources, detectors) is locally heated, CO 2 mixing ratios change ∼8.3 ppm for every degree change of the housing of the light sources and ∼1.8 ppm for every degree change of the housing of the detectors. This result implies that frequent calibrations are required for this analyzer to remove the thermal impacts. During flights, zero calibrations are made every two minutes while low or high spans are carried out after every other zero calibration (i.e. zero-zero/low-zero-zero/high etc.).
A total calibration period of 12 s is used, based on two facts: (1) the time response of the analyzer is fast, ∼4 s for a 90 % exchange, and (2) the heat flow around the light sources and detectors due to valve switching affects the measurements. Taking a short calibration period is to minimize 1) the length of missing data due to calibrations, and (2) the influence of thermal impact. Nevertheless, laboratory tests show that there are biases in the CO 2 measurements of a tank air immediately after a 12-s calibration. An experimentally determined exponential curve has been used to correct these biases, and the corrections range from 0.7 ppm to 0.1 ppm. Fig. 3. Long-term stability of CO 2 mixing ratios of one 0.7 l cylinder associated with a pressure regulator from Scott Specialty Gases. The dashed line indicates the mixing ratio of the gas in the filling tank; the solid line shows a long-term trend.

Characterization of the calibration system
The three CO 2 standards used for in-flight calibrations are contained in one 3.5 l fiber-wrapped aluminum cylinder (for the reference gas) and two 1.2 l aluminum cylinders (for the low-span and high-span gases). The accuracy of CO 2 measurements is dependent on the stability of CO 2 mixing ratios of calibration gases delivered into the sample and reference cells, especially in the case of a long-term deployment in the field, e.g. one year or even a couple of years at the Bialystok site. To investigate the long-term CO 2 stability of the calibration system, a series of laboratory tests was carried out. A detailed description of the experimental setup is given in Winderlich (2007). This experiment involved tests of the stability of CO 2 mixing ratios for eight gas cylinders (volumes 0.75-3.5 l) associated with 3 different pressure regulators (Premier Industries, Belle Chasse, LA; Scott Specialty Gases, Plumsteadville, PA, 51-14D; TESCOM, Tescom Europe, Selmsdorf, Germany). During these tests, the cylinders are attached with pressure regulators, followed by highpressure stop valves that block the gas flow when no experimental measurement is being performed; the valves of these cylinders, in contrast, are open all the time. One CO 2 standard (392.491 ppm) in a 50 l aluminum tank was used to fill all eight gas cylinders for further tests. The gases from the cylinders were measured at variable intervals depending on the availability of a high-precision Loflo CO 2 system (Da Costa and Steele, 1999). The experiment lasted ∼100 days. These tests characterized the influences of pressure regulators and storage in small cylinders on CO 2 mixing ratios using two factors: a surface effect and a permeation effect (Fig. 3). These two effects are explained below in detail.
The surface effect can be explained by the tendency of CO 2 molecules to adhere to the walls of aluminum cylinders, which is a pressure-dependent process (Langenfelds et al., 2005). The CO 2 mixing ratios of the gases in the small cylinders immediately after filling are lower than that of the gas in the filled tank due to the adsorption of CO 2 molecules on the walls of these small cylinders, whereas the CO 2 mixing ratios of the gases in the small cylinders increase when the pressure drops below a relatively low level of ∼30 bar due to the desorption of CO 2 molecules from the walls. The tests revealed that this effect scales with the surface area of cylinders. For example, the Al 2 O 3 covered aluminum surface can explain the adsorption of 8.3 × 10 16 molecules at the 420 cm 2 inner surface of the 0.7 L cylinder (sum of reversible and irreversible adsorption on AL 2 O 3 from Mao and Vannice, 1994). Relying on 9.4 × 10 20 molecules within the cylinder, 0.04 ppm depletion can be explained. This represents only 36 % of the observed difference and could indicate a 2.75 times bigger surface roughness value of the cylinders compared to the ideally prepared Al 2 O 3 surfaces from Mao and Vannice (1994). The increase of CO 2 mixing ratios when the cylinder pressure is below 30 bar is consistent with the experience of other groups that use high-pressure calibration standard gases until the pressure drops to 5 to 35 bar (Daube et al., 2002;Langenfelds et al., 2005;Keeling et al., 2007). The approach of mass conservation leads to an enrichment of +0.44 ppm below ∼30 bar (equals −0.11 ppm at 120 bar), which has the same magnitude as the observations.
Because some air constituents preferentially permeate the polymer material used in pressure regulators, the CO 2 mixing ratio of the gases on the high-pressure side of the pressure regulator -and eventually the gases in the cylinders -can be modified. For example, the first stage of the Scott regulator is equipped with a Viton sealed piston. CO 2 molecules preferentially diffuse through this polymer (Sturm et al., 2004), causing the air on the high-pressure side to become depleted in CO 2 ; on the low-pressure side CO 2 molecules accumulate and then diffuse when the mixing ratio of CO 2 is higher than the ambient. Therefore, for a long-term operation, the CO 2 mixing ratios of gases in the cylinders tend to decrease with time. In contrast, during each analysis of the tank air after more than 4 h storage, the CO 2 mixing ratio increases until the CO 2 depleted air on the high-pressure side of the pressure regulator is flushed, as it can be seen from the measured CO 2 mixing ratios around 120, 40, and 25 bar in Fig. 3. This effect has been reported repeatedly (Da Costa et al., 1999;Daube et al., 2002;Keeling et al., 2007). Tests show that a TESCOM regulator has a smaller permeation effect; however, the size of this regulator is too large to be employed in our airborne analyzer. The observed drift for 0.75 l cylinders is −0.15 ± 0.06 ppm/100 days during these tests when the cylinder valves are open and regulators are constantly attached.
Apart from the cylinder size, the variations of various parameters in different testing setups (temperature: laboratory conditions vs. 40 • C; fitting material: stainless steel vs. brass; pressure regulator type: Scott or Premier Industries) were investigated, and no influence on the trend of the CO 2 mixing ratios was observed.
These laboratory tests led to a strategy for the use of the calibration system of the NDIR analyzer during flight: (1) calibrating the CO 2 mixing ratio of air in the small cylinder after being filled instead of using the value of the filling tank; (2) using the cylinders only when the pressure is above 30 bar, a conservative level below which CO 2 mixing ratios may significantly increase due to desorption of CO 2 molecules from the walls of the cylinders; (3) flushing the dead volume in the pressure regulators before measurements are started during a flight; (4) calibrating the small cylinders before and after deployment in the field to characterize a potential long-term drift in CO 2 mixing ratios due to the diffusion effect. When these rules are followed, deviations ranging from −0.2 to +0.1 ppm have been observed in the laboratory tests. Therefore, our laboratory experiments suggest that such a calibration system can supply the measurement system with a stable CO 2 mixing ratio within 0.2 ppm.
In addition, we compute the CO 2 mixing ratios of the small calibration cylinders inside the NDIR analyzer system by measuring three calibrated working standards as sampling air on the same NDIR analyzer system. This mimics the atmospheric sampling, and can compensate for known biases, e.g. the thermal impact on measurements of calibration gases (similar impact on measurements of sample air immediately after calibrations has been discussed in Sect. 2.2).
Our flight interval is normally about one to three weeks; according to the CO 2 stability test, the depletion of CO 2 in the regulator could be as large as 0.5 ∼ 1.0 ppm. To overcome this, at least 1 l gas in the regulators should be flushed before flight, which ensures the mixing ratios of calibration gases running through the analyzer during flight are within 0.1 ppm of the real stable values. The cylinders should be used until the pressure for one of the cylinders drops below 30 bar. Calibrations of gases in the three in-flight cylinders using five external working cylinders before and after deployment in Bialystok for eight months showed drifts of CO 2 mixing ratios are smaller than 0.2 ppm. Our working cylinders are calibrated relative to the MPI-BGC GasLab laboratory standards calibrated by NOAA-ESRL (Zhao and Tans, 2006). The traceability of these laboratory standards to NOAA-ESRL at a level of 0.03 ppm for CO 2 has been confirmed by comparison programs.

Validation of in situ measurements with analysis results of discrete flasks
During flight, air samples were collected by an operator using a flask sampler, in which paired glass flasks were connected in series and filled to ∼1 bar above ambient pressure.
The sampling air was dried with magnesium perchlorate before being filled into the flasks. Valves with either Perfluoroalkoxy (PFA) or Polychlorotrifluoroethylene (PCTFE) Orings were used to seal the flasks. A 0.003 ppm day −1 decrease in CO 2 has been found for those with PFA O-rings during a storage test for ∼300 days, whereas no loss of CO 2 has been discovered for those with PCTFE O-rings during a storage test for ∼400 days. The flasks were analyzed by an automated gas chromatographic (GC) system in the GasLab at MPI-BGC. To ensure the quality of the measurements, the sampling air from the flasks was flown through an additional magnesium perchlorate dryer before it is analyzed by the GC system. Based on the results of flask storage tests, a 0.003 ppm day −1 correction has been applied for those with PFA O-rings, and no correction has been made for those with PCTFE O-rings. The adsorption effect has not been observed during laboratory tests. This analytical system is regularly checked by a flask comparison program ("sausage flask program") and its consistency has been verified. The typical analytical precision of the flask measurements at MPI-BGC is smaller than 0.06 ppm. Therefore, comparison of in situ CO 2 measurements with the analysis results of flasks offers one way to assess the accuracy of the in situ measurements. Given that air does not flow into the flasks instantaneously, flask sample data cannot be compared directly with in situ measurements. Actually, the CO 2 mixing ratio of the air in the flask is a weighted average of the mixing ratios of the air during flask flushing and filling time. During flight, flask samples are collected in two steps: first, air is pumped through the flasks at an ambient pressure for about 5 min to flush and remove the conditioning air in the flasks, and then the flasks are pressurized until the pressure reaches ∼1 bar above the ambient pressure. Based on the flask filling procedure, weighting functions for in situ measurements have been developed for comparison with flask analysis results.

Method for comparison of in situ measurements with single flask measurements
Briefly, the weighting function is derived from the assumption that the air entering a flask mixes instantaneously with the existing air in the flask. This perfect mixing has been shown in laboratory tests, when a step change in the CO 2 mixing ratio in the air flowing to the flask was made and CO 2 in the air leaving the flask was analyzed with an analyzer based on the cavity ring-down spectroscopy technique. Exponential responses of this step change have been observed at flow rates from 0.5 to 3.5 l min −1 , indicating that the assumption of perfect mixing gives a good approximation of air mixing in the flask during the flask sampling process aboard aircraft. For one single flask, the flask sampling process consists of two steps: flushing and pressurizing (see Fig. 4). During the flushing process, air flows into the flask, is instantaneously mixed, and then flows out of the flask at the same flow rate, f 0 ; at the time when the pressurizing period starts, the fraction of the air (entering the flask at time t) remaining in the flask, is c(t). During the pressurizing process, air flows into the flask at a decreasing flow rate of f (t), and the flask is pressurized until the flask sampling is completed. In the Appendix, analytical formula for c(t) and f (t) are presented from which the following weighting function for integrating in situ measurements for comparison with the analysis result of a single flask can be derived (see Appendix A1) Here P s and P e are the flask pressures when the flask pressurizing process starts (t = t s ) and ends (t = t e ); p(t) is the flask pressure at time t. The time scale is relative to a chosen time (100 s for one single flask, and 150 s for paired flasks) prior to the start of pressurizing, which is empirically determined so that the weighting at t = 0 is negligibly small. The weighting function for integrating in situ measurements to compare them with the analysis result of one single flask is shown in Fig. 5. The weighting function is normalized to 1 and has its maximum value at the time when the pressurizing starts t = t s .

Method for comparison of in situ measurements with paired flask measurements
For the case of paired flasks, the flask sampling process consists of the same two processes: flushing and pressurizing (see Fig. 6). During flushing, air flows into and out of the upstream flask and then the downstream flask at a flow rate of f 0 ; when the pressurizing period starts, the fraction of the air (entering the upstream flask at the time t) remaining in the upstream flask is c 1 (t), while the fraction of the air remaining in the downstream flask is c 2 (t). During the pressurizing period, air flows into the upstream flask at a decreasing flow rate of f (t), but out of the flask at the flow rate of f (t)/2; at the time when the pressurizing period ends, the fraction of the pressurizing air (entering the upstream flask at the time t) remaining in the upstream flask is c 1 (t), while the fraction of the air coming into the downstream flask is c 2 (t). It is important to note that a fraction of flushing air flows from the upstream flask into the downstream flask during the pressurizing period. The process-based mass balance equations with variables f (t), c 1 (t), c 2 (t), c 1 (t), and c 2 (t) are given and solved in Appendix A2 to derive the weighting function for integrating in situ measurements for comparison with the analysis result of the upstream flask of a pair: Similarly, the weighting function for integrating in situ measurements to compare them with the analysis result of the downstream flask of a pair can be described as follows (see Appendix A2):  where P s and P e are the flask pressures (both flasks have the same pressure) when the flask pressurizing process starts and ends; p(t) is the flask pressure at time t. The weighting functions for integrating in situ measurements to compare with pair-flask analysis results are shown in Fig. 7.
Here an example of using the weighting functions for integrating in situ measurements of CO 2 mixing ratios and then comparing these with flask measurement data is given. The measurement results of CO 2 mixing ratios made by the NDIR analyzer and from analyses of flask samples from a flight on 20 August 2008, in Bialystok, Poland, are shown in Fig. 8. The flask CO 2 data are shown as blue (upstream) and green (downstream) dots. At about 45 700 s, CO 2 flask values from the paired flasks varied by a few ppm, even though they were taken simultaneously.
The differences of integrated in situ and flask CO 2 mixing ratios using constants (1/120 over a 120 s window) and the above-described weighting coefficients are shown in Fig. 8. The improved agreements between averaged in situ and flask CO 2 mixing ratios when using the weighting functions show that the atmospheric CO 2 variability can be accounted for when using the proper weighting functions for integrating in situ CO 2 values.

Validation of in situ measurements with flask CO 2 measurements
A direct comparison of integrated in situ CO 2 values with 216 flasks from 22 flights is shown in Fig. 9a   mixing ratios are properly corrected using the flask values and water vapor measurements. The biases in the differences between in situ and flask CO 2 during two periods in Fig. 9a are caused by residual water vapor in the air after the chemical dryer. This effect can be clearly seen when the differences are plotted per flight as a function of ambient water vapor mixing ratios (see Fig. 10). The hypothesis is that the water vapor mixing ratios after the chemical dryer are proportional to the ambient values, and the drying efficiency of the chemical dryer decreases with time (inter-flight). Linear regression models are fitted per flight using the least squares approach for the differences between in situ and flask CO 2 as a function of water vapor mixing ratios. One slope value is obtained from each linear regression, which is used to correct the in situ measurements of CO 2 mixing ratios based on the measured ambient water vapor mixing ratios. The comparison of integrated in situ and flask CO 2 measurements after correcting the water vapor effects for the 10 flights is shown in Fig. 9b, with the corrected values shown in blue. The mean difference of in situ and flask CO 2 values reduces to 0.06 ppm with a standard deviation of 0.45 ppm.

Flask CO 2
The time series of CO 2 mixing ratios at 300 m and 2500 m from 2002 to 2010 are shown in Fig. 11, excluding flasks that have been flagged as contaminated. The flasks are flagged as contaminated when abnormally low values of δ 13 C measurements (δ 13 C < −10 ‰ on the VPDB scale) and abnormally high values of CO (CO > 500 ppb), and H 2 (H 2 > 600 ppb) are observed. From 2002 to 2004, compressed air from Messer Griesheim Ltd was used to condition the flasks. This air contains ambient-level mixing ratios of CO 2 , CH 4 , N 2 O, and SF 6 , but during some periods, it was heavily polluted with CO and H 2 . The pollution affected the analysis of air samples for CO and H 2 mixing ratios when the conditioning air was not completely flushed before air samples were collected. Starting in 2005, compressed dried ambient air filled with a compressor system from the roof of the Max Planck Institute for Biogeochemistry into high-pressure cylinders has been used as conditioning air to eliminate this problem.
Note that the data prior to 2005 are sparse, a linear trend and a third order harmonic function have been fitted to the CO 2 data at 300 m and at 2500 m after 2005, respectively (see Fig. 11). For comparison, the reference marine boundary layer CO 2 (Masarie and Tans, 1995;GLOBALVIEW-CO 2 , 2011) is interpolated to the latitude of the flask sampling site, and shown in Fig. 11. The calculated slope is 2.15 ± 0.42 ppm yr −1 for the data at 300 m, and 2.15 ± 0.14 ppm yr −1 for the data at 2500 m. As for the marine reference, the slope is 1.81 ± 0.03 ppm yr −1 . The Fig. 10. Linear regression models are fitted per flight using the least squares approach for the differences between in situ and flask CO 2 as a function of water vapor mixing ratios. The differences between in situ and flask CO 2 are denoted by different colors for each flight in the plots. Panels (a) and (b) show two periods during which the in situ measurements of CO 2 mixing ratios have been affected by residual water vapor in the sampling air after the chemical dryer. uncertainties are given as standard errors of the estimated trends. The relatively large uncertainty in the trend determined from 300 m data is due to large scatter. The few high biases in winter coincide with high CO values, suggesting influences from local pollution, and likely from the nearby city. The trend difference indicates that for recent years, the increase rate of CO 2 at the Bialystok site is bigger than the marine reference, and could be explained by a transport pattern change or a change in the fluxes that contribute to the CO 2 data at 2500 m relative to those contributing to the Marine boundary layer CO 2 (Ramonet et al., 2010). Using the same measurement period -between July 2005 and December 2008 -as in Popa et al. (2010), the CO 2 growth rates estimated from CO 2 data at 300 m and 2500 m are 2.11 ± 0.64 ppm yr −1 and 2.28 ± 0.18 ppm yr −1 , respectively. These values are consistent with the estimated value of 2.02 ± 0.46 ppm yr −1 using 300 m CO 2 data from the Bialystok tall tower (Popa et al., 2010). In summer, the level of CO 2 both at 300 m and the marine boundary CO 2 is significantly lower than the level of CO 2 at 2500 m due to the uptake of CO 2 by plants, whereas in winter, regional fossil fuel emissions increases the level of CO 2 at 300 m. To calculate the seasonal amplitude, CO 2 data at 300 m and 2500 m for the period between July 2005 and December 2008 are de-trended using the linear trends derived from the abovedescribed fits, and then fitted to third order harmonic functions. The results show that the seasonal amplitude of CO 2 at 2500 m (10.5 ppm) is significantly smaller than that of CO 2 at 300 m (20.4 ppm). The planetary boundary layer heights that are determined from the vertical profiles of temperature and water vapor are between 300 m and 2500 m. Both seasonal cycles have minimum values around August; however, the CO 2 mixing ratio at 300 m decreases abruptly in spring, while the CO 2 at 2500 m decreases smoothly from April to August. This reflects the larger influence CO 2 uptake by plants has on the 300 m level than on the 2500 m level in the free troposphere.
Furthermore, the seasonal cycle of CO 2 gradients (differences of CO 2 values at altitudes of 300 m and 2500 m) is calculated, and shown in Fig. 12. These CO 2 gradients contain useful information for estimating carbon fluxes between the surface and the free troposphere, and for improving vertical mixing of transport models (Lai et al., 2006;Stephens et al., 2007). Similarly, a smoothed curve has been fitted into these data using a third order harmonic function, which demonstrates that from April to September, the CO 2 gradients are negative, with the minimum value in July, mainly due to uptake of CO 2 by plants through photosynthesis; however, the gradients are positive for the rest of the year, indicating CO 2 surface sources dominate sinks.

In situ CO 2
As an example, in situ continuous CO 2 mixing ratio profiles from a flight on 20 August 2008 are shown in Fig. 13. The collection of two profiles from each flight provides an opportunity to assess the spatial variability of mixed-layer CO 2 averages based on observations. Flights were made every one to three weeks, around mid-day under fair weather conditions. Ascending profiles were usually made over a national park, while descending profiles were taken over a mixture of forest and cultivated land that is about 20 km away and is on the other side of the city of Bialystok. Descending profiles were always made after ascending profiles, roughly 50 min later (the average time difference between the time when the ascending and the descending profiles are carried out).
The planetary boundary layer (PBL) heights are determined from the virtual potential temperature profiles using the parcel method (Seibert et al., 2000). The mixed-layer average CO 2 mixing ratio for each profile, CO 2 , is calculated as the mass weighted average, excluding the bottom 10 % and the top 20 % of the mixed layer to avoid the influence of both the surface layer at the bottom and the entrainment zone at the top. The differences of mixed-layer CO 2 averages between the ascending and the descending profiles are shown in Fig. 14, separated as the part of the growing season with peak carbon uptake (June, July, and August) and the rest of the growing season (April, May, and September), hereafter referred to as the peak growing season and the non-peak growing season, respectively. The uncertainty of the mixed-layer averages for each profile is estimated based on the method employed in Gerbig et al. (2003a). The uncertainty ranges from 0.04 to 0.41 ppm for individual profiles. The uncertainty of the differences is the square root of the sum of variances of the ascending and the descending profiles.
The differences of mixed-layer CO 2 averages between the ascending and the descending profiles during the peak growing season are significantly larger than 0 ppm (t-test p-value 0.006), whereas for the non-peak growing season they are not significantly different from 0 ppm (t-test pvalue 0.115). The differences of mixed-layer CO 2 averages could have resulted from two main factors: the spatial gradients or changes in time associated with CO 2 sources or sinks at the surface. During the peak growing season, CO 2 is depleted in the mixed layer due to the uptake by vegetation, and as a result, the mixed-layer CO 2 average during ascending is higher than the mixed-layer CO 2 average during descending made roughly 50 min later. The average change in CO 2 during the growing season (Jun-Sep) between 10:00 and 15:00 LT (local time) is estimated to be 0.24 ppm/50 min based on tower observations (Popa et al., 2010), which is much smaller than the mean difference found from the insitu aircraft profiles of 1.1 ppm. Therefore, the differences must be due to spatial variations. The variability of the differences of the mixed-layer average CO 2 is 1.2 ppm during the peak growing season, which is larger than that during the non-peak growing season, 0.6 ppm. No differences of ascending and descending profiles for winter months have been shown because we do not have enough in situ data to perform this analysis.

Discussion and conclusion
Accurate in situ measurements of CO 2 mixing ratios have been achieved using a modified commercially available NDIR analyzer system. An optimized calibration strategy has been derived based on characterization of the analyzer and test results of the stability of CO 2 mixing ratios in small cylinders. An in-flight calibration system is necessary for in situ analyzer systems to account for potential drift due to instability under severe conditions of vibrations, changing temperature and pressure aboard aircraft (Anderson et al., 1996;Daube et al., 2002;Machida et al., 2008). It is worth pointing out that CO 2 measurements using state-of-the-art laser-based techniques (O'Keefe, 1998;Bowling et al., 2003;Crosson, 2008;McManus et al., 2008) do not require calibrations as frequently as the NDIR analyzer does. Specifically, the recently available cavity ring-down spectroscopy technique  has been proven to be sufficiently stable aboard a research aircraft within a field campaign period. However, even with a stable analyzer system, an in-flight calibration system is still recommended when no other independent measurements are available or if the analyzer needs to be deployed over the long term. The automation of the new system after August 2008 eliminates the requirement of manual in-flight calibrations on certain constant height levels, and thus, by saving flight time, allows for more extensive spatial sampling of the atmosphere. Observed spatial gradients between two vertical profiles sampled at 20 km distance near the Bialystok tall tower indicate spatial differences in upstream source-sink distributions. In combination with high-resolution transport modeling these observations provide important information on representation errors when utilizing tall tower data in inverse models to infer surfaceatmosphere fluxes.
A method for comparing in situ with flask CO 2 measurements using weighting functions has been developed applicable to both single and paired flask samples. Comparisons between in situ and flask CO 2 measurements demonstrate that atmospheric variability can be well accounted for by using weighting functions. Therefore, one should compare all flasks with in situ data regardless of atmospheric variability. However, it is critical to have the exact time when the pressurizing process starts, and the flask pressure or the flow rate during the flask sampling process. When these parameters are not available, the comparison is certainly sensitive to the atmospheric variability. The comparison of in situ with flask CO 2 measurements during flight has been successfully employed to identify water contamination issues during two periods. Since CO 2 needs to be reported as dry mole fraction, water contamination is an issue for any technology that detects CO 2 in dry air, and relies on a drying system to remove water vapor from sample air to a sufficiently low level. It has been successful for the cavity ring-down spectroscopy (CRDS) technique to use simultaneously measured water vapor to correct all water vapor effects for CO 2 . However, this has not been achieved or reported by using other technologies. These weighting functions can be applied to compare various in situ continuous measurements with discrete measurements of other trace gases. In addition, when flask measurements from a mobile platform are used in a modeling frame work, the effective location (latitude, longitude, and altitude) of the flask measurements can be derived from integrating corresponding in situ continuous measurements using these weighting functions.
In addition, we show the nine-year records of flask CO 2 from which the CO 2 increase rates after 2005 are computed for the 300 m level (2.15 ± 0.42 ppm yr −1 ) and for the 2500 m level (2.15 ± 0.14 ppm yr −1 ). The difference between a reference trend of marine boundary layer CO 2 and that of our CO 2 data at 2500 m is likely significant, and could be explained by a transport pattern change or a change in the fluxes that contribute to the CO 2 data at 2500 m relative to those contributing to the marine boundary layer CO 2 . The regular sampling of two profiles that are 20 km apart provides an opportunity to investigate temporal and spatial variability. The following presents a detailed description of how the weighting functions for integrating in situ measurements are derived, i.e. how single and paired flask measurements are compared based on two assumptions during the flask sampling process: (1) incoming air mixes instantaneously with existing air in the flasks; (2) the change of temperature in the flasks is negligible.

A1 Single flask model
The weighting function for integrating in situ measurements to compare them with a single flask measurement is divided into two parts based on the processes during flask sampling: flushing and pressurizing (see Fig. 5). When the flask sampling is completed, the influence of remaining conditioning air on the CO 2 mixing ratio in the flask is negligible. The mixing ratio of CO 2 in the flask is determined by the CO 2 mixing ratios of sampling air starting at flushing until pressurizing is complete, weighted by a function. The CO 2 mixing ratio within the flask can be written as: where <CO 2 > is the CO 2 mixing ratio of the air in the flask; t s and t e are the time when the pressurizing process starts and ends; W (t) is the weighting function that consists of W f (t) and W p (t), for the flushing and the pressurizing periods, respectively. The weighting function is proportional to the amount of the air (entering the flask at time t) remaining in the flask when the flask sampling is completed, i.e. the volume of sampling air flowing into the flask at time t multiplied by the fraction of the air that is preserved in the flask, given the volume is reported at the same pressure. The sum of the overall weighting function is normalized to 1.
During the flushing period (0 < t < t s ), the incoming air mixes with the air in the flask and flows through the flask. When the pressurizing starts (t = t s ), the air already in the flask is preserved. Because the flushing period is short (around 2 min), the ambient air pressure and the volume flow rate can be regarded as constants, i.e. f (t) = f 0 , p(t) = P s (throughout the text, we use lower case p as the symbol for pressure at any time, whereas capital P for the pressure at particular times). The mass balance for air in the flask at any time t can be written as: where c(t ) is, at any given time t (t < t < t s , the fraction of the air (in the flask at time t) remaining in the flask, given the boundary condition c(t = t) = 1; V is the volume of the flask, and f 0 is the volume flow rate at the ambient pressure P s . The solution of the equation is At the end of the flushing period, i.e. t = t s , the fraction of the air (in the flask at time t) remaining in the flask is According to Eq. (A4), for the air entering the flask at any given time t (with the volume f 0 · dt), the remaining volume in the flask at time t s is f 0 · dt · e −(t s −t)/τ . The weighting function W f (t) is then proportional to f 0 · dt · e −(t s −t)/τ : During the pressurizing process, all incoming air is kept in the flask until the whole flask sampling process is completed (see Fig. 5). The weighting function W p (t) is thus proportional to the volume flow rate, for which mass balance can be depicted as follows: where P s is the ambient pressure before the pressurizing period starts, f (t) is the volume flow rate at the pressure of P s , and p(t) is the air pressure in the flask. When the flask sampling is completed, the flask pressure is P e , and the fraction of all flushing air in the flask is and the fraction of all pressurizing air in the flask is Based on Eqs. (A5)-(A8), the weighting coefficients for integrating in situ measurements to compare with one single flask is described as The weighting coefficients for integrating in situ measurements to compare them with paired flask measurements are also divided into two parts during the flask sampling: flushing and pressurizing; however, the situations for the upstream and the downstream flasks are different and need to be considered separately. The CO 2 mixing ratio within the flask can be written as: where the subscripts 1 or 2 denotes the upstream and the downstream flasks respectively.

A2.1 Upstream flask
During the flushing period, the situation for the upstream flask is the same as in the single flask model and the weighting function W 1f is proportional to During the pressurizing period, the process for the upstream flask is a combination of a flushing process and a pressurizing process due to the fact that part of the air from the upstream flask flows into the downstream flask at half of the flow rate (see Fig. 7). For air in the flask at any given time t, (t s < t < t e ), the mass balance equation can be depicted as follows: is the fraction of the air (in the flask at time t) remaining in the flask at any given time t (t < t < t e ), and f (t ) is the volume flow rate (at pressure P s ) of sampling air. Besides, f (t ) and p(t ) are constrained by the equation Combining Eqs. (A12) and (A13) produces dp t dt c 1 t + p t dc 1 t The solution of Eq. (A14) is: When the flask sampling is completed, i.e. t = t e , the pressure reaches its final value, P e , the fraction of the air (in the flask at time t) remaining in the flask is c 1 (t e , t) = p(t) P e (A16) According to Eq. (A16), for the air entering the flask at any given time t (with the volume f 0 · dt), the remaining volume in the flask at time t e is f 0 · dt · p(t) P e . The weighting function W 1p (t) is then proportional to p(t) P e . The fractions of the flushing air remaining in the upstream flask at the time t s and the fractions of pressurizing air in the downstream flask at the time t e are shown in Fig. A1.
When t = t e , the fraction of the air (entering the upstream flask at time t, with the volume of f (t) · dt) remaining in the upstream flask is p(t) P e , and the fraction flowing into the downstream flask is 1 − p(t) P e . When the flask sampling is completed, the flask pressure is P e , the fraction of all air that flows into the flask during flushing is F 1f = P s P e P e P s = P s P e 2 (A17) and the fraction of all pressurizing air in the flask is Based on Eqs. (A11) and (A16)-(A18), and the normalization, the weighting function for integrating in situ measurements to compare with the upstream flask is described as Pe 2 e −(ts−t)/τ ts 0 e − ts−t /τ dt , 0 < t < t s W 1p (t) = 1 − Ps Pe 2 dp(t) dt p(t) Pe te ts dp t dt p t Pe dt , t s ≤t < t e =    W 1f (t) = Ps Pe 2 · 1 τ e −(ts−t)/τ / 1 − e −ts/τ , τ = Ps 2· dp(ts) dt , 0 < t < t s W 1p (t) = 2·p(t) P 2 e · dp(t) dt , t s ≤t < t e (A19)

A2.2 Downstream flask
During the flushing period (0 < t < t s ), the incoming air mixes with the air in the upstream flask and flows through the downstream flask. When the pressurizing starts (t = t s ), the air already in the downstream flask is preserved. The mass balance for the air in the upstream flask at any time t can be written as: where c 1 (t ), c 2 (t ) are, at any given time t (t < t < t s ), the fractions of the air (in the upstream flask at time t) remaining in the upstream and downstream flasks, respectively, given the boundary condition c 1 (t) = 1, c 2 (t) = 0; V is the volume of the flask, and f 0 is the volume flow rate at the ambient pressure P s . The solution of the equation is At the end of the flushing period, i.e. t = t s , the fraction of the air (in the upstream flask at time t) remaining in the downstream flask is c (t s , t) = t s − t τ e −(t s −t)/τ , τ = V f 0 (A22) According to Eq. (A20), for the air entering the upstream flask at any given time t (with the volume f 0 · dt), the remaining volume in the downstream flask at time t s is f 0 · dt · t s − t τ · e −(t s −t)/τ . In addition, a fraction of the air that has flown into the upstream flask during flushing flows into the downstream flask during the pressurizing period, and according to Eq. (A14), at time t e the fraction of the air (in the upstream flask at time t s ) flowing into the downstream flask is 1 − P s P e . As a result, at time t e , for the air entering the upstream flask at any given time t (with the volume f 0 · dt), the remaining volume in the downstream flask is f 0 · dt · t s − t τ ·e −(t s −t)/τ + (1 − P s P e )·e −(t s −t)/τ , which is proportional to the weighting function W 2f (t): W 2f (t) ∼ t s − t τ e −(t s −t)/τ + 1 − P s P e e −(t s −t)/τ (A23) During the pressurizing period, the fraction of the air (in the upstream flask at time t) coming into the downstream flask can be derived from Eq. (A14): c 2 (t e , t) = 1 − p(t) P e (A24) According to Eq. (A21), for the air entering the flask at any given time t (with the volume f (t) · dt), the weighting function W 2p (t) is then proportional to f (t) · dt · (1 − p(t) P e ): W 2p (t)∼f (t) 1 − p(t) P e ∼ dp(t) dt · 1 − p(t) P e (A25) When the flask sampling is completed, the flask pressure is P s , and the fraction of flushing air in the downstream flask is and the fraction of pressurizing air in the downstream flask is Based on Eqs. (A23) and (A25)-(A27), the weighting function for the downstream flask is described as: Pe dp(t) dt 1 − p(t) Pe , ts≤t < te (A28)