Community Air Sensor Network ( CAIRSENSE ) project : evaluation of low-cost sensor performance in a suburban environment in the southeastern United States

Advances in air pollution sensor technology have enabled the development of small and low-cost systems to measure outdoor air pollution. The deployment of a large number of sensors across a small geographic area would have potential benefits to supplement traditional monitoring networks with additional geographic and temporal measurement resolution, if the data quality were sufficient. To understand the capability of emerging air sensor technology, the Community Air Sensor Network (CAIRSENSE) project deployed low-cost, continuous, and commercially available air pollution sensors at a regulatory air monitoring site and as a local sensor network over a surrounding ∼ 2 km area in the southeastern United States. Collocation of sensors measuring oxides of nitrogen, ozone, carbon monoxide, sulfur dioxide, and particles revealed highly variable performance, both in terms of comparison to a reference monitor as well as the degree to which multiple identical sensors produced the same signal. Multiple ozone, nitrogen dioxide, and carbon monoxide sensors revealed low to very high correlation with a reference monitor, with Pearson sample correlation coefficient (r) ranging from 0.39 to 0.97, −0.25 to 0.76, and −0.40 to 0.82, respectively. The only sulfur dioxide sensor tested revealed no correlation (r < 0.5) with a reference monitor and erroneously high concentration values. A wide variety of particulate matter (PM) sensors were tested with variable results – some sensors had very high agreement (e.g., r = 0.99) between identical sensors but moderate agreement with a reference PM2.5 monitor (e.g., r = 0.65). For select sensors that had moderate to strong correlation with reference monitors (r > 0.5), step-wise multiple linear regression was performed to determine if ambient temperature, relative humidity (RH), or age of the sensor in number of sampling days could be used in a correction algorithm to improve the agreement. Maximum improvement in agreement with a reference, incorporating all factors, was observed for an NO2 sensor (multiple correlation coefficient R2 adj-orig = 0.57, R2 adj-final = 0.81); however, other sensors showed no apparent improvement in agreement. A four-node sensor network was successfully able to capture ozone (two nodes) and PM (four nodes) data for an 8-month period of time and show expected diurnal concentration patterns, as well as potential ozone titration due to nearby traffic emissions. Overall, this study demonstrates the performance of emerging air quality sensor technologies in a real-world setting; the variable agreement between sensors and reference monitors indicates Published by Copernicus Publications on behalf of the European Geosciences Union. 5282 W. Jiao et al.: Community Air Sensor Network (CAIRSENSE) project that in situ testing of sensors against benchmark monitors should be a critical aspect of all field studies.

itor and erroneously high concentration values.A wide variety of particulate matter (PM) sensors were tested with variable results -some sensors had very high agreement (e.g., r = 0.99) between identical sensors but moderate agreement with a reference PM 2.5 monitor (e.g., r = 0.65).For select sensors that had moderate to strong correlation with reference monitors (r > 0.5), step-wise multiple linear regression was performed to determine if ambient temperature, relative humidity (RH), or age of the sensor in number of sampling days could be used in a correction algorithm to improve the agreement.Maximum improvement in agreement with a reference, incorporating all factors, was observed for an NO 2 sensor (multiple correlation coefficient R 2 adj-orig = 0.57, R 2 adj-final = 0.81); however, other sensors showed no apparent improvement in agreement.A four-node sensor network was successfully able to capture ozone (two nodes) and PM (four nodes) data for an 8-month period of time and show expected diurnal concentration patterns, as well as potential ozone titration due to nearby traffic emissions.Overall, this study demonstrates the performance of emerging air quality sensor technologies in a real-world setting; the variable agreement between sensors and reference monitors indicates that in situ testing of sensors against benchmark monitors should be a critical aspect of all field studies.

Introduction
Air quality monitoring, including measurements of common gas-phase and particulate matter pollutants, has traditionally been conducted by regulatory organizations using specific instrumentation and protocols.For example, the United States Environmental Protection Agency (EPA) monitors criteria pollutants regulated under the National Ambient Air Quality Standards (NAAQS) via a network of ambient monitoring sites operating federal reference methods (FRMs) or federal equivalent methods (FEMs).FRM and FEM designation for instruments is established through a strict testing protocol (Hall et al., 2014) and the overall network produces very high quality data that is, however, generally sparse in geographic coverage.
Meanwhile, numerous field studies have established that outdoor air pollution can vary considerably at a fine spatial scale due to localized impacts of source emissions (e.g., Karner et al., 2010).Recent and fast-paced technology development has brought to the market portable and low-cost air sensor devices that may have potential to provide hyper-local air quality data through individual use or application in a dense sensor network (Jovasevic-Stojanovic et al., 2015;Kumar et al., 2015;Snyder et al., 2013).Low-cost sensor devices, defined here as below USD 2000 per pollutant (i.e., under USD 4000 for a two-pollutant device), typically utilize electrochemical or metal oxide sensors for gas-phase pollutants such as carbon monoxide (CO), nitrogen dioxide (NO 2 ), nitrogen oxide (NO), ozone (O 3 ), and, to some extent, total volatile organic compounds (VOCs).Commercially available particle sensor devices currently use laserbased or light-emitting diode (LED)-based optical detection of particles.Currently, no direct mass measurement of particulate matter is commercially available, but ongoing research is in progress to develop a true mass measurement (Paprotny et al., 2013).The pollutant detection methods utilized in miniaturized sensors are potentially prone to measurement artifacts.For gas-phase sensors, these artifacts may include cross-sensitivity to other gases as well as impacts by varying humidity or temperature.The optical-based detection of particles is anticipated to be affected by humidity during high relative humidity (RH) conditions, as the uptake of water by hygroscopic particles can lead to an enhancement in the scattered light signal.Finally, both lower and upper detection limits are also an expected factor in sensor performance.
Research groups have built custom devices using available original equipment manufacturer (OEM) sensor components -such as the integration of the particulate PPD42NS sensor (Shinyei) into field-ready devices (Gao et al., 2015;Holstius et al., 2014) -which generally involves adding an enclosure, microprocessor, battery or AC electricity connection, wireless communications and/or on-board data storage, and potentially other environmental sensors.Most research groups working with low-cost OEM sensors have tested their sensor performance in field settings, with varying results.For particulate sensors, PPD42NS sensor comparison at low to moderate ambient concentrations revealed good correlation (e.g., R 2 = 0.72 for 24 h averages, PM 2.5 ranging ∼ 3-20 µg m −3 ) with a reference monitor (Holstius et al., 2014), but the same particle sensor at very high concentrations (hourly average PM 2.5 ranging ∼ 77-889 µg m −3 ) revealed a nonlinear response and authors used high-order model fits to correct their data (Gao et al., 2015).Additionally, a modified commercially available particle sensing device (Dylos) was shown to match diurnal ambient PM 2.5 trends with a research-grade monitor (DustTrak) under ambient concentrations (hourly average PM 2.5 ranging ∼ 5-50 µg m −3 ), after adjustment with 24 h averages derived by a beta-attenuation regulatory-grade monitor (Northcross et al., 2013).
Results of gas-sensor performance in real-world environments have also had promising but variable results.Spinelle et al. (2015) used multiple statistical approaches to maximize the data quality from O 3 and NO 2 sensors, finding a simple linear regression for an electrochemical ozone sensor was sufficient to achieve good correlation with a reference monitor, but even advanced supervised learning strategies were not able to achieve good correlation for NO 2 sensors.Mead et al. (2013) noted a 100 % ozone interference issue for an electrochemical NO 2 sensor, which could be corrected by sampling both parameters simultaneously.
Researchers are already employing low-cost sensors in exploratory research, to assess spatial variability of urban air quality (Gao et al., 2015;Heimann et al., 2015;Moltchanov et al., 2015), and the growing number of commercially available devices is anticipated to create an exponential increase in air quality data.The consumer product potential has motivated a number of new business ventures, some initiated through crowd-sourced funding (e.g., Kickstarter, Indiegogo).Sensor developers are also looking to engage directly with the public, with one innovative group providing particle sensors at a public library for citizens to borrow for their personal use (Page-Jacobs, 2015).While the public interest is quickly growing, the quality of the air sensor data remains uncertain, particularly for commercial devices that may be utilized by citizens and community groups without access to reference monitoring sites for collocation.In order to better understand the performance of commercially available air sensor devices, EPA established the Community Air Sensor Network (CAIRSENSE) project, which involves testing the feasibility of a wireless sensor network application as well as collocation of multiple identical sensor devices with reference monitors over an extended period of time.The CAIRSENSE project is a multi-year effort, involving field testing emerging air quality sensors in multiple locations in

Field study design
Two testing components -the sensor ad hoc field testing (SAFT) and the wireless sensor network (WSN) -constituted the CAIRSENSE project (Fig. 1).The SAFT involved a minimum 30-day testing period of duplicate or triplicate sensors located at a state regulatory monitoring site.Meanwhile, the WSN involved long-term (> 7 months) deployment of several selected sensors in multiple locations over an approximately 2 km 2 spatial range.With the overarching goal to test sensors with potential near-term wide use, candidate sensors were selected based upon several criteria and market research.Criteria pollutants -including particulate matter (PM), carbon monoxide (CO), nitrogen dioxide (NO 2 ), sulfur dioxide (SO 2 ), and ozone (O 3 ) -were given priority in sensor type selection.Other sensor selection criteria included a general upper cost limit at USD 2000 per pollutant (e.g., USD 2000 for a single pollutant sensor device, USD 4000 for a two-pollutant sensor device), commercial availability, continuous measurement, and low maintenance.The cost break point was set by the estimated hardware price point at the time of the device selection and does not incorporate other possible other costs that may vary by application (e.g., maintenance, data-hosting fees, modification of power input).The term "sensor" in this paper refers to the off-the-shelf hardware that was selected for testing, which generally includes one or more pollutant detection components (e.g., an electrochemical cell) combined with a form of on-board microprocessor to convert the signal into a concentration units.The design of the sensor for long-term use in an outdoor environment (e.g., a weatherproof enclosure) was not a selection factor, as the research team was aware of a number of outdoor air quality field studies utilizing sensors designed for indoor application.The field testing setup was therefore designed to provide weather protection for all sensor types tested.The SAFT sensor set included five types of PM sensors (Shinyei, Dylos, Airbeam, MetOne, and Air Quality Egg), three types of ozone sensors, three types of NO 2 sensors, two types of CO sensors, and one SO 2 sensor (Table 1).Finally, it should be noted that the sensors utilized in this study represent a selection of sensors available on the market at the time of the a Note that the Dylos and the MetOne also include additional size channels in their data output.The Dylos DC1100-PRO-PC and DC1100-PC include a larger particle size channel representing particles ≥ 2.5 µm and ≥ 5 µm, respectively.The MetOne sensor includes size channels for PM 1 , PM 4 , PM 10 , and TSP.
study initiation and that the sensor development market is quickly changing with time.
The SAFT component included two or three identical sensor devices collocated and operated on 115 V AC power.The sensors were placed in a shelter providing full exposure to ambient air while also protecting from rainfall (Fig. 1a and d).To understand the basic sensor device functionality, each SAFT sensor was operated according to manufacturer's recommendations and data were output in their default format.For example, PM sensors reported concentrations in a variety of units including µg m −3 , pt 0.01 cf −1 (particles per 0.01 cubic feet or 283 mL), and hppcf (hundreds of particles per cubic feet).For one sensor -the Air Quality Egg -units were unclear for gas measurements and the data output appeared to be raw voltage signals.All SAFT sensor data were logged locally to the extent possible; for sensors which were designed to transmit data primarily to an internet server (AirBeam, Air Quality Egg), a microprocessor code variation was written to support local logging.One exception was the AQMesh, a commercial system that utilizes multiple electrochemical sensors to measure gases and wirelessly transmits the data to the manufacturer's server.In this case, the data were provided to the research team from the manufacturer on a weekly basis during the field study.The AQMesh data analyzed were already post-processed by manufacturer proprietary algorithms prior to analysis.
In addition, four WSN nodes plus one base communication station were deployed to test the feasibility of deploying a local wireless sensor network.Selected air quality sensors included the Shinyei PM sensor, the Cairclip NO 2 / O 3 sensor, and the Aeroqual SM50 O 3 sensor, with the two gas sensors utilized in conjunction to provide data supporting the separation of NO 2 and O 3 signals.The CAIRSENSE network was designed based on a star topology with the NCore (National Core) location serving as the base station, while every other node connects to it.The design goal was for all of the nodes to wirelessly report their data in near real time to the base station, then data subsequently were transmitted to a server through cellular communication.Digi's Xbee-PRO 900 HP 900 MHz 10 Kbps radios were chosen as the backbone of the WSN based on their relative low cost and extended line-of-sight range.An omnidirectional antenna was selected for the base station while directional Yagi antennas were chosen for the remote nodes.Prior to the field deployment, the communication protocol and wireless range were tested between a remote node and the base station.Range tests were conducted in a mixed suburban environment in North Carolina with conditions similar to those found surroundings the NCore station.While the manufacturer lists a line-of-sight range of up to 9 miles (14 km) for the selected Xbee radios, actual tests indicated a maximum communication range of approximately 1 mile (1.6 km) with mixed Transmission of data to server base a A larger solar-power system was utilized for node 1, supporting the inclusion of the SM50 ozone sensor .The other location that included the sensor, node 4, was operated on land power.
open, forested, and commercial buildings located between the radios.The WSN nodes were designed to be small, weatherproof, and self-powered.The compact size was important to facilitate deployment and minimize the installed footprint.Each WSN node consisted of a weatherproof enclosure that was approximately a 0.4 × 0.4 × 0.15 m in size, supporting several low-cost (< USD 1000) sensors (PM 2.5 , O 3 , NO 2 ), an Arduino based microcontroller, micro SD card, Xbee wireless radio, Xbee antenna, solar panel, solar-power controller, and a 12 V DC battery.A photo of a typical node is shown in Fig. 1 with components listed in Table 2. Like the remote nodes, the base station had an Arduino microcontroller and Xbee radio to receive signals from the nodes and an SD card for on-board data logging.The base node included a Sierra Airlink ® GX440 cellular gateway and associated antenna to connect the base node to the internet.Data were uploaded and stored on a remote server in a Microsoft SQL database and displayed on private web page that updated every minute.The web page displayed the data in a tabular format and supported direct data downloading.The communication base station and the sensor node 4 collocated at the NCore site used 120 V (nominal) AC electricity, while the remaining satellite stations (nodes 1-3) operated on solar power with battery backup.
Preliminary review after WSN deployment revealed brief spurious PM readings (e.g., 10 to 50 times higher than FEM) that occurred during midday, which appeared to be caused by side-scattered sunlight intrusion to the Shinyei sensor.As an experimental measure, aluminum foil was placed surrounding the radiation shielding that encompassed the sensor to reduce light penetration, while still allowing the sensor to have access to ambient air.After foil was applied, very high values were greatly reduced (Fig. S4 in the Supplement); therefore, the foil covers were left in place for the remainder of the WSN data collection.

Study location
The State of Georgia South Dekalb regulatory monitoring site is located in the suburban Atlanta area Decatur (AQS ID: 130890002; latitude/longitude: 33.68808/−84.29018).The South Dekalb station is operated year-round as an NCore multipollutant monitoring network site and includes an extensive suite of measurements including criteria pollutants and precursors, air toxics, and meteorology.The surrounding area has mature trees, single-family residential houses, sports fields, and schools (Fig. 2).No known major point source emissions were located nearby.A nearby highway (I-285; 145 000 annual average daily traffic) is located approximately 400 m to the north of the site.
The SAFT component was located only at the NCore site.The WSN nodes were located in the surrounding area.Node 1 (WSN-N1) was positioned at a nearby medical center (∼ 1.9 km from the South Dekalb) and about 30 m away from the major highway.Node 2 (WSN-N2) was near a sports field (∼ 0.8 km from the South Dekalb).Node 3 (WSN-N3) was outside a school property (∼ 0.2 km from the South Dekalb).Node 4 (WSN-N4) and the communication base station were co-located with the NCore site.

Analytical methods
Sensor data were checked and analyzed bi-weekly during the first 3 months to ensure all sensors were working properly.Subsequently, data were recovered on a monthly basis.The statistical software R (http://www.r-project.org/)version 3.2.1 with the "base" and "openair" packages was used for all data processing and analysis.Multiple sensors reporting the same pollutant of interest were compared against readings recorded by the NCore FEMs.For duplicate or triplicate sensors evaluated in SAFT, readings were compared between identical sensors to understand the reproducibility of sensor performance.Several statistical measures are used to compare the co-located sensor measurements with the FEM data, including (1) the Pearson sample correlation coefficient (r) between individual sensor and FEM, (2) the average values of sensor and FEM measurements in their original units, and (3) the slope, intercept, and coefficient of determination (r 2 ) of ordinary least squares (OLS) regressions of individual sen-sor measurement on FEM.In addition, to enable basic comparison of PM values with a reference monitor, data from PM sensors that had at least moderate correlation (r > 0.5) were converted to µg m −3 units based on upon an OLS regression equation.
Local meteorology was anticipated to be a driver of spatial variability in local pollutant trends as well as potentially affecting sensor performance, as some sensors may have temperature and/or humidity-based artifacts.The NCore wind, temperature, and humidity data were used in all analyses as representative of local meteorology conditions.In addition, sensor aging is another potential source of measurement artifact -for example, solid-state gas sensors may undergo a loss of sensitivity over time.Therefore, an analysis of sensor performance over the number of sampling days was conducted to determine if an aging effect existed.Similar to the analysis by Holstius et al. (2014), artifacts were assessed by comparing the adjusted regression coefficients (R 2 adj ) among multiple linear regressions of all possible variable combinations.
For the WSN, the first step of the analysis was to conduct an experimental network calibration, where data were subset for a period presumed to be representative of similar atmospheric conditions at all sites -namely, hours of 01:00-04:00 and during periods with wind upwind of the highway (wind direction from 75 to 235 • ).For this study, all data representing those conditions were grouped and compared with the reference monitors, where OLS regressions were conducted with FEM values as the dependent variable and sensor values as the independent variable, which yielded a regression equation that was used to convert individual sensor values to the corresponding FEM units.For sensors reveal- ing at least marginal agreement with FEM data (r > 0.4), exploratory analyses are presented showing node-to-node comparison in trends.
While the EPA has a clearly defined method for approving technologies for use in a regulatory application (e.g., Hall et al., 2014), there currently are neither clearly defined nor universally accepted criteria by which to provide a "pass" or "fail", or alternative grading scheme, judgement on a particular sensor model.Developing such criteria will be challenging, given the diversity of research applications and related data quality objectives.In addition, sensor performance may be affected by both the air pollutant mixture and concentration level, as well as the environmental conditions.Therefore, the results in this paper are communicated quantitatively by their correlation, or lack thereof, in comparison to regulatory-grade monitors, with common associated descriptors of the strength of agreement (e.g., "moderate").

Results and Discussion
Sensor field testing and the wireless sensor network were conducted over a wide range of atmospheric conditions.The South Dekalb NCore site ambient temperature ranged from −12 to 33 • C (average = 14 • C) during the CAIRSENSE deployment and RH ranged from 11 to 100 % (average = 68 %).

Particle sensor evaluation
All particle sensors evaluated in this study detected particles via a light-scattering method.No sensors directly measured particulate mass nor had inertial-based size cuts preventing large particles from entering the optical cell.Based on the project goal of understanding whether these types of lowcost sensor data could be indicative of fine particulate matter (PM 2.5 ) trends, the reference monitor utilized for comparison was the MetOne BAM 1020 FEM PM 2.5 monitor.FEM PM 2.5 monitors are designed according to their application for use in determining compliance with the US EPA NAAQS, which are at a 24 h or annual time basis.The beta-attenuation approach utilized in the MetOne requires having sufficient particle mass deposited to the internal filter for an adequate signal-to-noise ratio.Given that research applications of PM sensors may desire to use the data at a sub-daily time interval, preliminary analysis was conducted to determine whether the raw MetOne BAM 1020 data could be used at a faster time resolution than 24 h, resulting in 12 h averaging period utilized for the FEM PM 2.5 data comparisons.
Summarized in Table 3, the various particle sensors had widely variable initial output quantities and correlation with the FEM monitor.The three collocated Air Quality Egg units, with internal Shinyei PPD42NS sensors, had poor correlation with the FEM (r = −0.06 to 0.40).The three MetOne 831 monitors also had weak correlation (r = 0.32 to 0.41).The three Shinyei PM sensors had moderate agreement (r = 0.45-0.60),followed by relatively higher correlation by the AirBeam (r = 0.65-0.66)and Dylos units (r = 0.63-0.67 for the DC1100 PRO-PC version, r = 0.58 for the DC1100 version).Comparison of identical sensors revealed generally highest agreement (Fig. S1) -for example, while the three MetOne monitors had weak correlation with the FEM, they had nearly perfect correlation between identical units (r = 0.99).This finding suggests that some sensor sets may have high-precision supporting use to evaluate relative concentration levels, but caution must be exercised in presuming the resulting measurements are representative of PM 2.5 reference measurements.Some factors that likely contribute to the strong agreement among optical particle sensors, but weaker agreement with PM 2.5 FEM monitors, include the following: differing physicochemical properties between calibration aerosol and real-world aerosol mixtures, light-scattering signal by particles larger than 2.5 µm, and, for some sensors, particle count as the reported value which generally emphasizes the numerous but smallest detected particles.It should be noted that one sensor type -the Dylos units -does provide an additional larger particle size channel (≥ 2.5 µm for the DC1100 PRO-PC version, ≥ 5 µm for the DC1100 version), which one indoor application study utilized to remove the larger particle signal (Dacunto et al., 2015).However, in the suburban ambient environment in this study, the fraction of particle count in the larger size channels appeared to be a small component of the total particle number count, with the ratio of the large vs. small count channels averaging 0.03 and 0.04 for the DC1100 and DC1100 PRO-PC, respectively.
Several particle sensors with at least fair correlation (r > 0.5) were further investigated for measurement artifacts based upon temperature, humidity, or days of use.For three selected sensors that showed the highest correlation with FEM among identical sensors -the Shinyei SAFT-2, Dylos SAFT-2, and Airbeam SAFT-2 -incorporation of artifacts such as temperature, RH, and number of measurement days made some minor improvements in agreement with the FEM as indicated by R 2 adj values from the multiple linear regression analysis (Table 5).No single factor provided much improvement to the Shinyei or Airbeam sensor agreement.However, accounting for days of use significantly increased the Dylos unit R 2 adj by 0.11, but incorporation of RH revealed no improvement and temperature revealed only minor improvement (+0.03 in R 2 adj ).

Gas-phase sensor evaluation
Gas-phase sensor measurements of O 3 , NO 2 , NO, CO, and SO 2 were compared with hourly average NCore reference monitors ( ).Since Cairclip readings were not calibrated with FEM, any negative values resulted from the subtraction were retained in the correlation analysis.In addition, it should be noted that two Cairclip sensors at the SAFT site showed apparent operation failure at the outset of testing.Replacement was conducted in mid-November for one sensor, for which the data were included in the analysis.The other failing sensor was deemed nonfunctional and the data were not incorporated into the collocation results.

Ozone
Of the ozone sensors tested, weak correlation was evident for two AQMesh units (r = 0.39-0.45),high for two Cairclip sensors (r = 0.82-0.94),and consistently very high for three Aeroqual SM50 sensors (r = 0.91-0.97)when compared to FRM/FEM measurements (Fig. S2).For the Aeroqual SM50 sensor, no apparent improvement in agreement was observed when temperature, RH, or sampling day length factors were incorporated (Table 5).However, incorporating RH appeared to provide some improvement (+0.07 in R 2 adj ) to the Cairclip sensor agreement with a reference monitor.

Nitrogen dioxide
The Cairclip, AQMesh, and Air Quality Egg measurements of NO 2 were highly variable compared with a reference monitor, with r ranging from 0.42 to 0.76, 0.14 to 0.32, and −0.25 to −0.22, respectively (Fig. S3).Only one Cairclip NO 2 sensor that had sufficient correlation was further explored for artifact correction.Significant improvement was evident when temperature and RH were incorporated as adjustment factors, with very slight additional improvement by incorporating days of use (Table 5).

Nitrogen oxide
One sensor device -the AQMesh -was tested that reported NO measurements.The two identical AQMesh units had high correlation with the reference monitor (r = 0.88-0.93).No apparent improvement in agreement was determined when incorporating environmental or days of use as adjustment factors (Table 5).In absolute terms, the NO original sensor output also agreed closely with mean FEM values (Table 4).

Carbon monoxide
The AQMesh and Air Quality Egg incorporated electrochemical and metal oxide CO sensors, respectively.The AQMesh reported CO in ppb units, whereas the Air Quality Egg had no clear indication of units.Good correlation (r = 0.79-0.82)was observed between the AQMesh and a reference monitor.Incorporating days of use provided significant improvement  in the AQMesh CO data (Table 5), with a clear slope drift with time evident (Fig. 3).The Air Quality Egg CO sensors had poor agreement with the reference (r = −0.40 to −0.14).

Sulfur dioxide
Only one sensor device was available that measured SO 2the AQMesh.The reported SO 2 values by the AQMesh were generally far higher than the reference monitor, on average a factor of 172 and 163 higher.While the two AQMesh units had high correlation with one another for SO 2 (r = 0.94), they had weak correlation (r = 0.13-0.17)with the reference monitor.

Data communications
Based upon preliminary tests establishing an approximate 1.6 km maximum range utilizing XBee antennas for the direct point-to-point communication, the initial WSN consisted of four nodes over a 2 km 2 area that transmitted data to the base node located at the South Dekalb site.However, the location of several buildings and mature forest canopy in the South Dekalb area limited the communication range of the network.Two of the WSN nodes communicated reliably with the base station (nodes 3 and 4), whereas data from the more distant nodes 1 and 2 were not received.An attempt to improve the network communication was conducted by adding a repeater node midway between the base station and the distant nodes, which had some limited success but consistent wireless communication for the entire network was not achieved.Therefore, data retrieval was primarily conducted via manual SD card downloads for nodes 1 and 2.

Spatial and temporal trends
Comparison of the hourly average WSN with FEM data during periods of time with presumably similar pollution readings in all locations -hours of 01:00-04:00 and all sites upwind of the highway -revealed moderate to good correlation between the WSN O 3 and FEM O 3 (two nodes, r = 0.62 to 0.87) and WSN PM and FEM PM 2.5 (four nodes, r = 0.4 to 0.45).While the Cairclip total output compared well (two nodes, r = 0.79 to 0.9) with the summation of FEM O 3 and FEM NO 2 , the result was not replicated when isolating and comparing the WSN NO 2 component.A simple subtraction of either the on-board O 3 sensor data (SM50) or the FEM O 3 data from the Cairclip total output revealed effectively no correlation between WSN NO 2 and FEM NO 2 (r < 0.1).This finding indicates that the Cairclip NO 2 / O 3 sensor readings may not be entirely additive and field performance may not replicate the strong agreement observed in a laboratory evaluation (Williams et al., 2014).Further evaluation is needed to understand how to separate the NO 2 portion of the signal.Based on these results, analysis of spatial and temporal trends were constrained to O 3 and PM 2.5 sensor data sets.After data were adjusted based upon linear regression analysis of WSN and FEM data sets during the early morning and upwind time periods, wind-directional plots indicated lower O 3 concentrations at the roadside site when air is transported from the highway (wind direction from the N) with no directional trend observed at the site > 400 m from the highway (Fig. 4).Therefore, the O 3 sensors appear to indicate an ozone titration trend that has been observed in other near-road field settings (Beckerman et al., 2008).Meanwhile, the PM sensors had fairly uniform concentrations at all four sites and over the full range of wind conditions (Figs. S5-S6).This finding is similar to past near-road studies, which generally see a low signal change in particulate mass (Karner et al., 2010).
Diurnal signals of ozone revealed that the two sensor nodes replicated the typical afternoon peak in ozone, but the amplitude of the cycle was smallest for the roadside site (Fig. S5).PM sensors had repeatable trends at all sites of maximum early morning concentrations (06:00-08:00), which may attributed to lower atmospheric mixing and commute traffic periods.

Conclusions and discussion
Emerging air sensor technology is of widespread interest to increase the spatial resolution of air quality data sets and empower communities to measure air quality in their local environments.The CAIRSENSE project is a multi-year, multi-city effort to assess emerging ambient air quality sensors with existing or near-term commercial availability.Longterm evaluation of duplicate or triplicate sensors in Decatur, Georgia, revealed widely variable sensor performance under real-world conditions.The selected testing location represents a generally low concentration, suburban environment (e.g., mean PM 2.5 ranging ∼ 9-12 µg m −3 ) with temperate winters and hot, humid summers.A variety of factors are anticipated to contribute to sensor performance in the measurement of outdoor air pollution trends.Key design aspects include the sensitivity and stability of the internal pollutant sensing component, design of the device enclosure and mechanism of introducing air to the sensing region, addition of any ancillary sensors used for signal adjustment (e.g., RH sensor), as well as on-board or cloud-based firmware processing raw signals into estimated concentrations.In addition, the pollution mixture, concentration regime, and environmental conditions are anticipated to impact sensor performance.Therefore, testing in multiple climates and air pollution mixtures is desirable to characterize emerging air sensor technology.
At the Decatur testing site, some sensors were observed to have very strong agreement with FEMs over an extended period of time (e.g., SM50 O 3 sensor) and no artifact adjustment was required to improve the agreement.Other sensors had good agreement with FEMs (e.g., AQMesh CO sensor), that improved even further when days of use, temperature, and/or humidity were incorporated as parameters in a multilinear regression equation.Other sensors had poor or even negative agreement with FEM data sets and, in some cases, substantially weaker field performance than what had been shown in a laboratory setting.These results demonstrate the need for individual sensor performance testing prior to field use, and the corresponding higher uncertainty in sensor data sets that do not incorporate field testing in their application.
Application of select sensors in a local wireless sensor network revealed useable ∼ 8-month data sets for both ozone and particulate matter.ZigBee-based network communications were feasible over short ranges (e.g., 0.5 km), with the data communication range reduced from the nominal ∼ 1.5 km by the surrounding mature trees and several structures in the area.Selecting early morning and upwind hours provided a means to adjust the data sets against the nearby FEM data and subsequently investigate diurnal and winddirectional trends.Ozone and PM trends were similar to repeatable past near-road field study observations.
Air quality sensor technology is quickly developing, with research efforts underway worldwide to apply sensors for multiple uses including long-term outdoor monitoring, shortterm field studies, stationary and mobile applications, and personal monitoring.This field study demonstrates a very wide range of sensor performance in an outdoor, suburban setting.While the results of this study are likely transferable to environments that may have similar pollution concentration ranges and environmental conditions, one complicating and uncontrollable factor is the potential variability in the sensor manufacturing process.To maximize the potential of this emerging technology, incorporating collocation with a reference monitor into future field study designs is highly encouraged.

Data availability
The CAIRSENSE project data sets will be available for retrieval at the EPA Environmental Dataset Gateway (https: //edg.epa.gov/)(EPA, 2016), where the data set can be retrieved by searching for the keyword "CAIRSENSE" or an author's last name.The project data can also be requested from the corresponding author.
The Supplement related to this article is available online at doi:10.5194/amt-9-5281-2016-supplement.

Figure 2 .
Figure 2. CAIRSENSE project wireless sensor network (WSN) and sensor ad hoc field testing (SAFT) locations.WSN-N4, SAFT, and the WSN communication base station are collocated with the NCore site.

Figure 3 .Figure 4 .
Figure 3. AQMesh vs. FEM carbon monoxide comparison, with markers colored by the number of days of sensor use.

Table 1 .
Sensors selected for collocation ad hoc field testing (SAFT).

Table 2 .
Wireless sensor network components.

Table 3 .
Comparison statistics for 12 h average PM measurements at South Dekalb NCore Site.
a With aluminum foil added after 2014/09/18.b Short, discontinuous testing period from January to May 2015.c Reference PM 2.5 instrument: MetOne, BAM 1020 (Grants Pass, OR, USA).NA: not available; n/a: not applicable.
Table 4).Of all the sensors discussed, the Cairclip NO 2 / O 3 sensor is unique in having a single data value output that nominally represents the addition of NO 2 plus O 3 .Therefore, Cairclip NO 2 or O 3 values discussed represent the initial summation minus a FEM reading (i.e., Cairclip NO 2

Table 4 .
Comparison statistics for hourly gas measurements at South Dekalb NCore site.

Table 5 .
Comparison of adjusted regression coefficients (R 2 adj ) of multiple linear regression models between reference concentrations against individual sensor a , ambient temperature, humidity, and/or number of measurement days.