Journal topic
Atmos. Meas. Tech., 12, 5347–5362, 2019
https://doi.org/10.5194/amt-12-5347-2019
Atmos. Meas. Tech., 12, 5347–5362, 2019
https://doi.org/10.5194/amt-12-5347-2019

Research article 08 Oct 2019

Research article | 08 Oct 2019

# Application of factor and cluster analyses to determine source–receptor relationships of industrial volatile organic odor species in a dual-optical sensing system

Application of factor and cluster analyses to determine source–receptor relationships of industrial volatile organic odor species in a dual-optical sensing system
Jen-Chih Yang1,2, Pao-Erh Chang1, Chi-Chang Ho2, and Chang-Fu Wu2 Jen-Chih Yang et al.
• 1Green Energy and Environment Research Laboratories, Industrial Technology Research Institute, Room 220, 2F, Bldg.6, 321, Sec.2, Kuang Fu Rd., Hsinchu City 30011, Taiwan
• 2Institute of Environmental and Occupational Health Sciences, National Taiwan University, Room 717, No. 17, Xu-Zhou Rd., Taipei 10055, Taiwan

Correspondence: Chang-Fu Wu (changfu@ntu.edu.tw)

Abstract

Most odor nuisance investigations rely on either human olfactory examination or on-site sampling and analytical techniques, but these methods are often subject to spatial and temporal limitations and thus impractical for locating odor emission sources. This study developed an alternative approach with a dual-optical sensing system, a meteorological station, and the combination of factor and cluster analyses to identify and characterize emission sources of multiple air contaminants. Factor and cluster analyses were employed to establish the emission profile of multiple odorous substances from each emission source. Both receptor and source monitoring data were collected to characterize the emission sources of various odorous substances. Open-path Fourier transform infrared (OP-FTIR) as a receptor path detected concurrent trends of several organic solvents with concentrations higher than the reference odor threshold values, indicating that these compounds were potential causes of odor nuisance. Qualitative source apportionment by factor and cluster analyses suggested that these odorous substances were used as organic solvents in surface coating or painting processes. Closed-cell Fourier transform infrared (CC-FTIR) at two nearby surface-coating companies indicated that only one company's stack exhibited the same odorous substance profile found by the OP-FTIR receptor path. The major odor emission source was thus identified in this study. This study demonstrated the feasibility of using the alternative investigative framework to successfully identify emission sources from an industrial odor nuisance site. The major emission sources were identified, and future enforcement plans can be conducted to enhance odor investigation efficiency and improve overall air quality.

1 Introduction

The rapid growth of the economy and industrialization have led to environmental pollution problems, and consequently an increase in environmental nuisance complaints has been evidenced in recent years. With more than 93 265 complaints, representing 33.7 % of total reported environmental nuisances (Fig. 1), odor nuisances have been ranked as the leading cause of environmental nuisances in Taiwan (Taiwan Environmental Protection Agency, 2017). Volatile organic compounds (VOCs) are one of the factors contributing to odors and triggering various health problems, such as asthma, pneumonia, and bronchitis (Pride et al., 2015). VOCs are also a precursor to fine particulate matter in the atmosphere, aggravating photochemical smog conditions in urban areas (Hu et al., 2017; Jathar et al., 2014). With residential area gradually expanding to industrial districts, odor nuisance has become another critical problem related to industrial VOC emissions, with a great impact on quality of life.

Figure 1Trend of total odor nuisance complaints by the TEPA from 2004 to 2017. An increase in odor nuisance complaints has been evidenced in recent years and the odor nuisances have been ranked as the leading cause of environmental nuisances in Taiwan.

Identifying the emission sources responsible for VOCs and odors remains a great challenge. Most odor nuisance investigations rely on either human olfactory examination or on-site sampling and analytical techniques (Merlen et al., 2017). However, these methods are hampered by spatial and temporal limitations. The “triangle odor bag method”, originally developed by the Tokyo metropolitan government in 1972, was adopted by the Taiwanese government as a regulatory enforcement method in odor nuisance investigation. This method quantifies odor nuisance by using the human olfactory sense of a group of trained personnel (Higuchi, 2009; Higuchi and Masuda, 2004; Ueno et al., 2009). However, this method can only help determine the degree of odor intensity in collected air samples; it cannot enable the identification of the responsible emission sources. Sampling tools such as the Summa canister, Tedlar bag, and charcoal tube can be equipped with conventional fixed-point sampling and analytical methods to measure various VOC odor species (van Harreveld, 2003; Rumsey et al., 2012). However, these methods are highly temporally and spatially dependent, rendering the sampling of periodic or occasional odor episodes problematic. The insufficiency of conventional fixed-point sampling and analytical methods poses a great challenge to regulatory inspectors when odor nuisance occurs intermittently or during nonworking hours or originates from multiple sources. Many repeated air pollution complaints remain unresolved because the root pollution sources have not yet been found.

Fourier transform infrared (FTIR) spectrometry is an optical sensing technology that can detect multi-gaseous pollutants on a continuous basis and is, therefore, suitable for use in VOC or odor emission source investigation (Russwurm et al., 1991; Sung et al., 2014). It can allow real-time monitoring and analysis of several compounds simultaneously. The IR “fingerprints” of over 300 compounds were established on the basis of information from the U.S. Environmental Protection Agency (U.S. EPA) and the FTIR software developers. FTIR systems are of two types, namely open-path and closed-cell systems (USEPA, 2011). The open-path system, also called open-path Fourier transform infrared (OP-FTIR) spectroscopy, is an optical remote sensing technique used for measuring VOCs and inorganic compounds such as ammonia and hydrogen chloride in the ambient environment (e.g., fence line monitoring). The closed-cell system, also called closed-cell Fourier transform infrared (CC-FTIR) spectroscopy, is equipped with the same basic FTIR module as the OP-FTIR system, but employs gas pumps and sampling tubes to extract waste gas (e.g., from stack outlets) to a multipath cell attached to the FTIR spectrometer. In this study, the OP-FTIR and CC-FTIR systems were combined to obtain a “dual-optical sensing system” for accomplishing the multiple functions of open-path long-range measurement, continuous monitoring, and multiple species measurement of stack exhaust, offering a powerful alternative method for investigating VOC or odor emission sources. Because OP-FTIR and CC-FTIR systems can generate a large speciation dataset in a short period, statistical methods play an essential role in data processing to extract the underlying meaning behind time series patterns. Multivariate statistical modeling is suitable for processing FTIR data because it primarily analyzes correlations between time series trends of different species at different locations. By identifying common contaminants and concurrent trends among the various species measured using both systems, data from both receptors (OP-FTIR) and sources (CC-FTIR) may be compared and analyzed.

The aim of this study was to develop an alternative investigative framework to detect air pollution sources by using a dual-optical sensing system, a meteorological station, and factor and cluster analyses to enable future accomplishments of emission reductions according to the investigation result.

2 Materials and methodology

## 2.1 Site description

Taiwan Environmental Protection Agency (TEPA) frequently receives complaints of odor nuisance at an intersection near an industrial park in southern Taiwan. The odor, being described as solvent- or chemical-like, is mostly reported by commuters traveling through or waiting at the traffic signal at this intersection. A sunglass factory (hereafter called CY) is located at the northwest corner, a light metal casing factory (hereafter called KS) in the southeast corner, and a solar cell manufacturer (hereafter called NS) at slightly removed from the intersection to the east. Stacks (approximately 15–30 m high) on the rooftop of each factory continuously emit various processing gases during operating hours. The chemicals used at both CY and KS are mainly paint-related materials containing organic solvents, such as toluene, xylene, acetone, and ethyl acetate, for surface coating purposes (CRC, 2006). NS mainly uses inorganic materials such as ammonia, silane, and nitric acid for silicon glass processing, thus generating both primary and secondary air pollutants (e.g., nitrogen dioxide) from high-temperature glass sintering processes (USPatent:4883521A, 1989).

## 2.2 Sampling techniques

To investigate odor emission sources at this location, an OP-FTIR beam path was deployed at the intersection to mimic the olfactory sense of people traveling through it. The OP-FTIR spectrometer (AirSentry-FTIR, Cerex, USA) used in this study was a monostatic type equipped with Zn–Se beam splitters and liquid-nitrogen-cooled mercury cadmium telluride (HgCdTe) detectors and a corner-cube retroreflector (PLX, Inc., USA) placed on the other end of the beam path. An infrared (IR) light beam transmitted from a telescope to the retroreflector targeted some distance away from the light emitter was reflected back to the detector inside the instrument, enabling measurement of pollutants transported through the light beam path.

Monitoring was conducted from 9 to 19 March 2015, collecting a total of 2911 consecutive spectral data. The OP-FTIR beam path was 143 m long in one direction and was equipped with a light emitter on the ground level on one side and a retroreflector at a height of 10 m on the other side. A meteorological station at a height of 12 m (fourth floor) was used together with the OP-FTIR beam path to monitor wind speed and direction (see Fig. 2a). Wind and OP-FTIR data were simultaneously measured and continuously collected in a synchronic system to enable identification of the incoming direction of gaseous contaminants and provide the spatiotemporal measurement of VOCs or odor pollutants.

Figure 2Top view of OP- and CC-FTIR configuration. (a) Receptor path of OP-FTIR monitoring at the intersection. The OP-FTIR beam path was 143 m long in one direction and was equipped with a light emitter on the ground level on one side and a retroreflector at a height of 10 m on the other side. A meteorological station at a height of 12 m was used together with the OP-FTIR beam path to monitor wind speed and direction. Wind and OP-FTIR data were measured as a synchronic system to enable identification of the incoming direction of gaseous contaminants and provide a spatiotemporal measurement of VOCs or odor pollutants. (b) Source stack CC-FTIR measurement at three potential odor emission sources. A 10 m (path length) gas cell with the inner pressure of 95 992 Pa and an estimated gas flow rate of 0.37 L s−1 were used for the CC-FTIR multi-reflection gas measurements.

Official documents were reviewed to ascertain the raw material usage of each of the nearby factories. Three potential sources (factories), namely CY, KS, and NS, were targeted for further stack monitoring using CC-FTIR. A 10 m (path length) gas cell with the inner pressure of 95 992 Pa, and an estimated gas flow rate of 0.37 L s−1 was used for the CC-FTIR multi-reflection gas measurements. The water vapor was mostly removed by using an impinger connected to the inlet of the gas cell to decrease interference with H2O absorption in the FTIR spectra. The stack exhaust of these three factories was measured for 24 to 242 h, generating data at each 5 min interval. This continuous monitoring system generated sufficient time series data to enable factor and cluster analysis in the next phase. Two CC-FTIR systems were deployed at each selected emission source to measure chemical species of exhaust gases from each stack (see Fig. 2b). Sampling tubes were divided into several manifolds at the stack end, joining together before entering the CC-FTIR gas cell. This sampling method allowed multiple waste gas flow from different stacks to be collected and transferred to the gas cell simultaneously, avoiding time lags when switching the sampling line from one stack to another. A total of 4378 spectral data were collected from the stack outlets of the three potential odor emission sources, namely 288 spectra from CY, 2907 spectra from KS, and 1183 spectra from NS.

## 2.3 Chemical analysis methods

Any gaseous compounds absorbed in the IR region (approximately 2.5–25 µm) were potential candidates for monitoring using FTIR technology. The resolution of the OP-FTIR and CC-FTIR interferograms was 1 cm−1, recording a co-added infrared spectrum at 5 min intervals, with 64 IR scans generated at each interval. Contaminants of interest were identified and quantified using spectral search software featuring compound-specific analysis and comparison to the system's internal reference spectra library. The unique fingerprint characteristics of each chemical compound made identification of gaseous pollutants possible through comparing the shape, position, and relative peak height of each measured spectrum with reference spectra. Multicomponent classical least-squares techniques were employed in the FTIR spectral quantitative analysis. Rolling backgrounds were used in OP-FTIR spectral analysis to eliminate baseline shifts resulting from contingent changes in weather conditions (Hunt, 1995). The rolling background was collected using the first spectrum as a background to create an absorbance spectrum from the second spectrum, using the second spectrum as a background for the third spectrum and so on. The integral values of concentrations are calculated to obtain time series data for each compound. The advantage of using the rolling background is that it will have the best correction for water vapor, detector and instrument response, and the lowest residual error (Hunt, 1995). A “fixed” reference method was used in CC-FTIR spectral analysis. The fixed reference method uses a reference spectrum that is taken from the zero air or highly purified nitrogen to generate a bundle of spectra using an identical reference spectrum. The main advantage of this method is that the reference is pure, without any contaminants, and the absolute concentrations of the contaminants can be calculated accordingly (Hunt, 1995).

## 2.4 Qualitative receptor modeling

Factor analysis and cluster analysis using the SAS statistical software package (SAS Institute Inc., USA) were employed in qualitative receptor modeling in this study. Factor analysis uses an eigenvector with varimax orthogonal rotations to interpret large datasets (Johnson, 1998). The factor analysis model expresses each variable as a linear combination of underlying common factors f1, ${f}_{\mathrm{2}},\phantom{\rule{0.125em}{0ex}}\mathrm{\dots },\phantom{\rule{0.125em}{0ex}}{f}_{m}$ with an accompanying error term to specify that part of the variables that are uncorrelated with any of the common factors. For X1, ${X}_{\mathrm{2}},\phantom{\rule{0.125em}{0ex}}\mathrm{\dots },\phantom{\rule{0.125em}{0ex}}{X}_{p}$ in any observation vector X, the m-factor model is calculated using the following Eqs. (1)–(4) (Rencher, 2002):

$\begin{array}{}\text{(1)}& {X}_{\mathrm{1}}={a}_{\mathrm{11}}{f}_{\mathrm{1}}+{a}_{\mathrm{12}}{f}_{\mathrm{2}}+\mathrm{\dots }+{a}_{\mathrm{1}\phantom{\rule{0.125em}{0ex}}m}{f}_{m}+{e}_{\mathrm{1}},\text{(2)}& {X}_{\mathrm{2}}={a}_{\mathrm{21}}{f}_{\mathrm{1}}+{a}_{\mathrm{22}}{f}_{\mathrm{2}}+\mathrm{\dots }+{a}_{\mathrm{2}m}{f}_{m}+{e}_{\mathrm{2}},\text{(3)}& {X}_{p}={a}_{p\mathrm{1}}{f}_{\mathrm{1}}+{a}_{p\mathrm{2}}{f}_{\mathrm{2}}+\mathrm{\dots }+{a}_{p\phantom{\rule{0.125em}{0ex}}m}{f}_{m}+{e}_{p},\text{(4)}& \begin{array}{rl}& \mathbit{X}={\left({X}_{\mathrm{1}},\mathrm{\dots },{X}_{P}\right)}^{\prime },\mathbit{f}={\left({f}_{\mathrm{1}},\mathrm{\dots }{f}_{m}\right)}^{\prime },\\ & \phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\text{and}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\mathbit{e}={\left({e}_{\mathrm{1}},\mathrm{\dots }{e}_{p}\right)}^{\prime },\end{array}\end{array}$

where Xi is the ith chemical species with mean 0 and unit variance, $i=\mathrm{1},\mathrm{\dots },p$; ai1 to aim are the factor loadings for the ith chemical species; f1 to fmm are uncorrelated common factors, each with mean 0 and unit variance; and e represents the error terms indicating the residual part of Xi that is not in common with the other variables.

Because data collected by FTIR contain many intercorrelated variables that are multivariate, the simultaneous consideration of all variables was essential to understanding the underlying meaning of the measured data. Variables (VOC or odor substances) with concurrent patterns were grouped together as a factor to gain insight into the underlying emission source characteristics. Factors with an eigenvalue greater than 1 were retained for varimax rotations and factor loading calculations. Factor loadings with absolute values greater than 0.4 were considered influential variables (Rencher, 2002); the higher the factor loading (>0.4), the stronger the correlation between the variables (odor substances) and the factor (emission source). The combination of variables in each factor roughly represented the types or characteristics of each factor or source. This method is especially useful when the patterns of association between the receptor (measured by ambient OP-FTIR) and source (measured by stack CC-FTIR) are compared reciprocally, enabling emission sources that mutually correspond to be identified.

Figure 3Wind roses for (a) 9–19 March 2015 and (b–l) each day during the 9–19 March 2015 period; 9–14 March 2015 from the NNW–N–NNE directions and 15–18 March 2015 from the SSW–S–SSE–SE–ESE directions.

Cluster analysis is used to find patterns in a dataset by grouping all variables into clusters. A single linkage method (also called nearest-neighbor method), a type of hierarchical method, was used to calculate the distance between two clusters in this study. In the single linkage method, the distance between two clusters A and B is defined as the minimum distance between a point in A and a point in B described as Eq. (5) (Rencher, 2002):

where d(yi, yj) is the Euclidean distance (Rencher, 2002, chap. 14).

The concurrent trends between different species can be analyzed using both factor and cluster analysis. Odor contaminants with concurrent patterns were grouped as a factor to gain insight into the underlying emission source characteristics. Meteorological data were used to confirm the factor analysis in that the incoming wind direction of each factor (representing a group of chemicals) may be different according to the relative locations of each potential odor source. Cluster dendrograms provide linkage paths between groups of chemicals to offer more information about the characteristics of different emission sources.

3 Results and discussion

## 3.1 Meteorological data

The meteorological data from 9 to 19 March 2015 are shown in Fig. 3. The prevailing wind from 9 to 14 March was from the NNW–N–NNE directions, whereas the prevailing wind from 15 to 18 March was from the SSW–S–SSE–SE–ESE directions. A dramatic change in wind direction from 14 to 15 March, when the incoming wind direction changed from north to south, was observed. The integrated wind direction in Fig. 3a indicates that the overall wind direction was from the N–NNE direction during the 10 d of field monitoring.

Table 1Descriptive statistics of VOC measurements at the receptor site and the correlation coefficients between the receptor site and the reported odor nuisance events.

Measurements performed from 9 March 2015 at 15:41 LT to 19 March 2015 at 11:25 LT; 2911 spectra were recorded at 5 min intervals.
a Background species. The exact concentration of the background species cannot be quantified using a rolling background because of unknown background levels.
b Italicized numbers represent concentrations exceeding corresponding odor thresholds.
c MDC (estimated minimum detectable concentrations) is calculated based on path length 286 m (two-way), 5 min average by the peak-to-peak (p–p) absorbance noise in the spectral region of the target absorption feature and the MDC is the absorbance signal (of the target compound) that is equal to the p–p noise level, using a reference spectrum acquired for a known concentration of the target compound: MDC $=\frac{\left(\mathrm{ppm}\cdot \mathrm{m}\right)}{{A}_{n}\left(v\right)}\cdot \frac{{\text{NEA}}_{x}}{\text{pathlength}\left(\mathrm{m}\right)}$, where MDCpeak-to-peak is estimated minimum detectable concentration (ppm or ppb), An(ν) is normalized absorbance, NEAx is noise equivalent absorbance which is calculated by the p–p noise, and path length is two way path length (m).
d Phi Correlation coefficient (Gallagher, 2011).
e Point biserial correlation (Demirtas et al., 2012).
f Statistically significant correlation coefficients are marked with ${}^{*}p<\mathrm{0.05}$, ${}^{**}p<\mathrm{0.01}$, and ${}^{***}p<.0.001$.

Figure 4Comparison between measured spectra (at the receptor site) and reference spectra (from the spectra library).

## 3.2 Ambient data from receptor path

Table 1 shows the ambient concentration of air contaminants measured using the OP-FTIR system at the intersection. The first column represents the 16 species measured by the receptor path (OP-FTIR), namely acetone, ethyl acetate, ammonia, gasoline, m-xylene, nitrogen dioxide, o-xylene, n-butyl acetate, toluene, propylene glycol methyl ether acetate (PGMEA), p-xylene, acetylene, ethylene, butyl cellosolve, carbon monoxide, and nitrous oxide. Figure 4 displays a series of comparisons between the measured and reference spectra. The concentrations of most species were quantified, except for background species such as carbon monoxide and nitrous oxide. The exact concentration of background species cannot be quantified using a rolling background in the spectral analysis because of unknown background levels; however, the incremental concentration of these species can still be calculated to generate concentration trends suitable for factor analysis. A total of 2911 consecutive spectra were collected during the 10 d of field monitoring, with various detection limits intrinsic to each compound. The numbers shown in the second column indicated that the probability of detection of ammonia, ethyl acetate, acetone, butyl cellosolve, n-butyl acetate, o-xylene, PGMEA, and ethylene was higher than that of other species. The maximum value of each detected contaminant represented the highest concentration measured within a 5 min period. Concentrations detected using OP-FTIR were the path average. Among the 16 detected species, the major compounds were gasoline, m-xylene, and nitrogen dioxide, with mean concentrations of 33.21±5.00, 27.96±6.05, and 25.13±3.28 ppbv, respectively. Toluene, o-xylene, and acetone revealed mean concentrations ranging from 11.61 to 20.57 ppbv. The concentration levels of gasoline, m-xylene, nitrogen dioxide, n-butyl acetate, toluene, and PGMEA were higher than the odor threshold reference values, indicating that these compounds were potential causes of odor nuisance in the intersection zone. These odor substances are mainly used as organic solvents in surface coating or painting processes. The evidence of the correlation between the substances (concentrations) detected at the receptor site and reported odor nuisance events was provided by using phi coefficients and point biserial correlation (Gallagher, 2011; Demirtas et al., 2012). The phi-coefficient correlations (rphi) for “odor” vs. “compound” displayed correlation coefficients of two dichotomous variables between the detection of compounds (detected vs. non-detected) and the perception of odor (odor vs. non-odor; as recorded by the local environmental protection agency). The point biserial correlation (rpb), a correlation between one continuous and one dichotomous variable, represents the concentration of compounds and the perception of odor (Capelli et al., 2013). A value close to 1 for rphirpb indicated that the association between odor and compound was strong. The rphirpb values between the odor and acetone, ethyl acetate, toluene, PGMEA, and butyl cellosolve were mostly at moderate levels (rphi=0.50 to 0.67; rpb=0.30 to 0.45), and the correlations were statistically significant (p<0.001). Relatively weak correlations between the odor and m-xylene, p-xylene, and n-butyl acetate were shown, although the correlations were statistically significant (p<0.001) as well. Therefore, it would suggest that acetone, ethyl acetate, toluene, PGMEA, and butyl cellosolve were the most likely odor substances that were correlated with the recorded odor nuisance events, which were defined by any solvent smell arising from the intersection zone. A complete time series pattern of chemical species found at the receptor site that was used as the basis for the calculation of rphi and rpb was shown in Fig. 5, in which the periods when the odor was reported were highlighted.

Figure 5Time series pattern of chemical species detected at the receptor site by OP-FTIR; the yellow highlights indicate the periods when the odor was reported.

Table 2The grouping of the data as a function of time and wind direction using factor analysis for chemical species measured by OP-FTIR at the receptor site.

a Extraction method: principal component analysis; rotation method: varimax with Kaiser normalization . b Kaiser's measure of sampling adequacy: overall MSA=0.849, indicating the dataset's appropriateness for use in factor analysis. c Italicized numbers represent factor loadings of >0.40, indicating the main species in each factor (source).

Figure 6Diurnal time series pattern of factor scores for the four factors/sources categorized by OP-FTIR at the receptor path; (a) first group (F1_OP): surface coating; (b) second group (F2_OP): incomplete engine combustion; (c) third group (F3_OP): solar cell production; (d) fourth group (F4_OP): solvent use; (e) time series pattern of factor scores of the four factors, suggesting that the proportion of the factor scores in negative values were in a relatively small range.

Table 2 summarizes the results of factor analysis for the OP-FTIR receptor path. The pattern of the first factor (F1_OP) indicated several organic solvents, including m-xylene, p-xylene, o-xylene, ethyl acetate, PGMEA, toluene, and butyl cellosolve, all of which are commonly used as chemical solvents in surface coatings and paints (USEPA, 2009) and could be considered possible causes of odor nuisance because their concentrations were higher than the reference values. The daytime pattern of factor scores for the first group, as shown in Fig. 6a, revealed higher concentrations and frequencies of occurrence from 14:00 to 22:00 LT, particularly on weekdays. This could explain the higher incidence of odor nuisance complaints during the afternoon and evening hours on weekdays. Moreover, the incoming direction of these seven species (as represented by factor scores) revealed that the highest factor score occurred in the direction of the WNW, although a few came from the direction of ESE and the directions of NNW–ENE (Fig. 7a).

The compounds included in the second factor (F2_OP) were acetylene, ethylene, gasoline, and carbon monoxide. Figure 6b shows the daytime pattern of these five species, indicating higher concentrations during the peak traffic hours from 06:00 to 09:00 and 17:00 to 20:00 LT on weekdays (Fig. 6b). This unique pattern indicates that the second group of compounds may be derived from incomplete combustion of vehicles waiting or idling at the intersection and thus generating chemical byproducts such as acetylene, ethylene, and carbon monoxide (USEPA, 2000; Liu et al., 2014). The incoming directions of Factor 2 were mostly from NNE to NE, although a few came from the directions of ENE–SE and the direction of NNW (Fig. 7b), indicating multiple source directions for the incomplete engine combustion.

Ammonia, nitrogen dioxide, and nitrous oxide were identified as the third-factor (F3_OP) compounds. These mainly inorganic compounds exhibited higher concentrations from 06:00 to 09:00 LT on weekends (Fig. 6c) and mostly came from the NNE–ESE directions, although a few came from the SSW direction (Fig. 7c), indicating that the major upwind location of the emission source(s) was located in the NNE–ESE direction. The solar cell production company located to the east and using inorganic materials such as ammonia, silane, and nitric acid to produce silicon glass could generate nitrogen dioxide and nitrous oxide from high-temperature glass sintering processes (USPatent:4883521A, 1989), and was therefore deemed the potential emission source.

Figure 7Wind rose diagrams of factor scores for the four factors/sources categorized by OP-FTIR at the receptor path; (a) first group (F1_OP): surface coating; (b) second group (F2_OP): incomplete engine combustion; (c) third group (F3_OP): solar cell production; (d) fourth group (F4_OP): solvent use.

The fourth-factor (F4_OP) compounds, namely acetone and n-butyl acetate, also exhibited higher concentrations and greater frequency of occurrence from 06:00 to 10:00 and from 17:00 to 22:00 LT on weekdays (Fig. 6d). The incoming direction of these two compounds was mainly from the N–ENE (Fig. 7d), which is slightly different from that of the first-factor (F1_OP) compounds.

The four factors were identified and characterized through the combination of species, hours of emission, and incoming direction of each. Four groups of emission sources were identified and categorized using factor analysis, namely surface coating (paint), incomplete engine combustion, solar cell production, and solvent use.

## 3.3 Comparison of ambient data from the receptor path and source profiles from multiple stacks

The ambient data from the receptor path indicated several factors or source groups at the intersection, including organic solvents from the surface coating, traffic emissions from incomplete vehicle engine combustion, and inorganic emissions from solar cell production. Official documents showed that the chemicals used in both CY and KS were paint-related materials containing organic solvents, which were thus categorized as first-factor (F1_OP) compounds. However, wind rose diagrams for the first factor (Fig. 7a) displayed multiple source directions (including N–NE, NW, and ESE), indicating that the first factor (F1_OP) might not be limited to one source; further efforts are thus required to clarify the sources. To analyze observations at the receptor path, the emission profiles of potential sources were compared.

Table 3Concentrations of the receptor path vs. source stacks and vehicle exhaust profile (ppbv).

Figure 8Panel plots showing relationships between source profiles and ambient data: (a) CY source profile; (b) KS source profile; (c) NS source profile; (d) traffic source profile; (e) ambient data at the receptor.

Figure 8 and Table 3 present a comparison of the detected air pollutants and their concentrations at the receptor path and source stacks in the intersection zone. Vehicle exhaust profiles from the U.S. EPA's SPECIATE database are also provided in the last column of Table 3 to indicate the emissions from traffic incomplete vehicle engine combustion at the receptor path. Almost every compound detected in the receptor path corresponded with one or more chemicals from the source stacks, except for traffic-related chemicals (e.g., gasoline, ethylene, and acetylene). The panel plot shows the patterns of association between the receptor (ambient data from OP-FTIR) and the source (stack source profile from CC-FTIR). Concentration boxplots for chemical species (except carbon monoxide) measured using OP-FTIR (at the intersection) are shown in Fig. 8e, with eight species coinciding with those found in the CY stacks (Fig. 8a), seven coinciding with those found in the KS stacks (Fig. 8b), and three coinciding with those found in the NS stacks (Fig. 8c), as well as six from vehicle emissions (Fig. 8d). Furthermore, among the species found in the CY and KS stacks, six coexisted in both factories, namely ethyl acetate, toluene, o-xylene, m-xylene, p-xylene, and acetone, indicating that these six compounds were common species emitted at both locations. By contrast, butyl cellosolve and PGMEA were uniquely found in the CY stacks. Ammonia was found at both the KS and NS stacks.

## 3.4 Factor and cluster analyses of sources

Because the chemicals used at both CY and KS were mainly organic solvents that are similar to each other, factor analysis was performed for each source to distinguish the main contributor of odor nuisance in this location and examine relationships between the ambient data and the profiles of these two sources.

Two types of multivariate statistical methods, namely factor and cluster analyses, were used together to analyze concurrent trends of CC-FTIR data measured at the CY, KS, and NS stacks (Table 4). The result of factor analysis for CY (Table 4a) indicated two factors with an eigenvalue greater than 1. The influential species (factor loading of >0.4) for the first factor (F1_CY) were o-xylene, m-xylene, p-xylene, toluene, PGMEA, ethyl acetate, and butyl cellosolve, but only acetone for the second factor (F2_CY). The first factor (F1_CY) contained a combination of various types of solvents used as paint thinners (for plastic coating purposes), whereas the second factor (F2_CY) species (acetone) were used as chemical solvents to remove residual paint in sprinkle nozzles. Two factors were also identified from the CC-FTIR results for the KS stack (Table 4c). The first factor (F1_KS) comprised p-xylene, toluene, m-xylene, and o-xylene, and the second factor acetone and ethyl acetate. The first factor (F1_KS) thus contained various chemical solvents used as paint thinners (for metal coating purposes), whereas the second factor (F2_KS) contained substances used for cleaning or other purposes in manufacturing light metal casings. The chemicals from the NS stacks were mainly inorganic materials (nitrous oxide, silane, ammonia, nitrous acid, and nitrogen dioxide) that were either primary or secondary air pollutants derived from solar cell production (Table 4e), all of which did not correspond with the organic odorous solvents identified at the receptor sites. The first factor (F1_NS) was comprised of raw materials used for growing antireflection films, including nitrous oxide, silane, and ammonia, in which silane and ammonia were often controlled at opposite flow rates to ensure no significant pressure fluctuations (ChinaPatent:CN102244109B, 2013). The second factor (F2_NS) contained nitrous acid and nitrogen dioxide, in which the formation of NO2 is enhanced by thermal decomposition of HNO3.

Using cluster dendrograms, different compounds can be linked to represent their relationships with each other and the interrelationships between groups, thus providing another means of displaying correlations between different variables. According to the cluster analysis results in Table 4b and d, acetone was excluded from other chemicals already in the first branch, indicating that its source was different from others. Similarly, the linkage path between groups of chemicals differed from one company to another, indicating that different types of paint thinner could be used in two companies for different purposes. Factor analysis between ambient data and source profiles indicated that the grouping pattern of seven odorous compounds (o-xylene, m-xylene, p-xylene, toluene, PGMEA, ethyl acetate, and butyl cellosolve) between the receptor path (OP-FTIR) and the CY stack (CC-FTIR) was identical. Thus, the CC-FTIR results from the CY stacks indicated the same odorous compounds as the receptor path (OP-FTIR), all of which came from the direction of CY. However, the grouping pattern for KS differed from that of the receptor path (OP-FTIR), with three key species in the first factor (PGMEA, ethyl acetate, and butyl cellosolve) missing in the KS stacks.

Table 4Factor and cluster analyses of chemicals measured using CC-FTIR in the stacks of three companies. Italicized numbers represent factor loadings of >0.40, indicating the main species in each factor (source).

Figure 9 uses scatter plots to display concentration variations in selected contaminants over time, with the interrelationship between odorous compounds at the CY stack (CC-FTIR) and the receptor path (OP-FTIR) delineated and compared. Compounds for each pair were linearly correlated, with the correlation coefficients mostly greater than 0.7. However, the correlation coefficients for the KS stack were mostly below 0.1, indicating that the relationships between the ambient data and the KS source profiles were not as significant as those for CY.

Figure 9Scatter plots of concentration variations over time between two detected contaminants from CY stacks (CC-FTIR) and receptor path (OP-FTIR). The correlation coefficients were mostly greater than 0.7.

Figure 10Graphical abstract illustrating the concept of using a dual-optical sensing system to generate receptor and source continuous monitoring data for performing the qualitative source apportionment of factor and cluster analyses in this study.

4 Conclusion

This study developed an alternative investigative framework for detecting air pollution sources of odor nuisance by measuring 16 gas species simultaneously using FTIR spectroscopic measurements and factor analyses to identify and characterize emission sources of multiple air contaminants. Meteorological data and cluster analysis were employed to prove the identification of the major odor emissions. Different industrial processes were related to a specific combination of different pollutants, and this combination was obtained using the two statistical methods of factor analysis and cluster analyses. Factor and cluster analyses were employed to improve the quality and completeness of the source profiles. A field study used FTIR spectroscopic measurements to determine the source of the emission of volatile organic odor species near an industrial park in southern Taiwan and demonstrated the feasibility of this proposed method. The major odor emission source was identified through qualitative source apportionment of factor and cluster analyses. With enhanced efficiency in odor investigation methodology, future emission reduction plans can be developed and overall air quality can be improved.

Data availability
Data availability.

The data that support the findings of this study are not publicly available because they contain information that could compromise the privacy of research participants.

Author contributions
Author contributions.

JCY and PEC designed the experiments. JCY performed the spectra analysis of both OP-FTIR and CC-FTIR systems. JCY and CCH performed statistical modeling for factor and cluster analyses. JCY prepared the paper with contributions from all co-authors. CFW supervised the project.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Acknowledgements
Acknowledgements.

The authors wish to express their gratitude for the administrative support from both the Taiwan Environmental Protection Agency and Tainan City Environmental Protection Bureau.

Financial support
Financial support.

This open-access publication was funded by Green Energy and Environment Research Laboratories, Industrial Technology Research Institute.

Review statement
Review statement.

This paper was edited by Frank Hase and reviewed by two anonymous referees.

References

Capelli, L., Sironi, S., Del Rosso, R., and Guillot, J.-M.: Measuring odours in the environment vs. dispersion modelling: A review, Atmos. Environ., 79, 731–743, https://doi.org/10.1016/j.atmosenv.2013.07.029, 2013.

ChinaPatent:CN102244109B: Anti-reflection coating of crystalline silicon solar cell and preparation method, National Intellectual Property Administration, PRC, abbreviated as CNIPA, Beijing, PRC, 2013.

CRC: Coatings materials and surface coatings, CRC press, Taylar and Francis Group, 2006.

Demirtas, H., Hedeker, D., and Mermelstein, R. J.: Simulation of massive public health data by power polynomials, Stat. Med., 31, 3337–3346, https://doi.org/10.1002/sim.5362, 2012.

Gallagher, E. D.: Statistical Definitions.pdf, EEOS 601, Prob & Stats, Handout 2, EDG Homepage, © Gallagher, E. D., 2011.

Higuchi, T.: Estimation of uncertainty in olfactometry, Water Sci. Technol., 59, 1409–1413, https://doi.org/10.2166/wst.2009.108, 2009.

Higuchi, T. and Masuda, J.: Interlaboratory comparison of olfactometry in Japan, Water Sci. Technol., 50, 147–152, 2004.

Hu, J., Jathar, S., Zhang, H., Ying, Q., Chen, S.-H., Cappa, C. D., and Kleeman, M. J.: Long-term particulate matter modeling for health effect studies in California – Part 2: Concentrations and sources of ultrafine organic aerosols, Atmos. Chem. Phys., 17, 5379–5391, https://doi.org/10.5194/acp-17-5379-2017, 2017.

Hunt, R. N. and Fuchs, P. A.: Applications in continuous monitoring of atmospheric pollutants by remote sensing, Proc. SPIE 2365, Optical Sensing for Environmental and Process Monitoring, https://doi.org/10.1117/12.210805, 1995.

Jathar, S. H., Gordon, T. D., Hennigan, C. J., Pye, H. O. T., Pouliot, G., Adams, P. J., Donahue, N. M., and Robinson, A. L.: Unspeciated organic emissions from combustion sources and their influence on the secondary organic aerosol budget in the United States, P. Natl. Acad. Sci. USA, 111, 10473–10478, https://doi.org/10.1073/pnas.1323740111, 2014.

Johnson, D.: Applied multivariate methods for data analysts, Pacific Grove, Duxbury Press, CA, USA, 1998.

Liu, W. T., Chen, S. P., Chang, C. C., Ou-Yang, C. F., Liao, W. C., Su, Y. C., Wu, Y. C., Wang, C. H., and Wang, J. L.: Assessment of carbon monoxide (CO) adjusted non-methane hydrocarbon (NMHC) emissions of a motor fleet – A long tunnel study, Atmos. Environ., 89, 403–414, https://doi.org/10.1016/j.atmosenv.2014.01.002, 2014.

Merlen, C., Verriele, M., Crunaire, S., Ricard, V., Kaluzny, P., and Locoge, N.: Quantitative or only qualitative measurements of sulfur compounds in ambient air at ppb level? Uncertainties assessment for active sampling with Tenax TA (R), Microchem. J., 132, 143–153, https://doi.org/10.1016/j.microc.2017.01.014, 2017.

Pride, K. R., Peel, J. L., Robinson, B. F., Busacker, A., Grandpre, J., Bisgard, K. M., Yip, F. Y., and Murphy, T. D.: Association of short-term exposure to ground-level ozone and respiratory outpatient clinic visits in a rural location – Sublette County, Wyoming, 2008–2011, Environ. Res., 137, 1–7, https://doi.org/10.1016/j.envres.2014.10.033, 2015.

Rencher, A. C.: Methods of Multivariate Analysis, 2nd edn., A John Wiley & Sons, INC. Publication, 380–407, 2002.

Rumsey, I. C., Aneja, V. P., and Lonneman, W. A.: Characterizing non-methane volatile organic compounds emissions from a swine concentrated animal feeding operation, Atmos. Environ., 47, 348–357, https://doi.org/10.1016/j.atmosenv.2011.10.055, 2012.

Russwurm, G. M., Kagann, R. H., Simpson, O. A., McClenny, W. A., and Herget, W. F.: Long-path FTIR Measurements of Volatile Organic Compounds in an Industrial Setting, J. Air Waste Manage., 41, 1062–1066, https://doi.org/10.1080/10473289.1991.10466900, 1991.

Sung, L. Y., Shie, R. H., and Lu, C. J.: Locating sources of hazardous gas emissions using dual pollution rose plots and open path Fourier transform infrared spectroscopy, J. Hazard. Mater., 265, 30–40, https://doi.org/10.1016/j.jhazmat.2013.11.006, 2014.

Taiwan Environmental Protection Agency, Report and statistics of public nuisance cases, 368, 74–75, 2017.

Ueno, H., Amano, S., Merecka, B., and Kosmider, J.: Difference in the odor concentrations measured by the triangle odor bag method and dynamic olfactometry, Water Sci. Technol., 59, 1339–1342, https://doi.org/10.2166/wst.2009.112, 2009.

USEPA. Quantifying the Contribution of Important Sources to Ambient VOC, PAMS Data Analysis Workbook: Source Apportionment, USEPA (U.S. Environmental Protection Agency), Research Triangle Park, NC, 2000.

USEPA: SPECIATE 4.2 – Speciation Database Development Documentation, USEPA, Research Triangle Park, NC, 2009.

USEPA: EPA Handbook: Optical Remote Sensing for Measurement and Monitoring of Emissions Flux, Research Triangle Park, NC, USEPA, 2011.

USPatent:4883521A: Method for the preparation of silica glass, United States Patent and Trademark Office, Alexandria, Virginia, 1989.

van Harreveld, A. P.: Odor concentration decay and stability in gas sampling bags, J. Air Waste Manage., 53, 51–60, 2003.