Journal cover Journal topic
Atmospheric Measurement Techniques An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 3.248 IF 3.248
  • IF 5-year value: 3.650 IF 5-year 3.650
  • CiteScore value: 3.37 CiteScore 3.37
  • SNIP value: 1.253 SNIP 1.253
  • SJR value: 1.869 SJR 1.869
  • IPP value: 3.29 IPP 3.29
  • h5-index value: 47 h5-index 47
  • Scimago H index value: 60 Scimago H index 60
Volume 11, issue 8 | Copyright
Atmos. Meas. Tech., 11, 4929-4942, 2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Research article 30 Aug 2018

Research article | 30 Aug 2018

Evaluation of a hierarchical agglomerative clustering method applied to WIBS laboratory data for improved discrimination of biological particles by comparing data preparation techniques

Nicole J. Savage1,a and J. Alex Huffman1 Nicole J. Savage and J. Alex Huffman
  • 1University of Denver, Department of Chemistry and Biochemistry, Denver, USA
  • anow at: Aerosol Devices, Inc., Fort Collins, Colorado, USA

Abstract. Hierarchical agglomerative clustering (HAC) analysis has been successfully applied to several sets of ambient data (e.g., Crawford et al., 2015; Robinson et al., 2013) and with respect to standardized particles in the laboratory environment (Ruske et al., 2017, 2018). Here we show for the first time a systematic application of HAC to a comprehensive set of laboratory data collected for many individual particle types using the wideband integrated bioaerosol sensor (WIBS-4A) (Savage et al., 2017). The impact of the ratio of particle concentrations on HAC results was investigated, showing that clustering quality can vary dramatically as a function of ratio. Six strategies for particle preprocessing were also compared, concluding that using raw fluorescence intensity (without normalizing to particle size) and logarithmically transforming data values (scenario B) consistently produced the highest-quality results for the particle types analyzed. A total of 23 one-to-one matchups of individual particles types was investigated. Results showed a cluster misclassification of <15% for 12 of 17 numerical experiments using one biological and one nonbiological particle type each. Inputting fluorescence data using a baseline +3σ threshold produced a lower degree of misclassification than when inputting either all particles (without a fluorescence threshold) or a baseline +9σ threshold. Lastly, six numerical simulations of mixtures of four to seven components were analyzed using HAC. These results show that a range of 12%–24% of fungal clusters was consistently misclassified by inclusion of a mixture of nonbiological materials, whereas bacteria and diesel soot were each able to be separated with nearly 100% efficiency. The study gives significant support to clustering analysis commonly being applied to data from commercial ultraviolet laser/light-induced fluorescence (UV-LIF) instruments used for bioaerosol research across the globe and provides practical tools that will improve clustering results within scientific studies as a part of diverse research disciplines.

Publications Copernicus
Short summary
We show the systematic application of hierarchical agglomerative clustering (HAC) to comprehensive bioaerosol and non-bioaerosol laboratory data collected with the wideband integrated bioaerosol sensor (WIBS-4A). This study investigated various input conditions and used individual matchups and computational mixtures of particles; it will help improve clustering results applied to data from the ultraviolet laser and light-induced fluorescence instruments commonly used for bioaerosol research.
We show the systematic application of hierarchical agglomerative clustering (HAC) to...