Journal cover Journal topic
Atmospheric Measurement Techniques An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 3.248 IF 3.248
  • IF 5-year value: 3.650 IF 5-year 3.650
  • CiteScore value: 3.37 CiteScore 3.37
  • SNIP value: 1.253 SNIP 1.253
  • SJR value: 1.869 SJR 1.869
  • IPP value: 3.29 IPP 3.29
  • h5-index value: 47 h5-index 47
  • Scimago H index value: 60 Scimago H index 60
Volume 11, issue 2 | Copyright
Atmos. Meas. Tech., 11, 1233-1250, 2018
https://doi.org/10.5194/amt-11-1233-2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Research article 02 Mar 2018

Research article | 02 Mar 2018

Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

Cheng Wu1,2 and Jian Zhen Yu3,4,5 Cheng Wu and Jian Zhen Yu
  • 1Institute of Mass Spectrometer and Atmospheric Environment, Jinan University, Guangzhou 510632, China
  • 2Guangdong Provincial Engineering Research Center for On-Line Source Apportionment System of Air Pollution, Guangzhou 510632, China
  • 3Division of Environment, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
  • 4Atmospheric Research Centre, Fok Ying Tung Graduate School, Hong Kong University of Science and Technology, Nansha, China
  • 5Department of Chemistry, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China

Abstract. Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.

Publications Copernicus
Download
Short summary
A new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator is proposed to conduct benchmark tests on a variety of linear regression techniques. With an appropriate weighting, Deming regression (DR), weighted ODR (WODR), and York regression (YR) are recommended for atmospheric studies when both x and y data have measurement errors. An Igor-based program (Scatter Plot) is developed to facilitate the regression implementation.
A new data generation scheme that employs the Mersenne twister (MT) pseudorandom number...
Citation
Share