Evaluation of methods for gravity wave extraction from middle-atmospheric lidar temperature measurements

This study evaluates commonly used methods of extracting gravity-wave-induced temperature perturbations from lidar measurements. The spectral response of these methods is characterized with the help of a synthetic data set with known temperature perturbations added to a realistic background temperature profile. The simulations are carried out with the background temperature being either constant or varying in time to evaluate the sensitivity to temperature perturbations not caused by gravity waves. The different methods are applied to lidar measurements over New Zealand, and the performance of the algorithms is evaluated. We find that the Butterworth filter performs best if gravity waves over a wide range of periods are to be extracted from lidar temperature measurements. The running mean method gives good results if only gravity waves with short periods are to be analyzed.


Introduction
Atmospheric gravity waves are well known to have a strong impact on the middle-atmospheric circulation (e.g., Holton and Alexander, 2000;Fritts and Alexander, 2003).By transporting energy and momentum from the lower atmosphere into the middle atmosphere, they are responsible for the formation of the cold polar summer mesopause (e.g., Lindzen, 1981).Although some processes related to gravity waves are believed to be well understood, there are still open questions.For example, to what extent gravity wave excitation, propagation and forcing is affected by a changing climate remains an open question (cf.Fritts and Alexander, 2003;Plougonven and Zhang, 2014).
Lidar technology has been used to study gravity waves in the middle atmosphere for the last 3 decades (e.g., Chanin and Hauchecorne, 1981;Gardner et al., 1989;Wilson et al., 1991;Whiteway and Carswell, 1995;Duck et al., 2001;Rauthe et al., 2008;Yamashita et al., 2009;Alexander et al., 2011;Kaifler et al., 2015b).Hence, lidar studies can potentially be used to infer long-term trends in gravity wave activity.Furthermore, lidars have the advantage of providing measurements throughout the entire middle atmosphere with high temporal and vertical resolution of typically 1 h and 1 km.However, lidars generally provide one-dimensional profiles, and no information on the horizontal structure and the intrinsic properties of atmospheric waves can be retrieved.Exceptions are measurements from airborne lidars and multi-beam lidars.
Gravity waves are usually determined from lidar measurements by separating an estimated background temperature (density) profile from the measured profiles in order to derive temperature (density) perturbation profiles.Several methods have been developed and used over the last decades.For example Gardner et al. (1989), Rauthe et al. (2008) and Ehard et al. (2014) calculate a nightly mean profile and subtract it from the (time-resolved) individual profiles.Yamashita et al. (2009) remove a background profile determined by a temporal running mean (in addition to vertical filtering).Perturbation profiles obtained through a fit of polynomial functions to the measured profiles are examined by, e.g., Whiteway and Carswell (1995), Duck et al. (2001) and Alexander et al. (2011).Mzé et al. (2014) apply a variance method in order to determine perturbation profiles, while Chane-Ming et al. (2000) use spectral filtering.
Published by Copernicus Publications on behalf of the European Geosciences Union.

B. Ehard et al.: GW extraction from lidar measurements
All of these methods are most sensitive to different parts of the gravity wave spectrum.Thus, results from different lidar studies become hardly comparable because one cannot distinguish between variations that are caused by a different methodology and variations that are geophysically induced.Ehard et al. (2014) compared values of gravity wave potential energy density (GWPED) from different studies to their results.Due to potential methodological biases it remained unclear whether the differences were in fact of geophysical origin.Hence, they expressed the need for a standardized method to extract gravity wave amplitudes from lidar measurements.
To our knowledge, no literature is yet available which characterizes and evaluates the most commonly used methods to extract information on gravity waves from lidar profiles.Thus, we will evaluate and compare four methods in detail: subtraction of the nightly mean profile, subtraction of temporal running mean profiles, the sliding polynomial fit method proposed by Duck et al. (2001) and the application of a Butterworth filter.While the first two methods rely on filtering in time, the latter two methods apply a filter in space to determine wave-induced temperature perturbations.
This paper is structured as follows: the four methods are described in detail in Sect. 2. The performance is studied in terms of their spectral response to synthetic data in Sect.3. The results are then applied to measurement data in Sect. 4. Finally, the characteristics of the four methods as well as their suitability for extracting gravity-wave-induced temperature perturbations are discussed in Sect.5, and conclusions are drawn in Sect.6.

Methods
Lidar systems used for studies of the middle atmosphere measure the Rayleigh backscatter signal which is proportional to atmospheric density after range correction.The temperature is commonly retrieved by integration assuming hydrostatic equilibrium (Hauchecorne and Chanin, 1980).Recently Sica and Haefele (2015) proposed a temperature retrieval using optimal estimation methods.The derived temperature profiles typically range between 30 and 80-90 km altitude depending on signal-to-noise ratio.At the upper boundary, the temperature retrieval is commonly initialized with satellite data (e.g., Alexander et al., 2011) or resonance lidar measurements (e.g., Rauthe et al., 2008).
The combination with a resonance lidar system extends the altitude range of temperature measurements up to ≈ 105 km.Temperatures below 30 km altitude can be retrieved by using a stratospheric Raman channel (e.g., Alpers et al., 2004).The large altitude range allows for studies of gravity wave propagation from the troposphere to the mesosphere.Hence, we discuss the extraction of gravity waves from temperature data rather than atmospheric density, although most of the results can be applied to density measurements as well.For different methods of extracting gravity waves from density measurements see, e.g., Sica and Russell (1999), Thurairajah et al. (2010) and Mzé et al. (2014).
Lidar studies usually determine wave-induced temperature perturbations T (z, t) (which are a function of altitude z and time t) from the measured temperature profile T (z, t) by subtracting a background temperature profile T 0 (z, t): T (z, t) = T (z, t) − T 0 (z, t). (1) T 0 (z, t) ideally contains all contribution from radiative and chemical heating and other large-scale effects such as planetary waves and tides.Hence, the temperature perturbations T (z, t) should be solely caused by gravity waves.Estimation of T 0 (z, t) is challenging due to the specific shape of the temperature profile with its changes in vertical temperature gradient, e.g., at the stratopause or mesopause.
The frequency range of gravity waves which may be present in T (z, t) can be inferred from the gravity wave dispersion relation which states that the relation between the intrinsic frequency ω, the Brunt-Väisälä frequency N and the Coriolis parameter f must be fulfilled at all times.Using a typical stratospheric value of N = 0.02 s −1 and a Coriolis parameter for midlatitudes of f = 10 −4 s −1 , the intrinsic period τ = 2π ω ranges between 5 min and 17 h.It is important to note that the lidar only detects the observed period τ which can be Doppler-shifted to larger or smaller values, depending on local wind conditions.Typical vertical wavelengths of gravity waves measured by ground-based instruments vary between 1 and 17 km (see Chane-Ming et al., 2000, their Table 2).The spatial scales combined with the temporal scales define the spectral requirements on the methods of extracting gravity-wave-induced temperature perturbations.

Time-averaged background profiles
A widely applied method is the use of the nightly mean temperature profile as a background temperature profile (e.g., Gardner et al., 1989;Rauthe et al., 2008;Ehard et al., 2014).It is then assumed that the timescales of phenomena other than gravity waves affecting the temperature profile are considerably larger and the timescales of gravity waves are smaller than the measurement period, which is typically in the range of 3-12 h.
Another common method is to determine background temperature profiles by means of a running mean over a time window which is typically on the order of 3 h (e.g., Yamashita et al., 2009).Temperature variations with timescales larger than the window width are attributed to the background temperature profiles and are therefore not included in the extracted gravity wave spectrum.Duck et al. (2001) proposed a method of extracting temperature perturbations based on a sliding polynomial fit in the spatial domain.The method is sensitive to small vertical scales and ignores the temporal evolution of waves.The method is based on the assumption that temperature variations with large vertical scales can be attributed to the climatological thermal structure of the atmosphere (i.e., the different vertical temperature gradients in the troposphere, stratosphere and mesosphere), to the advection of colder or warmer air masses, or to tides and planetary waves.Only variations with a spatial scale smaller than a certain threshold are identified as gravity waves.

Sliding polynomial fit
The sliding polynomial fit method was designed to produce a background temperature profile which contains all perturbations with vertical scales larger than 15 km.For each measured temperature profile Duck et al. (2001) applied a series of overlapping cubic polynomial fits to each range gate.Each fit was applied to an altitude window with a width of L f = 25 km.A weighted average was computed to reconstruct the background temperature profile from the individual polynomial fits using the weighting function Here δ = 0.5 L f − L w , L w is the width of the weighting window, z c,i the center altitude of the individual fit and γ the e-folding width which defines how fast the weighting function decreases.Duck et al. (2001) used a weighting window length L w = L f 3 and γ = 3 km.Duck et al. (2001) smoothed the resulting background temperature profiles with a 1.5 km boxcar mean.These profiles were then subtracted from the corresponding measured temperature profiles according to Eq. (1), yielding the temperature perturbation profiles.
In this study the following set of parameters is used: a fit length L f = 20 km, a weighting window length L w = 3 km and an e-folding width γ = 9 km.These parameters are chosen because they yield the flattest spectral response for the altitude resolution used in this study (see Sect. 5.2 for further details).The boxcar smoothing proved to have a negligible effect.Hence, it is not applied in this study.

Spectral filter
Another method which can be applied to vertical profiles is spectral filtering (e.g., Chane-Ming et al., 2000).By applying a high-pass filter to individual temperature profiles, temperature perturbations can be retrieved.In order to yield perturbations caused by gravity waves, a filtering function has to be chosen which has an adequate spectral response.
In this study we use a 5th-order Butterworth high-pass filter with a cutoff wavelength λ c = 15 km and the transfer function where n is the order of the filter and λ z is the vertical wavelength.The Butterworth filter is chosen due to its flat frequency response in the passband.The filter itself is applied in Fourier space.As the Fourier transformation assumes a cyclic data set, the upper and lower end of the measured temperature profile are internally connected.This creates an artificial discontinuity which introduces a broad range of frequencies including frequency components that are in the passband of the filter.These frequency components contribute to temperature perturbations at the upper and lower end of the analyzed altitude window and thus artificially enhance gravity wave signatures.In order to mitigate this effect, the data set is mirrored at the lowest altitude bin and attached to the original data set before the filtering process.The data set can be cyclically extended without discontinuities at the lower end, where temperature perturbations are smallest and therefore artificial enhancements produce the largest relative errors.After the filtering only the original half of the resulting perturbation profile is retained.

Application to synthetic data
In order to characterize the different methods regarding their ability to extract temperature perturbations from middleatmospheric temperature profiles, we apply them to a synthetic data set with known temperature perturbations.These perturbations are added to a fixed, realistic background temperature profile T 0 (z).The latter is derived from the mean temperature profile above Lauder, New Zealand, (45.0 • S, 169.7 • E) measured with the Temperature Lidar for Middle Atmosphere research (TELMA) from July until the end of September (black line in Fig. 1a).The particular choice of the background temperature profile does not affect the results as long as the background temperature profile is realistic, is smooth and does not contain contributions from gravity waves.For example, with a climatological or a model temperature profile, similar results can be derived.Sinusoidal temperature perturbations with exponentially increasing amplitude were added to the background temperature profile according to with the amplitude A, the vertical wavelength λ z , the observed period τ , the scale height H and the lowest altitude of The temperature perturbations T added to T 0 .Temperature perturbations in both panels were constructed using Eq. ( 6) with the following set of parameters: the analyzed altitude range z 0 .An example of the perturbed background profile T s can be seen in Fig. 1a (red line), and the corresponding temperature perturbations T s in Fig. 1b.
For each method, the spectral response R m (z) was calculated from the ratio between the time averaged absolute values of the determined temperature perturbations |T m (z, t)| and the synthetic temperature perturbations A spectral response larger than 100 % indicates an overestimation of gravity wave amplitude, while a value below 100 % indicates an underestimation of gravity wave amplitude.
All simulations conducted for this study use the realistic set of parameters A = 1.2 K, H = 12 km and z 0 = 25 km.A height resolution of z = 0.1 km was used, while the altitude interval ranged from 25 to 90 km.A time interval of 8 h, corresponding to the length of an average nighttime measurement period, with a resolution of t = 0.5 h was used.For each simulation either λ z or τ was kept constant, while the other was varied.The vertical wavelength λ z was varied from 0.6 to 20 km in steps of 0.2 km, while τ was varied from 0.15 to 14.95 h in steps of 0.1 h.

Constant background temperature
As a first step, simulations were carried out with a constant background temperature profile T 0 (z).In order to reduce aliasing effects caused by even multiples of the analyzed time window (8 h), the period of simulated gravity waves was set to τ = 1.9 h while the vertical wavelength λ z was varied.The nightly mean method (Fig. 2a) and the 3 h running mean method (Fig. 2b) both exhibit an almost uniform spectral response at all altitudes and wavelengths.However, the running mean slightly overestimates the extracted temperature perturbations, which is due to the choice of a specific as well as the simulated temperature perturbations (blue line).The different methods are color-coded as follows: nightly mean -green; 3 h running mean -orange; sliding polynomial fit -red; Butterworth filter -black.Please note that the blue line in this case lies exactly underneath the green line.All simulations were carried out with τ = 1.9 h and a background temperature profile constant in time.
period of τ = 1.9 h (cf.Fig. 3e).The sliding polynomial fit method (Fig. 2c) shows a reduced spectral response for vertical wavelengths larger than ≈ 13 km.For shorter vertical wavelengths the spectral response is close to 100 % at most altitudes.Vertical wavelengths of ≈ 9 km show a slight reduction in spectral response over the entire altitude range.At the upper and lower 5 km of the analyzed altitude window, vertical wavelengths larger than 5 km are strongly damped.
The spectral response of the Butterworth filter (Fig. 2d) is very similar to the sliding polynomial fit.The main difference is that the Butterworth filter exhibits no underestimation of temperature perturbations at 9 km vertical wavelength.Figure 2e and f show mean extracted temperature perturbations.The blue line (here underneath the green line) depicts the original temperature perturbations added to the background temperature profile.As evident from Fig. 2e, the sliding polynomial fit method underestimates temperature perturbations at vertical wavelengths around 9 km.In agreement with the filter design both vertical filtering methods, the sliding polynomial fit and the Butterworth filter, show a decrease in extracted temperature perturbations for vertical wavelengths larger than 13 km.This decrease is almost linear with increasing vertical wavelength.As a consequence, amplitudes are effectively reduced by a factor of 3 at λ z = 20 km.
In the first simulation setup the vertical wavelength λ z was varied, while the period τ was kept constant.We now proceed by varying the period τ with a fixed λ z = 6 km (Fig. 3).The spectral response of the nightly mean method (Fig. 3a) is close to 100 % at all altitudes.Temperature perturbations with periods larger than 10 h are damped, and periods around 6 h are slightly underestimated.For τ = 15 h the reduction in amplitude is ≈ 20 % (green line in Fig. 3e and f).Like the nightly mean method, the 3 h running mean (Fig. 3b) exhibits a uniform spectral response at all altitudes.However, waves with periods longer than 3.5 h are strongly damped.At a period of 6 h temperature perturbations are underestimated by a factor of 2, and for τ = 2.5 h amplitudes are overestimated by ≈ 20 % (orange line in Fig. 3e and f).The spectral response of the filter for waves with shorter periods oscillates between over-and underestimation as τ approaches zero.In contrast, the sliding polynomial fit method (Fig. 3c) and the Butterworth filter (Fig. 3d) both exhibit an almost uniform spectral response for most periods.Only for very long periods do the spectral response oscillate between over-and underestimation with increasing altitude, indicating a slight phase delay between simulated and extracted temperature perturbations.This oscillation is not seen in Fig. 3e and f due to the vertical averaging over 10 km.

Varying background temperature
While in the previous section the simulated background temperature was kept constant, we now examine the influence of a time-dependent variation of the background temperature on the different methods.Slow variations of the form were added to Eq. ( 5), where α = 0.5 K h −1 is the heating/cooling rate and H 0 = 65 km is the scale height of the background temperature variation.This results in a warming of the stratosphere and a cooling of the mesosphere over time, representing a very simplified effect of a propagating planetary wave with a vertical wavelength of 60 km.All other parameters are the same as before.Filter characteristics are shown for a varying vertical wavelength in Fig. 4. Compared to the steady background simulations (e.g., Fig. 2), the nightly mean method exhibits an enhanced spectral response around 35 and 65 km altitude (Fig. 4a).From Fig. 4e it can be determined that the nightly mean method overestimates temperature perturbations by roughly 25 % between 30 and 40 km altitude.No change in spectral response is detected for the 3 h running mean method (Fig. 4b), the sliding polynomial fit method (Fig. 4c) and the Butterworth filter (Fig. 4d).
The filters exhibit similar characteristics if the gravity wave period is varied instead of the vertical wavelength.The nightly mean method (Fig. 5a) overestimates temperature perturbations in the same altitude bands as shown for the simulations with varying vertical wavelength (cf.Fig. 4a).The filter characteristics of the 3 h running mean method (Fig. 5b), the sliding polynomial fit method (Fig. 5c) and the Butterworth filter (Fig. 5d) are not affected by the varying background temperature.

Application to measurement data
Rayleigh lidar measurements at Lauder, New Zealand, (45.0 • S, 169.7 • E) were obtained with the TELMA instrument from mid-June to mid-November 2014 (Kaifler et al., 2015a).We use temperature data with a temporal resolution of 10 min and a vertical resolution of 100 m.The effective vertical resolution of the temperature data is 900 m due to smoothing of the raw data before processing.Measurement uncertainties are typically on the order of 2-3 K at 70 km altitude and generally lower than 1 K below 60 km altitude.

Case study: 23 July 2014
A detailed analysis with the four different methods of extracting temperature perturbations is shown for the data set obtained on 23 July 2014 in Fig. 6.This case was chosen because the gravity wave analysis depicts many previously noted characteristics of the four methods.
The main features of the mean temperature profile (Fig. 6b) are the stratopause between 45 and 55 km altitude with T ≈ 245 K and the temperature minimum of approximately 200 K at 73 km altitude below a mesospheric inversion layer.The time evolution of the temperature measurements (Fig. 6a) shows an increase of the temperature at the stratopause and a jump in stratopause height around 08:00 UTC.Afterwards, the stratopause descends slowly.The structure of the mesospheric inversion layer varies also over time, with the minimum temperature below the inversion layer reaching ≈ 175 K around 14:00 UTC.
The temperature perturbations as determined by the nightly mean method (Fig. 6c) exhibit a vertically broad maximum descending from about 80 km altitude down to 50 km altitude over the 12 h measurement period.Temperature perturbations within this descending maximum reach values of up to ±20 K. Below 50 km altitude temperature perturbations are generally on the order of ±5 K.
The 3 h running mean method on the other hand (Fig. 6d) shows strongly tilted patterns.Below 50 km altitude the phase lines tend to be steeper than above.The magnitude of the temperature perturbations generally increases with alti- tude from approximately ±5 K below 60 km altitude to approximately ±15 K above 60 km altitude.
The sliding polynomial fit method (Fig. 6e) and the Butterworth filter (Fig. 6f) extract almost identical patterns of temperature perturbations, with the Butterworth filter inferring slightly larger amplitudes.The phase lines in Fig. 6e and f decrease more slowly in altitude compared to the 3 h running mean method.Below 60 km altitude temperature perturbations are below ±10 K for both filters and increase to ±15 K above 60 km altitude.

Statistical performance
A quantity often used as a proxy for gravity wave activity is the GWPED per mass: where g denotes the acceleration due to gravity and c p the heat capacity of dry air under constant pressure, in addition to the previously defined variables.The mean GWPED is determined as the average over one measurement period -typically 5-12 h in our case -which is denoted by the overline in Eq. ( 9).Due to the decrease in density with altitude, GW-PED per mass increases exponentially with altitude in the case of conservative wave propagation.For a more detailed description and physical interpretation of the GWPED see, e.g., Rauthe et al. (2008) and Ehard et al. (2014).From TELMA observations above New Zealand over the period 1 July to 30 September 2014 we determined the mean GWPED per mass using the four methods of gravity wave extraction discussed in this study (Fig. 7).Relative uncertainties of the GWPED for all methods are on the order of 0.5 % in the stratosphere and increase to approximately 5 % at 80 km altitude, which is considerably smaller than the variations of the GWPED due to the geophysical variability.The absolute value of the GWPED varies by as much as 1 order of magnitude depending on which method is used.The largest relative deviations appear in the lower stratosphere between the 3 h running mean method and the Butterworth filter.Above 65 km altitude all methods produce similar results.A distinct feature of Fig. 7 is the larger growth of GWPED with altitude if the running mean method is used instead of the vertical filtering methods.Additionally, the 3 h running mean method yields the lowest GWPED values.If a 4 h running mean is used instead, the GWPED profile is shifted towards slightly larger values.Below 45 km altitude the nightly mean method produces values comparable to the sliding polynomial fit and the Butterworth filter.Above 45 km altitude the nightly mean method shows the largest values of all methods.The sliding polynomial fit and the Butterworth filter produce generally similar results, with the Butterworth filter yielding a slightly larger GWPED.Another striking feature in Fig. 7 is the enhanced GWPED below 35 km altitude which is detected by both vertical filtering methods.This enhancement is not detected by the running mean method.

Temporal filters
The nightly mean method has been applied in many studies (e.g., Gardner et al., 1989;Blum et al., 2004;Rauthe et al., 2008;Ehard et al., 2014).The major disadvantage is that a varying length of measurement periods results in a variation of the sensitivity to different timescales.This effect is clearly demonstrated in Fig. 3e, showing that gravity waves with periods larger than 10 h are significantly underestimated if an 8 h long time series is used.If the time series is shortened, the cutoff period is smaller as well (not shown) and the spectral response for long-period waves is reduced even further.Strictly speaking, this implies that gravity wave analyses of time series of different length cannot be compared.
In practice measurement periods vary typically in length between a few hours up to a whole night as weather conditions can change rapidly during an observational period.Moreover, there is a seasonal dependency because most middle-atmospheric lidars are capable of measuring in darkness only.This results in shorter measurement periods in summer and longer measurement periods in winter.Hence, the nightly mean method is sensitive to different parts of the gravity wave spectrum depending on weather conditions as well as season.For example, Rauthe et al. (2006) compared winter and summer measurements of gravity wave activity determined by the nightly mean method.They resolved gravity waves with periods of 1.5-12 h during winter and 1.5-3.5 h during summer.Hence, Rauthe et al. (2006) limited their analysis to 3-5 h long measurement periods in order to reduce the variation of the spectral response.
The use of the nightly mean method in gravity wave analysis is further complicated by the fact that there are processes besides gravity waves which occur on similar timescales.For example tides with periods of 8, 12 and 24 h are within the sensitivity range of this method.In the analysis of radar data, the removal of tidal signals is a standard procedure (e.g., Hoffmann et al., 2010).With lidar data, however, this is problematic due to generally shorter and often intermitted measurement periods.Figure 6c shows an example of a tidal signal extracted with the nightly mean method.The broad descending maximum in temperature perturbations is caused by the semidiurnal tide, which was confirmed by a composite analysis over several days (not shown).Note that the nightly mean method is not a suitable method for tidal analysis.Tidal signals are generally extracted from lidar measurements by means of the previously mentioned composite analysis (e.g., Lübken et al., 2011).
The running mean method (e.g., Yamashita et al., 2009) tries to compensate for some of the shortcomings of the nightly mean method.The spectral response is limited to timescales on the order of the window width of the running mean -which is typically 3 h -resulting in the suppression of tides and planetary waves.However, due to this limitation, only a very small part of the gravity wave spectrum is retained in the analysis (e.g., Fig. 3e).As stated previously, gravity wave periods can range from about 5 min to 17 h.Thus the limitation to short timescales excludes a major part of the gravity wave spectrum.Figure 7 shows that, as the length of the running mean window increases, the GWPED increases as well.Still, gravity waves with long periods are suppressed.Additionally, the running mean method overestimates periods slightly shorter than the chosen window width (Fig. 3e).The strongly oscillating spectral response of the running mean method for short periods (Fig. 3e) arises due to the coarse temporal resolution of 0.5 h used in the simulations, which is a typical temporal resolution of lidar measurements.If the temporal resolution of the simulations is increased, these sharp peaks for periods shorter than 1 h vanish (not shown).
The beginning and the end of the measurement period pose an additional problem for the application of the running mean method.At the beginning of the measurement period, a centered running mean of 3 h lacks the first 1.5 h of observations necessary for determining the background temperature.Thus, if in the beginning of the measurement only 1.5 h of data are available for averaging, the spectral response differs at the beginning of the measurement period compared to later times when 3 h of measurements are available.The same is true at the end of the measurement period as well as in the presence of measurement gaps.Thus, when requiring the same spectral response at all times, the "spin-up" time of the running mean method would have to be discarded.However, this would result in a significantly reduced data set be-cause one window width of data would have to be discarded from each measurement period, in addition to another window width for each measurement gap.
Note that the resolved high-frequency range of the gravity wave spectrum is limited by the sampling frequency of the lidar system which ranges typically between 10 min and 1 h, depending on lidar performance.This is a fundamental limitation to the extractable part of the gravity wave spectrum which affects all methods of extracting gravity-wave-induced temperature perturbations in the same way.The same holds true for the effective vertical resolution of the temperature profiles.

Spatial filters
Filtering in the spatial domain, by using either the sliding polynomial fit or the Butterworth filter, has the advantage that the spectral response in the time domain is independent of the length of the measurement period and the presence of measurement gaps.This makes it possible to derive temperature perturbations associated with gravity waves from observational periods which are too short to yield meaningful results if temporal filtering methods are applied.In addition, both spatial filtering methods are capable of detecting waves with periods larger than 12 h (Fig. 3c and d).One disadvantage of both spatial filtering methods is the dampening of vertical wavelengths larger than 5 km at the upper and lower edge of the analyzed altitude window due to edge effects.
The sliding polynomial fit has been applied in several studies (e.g., Duck et al., 2001;Alexander et al., 2011;Kaifler et al., 2015b).Different authors use temperature data with different altitude resolutions and slightly different parameter setups for L f , L w and γ .The fit length L f determines the cutoff wavelength of the spectral response.The weighting window length L w and the e-folding width γ must be adapted to the altitude resolution of the data used.For example, the parameter setup γ = 3 km and L w = L f /3 used by Duck et al. (2001) results in a flat spectral response for their altitude resolution of z = 2 km and fit length L f = 25 km.If a different altitude resolution is chosen, a different set of parameters is needed in order to achieve a flat spectral response in the passband.For the altitude resolution of z = 0.1 km used in this study, a flat spectral response was found for γ = 9 km and L w = 3 km.However, vertical wavelengths of ≈ 9 km are still slightly underestimated with this parameter set.The fit length of L f = 20 km was chosen following Kaifler et al. (2015b).Additional high-pass filtering, as applied by Alexander et al. (2011) or Kaifler et al. (2015b), was found to be unnecessary because the long vertical wavelengths are already strongly suppressed by the sliding polynomial fit itself.
The sliding polynomial fit method is sensitive to large changes of the temperature gradient and may falsely overestimate temperature perturbations for example in the presence of mesospheric inversion layers (not shown).The Butterworth filter tends to overestimate sudden changes in the temperature gradient of the measured temperature profile as well.However, the magnitude of the overestimation is generally lower than for the sliding polynomial fit method.Furthermore, the Butterworth filter has the advantage that it can be easily adjusted if a different cutoff wavelength is desired.

Application to measurement data
All the previously discussed characteristics influence the gravity wave spectrum which is extracted from lidar temperature measurements.This becomes visible if the mean GW-PED of a set of measurements is computed using different methods as shown in Fig. 7.The running mean method extracts only a small part of the gravity wave spectrum and thus shows the lowest GWPED values.The GWPED increases if the window width of the running mean is increased.The nightly mean method yields the largest GWPED values at higher altitudes.This can be attributed to the insufficient suppression of tides and other processes unrelated to gravity waves which happen on longer timescales.In the lower stratosphere the sliding polynomial fit method and the Butterworth filter yield the largest GWPED values.This is most likely caused by the inclusion of long-period waves such as quasi-stationary mountain waves.These waves have the largest impact on GWPED in the lower stratosphere above Lauder during winter (Kaifler et al., 2015a).Above 30 km altitude GWPED values are reduced.A possible mechanism is that mountain waves with very large amplitudes become unstable at these altitudes and break.This has for example been observed by Ehard et al. (2015), who detected a self-induced critical layer around 30 km altitude caused by a strong mountain wave event above northern Scandinavia.
The fact that the Butterworth filter exhibits a lower growth rate of GWPED compared to the running mean method (Fig. 7) may be evidence that short-period gravity waves can propagate more easily to higher altitudes than gravity waves with long periods.This complicates the comparison and interpretation of GWPED growth rates (generally expressed in terms of scale heights) of different studies.For example Rauthe et al. (2006) deduced a GWPED scale height of 9-11 km with the nightly mean method for a midlatitude site.On the other hand, Kaifler et al. (2015b) reported a GWPED scale height of approximately 7 km determined with the sliding polynomial fit method for measurements conducted at Antarctica.A large part of the difference in retrieved scale height can be attributed to different wave propagation conditions at the two sites.However, it remains an open question to what extent the results are affected by the use of different methods to extract gravity waves.
We evaluated four commonly used methods of extracting gravity-wave-induced temperature perturbations from lidar measurements.A widely used method -the nightly mean method -relies on filtering in time by subtraction of the nightly mean temperature.Thus, it is sensitive to all temperature changes occurring on the timescale of the measurement period, including temperature changes induced by planetary waves and tides.Because measurement periods can vary substantially in length and the spectral response of the nightly mean method depends on the length of the measurement period, the extracted gravity wave spectrum can vary from observation to observation.This makes the nightly mean method an improper choice for compiling gravity wave statistics if a data set with a varying length of observational periods is analyzed.
The second method which relies on filtering in time, the running mean method, provides a more stable spectral response with regard to a varying length of the measurement period.However, it extracts only a small fraction of the gravity wave spectrum, with long-period waves being strongly suppressed.Moreover, the running mean method exhibits a variation in the spectral response at the beginning and end of a measurement period as well as in the presence of measurement gaps.
The sliding polynomial fit method is not only capable of extracting waves over a broad range of temporal scales but also suppresses tides and planetary waves due to their large vertical wavelengths.In addition, it is unaffected by measurement gaps.However, the parameters used for the sliding polynomial fit need to be adjusted to the altitude resolution of the measured temperature profiles in order to provide a flat spectral response in the passband.
The Butterworth filter provides an alternative to the sliding polynomial fit method which is not only easy to implement but also easily adjustable to a desired cutoff wavelength.Also, the filter is largely independent of the altitude resolution while providing all the advantages of the sliding polynomial fit method.Furthermore, sudden changes in the background temperature gradient affect the Butterworth filter less than the sliding polynomial fit method.
Based on the results presented here, two methods are recommended for gravity wave extraction from lidar temperature measurements covering a large altitude range: the running mean method is the most suitable method if the analysis is focused on short-period gravity waves with large vertical wavelengths.On the other hand, if a broad passband is desired which covers a large part of the gravity wave spectrum, the Butterworth filter is the method of choice.Additional advantages are the insensitivity to measurement gaps, a varying length of observational periods and the altitude resolution of the measured temperature profile.
Figure 1.(a) Background temperature profile T 0 used for the simulations (black) and perturbed temperature profile T (red).(b)The temperature perturbations T added to T 0 .Temperature perturbations in both panels were constructed using Eq.(6) with the following set of parameters: t = 4 h, A = 1.2 K, λ z = 6 km, τ = 1.9 h, H = 12 km.
Figure 2 depicts the spectral response of the different methods as a function of vertical wavelength.

Figure 2 .
Figure 2. Spectral response of different methods of determining temperature perturbations as a function of vertical wavelength λ z : nightly mean (a), 3 h running mean (b), sliding polynomial fit (c) and Butterworth filter (d).(e) and (f) depict mean extracted temperature perturbations in the ranges 30-40 km (e) and 50-60 km (f) as well as the simulated temperature perturbations (blue line).The different methods are color-coded as follows: nightly mean -green; 3 h running mean -orange; sliding polynomial fit -red; Butterworth filter -black.Please note that the blue line in this case lies exactly underneath the green line.All simulations were carried out with τ = 1.9 h and a background temperature profile constant in time.

Figure 3 .
Figure 3. Same as Fig. 2 but as a function of period τ .All simulations were carried out with a fixed vertical wavelength of 6 km and a background temperature profile constant in time.Note that the blue and black lines in (e) and (f) are lying on top of each other.

Figure 4 .
Figure 4. Same as Fig. 2 but with a varying background temperature (see Sect. 3.2 for details).

Figure 5 .
Figure 5. Same as Fig. 3 but with a varying background temperature (see Sect. 3.2 for details).

Figure 6 .
Figure 6.Temperature (a), mean temperature profile (b) and derived temperature perturbations obtained by different methods (c-f) over Lauder, New Zealand, (45.0 • S, 169.7 • E) on 23 July 2014.The following methods were used for the different panels: nightly mean (c), 3 h running mean (d), sliding polynomial fit (e), and Butterworth filter (f).Time is given in UTC.

Figure 7 .
Figure 7. Mean gravity wave potential energy density (GWPED) per mass over Lauder, New Zealand, (45.0 • S, 169.7 • E) between 1 July and 30 September 2014.The methods used to determine the GWPED are color-coded.The profiles were smoothed by a vertical running mean with a window width of 3 km.