The height of the atmospheric boundary layer or mixing layer is an
important parameter for understanding the dynamics of the atmosphere and the
dispersion of trace gases and air pollution. The height of the mixing layer
(MLH) can be retrieved, among other methods, from lidar or ceilometer
backscatter data. These instruments use the vertical backscatter lidar signal
to infer MLH
The atmospheric boundary layer is the lowest part of the atmosphere where
most of the interactions between surface and atmosphere take place. Knowledge
of the processes and mechanisms in this layer is essential in meteorology and
climate science. The height of the boundary layer, or mixing layer height
(MLH), is an important parameter; it affects, for example, near-surface air quality,
since it limits the volume of air into which pollutants are emitted, mixed
and dispersed, and is therefore crucial in modelling pollution, smog and
dispersion of greenhouse gases. Since the height of the mixed layer is
determined by surface fluxes that drive turbulent processes
Several methods exist to measure MLH. Instruments used for this include
backscatter lidar
Backscatter lidars and ceilometers are based on the same principles and use aerosol concentration as a tracer for MLH, which is possible because the main sources of aerosols are situated at the surface. The turbulent motions in the mixing layer cause the aerosol concentrations to be relatively well mixed within the layer, while exchange between mixing layer (ML) and free atmosphere (FA) is limited. As a result, higher aerosol concentrations are found within the mixing layer close to the ground and lower aerosol concentrations are found in the free atmosphere above. This difference in aerosol concentrations can be used to detect MLH.
Although the detection of the boundaries between atmospheric layers based on aerosol gradients has been used for many years, most of the backscatter-lidar-based techniques have difficulties to coherently track the MLH over time and may inadvertently jump between different atmospheric layers, such as residual layers. Most algorithms search for the strongest gradient in a certain time window to which the MLH is then assigned, and the next time interval is treated independently, with jumps between layers as a result. This limits the utility of these data records for deriving statistics and climatologies, as well as for processing studies and model validation/verification.
To alleviate this problem, guidance can be sought from ancillary data (e.g.
Here, we describe the new algorithm, pathfinder, which is used to track the development of
the MLH during the day, based solely on single wavelength backscatter lidar
data. The pathfinder algorithm is based on graph theory and the algorithm
published by
The most common techniques used to detect MLH from lidar measurements are
the gradient method (e.g.
These techniques evaluate measurements per time step individually. Different features, like the residual layer and advected aerosol layers can cause additional strong gradients in the lidar signal besides the MLH. These additional gradients can be of same order of magnitude as the backscatter gradient on the MLH or even stronger, possibly leading to a MLH estimate alternating between these different layers. However, MLH is a relatively slowly evolving quantity and large differences between subsequent MLH estimates can be rejected. This can be accomplished by processing information from both the spatial and temporal domain simultaneously.
Alternatively, MLH can be determined manually. Even today, this remains a powerful way of determining MLH as the human brain can use knowledge on processes affecting ML development (e.g. time of sunrise and sunset or presence and type of clouds) to distinguish the correct MLH from other gradients. Plots of lidar RCS and gradient fields are visually inspected to determine exact MLH. Although this method does not give a completely independent validation for correct MLH, it does provide an opportunity to assess the performance of the pathfinder method when applied to the same lidar measurements.
Flow diagram of the pathfinder algorithm. Internal operations and variables are grouped within the blue area. Input data are shown at the top and output variables at the bottom. Auxiliary data are displayed on the right. The use of collocated meteorological data is optional
The lidar data used are the range-corrected
lidar signals (RCS). This is the
signal that results after subtraction of the background sky light and
correction of the
The pathfinder method starts with a time series of RCS data as defined in
Eq.
Overview of the translation of data into a mathematical graph and subsequent steps to determine MLH.
Although the ML can exhibit many different features, the occurrence of clouds, residual layers and advected air masses are always situated above or coincident with of the ML. In the lidar measurements these can be detected by high RCS (clouds) and gradients in the RCS (residual and advected layers). Pathfinder uses this to construct altitude restrictions before estimating MLH. These restrictions are based on cloud top height, negative and positive gradients. An additional restriction is the local MLH climatology.
The pathfinder algorithm restricts the maximum height of the MLH to the top of the first cloud layer. Cloud detection in pathfinder is simple yet effectively based on the magnitude of RCS, which is possible due to the high backscatter on cloud droplets compared to aerosols. For each time step, the lowest altitude at which RCS exceeds a prescribed threshold is marked as cloud base height (CBH). After that, the lowest altitude above the CBH at which RCS drops below the same threshold is marked as cloud top height (CTH). Note that CTH is an apparent cloud top height. For optically thick clouds, the lidar signal might not penetrate the complete cloud and the signal might drop below the threshold value before reaching the actual cloud top. Therefore, it is not certain that CTH marks the correct cloud top. Nevertheless, we do not limit the search range to the cloud base, because, in case of shallow cumulus clouds, the air in the cloud is part of the boundary layer. The apparent cloud top marked as CTH is always closer to the correct cloud top and MLH than the cloud base.
Even though RCS
From literature, the scales associated with the ML in the midlatitudes are
well known. For example, the ML in the Netherlands rarely extends above about
750 m during night-time and about 3000 m during daytime. This is
implemented in the algorithm by restricting the search range during the
morning period to that of the night period. On the onset of the convective period, a
linear increase of 2.5 ms
Main characteristics of Leosphere ALS450.
Restriction to the exact altitude of cloud top or gradient might cause the exclusion of the actual MLH from the search range if the feature triggering a restriction is coincident with MLH. To prevent this from happening, the restrictions are relaxed in altitude by 75 m. Additionally, to prevent a single noisy measurement or inhomogeneity to disturb the algorithm, the restrictions are also relaxed in time by 2 min.
Settings of the algorithm related to these points are included in
Table
Tracking the evolution of MLH is essentially selecting a series of points with corresponding time and altitude from a data set, based on certain criteria – much like planning an optimal route on a map. The mathematical representation of information in graphs is an excellent tool for this purpose. To apply graphs for MLH tracking from lidar measurements, we define the following four steps. First, a graph is set up with each point in the data set as vertices. Secondly, connections between these vertices are made so that any collection of connected vertices (hereafter called “path”) in the graph represents a physically possible MLH evolution. Third, costs are assigned to the connections. Finally, the graph is searched by Dijkstra's shortest path algorithm to select the optimal path following the MLH.
In the first step, a graph is created following the structure of the
corresponding data set. Every point in the data set is translated into a
vertex, regardless of the actual values in the data set. For a data set with
Overview of pathfinder parameters applied to Leosphere ALS450 lidar data.
To restrict possible paths in the graph to a physically sensible MLH
evolution, connections between vertices represent specific conditions. The
first condition is that vertices only connect to vertices of the next
time step. Connections to vertices of the same time step, previous time steps or
more than one time step away are not allowed. These excluded connections are
represented by the red arrows in Fig.
In the third step, costs are assigned to the connections in the graph to
determine which path represents the MLH from the collection created in the
previous steps. For pathfinder, costs
Connections pointing to a certain vertex are assigned a cost corresponding to
the point in RCS
In the final step, Dijkstra's shortest path algorithm is applied to select
the optimal MLH path. This method efficiently determines the path with the
lowest total cost originating from a specific vertex in the first time step to
one of the vertices in the last time step satisfying the above-mentioned
conditions (Fig.
Theoretically, any data set of arbitrary length can be translated into a graph
and analysed simultaneously. However, even with the relative efficiency of
Dijkstra's algorithm, this is unpractical and would lead to long processing
times. Therefore, it was decided to split data into multiple time windows and
apply the method to these windows separately. This improves the processing
times substantially, making the method available for near-real-time MLH
tracking. Whereas computational cost determines the upper limit of the window
size, the lowest limit is determined by the typical timescale within the ML.
For the results shown in Sect.
In the final stage of the pathfinder algorithm, a quality flag is added to
each point of the MLH estimates. The quality criterion is based on the ratio
Since we expect RCS to be smaller above MLH than below it,
The instruments for this study were located at the Cabauw Experimental Site
for Atmospheric Research (CESAR;
For the development of pathfinder, a continuous data set was needed from a
backscatter lidar with sufficient signal to noise. This was provided by the
ALS450 described in Sect.
The main instrument used for the development and testing in this study was a
single wavelength backscatter lidar, the Leosphere ALS450, operating at
355 nm. This particular instrument also measures depolarisation
The atmospheric clear air attenuation is not entirely negligible at 355 nm, and adds a negative contribution to the gradient. However, we will neglect contribution to the gradient, as it is very smooth compared to the gradients from the aerosol backscatter.
The ALS450 has been operational at Cabauw since 2007. Unfortunately, significant gaps exist in the instruments' data record due to frequent instrument failure. The continuity of data coverage was best in 2010, which is why this data period was selected, providing a full annual cycle to be studied.
Also installed at Cabauw is an Impulsphysik LD40 ceilometer, which is part of
the Dutch ceilometer network, from which MLH is routinely derived
The wind profiler/RASS (Radio Acoustic Sounding System) is a clear-air radar
measuring Doppler shift to collect information on the vertical wind profile.
The profiler at Cabauw operates at 1290 MHz. To measure the wind profile it
executes a measuring cycle consisting of measurements in four different
oblique directions (15.5
Radiosonde data from Vaisala RS92-GDP measuring profiles of relative
humidity, temperature, pressure and wind are used from the IMPACT campaign
Several methods exist to calculate MLH from the radiosonde measurements
(MLH
The performance of pathfinder is demonstrated in a couple of case studies:
under clear-sky conditions in Sect.
Furthermore, the same data set is also processed by the STRAT2D method
The following data are all based on measurements of the Leosphere ALS450 at
Cabauw, for which the pathfinder tuning parameter values are listed in
Table
Highlights of the MLH evolution at Cabauw on 20 May 2010. See text in Sect.
The first case study is the evolution of MLH
From sunset it takes several hours before the surface temperature is high
enough to produce convection visible in lidar observations. With the complete
ML below the lidar detection range
The greatest challenge in deriving MLH from lidar measurements is to separate
the gradient associated with the ML from the gradients in the residual layer.
The pathfinder method will ignore additional layers when these are relatively
far from the MLH. However, when an additional strong gradient exists close to
the MLH, it might be included in the MLH estimate. The algorithms' decision
depends on the proximity of the gradients and the ratio of their relative
strength. For the solution to shift to a different layer it has to transition
several points of weak gradients and receives a penalty for this in the form
of a higher path cost. A relatively strong gradient on a residual layer can
outweigh this penalty and still cause a shift in the path with the lowest
total path cost. Figure
Although individual rising thermals are also visible during morning growth,
the differences between them are even more pronounced at midday when the height
of the ML is more or less stable. As shown in
Fig.
Even though a clear-sky day gives a good insight into different ML features,
completely cloud-free days are scarce in the Netherlands. The presence of
clouds influences the evolution of the ML (e.g. by blocking incoming solar
radiation) and causes additional gradients which can distract the algorithm
from the correct solution. As an example for this, the next case study treats
a day with abundant fair-weather cumulus clouds. As can be seen in
Fig.
Lidar RCS together with pathfinder MLH estimate on 11 April 2010. A day with cumulus cloud forming on top of the ML with an additional stratocumulus cloud layer above.
Around sunrise and sunset, the two cloud layers are well separated. Pathfinder correctly designates MLH to the top of the lowest cloud layer. This is mainly due to the guiding restriction that excludes measurements above the first cloud layer. However, this is a broken cloud deck and the guiding restrictions cannot exclude the stratocumulus clouds for all time steps. It is the combination of the guiding restriction together with the limited vertical search range that ensures correct tracking of the MLH even for broken cloud layers.
With the growth of the ML, the distance between the two cloud layers decreases up to a point where the two can no longer be distinguished when the signal extinction is too strong in the cumulus layer. During these periods, the solution tracks the top of the stratocumulus as MLH, leading to short peaks in MLH, e.g. around 10:00 UTC.
Highlights of the MLH estimate found by the pathfinder algorithm. See text in Sect.
The limited vertical searching reach causes MLH to be assigned to a cloud layer,
which is beneficial for tracking broken cloud decks. However, an example of the solution
tracking the wrong cloud layer can be seen in Fig.
The last two cases consider the ML during precipitation events. An example of
a day with (heavy) precipitation is 8 June 2010, seen in
Fig.
Lidar RCS together with pathfinder MLH estimate on 8 June 2010. A day with varying cloud types and changing precipitation rates, including some showers between 06:00 and 09:00, 15:00 and 16:00 and at 19:00 UTC.
Another day with frequent precipitation is 20 June 2010, seen in
Fig.
The period between 14:00 and 15:00 UTC is an example of precipitation reaching the overlap region. Because the raindrops evaporate on their way down, the highest droplet concentration and accompanying backscatter are strongest near the cloud base. Consequently, the RCS increases with altitude and no negative gradient is found until the apparent cloud top is reached. This altitude is then indicated as MLH.
Lidar RCS together with pathfinder MLH estimate on 20 June 2010. A day with a completely overcast sky and precipitation falling from the cloud but (almost) completely evaporating before reaching the surface.
Precipitation evaporating well above the overlap region can be seen in
Fig.
For a comparison of the lidar methods, pathfinder and STRAT2D, which were
applied to the same data, were compared to radiosonde and wind profiler MLH retrievals. We used the
observations from a 12-day period in May 2008, obtained during the IMPACT
campaign at Cabauw. Radiosonde observations were taken around 05:00, 10:00 and
16:00 UTC each day. The Richardson bulk method was used to estimate MLH
Highlights of the MLH estimate found by the pathfinder algorithm on 20 June 2010. See text in
Sect.
The MLH estimates from the wind profiler and pathfinder algorithm are in excellent
agreement, with 90 % of the pathfinder MLH estimates falling within a
range of 250 m of the wind profiler values. Pathfinder mean bias compared to
wind profiler MLH estimates is as small as
STRAT2D and pathfinder differ in a number of ways, including the way in which the
layers are detected; pathfinder uses a simple gradient method, whereas
STRAT2D is based on a wavelet transform
For large parts of the 12-day period, the estimates by STRAT2D and pathfinder
agree well and also with the results of the wind profiler. However, for STRAT2D
large differences occur between subsequent time steps when irregularities are
found within the ML. This leads to unrealistic, erratic estimates of
MLH
Overview of MLH retrievals from pathfinder MLH
Agreement between radiosonde and pathfinder strongly depends on the time of
day. A good correlation (
To quantify the performance of pathfinder for a wide range of atmospheric
conditions, it has to be applied to longer time series and validated against
different methods, preferably based on multiple instruments. Continuous,
collocated measurements needed for this comparison are scarce though, so the
12:00 UTC radiosonde observations from De Bilt were considered. However, a
check of the manually derived MLH estimates for Cabauw against MLH
MLH
Scatter plots of
Pathfinder limits the search for MLH to altitudes near previously found MLH
estimates, which gives it its improved tracking behaviour. However, the
method may be sensitive to initialisation parameters. Two parameters in
particular that may have an effect on the retrieved MLH
A change in the position of the first time step will not change the size of the search range, but shifts it along the time axis together with all other time windows of that day. As a consequence, the first MLH candidate found may be different from the one found in the default run and this may influence the subsequent MLH estimates since they are bound to the previous MLH estimates. The default time window is 15 min, i.e. 30 time steps for the ALS450. In the test, the time window was shifted forward and backward by 1 to 30 time steps, leading to a total of one base plus 60 sensitivity runs for each day considered. The analysis was applied to the observations of May 2010. Overall, very high agreement between base and sensitivity runs is found. There is an exact agreement at least 93.1 % for all individual time steps of the complete month when comparing the results of one sensitivity run with the base case. Accompanying mean bias is as low as 4.15 m and maximum monthly mean RMSE of 17 m. As expected, agreement deteriorates when start and endpoints of the time windows go further away from their original position in the base run. Lowest agreement is found for a forward shift of 17 time steps.
Within a time window, the vertical search range increases with 75 m both upward and downward each time step with our current settings. If the window size is increased, the search range consequently expands and more measurements are included in the search.
Next to the default time window of 30 time steps, calculations are made for
window sizes between 10 and 70 time steps with increments of 10 time steps.
Again, the ALS450 observations from May 2010 were used. Again, the algorithm
showed stable behaviour and the exact same solution is found between
95.3 and 96.6 % of the individual time steps in the month when using
time windows between 20 and 70 time steps. Corresponding mean bias ranges
between
All sensitivity runs with a larger time window showed the lowest correlations for a single day all on 27 May. Apart from some exceptions, this was due to the period between 14:00 and 16:00 UTC when a residual layer with clouds was in close proximity to the MLH. For window sizes larger than 30 time steps, the solution jumped from the MLH to a residual layer and followed this layer between 14:00 and 16:00 UTC. Runs with a smaller window size tracked MLH during the whole period. Pathfinder has to include several high cost time steps to allow for a jump to another layer. For relatively small time windows, there are not enough time steps left to compensate for this extra cost and the jump is rejected as a solution. In case of larger time windows, this might not be the case and the solution is allowed to jump to another layer more easily. This divergence continues as long as the tracked feature (e.g. residual or cloud layer) exists or the guiding restrictions pick up gradients forcing the calculations back to the correct solution.
Therefore, as long as the time window is large enough to capture the rising thermals, the pathfinder produces similar MLH estimates during a day irrespective of the time window settings. Although solutions can diverge for short periods, the guiding restrictions force the calculations back to the correct solution.
The clear-sky cases show that the guiding restrictions successfully exclude large parts of the additional gradients of the residual layer. This prevents the need to apply strong smoothing filters or averaging and allows the shortest path algorithm to determine the evolution of the MLH on the native resolution of the underlying data. In case the guiding restrictions do not exclude all additional gradients, a jump to another layer of high gradients is most often prevented by the shortest path algorithm itself. For a transition, several measurement points with low gradients have to be included in the path, increasing the total cost of the jump. A transition cannot always be prevented if the additional gradients are strong enough to compensate for the extra cost. Here, further study would be needed on how to tune the algorithm settings. Also, atmospheric conditions that are different from the Dutch conditions may require different settings.
Clouds can be a good indicator of MLH height, but the high backscatter on the cloud droplets cause a negative gradient typically an order of magnitude larger than gradients associated with differences in aerosol concentration. Because of these strong gradients, the solution found by the algorithm is drawn to cloud tops. In case of a cloud layer above the ML, guidance is needed to restrict the algorithm to the MLH, for instance, by a more rigorous cloud screening prior to the pathfinder analysis.
During rain MLH
Because of the high temporal resolution of the lidar observations, changes in MLH by individual thermals are dominant. The typical spatial scale of the thermals is of the order of several hundreds of metres up to a kilometre. To compare results to other instruments with a similar time resolution, their collocation should be better than this typical scale.
Since the MLH at night in the Netherlands is often below the minimum overlap region of the ALS450 used in this study, no attempts were made to derive the nocturnal boundary layer. Nocturnal boundary layers could be tracked if the lidar profiles would start at appropriately low heights.
Whereas pathfinder was developed specifically for MLH retrieval from
stand-alone single wavelength backscatter lidar data, the method may be
generalised to be applied to data from other profiling instruments for MLH
retrieval. Moreover, pathfinder may be embedded in a chain of multiple
algorithms, such as cloud screening. One such example has recently been
described by
A common feature in the existing methods used to derive MLH from backscatter lidar
measurements based on gradient detection in aerosol loading is that MLH is
derived for each time step individually, which allows for large jumps in the
MLH between subsequent time steps. These jumps are not consistent with the
inherent gradual evolution of the layer, which usually makes it easy for a
somewhat trained individual to visually recognise the MLH development in
lidar RCS plots. To accommodate for this, a new method was proposed called “pathfinder”. This method evaluates multiple time steps within a
configurable time window simultaneously. Graph theory is applied together
with Dijkstra's shortest path algorithm
The pathfinder algorithm stores a full day of lidar measurements arranged in
a time–altitude matrix and subsequently divides the matrix into time windows
of 15 min. These 15 min blocks are translated into graphs in which
each individual data point represents a vertex. To estimate MLH exactly one
altitude has to be selected in each time step. For the selection a certain
cost is assigned to each vertex, which is inversely proportional to
the gradient at the point in the graph. This way, the path with the lowest
total cost will contain the maximum sum of strong gradients and will be a
good estimate for the MLH. To mimic the gradual evolution of the MLH, the
distance between subsequent points is restricted. The threshold used for this
is a maximum growth rate of 2.5 ms
Pathfinder was applied to data from a Leosphere ALS450 deployed at the Cabauw Experimental Site (CESAR) in the Netherlands. The results were checked against MLH estimates obtained from independent observations, such as those from a wind profiler and radiosondes. Excellent agreement was found between MLH estimates of the pathfinder method and from the wind profiler during a 12-day period (IMPACT campaign, May 2008). The comparison with collocated radiosonde data was more problematic, we believe, due to limitations in the Richardson bulk method. Pathfinder results were also checked against manual/visual MLH retrieval applied to the same data, as well as the results from a different algorithm, STRAT2D, applied to the same data.
In in this study, pathfinder gives less scatter than STRAT2D in the comparison of a full-year analysis with manual MLH retrievals. This is due to the jumps between layers present in the STRAT2D estimates.
The pathfinder method can be used operationally on stand-alone single wavelength backscatter lidar data, provided the signal-to-noise ratio is sufficient to detect aerosol layers up to a few kilometres above ground. The typical computation time is less than 5 min for observations from a full day, based on the Leosphere ALS450 data set. An application of pathfinder to other lidar instruments, such as the Lufft CHM15k which now being deployed in the Dutch operational observation network, is currently under investigation.
The lidar data used in this study are publicly available from the CESAR database
The authors declare that they have no conflict of interest.
The authors thank the STRAT development team for making the STRAT2D code available, for their support and fruitful discussions. The authors also thank the reviewers for their positive comments and feedback, which made it possible to improve and clarify the paper. Edited by: L. Bianco Reviewed by: W. Angevine and one anonymous referee