Visible–shortwave infrared imaging spectroscopy provides valuable remote
measurements of Earth's surface and atmospheric properties. These
measurements generally rely on inversions of computationally intensive
radiative transfer models (RTMs). RTMs' computational expense makes them
difficult to use with high-volume imaging spectrometers, and forces
approximations such as lookup table interpolation and surface–atmosphere
decoupling. These compromises limit the accuracy and flexibility of the
remote retrieval; dramatic speed improvements in radiative transfer models
could significantly improve the utility and interpretability of remote
spectroscopy for Earth science. This study demonstrates that nonparametric
function approximation with neural networks can replicate radiative transfer
calculations and generate accurate radiance spectra at multiple wavelengths
over a diverse range of surface and atmosphere state parameters. We also
demonstrate such models can act as surrogate forward models for atmospheric
correction procedures. Incorporating physical knowledge into the network
structure provides improved interpretability and model efficiency. We
evaluate the approach in atmospheric correction of data from the PRISM
airborne imaging spectrometer, and demonstrate accurate emulation of
radiative transfer calculations, which run several orders of magnitude faster
than first-principles models. These results are particularly amenable to
iterative spectrum fitting approaches, providing analytical benefits
including statistically rigorous treatment of uncertainty and the potential
to recover information on spectrally broad signals.
The author's copyright for this publication is transferred to the Jet Propulsion Laboratory, California Institute of Technology.
Introduction
Remote visible–shortwave infrared (VSWIR) imaging spectroscopy, also known
as hyperspectral imaging, is a powerful approach to map the composition,
health, and biodiversity of Earth's ecosystems . Remote
sensing of the solar-reflected spectrum from 380 to 2500 nm reveals
physics and chemistry of many processes in Earth's surface–atmosphere system
, including terrestrial plant health and traits
; biodiversity ; the condition and
composition of aquatic, benthic, and nearshore ecosystems
; geology ; and trace
greenhouse gases . While Earth scientists aim to
measure these geophysical variables, remote sensors can only measure the
incident light at the sensor. Inferring geophysical properties requires
inverting the measurement with a physical model – typically one that
accounts for both absorption and scattering by the atmosphere, and the
fraction of light reflected from the surface at each wavelength
.
Radiative transfer models (RTMs) such as DISORT are a
critical component of such models, and form the core of common spectroscopy
analysis codes including ACORN , ATCOR ,
FLAASH , ATREM , and others
. The RTM posits a stratified atmosphere
populated by atmospheric gases at appropriate concentrations and
temperatures, and solves the general equations of radiative transport based
on a known solar input and presumed surface. This is an intensive
computation, requiring special care for modern high-volume imaging
spectrometers that acquire thousands or millions of spectra per second.
Because imaging spectrometers produce too much data to analyze each
measurement with an independent RTM, investigators use RTMs to pre-calculate
lookup tables of atmospheric optical properties such as scattered path
radiance or transmission for atmospheric states appropriate to the conditions
observed at image acquisition. At runtime, the model inversion estimates the
actual state from the radiance spectrum and interpolates within the lookup
table to find the associated optical properties. This informs parametric
approximations of atmospheric transport, such as the formulation by
, permitting algebraic solutions for the remaining
unknowns like surface reflectance. The sequential retrieval of atmospheric
and surface properties, a process known as atmospheric correction,
obtains a self-consistent but approximate explanation for the surface and
atmosphere system.
The lookup table solution is adequate for many needs, but imposes several
limitations. First, lookup tables can only model a few degrees of freedom in
an atmospheric state due to the “curse of dimensionality;” the number of
samples necessary to adequately represent the state space increases
exponentially with the number of input variables. To increase the fidelity of
grid samples in high dimensions, designers can leverage representative
sampling or hyperparameter search strategies such as within
the state space, or space-filling sampling methods like Latin hypercube
sampling or lattice regression methods .
However, such techniques are restricted by prohibitive computation and
storage requirements for highly dimensional state spaces, and incur increased
risks of interpolation inaccuracy. Also, because the contents of
precalculated lookup tables capture atmospheric optical properties
independently from the surface, lookup-table-based approaches preclude strong
coupling between the two. Speeding RTMs to the point at which they could run
many times faster for each spectrum would obviate the lookup table compromise
and enable more flexible, accurate, and statistically rigorous inversion
algorithms such as the optimal estimation approach used in many atmospheric
sounding missions .
Recent work suggested the use of nonparametric function approximators such as
neural networks or Gaussian
processes for this purpose. Investigators can train such
models using prior runs of radiative transfer models over relevant ranges of
surface and atmospheric conditions. After learning the underlying function
with sufficient accuracy, the trained model could act as an
instrument-specific RTM that would not have to solve the underlying
differential equations. Alternative formulations such as
and provide empirical validation of RTM assumptions
by evaluating atmospheric, transmittance, and surface interactions captured
in separate models, while other methods (e.g.,
) permit retrieval of atmospheric or
radiometric parameters based on models constructed using outputs generated by
first-principles RTMs that span multiple wavelengths. However, to date,
techniques designed to retrieve surface reflectance using learned RTM
emulators have only been demonstrated on a small number of cases with limited
surfaces and atmospheres , and
not across the VSWIR range with state vector flexibilities that would permit
a functionally useful alternative for existing atmospheric correction
routines (e.g., as a surrogate forward model). To our knowledge, this work
represents the first demonstration on real imaging spectrometer data using a nonparametric function approximation to emulate
the radiative transfer function
F(x) such that the resulting emulator can act as a forward model in an atmospheric correction
procedure, permitting surface reflectance retrievals over the entire VSWIR range under diverse imaging conditions.
This study demonstrates an accurate neural network model deployed as part of
an iterative model inversion, showing that emulation is a practical solution
for operational atmospheric correction of imaging spectroscopy data. This
opens new possible avenues of research, for both the inversion algorithm
itself (to explore further expansions of the state vector beyond the
traditional retrieved variables) and downstream analyses (to exploit the
benefits of new retrieval methods that do not require lookup tables). We
begin by describing the neural network architecture and RTM emulation
methodology, including several novel advances: an analytical decomposition of the radiative transfer function F(x) into
quantities that are individually easier to model, channelwise, monochromatic subnetworks to simplify training and prediction, and weight propagation to account for correlation between adjacent channels and
to reduce training time. We also describe an approach to partition the state
space in a manner that guides each subnetwork to generate accurate
predictions for states within the bounds of the state space. We evaluate our
approach in a case study focusing on atmospheric correction for the PRISM
imaging spectrometer, and demonstrate high-quality surface reflectance
retrievals using the optimal estimation approach of
equipped with our neural RTM as the forward model. The retrievals capture
subtle atmospheric variability such as view dependence of Rayleigh scattering
not typically handled in conventional atmospheric correction codes. Finally,
we describe paths for future development of neural network RTM emulation
technology.
Neural networks for radiative transfer modeling
Our goal is to construct a model that emulates a first-principles RTM using
precalculated outputs generated by that RTM for a representative set of
atmospheric, geometric, and surface states. More formally, we aim to model
the RTM function F(x)→y that maps a set of
m distinct state parameters {pj}j=1m with values captured in a
state vector x∈Rm to a vector y∈Rn of observed at-sensor radiances for k channels centered at
wavelengths λ={λ1,…,λk}. We use italics to signify
scalar values, boldface italics to signify vectors, and boldface capital letters to signify matrices.
We exploit two features of the problem to simplify F(x). First, we
leverage the fact that the observed radiance at any given channel is fully
specified by the observation geometry, atmospheric state, and the surface
reflectance in that channel. In statistical terms, absent any prior
distribution that couples neighboring wavelengths, the channelwise radiances
become conditionally independent of each other given the atmosphere and
observation geometry. This permits an exact decomposition of F(x)
into monochromatic functionsF(x)={fi(x)}i=1k,
where each fi(x)→yi represents the RTM function
for the channel centered at wavelength λi. Given this decomposition,
we construct a neural RTM emulator using a set of k channelwise
subnetworks, where each subnetwork is trained to model a single fi.
Figure shows the topology of one of the channelwise
subnetworks in the neural RTM. A side benefit of this approach is that the
partial derivatives of radiance channels with respect to their surface
reflectances are independent of each other, which simplifies calculations of
analytical Jacobians during iterative gradient descent inversions
.
Second, we reduce the radiance spectrum analytically to the top-of-atmosphere
reflectance, written ρobs, and solar illumination
components. The top of atmosphere reflectance is defined as
ρobs=yπ/ϕoeo, where ϕo
is the cosine of the solar zenith angle and eo the
extraterrestrial solar irradiance. ρobs is normalized
for solar illumination and, absent extreme glint, resides conveniently in the
[0,1] interval, making it an easier target for function approximation. For
any given observing geometry, the known values of ϕo and eo
can be used to infer the corresponding radiances.
Illustration of a single subnetwork in the neural RTM emulator. Each
subnetwork predicts the top-of-atmosphere reflectance ρobs(λi)
for a single channel centered at wavelength λi provided state parameters x and
surface reflectance ρs(λi). Collecting the predictions
generated by k subnetworks, each modeling distinct channels, and converting those
predictions from ρobs to radiance emulates the RTM function F(x)→y for the selected channels.
Constructing a robust neural RTM emulator from precomputed RTM outputs faces
two fundamental modeling challenges. First, the precomputed RTM outputs must
provide sufficient coverage of the state space to represent the distribution
of spectral responses in each channel. Second, the subnetworks must
accurately predict RTM outputs for intermediate state parameter
values within the bounds of the state space for all channels. Intuitively, modeling channels whose spectral responses vary
substantially with respect to small changes in state is more challenging than modeling more stable channels. For instance,
small variations in the concentration of atmospheric water vapor can produce complex, nonlinear changes in spectral responses
in the water absorption bands, while other wavelengths remain relatively unchanged. Accurately modeling unstable
channels may require generating additional RTM outputs at increased sampling
density to highlight distinct responses that are poorly represented in the
existing precomputed outputs, and additional computational resources may be required to fine-tune the
appropriate subnetworks to capture those distinctions.
To ensure each subnetwork reliably models its corresponding channel, we
measure prediction accuracy on a test set of precalculated RTM outputs
excluded from the training process. In our initial experiments, we used both traditional randomized
cross-validation and bootstrap sampling to construct the training and test sets, but after we
observed that the main sources of variability in the state space emerged from
interactions among a small number of state parameter values, we concluded
that randomized sampling of the state space without an informed sample
stratification yields optimistic or inconsistent estimates of test accuracy
and/or convergence time in cross-validation. Ultimately, we concluded that
validation using a fixed and bounded subset of the state parameter values
would provide a more informative assessment of model performance. Using a
bounded subset also permitted direct comparison to lookup-table-based
approaches, as they require upper and lower bounds on each variable to
generate intermediate values via interpolation. We describe this approach in
more detail later in this work.
We can also improve model accuracy and reduce computational demands by
exploiting characteristics of the state space in tandem with RTM modeling
assumptions. One means we use to achieve this is through a process of
weight propagation. Rather than initializing the weights for each
subnetwork from scratch, we use the converged weights of the subnetwork
modeling the previous channel to initialize the weights for the subnetwork
modeling the current channel, which are then fine-tuned to
minimize the prediction error for that channel. Using weight propagation often yields a substantial reduction in training time
in comparison to training each subnetwork from scratch, and can also improve prediction accuracy for channels whose RTM
outputs are relatively stable with respect to the state space. An additional side benefit is that weight
propagation provides an approximate means to account for channelwise coupling
for instruments whose spectral response functions partially overlap for adjacent channels.
Algorithm 1 describes the procedure to train the neural RTM emulator provided
n samples from the m-dimensional state space and their corresponding
ρobs outputs, each spanning k channels. The
output of the algorithm is a trained neural RTM that takes a state vector
x of m parameters as input and outputs a k-dimensional prediction
vector y, which we convert to radiance with
respect to ϕo and eo. The prediction vector
is the concatenated output of the trained subnetworks {fi}i=1k for each of the k
channels.
Our goal is to train each subnetwork fi to generate accurate predictions
for states explicitly included in X and more importantly
for intermediate states not explicitly included in X but
within the bounds of the state space. To achieve this, we employ a sampling
strategy that partitions the state space in a
manner that helps to guide each subnetwork to accurately predict intermediate
states. Specifically, we partition X and Y into disjoint
training (X,Y)tr and test
(X,Y)te sets such that Xtr contains all
state vectors containing the boundary values
{min(pj),max(pj)} of all state parameters. This
partitioning ensures the training set contains the convex hull of the
Euclidean subspace of Rm defined by the state parameters and
also that all test states in Xte represent intermediate states
within the hull. To capture the internal structure of the state space within
the hull, the training set should also contain one or more intermediate state
vectors for each pj satisfying min(pj)<xjtr<max(pj). Given the training and test partitions, we train each
subnetwork to model the function fi by minimizing the L2-regularized mean-squared
error (MSE) between the predicted and the observed values of the ntr
training samples (X,yi)tr representing the
ρobs responses for the ith channel.
Each subnetwork uses a feed-forward architecture with two hidden layers. The hidden layers use rectifying linear activation
functions, which have been shown to be more robust than the sigmoid or tanh activations used in traditional neural
networks , and the output layer uses a linear activation function. We use the method proposed by
to initialize any subnetwork weights that are not initialized via weight propagation, and use
the widely used error back propagation algorithm with
adaptive moment estimation to optimize the weights via
gradient descent. We train each subnetwork until we reach the maximum number of epochs nepoch, stopping early if the mean absolute error (MAE) between the true and predicted outputs for each channel converges to within 0.1 %. This level of accuracy is sufficient to make the
approximation error a smaller contributor to total uncertainty than other
unknowns in the measurement system. For example, it is generally a similar
magnitude to relative calibration error of different focal plane array
elements, which can vary slightly due to drift between calibrations
.
Neural RTM emulation for PRISM
We define a case study to demonstrate the capabilities of our RTM emulator using data acquired by the PRISM imaging
spectrometer
. PRISM uses a push-broom design and observes
a cross-track angular field of view spanning approximately 30∘, and is
designed to observe coastal ocean environments in the 350–1050 nm
spectral range at approximately 3 nm spectral sampling. The
instrument was mounted onboard a high-altitude ER-2 aircraft which overflew
Santa Monica, USA, in October 2015 at 20 km above sea level
. At this altitude, the instrument measured
the scene through nearly all of Earth's atmospheric scattering and
absorption, providing a challenging test case with relevance to future
orbital instruments.
State parameters values used in libRadtran model runs to generate ρobs
spectra to train and validate the neural RTM. State vectors containing the median value of each
auxiliary parameter (indicated by bold text) are held out for testing, while the remaining state vectors are used for training the channelwise subnetworks.
We consider a state space consisting of a single free parameter per instrument channel for the surface reflectance ρs,
along
with four
m=4 state parameters that vary independently for every spectrum in a given flight line, which include the atmospheric
aerosol optical depth at the surface, τ; the
atmospheric water vapor content of the column in grams per square centimeter, H2O;
the cosine of the observer zenith angle, cos(θv); and the relative
azimuth angle between the observer and the Sun, written ϕr. Each of
these free parameters varies independently for every spectrum in a given
flight line. Naturally, alternative parameterizations are possible, including
mixture models, continuum-absorption models, and others. However, these could
be mapped to our representation without loss of generality.
ρobs spectra for ρs=0.25 spanning the range
of the ϕr(a), cos(θv)(b), τ(c), and H2O(d)
parameters with respect to the validation grid values.
We identified a set of values for each state parameter that covered the conditions anticipated during the flight campaign,
and provided those values in Table 1. We generated RTM outputs using the libRadtran radiative transfer code
for a midlatitude summer atmosphere appropriate to the PRISM flight line we considered
We provide the template libRadtran config file in
the Supplement (prm20151026t173148_libradtran_config)
. Generating
ρobs spectra for every combination of state
parameter values yielded n=9072 total ρobs
output spectra, each of k=7101 dimensions spanning the range of the PRISM
instrument wavelengths with 0.1 nm spacing. Our test data consist of
all state vectors containing the median value of each state
parameter (shown in bold text in Table ) and the
ρobs spectra associated with those states. The
remaining states and their corresponding ρobs
spectra form our training set.
Figure depicts the changes in the
ρobs spectra with respect to parameters ϕr
(panel a), cos(θv) (panel b), τ (panel c), and
H2O (panel d), while holding the other parameters fixed at
their median values. Unsurprisingly, the most visibly dramatic changes occur
as absorption features appear with increased H2O vapor
concentrations. Of the remaining parameters, only aerosol optical
depth τ has an observable effect on spectral shape across the visible and
near-infrared wavelengths. Changes induced by varying ϕr and θv are
comparatively small and predominantly observable in the visible range.
For this case study, we focused on modeling F(x) based on libRadtran
outputs resampled to the PRISM instrument channels. This dramatically reduced
the computation required to construct the neural RTM, as we only needed to
train a total of 245 subnetworks representing each of the PRISM channels with
2.83 nm spacing, rather than the 7101 channels at 0.1 nm
spacing generated by libRadtran. We note that convolving the libRadtran spectra to the lower-resolution PRISM spectral response
function (SRF) means the ρobs values are no longer
strictly monochromatic. However, channelwise coupling is not a significant
concern as the instrument channels are well-separated. In future work, we plan to construct a more general neural RTM that
generates
ρobs predictions at 0.1 nm spacing, can be subsequently convolved to the spectral
response function associated
with a particular sensor.
We observed experimentally that subnetworks consisting of two hidden
layers with 50 units each and a training cycle of at most 500 epochs (where
one epoch consists of a full pass of gradient updates over the training set)
with batch sizes ranging from 100 to 200 training samples was sufficient for
each subnetwork to converge to our error requirements for the state space
parametrized by values in Table . Single-layer subnetworks were typically inadequate to model channels whose
ρobs responses changed in a highly variable
and/or nonlinear manner with respect to small changes in the state parameters (for instance, the
water absorption bands). We set the initial learning rate to 0.001 with the
following adaptive moment estimation parameters {β1=0.9,β2=0.999,ϵ=10-10} and set the L2
regularization penalty term α to 10-4 for each subnetwork. A
longer training cycle or additional hidden units can be used to match the RTM
output more precisely and would likely be necessary to model more complex
state parameter spaces.
(a)ρobs test prediction error
per channel using channelwise linear regressors (black line). (b) Neural
network test prediction error per channel using subnetworks trained
from scratch (NN, red line) versus subnetworks trained with weight propagation (NNWP, blue line).
As a baseline comparison, we used the channelwise training samples
((X,yi)tr in Algorithm 1) to train a
least-squares linear regressor on the ρobs
responses for each channel, and applied each regressor to generate
predictions on the associated test samples
((X,yi)te in Algorithm 1). The channelwise test errors of the linear regressors provide an upper bound on the
piecewise, locally linear interpolation error incurred using lookup tables to infer ρobs responses for
intermediate states. Figure compares
the ρobs test prediction error using the
channelwise linear regressors (panel a, black line) to the error produced
by the channelwise subnetworks trained from scratch (NN, blue line) versus
channelwise subnetworks initialized with weight propagation (NNWP, red
line). The channelwise subnetworks yield an order of magnitude reduction in
prediction error on all channels in comparison to the linear regressors, and
demonstrates potentially significant issues with lookup-table-based
approaches. Weight propagation provides an average reduction of 64 % in
channelwise error, but also yields systematically higher errors in the
H2O absorption range between 890 and 1000 nm where the
ρobs responses vary rapidly for adjacent
channels.
Difference in training epochs to converge to 0.1 % validation error.
Negative values (blue bars) show channelwise subnetworks that converged faster
using weight propagation, while positive values (red bars) indicate channels
whose subnetworks converged more quickly when trained from scratch.
Figure compares the number of epochs required to converge for channelwise
subnetworks trained from scratch versus subnetworks initialized with weight
propagation. Over the set of all PRISM instrument channels, weight
propagation permits convergence in ≈70 % fewer epochs over
subnetworks not leveraging weight propagation. In terms of raw compute time,
our scikit-learn implementation requires 2–3 min to
train a single monochromatic subnetwork from scratch on a single
processor core, while subnetworks initialized with weight propagation
typically require less than 30 s to converge. However, we note that the
channelwise subnetworks trained with weight propagation converge as quickly
in the 925–975 nm range – where their most significant prediction
errors occur – as in the remaining channels. While it is unsurprising
that the H2O absorption wavelengths are challenging to model, the fact that the two weight initialization schemes yield distinct
error distributions suggests that the subnetworks modeling those channels may not have converged.
Investigating further, we measured the average root-mean-square error (RMSE)
on the test set with respect to the pairwise interactions between ρs
and the four state parameters, and show the resulting error surfaces in
Fig. . Relatively small errors for the majority of the
parameter space indicate that the ρobs spectra
vary smoothly with respect to most state parameter values, with the most
significant variability emerging from a small range of values in the ρs∈[0.4,0.8] and H2O∈[1.0,2.0] regions of the state space.
The relatively high error in this regime is consistent with our earlier
observation that small changes in the atmospheric water vapor parameter yield
considerably different ρobs spectra, as shown in
Fig. , and the comparatively high prediction errors for the
H2O absorption bands shown in Fig. .
Pairwise contour plots showing the ρobs
test prediction error (RMSE) surfaces with respect to the state parameter values
specified in Table . Contour labels on the off-diagonal subplots
give the error levels associated with each contour. Diagonal subplots show
the average RMSE in 10 uniformly spaced bins spanning the (x axis) range of
each parameter. Vertical labels on the diagonal subplots indicate the
minimum and maximum error values for each parameter and their corresponding bins.
Atmospheric correction with the neural RTM emulator
We now evaluate the neural RTM emulator in the context of a
surface–atmosphere retrieval problem, retrieving surface reflectance for
comparison to known surface materials. To that end, we fused the optimal
estimation (OE) formalism of , following the specific
approach of for application to imaging spectroscopy.
The OE method estimates the atmosphere and surface state vector by an
iterative least-squares optimization of the forward model's match to the
measured radiances. Cost terms related to observation error and prior
probabilities of state vector elements ensure rigorous propagation of
uncertainties in the retrieval.
PRISM radiance spectrum (a) and the resulting smoothed
reflectance (b) spectrum for the beach sand target used in the vicarious calibration procedure.
Continuing our case study, we begin by computing radiometric calibration
factors for the PRISM flight line via vicarious calibration. This procedure,
similar to standard practice calibration for imaging spectrometers
, projects the residual error in retrieved surface
reflectance back into radiance space where it becomes a multiplicative
correction factor applied independently to each channel. We generate a
“standardized” surface reflectance target by performing a first-principles
retrieval for a beach sand radiance spectrum manually selected from the
target PRISM image. We smooth the resulting surface reflectance spectrum to
suppress significant atmospheric features, and use the smoothed spectrum to
generate radiometric correction factors appropriate to our flight line.
Applying the resulting factors fine tunes the calibration for optimal
results and suppresses residual errors caused by uncertainty in spectral
response or RTM inaccuracy. For reference, the beach sand radiance spectrum
and the resulting smoothed surface reflectance spectrum are shown in
Fig. .
Example surface reflectance retrieval for a PRISM vegetation spectrum.
Panel (a) shows the measured (black dashed spectrum) versus predicted (orange spectrum) radiance spectra.
Panel (b) shows the retrieved surface reflectance spectrum (blue spectrum).
Selected radiance spectra (a) and corresponding surface
reflectance retrievals (b) using the ATREM-based atmospheric correction
approach of (ATR15, green spectra) versus optimal estimation
equipped with our neural network RTM emulator as the forward model (OENN, blue spectra).
We applied the atmospheric correction procedure to a set of radiance spectra
from the PRISM flight line representing a diverse range of surface materials
including grass, rooftop materials, soil, and seafoam.
Figure shows a successful retrieval result for a radiance
spectrum representing grass on a golf course fairway. The inversion (orange
line) perfectly matches the measured radiance (black dashed line) in Fig. a. In Fig. b, the estimated surface reflectance (blue line) is
an extremely smooth and faithful estimate of a dark vegetation spectrum.
Figure shows additional radiance spectra (panel a) and their
corresponding surface reflectance retrievals (panel b). The high-quality
surface reflectance estimates – evidenced by the lack of residual bumps
caused by atmospheric absorption and the flat, low surface reflectance
profiles in the aerosol-dominated interval from 400 to 450 nm – provide
additional confidence in the network's value for atmospheric correction.
Our neural RTM emulator runs in less than 5 ms per PRISM spectrum (about
0.02 ms per channel). This represents a reduction of several orders of magnitude
in runtime in comparison to analogous first-principles RTMs (i.e.,
monochromatic RTMs that solve the coupled scattering–absorption problem in a
computationally exact manner, such as DISORT), which typically required over
10 min to generate a spectrum at 0.1 nm spacing (about 0.15 s per
channel).
Conclusions
Neural network RTM emulation offers a path to reduce both interpolation inaccuracy
and simultaneously runtime by several orders of magnitude. A well-parametrized
neural RTM is capable of modeling state parameter spaces with significantly
higher accuracy than conventional lookup-table-based approaches. Such high-capacity statistical models have potential for modeling state parameter spaces with much higher dimensionality than current methods.
The computational and theoretical advantages provided by fast and accurate
RTM emulators are particularly useful for iterative approaches that must
rerun the entire forward model many times for each spectrum. Equipping
iterative formalisms such as optimal estimation with the neural RTM forward
model also enables new retrieval approaches that jointly estimate surface and
atmospheric parameters. Joint retrieval of surface and atmospheric parameters
carries several advantages. It becomes possible to estimate arbitrary
parameters of the atmospheric state simply by adjusting the RTM dynamically
during the fitting process. A joint retrieval can represent strong coupling
between surface and atmosphere, including bidirectional reflectance
distribution function (BRDF) effects, and obviates parametric approximations.
The ability to model strong coupling is particularly important for conditions
with off-nadir views or haze. Finally, a combined model enables a rigorous,
unified, and quantitative treatment of uncertainty, respecting uncertainties
in all measurement processes and modeled variables and propagating posterior
uncertainties for downstream analysis.
Our results also demonstrate the advantages of informed sampling of the state
space. Finer grid sampling in rapidly varying regions of the state space is
advantageous to capture complex and often nonlinear interactions among state
parameters, while coarse sampling is beneficial in regions of the state space
that vary smoothly to reduce redundancy and computational overhead.
Uninformed sampling of the state space may not only lead to inaccurate
models, but can also yield overly optimistic or inconsistent results when
measuring test accuracy or convergence time during cross-validation. For
example, as Figs. and indicate, much of the
state space is relatively smooth. Traditional cross-validation strategies
that randomly partition the state space into training and test sets will
indicate the subnetworks generalize well due to sampling bias in regions of
the state space that are easy to model. Sample stratification approaches
during cross-validation can help to ensure each subnetwork accurately
captures the parameters that are more difficult to model. However, leveraging an
informed sample of the state space would not only eliminate the need for
sample stratification during cross-validation, but would also ultimately
yield more accurate models with reduced computational overhead.
Future work will investigate Bayesian optimization and smart sampling
approaches e.g., that may
help to provide a more informed sampling the state parameter space.
We will also construct a more “universal” neural RTM
designed to generate ρobs predictions for a more comprehensive set of state
parameters including aerosol optical properties at
0.1 nm spectral resolution. We also aim to reduce approximation error still further in order to keep the fractional contribution
small for very dark and/or noisy targets.
Code availability
The Python source code used to train and apply the neural RTM for optimal-estimation-based atmospheric
correction is available at https://github.com/dsmbgu8/isofit (last access: April 2019). The data used to train the RTM emulator, including the
input state vectors and corresponding ρobs spectra resampled to PRISM instrument wavelengths, are also available
in the examples/20151026_SantaMonica subdirectory at the above URL.
The supplement related to this article is available online at: https://doi.org/10.5194/amt-12-2567-2019-supplement.
Author contributions
Authors BDB and DRT conceived, implemented, and described
the methodology and experiments described in this paper, and are the
primary contributors to this work. SD contributed model
analysis and RTM data collection during a Caltech Summer Undergraduate
Research Fellowship (SURF) internship at JPL in summer 2017. ME
and ROG provided expertise in the theory and application of imaging
spectroscopy and atmospheric correction methods. VN provided
expertise in radiative transfer modeling along with software and guidance for
future efforts. TM contributed analysis and generated RTM outputs
during a Caltech SURF internship at JPL in summer 2018. MP
provided support and guidance for TM.
Competing interests
The authors declare that they have no conflict of
interest.
Acknowledgements
We thank Sven Geier, Scott Nolte, and the AVIRIS-NG instrument and Science
Data System teams for assistance in calibration and operations. Particular
thanks go to Charles E. Miller and Michael Turmon whose feedback
significantly improved the paper. We are also thankful for the counsel
of colleagues including Phil Townsend, Phil Dennison, Dar Roberts, Steven Adler-Golden,
Alexander Berk, Steven Massie, and Bruce Kindel. We acknowledge
the support of the NASA Earth Science Division for the AVIRIS-NG instrument
and the data analysis program “Utilization of Airborne Visible/Infrared
Imaging Spectrometer Next Generation Data from an Airborne Campaign in
India” NNH16ZDA001N-AVRSNG, managed by Woody Turner, for its support of the
algorithm development. We are also thankful for the support of the Jet
Propulsion Laboratory Research and Technology Development Program, the NASA
Center Innovation Fund managed in conjunction with the Jet Propulsion
Laboratory Office of the Chief Scientist and Technologist, and the Caltech
SURF program. A portion of this research took place at the Jet Propulsion
Laboratory, California Institute of Technology. US government support is
acknowledged.
Review statement
This paper was edited by Lars Hoffmann and reviewed by two
anonymous referees.
References
Asner, G., Martin, R., Knapp, D., Tupayachi, R., Anderson, C., Sinca, F.,
Vaughn, N., and Llactayo, W.: Airborne laser-guided imaging spectroscopy to
map forest trait diversity and guide conservation, Science, 355, 385–389,
2017.
Bodhaine, B. A., Wood, N. B., Dutton, E. G., and Slusser, J. R.: On Rayleigh
optical depth calculations, J. Atmos. Ocean. Tech.,
16, 1854–1861, 1999.Brajard, J., Jamet, C., Moulin, C., and Thiria, S.: Use of a
neuro-variational
inversion for retrieving oceanic and atmospheric constituents from satellite
ocean colour sensor: Application to absorbing aerosols, Neural Networks, 19,
178–185, 10.1016/j.neunet.2006.01.015, 2006.Buehler, S. A., John, V. O., Kottayil, A., Milz, M., and Eriksson, P.:
ARTICLE
IN PRESS, J. Quant. Spectr. Ra., 111,
602–615, 10.1016/j.jqsrt.2009.10.018, 2009.Emde, C., Buras-Schnell, R., Kylling, A., Mayer, B., Gasteiger, J., Hamann, U.,
Kylling, J., Richter, B., Pause, C., Dowling, T., and Bugliaro, L.: The libRadtran software
package for radiative transfer calculations (version 2.0.1), Geosci. Model Dev., 9, 1647–1672, 10.5194/gmd-9-1647-2016, 2016.ESAS: Thriving on Our Changing Planet: A Decadal Strategy for Earth
Observation from Space. A Report by the Decadal Survey on Earth Science and
Applications from Space, The National Academies Press, Washington, D.C.,
available at:
http://sites.nationalacademies.org/DEPS/esas2017/index.htm, last access:
January 2018.
Fichot, C. G., Downing, B. D., Bergamaschi, B. A., Windham-Myers, L.,
Marvin-DiPasquale, M., Thompson, D. R., and Gierach, M. M.: High-Resolution
Remote Sensing of Water Quality in the San Francisco Bay–Delta Estuary,
Environ. Sci. Technol., 50, 573–583, 2015.
Frankenberg, C., Thorpe, A. K., Thompson, D. R., Hulley, G., Kort, E. A.,
Vance, N., Borchardt, J., Krings, T., Gerilowski, K., Sweeney, C., Conley,
S., Bue, B. D., Aubrey, A. D., Hook, S., and Green, R. O.: Airborne methane
remote measurements reveal heavy-tail flux distribution in Four Corners
region, P. Natl. Acad. Sci. USA, 113, 9734–9739,
2016.
Gao, B. C., Heidebrecht, K. B., and Goetz, A. F.: Derivation of scaled
surface
reflectances from AVIRIS data, Remote Sens. Environ., 44, 165–178,
1993.
Gao, B.-C., Montes, M. J., Ahmad, Z., and Davis, C. O.: Atmospheric
correction
algorithm for hyperspectral remote sensing of ocean color from space, Appl.
Optics, 39, 887–896, 2000.
Gao, B. C., Montes, M. J., Li, R. R., Dierssen, H. M., and Davis, C. O.: An
atmospheric correction algorithm for remote sensing of bright coastal waters
using MODIS land and ocean channels in the solar spectral region, IEEE
T. Geosci. Remote, 45, 1835–1843, 2007.
Glorot, X. and Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks, in: Proceedings of the thirteenth
international conference on artificial intelligence and statistics, 249–256, Sardinia, IT, 2010.
Gupta, M., Cotter, A., Pfeifer, J., Voevodski, K., Canini, K., Mangylov, A.,
Moczydlowski, W., and van Esbroeck, A.: Monotonic Calibrated Interpolated
Look-Up Tables, arXiv.org, arXiv:1505.06378, 2015.
Hochberg, E. J.: Remote sensing of coral reef processes, in: Coral Reefs: An Ecosystem in Transition, 25–35, Springer, New York, 2011.Jamet, C., Thiria, S., Moulin, C., and Crepon, M.: Use of a Neurovariational
Inversion for Retrieving Oceanic and Atmospheric Constituents from Ocean
Color Imagery: A Feasibility Study, J. Atmos. Ocean. Tech., 22, 460–475, 10.1175/JTECH1688.1, 2005.Jamet, C., Loisel, H., and Dessailly, D.: Retrieval of the spectral diffuse
attenuation coefficient Kd( λ) in open and coastal ocean waters using
a neural network inversion, J. Geophys. Res., 117, C10,
10.1029/2012JC008076, 2012.
Jetz, W., Cavender-Bares, J., Pavlick, R., Schimel, D., Davis, F. W., Asner, G. P., Guralnick, R., Kattge, J., Latimer, A. M., Moorcroft, P.,
Schaepman, M. E., Schildhauer, M. P., Schneider, F. D., Schrodt, F., Stahl, U., and Ustin, S. L.: Monitoring plant functional diversity from
space, Nature Plants, 2, 16024, 2016.
Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980, 2014.Kox, S., Bugliaro, L., and Ostler, A.: Retrieval of cirrus cloud optical thickness
and top altitude from geostationary remote sensing, Atmos. Meas. Tech., 7, 3233–3246, 10.5194/amt-7-3233-2014, 2014.
Kruse, F.: Comparison of ATREM, ACORN, and FLAASH atmospheric corrections using low-altitude AVIRIS data of Boulder, CO, 13th
JPL Airborne Geoscience Workshop, Pasadena, CA, USA, Jet Propulsion Laboratory Publication 05–3, 10, 2004.
Kurucz, R. L.: Synthetic infrared spectra, in: Infrared solar physics, 523–531, Springer, New York, 1994.Loyola, D. G., Gimeno García, S., Lutz, R., Argyrouli, A., Romahn, F.,
Spurr, R. J. D., Pedergnana, M., Doicu, A., Molina García, V., and Schüssler, O.:
The operational cloud retrieval algorithms from TROPOMI on board Sentinel-5
Precursor, Atmos. Meas. Tech., 11, 409–427, 10.5194/amt-11-409-2018, 2018.Loyola R, D. G., Pedergnana, M., and García, S. G.: Smart sampling and incremental function learning for very large high dimensional data,
Neural Networks, 78, 75–87, 10.1016/j.neunet.2015.09.001, 2016.
Martino, L., Vicent, J., and Camps-Valls, G.: Automatic emulator and optimized look-up table generation for radiative transfer models, IEEE
International Geoscience And Remote Sensing Symposium, 2017.Mayer, B. and Kylling, A.: Technical note: The libRadtran software package for radiative transfer
calculations – description and examples of use, Atmos. Chem. Phys., 5, 1855–1877, 10.5194/acp-5-1855-2005, 2005.
Mouroulis, P., Green, R. O., and Wilson, D. W.: Optical design of a coastal
ocean imaging spectrometer, Opt. Express, 16, 9087–9096, 2008.
Mouroulis, P., Gorp, B. V., Green, R. O., Dierssen, H., Wilson, D. W.,
Eastwood, M., Boardman, J., Gao, B.-C., Cohen, D., Franklin, B., Loya, F.,
Lundeen, S., Mazer, A., McCubbin, I., Randall, D., Richardson, B., Rodriguez,
J. I., Sarture, C., Urquiza, E., Vargas, R., White, V., and Yee, K.: Portable
Remote Imaging Spectrometer coastal ocean sensor: design, characteristics,
and first flight results, Appl. Optics, 53, 1363–1380, 2014.
Nair, V. and Hinton, G. E.: Rectified linear units improve restricted boltzmann machines, International Conference on Machine Learning,
807–814, 2010.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,
O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.,
Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.:
Scikit-learn: Machine Learning in Python, J. Mach. Learn.
Res., 12, 2825–2830, 2011.
Perkins, T., Adler-Golden, S. M., Matthew, M. W., Berk, A., Bernstein, L. S., Lee, J., and Fox, M.: Speed and accuracy improvements in FLAASH atmospheric correction of
hyperspectral imagery, Opt. Eng., 51, 111707-1–111707-7,
2012.
Richter, R. and Schlapfer, D.: Geo-atmospheric processing of airborne imaging
spectrometry data, Part 2: atmospheric/topographic correction, Int.
J. Remote Sens., 23, 2631–2649, 2002.
Rodgers, C. D.: Inverse methods for atmospheric sounding: theory and
practice, World Scientific,
2, 2000.
Schaepman, M. E., Ustin, S. L., Plaza, A. J., Painter, T. H., Verrelst, J.,
and
Liang, S.: Earth system science related imaging spectroscopy – An assessment,
Remote Sens. Environ., 113, S123–S137, 2009.
Schaepman-Strub, G., Schaepman, M. E., Painter, T. H., Dangel, S., and
Martonchik, J. V.: Reflectance quantities in optical remote
sensing – Definitions and case studies, Remote Sens. Environ., 103,
27–42, 2006.
Snoek, J., Larochelle, H., and Adams, R. P.: Practical bayesian optimization of machine learning algorithms, in: Advances in neural information
processing systems, 2951–2959, Lake Tahoe, NV, USA, 2012.
Stamnes, K., Tsay, S.-C., Wiscombe, W., and Jayaweera, K.: Numerically stable
algorithm for discrete-ordinate-method radiative transfer in multiple
scattering and emitting layered media, Appl. Optics, 27, 2502–2509, 1988.
Stein, M.: Large sample properties of simulations using Latin hypercube
sampling, Technometrics, 29, 143–151, 1987.
Swayze, G. A., Clark, R. N., Goetz, A. F., Livo, K. E., Breit, G. N., Kruse,
F. A., Sutley, S. J., Snee, L. W., Lowers, H. A., Post, J. L., Stoffregen,
R. E., and Ashley, R. P.: Mapping Advanced Argillic Alteration at Cuprite,
Nevada, Using Imaging Spectroscopy, Econ. Geol., 109, 1179, 2014.
Thompson, D. R., Gao, B.-C., Green, R. O., Roberts, D. A., Dennison, P. E.,
and
Lundeen, S. R.: Atmospheric correction for global mapping spectroscopy:
ATREM advances for the HyspIRI preparatory campaign, Remote Sens. Environ., 167, 64–77, 2015.Thompson, D. R., Boardman, J. W., Eastwood, M. L., Green, R. O., Haag, J. M.,
Mouroulis, P., and Gorp, B. V.: Imaging spectrometer stray spectral response:
In-flight characterization, correction, and validation, Remote Sens. Environ., 204, 850—860, 10.1016/j.rse.2017.09.015,
2018a.
Thompson, D. R., Cawse-Nicholson, K., Erickson, Z., Fichot, C. G.,
Frankenberg,
C., Gao, B.-C., Gierach, M. M., Green, R. O., Natraj, V., and Thompson, A.:
Optimal Estimation for Coastal Ocean Imaging Spectroscopy, Remote Sens. Environ., in review, 2018b.Thompson, D. R., Natraj, V., Green, R. O., Helmlinger, M. C., Gao, B.-C., and
Eastwood, M. L.: Optimal estimation for imaging spectrometer atmospheric
correction, Remote Sens. Environ., 216, 355–373,
10.1016/j.rse.2018.07.003, 2018c.
Trinh, R. C., Fichot, C. G., Gierach, M. M., Holt, B., Malakar, N. K.,
Hulley,
G., and Smith, J.: Application of Landsat 8 for Monitoring Impacts of
Wastewater Discharge on Coastal Water Quality, Frontiers in Marine Science,
4, 329, 10.3389/fmars.2017.00329, 2017.
Ustin, S. L., Roberts, D. A., Gamon, J. A., Asner, G. P., and Green, R. O.:
Using imaging spectroscopy to study ecosystem processes and properties,
BioScience, 54, 523–534, 2004.
Vermote, E. F., Tanré, D., Deuze, J. L., Herman, M., and Morcette, J. J.:
Second simulation of the satellite signal in the solar spectrum, 6S: An
overview, Geoscience and Remote Sensing, IEEE Transactions on, 35, 675–686,
1997.
Verrelst, J., Sabater, N., Rivera, J. P., Muñoz-Marí, J., Vicent,
J.,
Camps-Valls, G., and Moreno, J.: Emulation of Leaf, Canopy and Atmosphere
Radiative Transfer Models for Fast Global Sensitivity Analysis, Remote
Sens., 8, 673, 2016.
Verrelst, J., Rivera Caicedo, J., Muñoz-Marí, J., Camps-Valls, G., and
Moreno, J.: SCOPE-Based Emulators for Fast Generation of Synthetic Canopy
Reflectance and Sun-Induced Fluorescence Spectra, Remote Sens., 9, 927,
2017.
Werbos, P. J.: Applications of advances in nonlinear sensitivity analysis,
in:
System modeling and optimization, 762–770, Springer, Berlin, 1982.