Interactive comment on “ Kalman filter physical retrieval of surface emissivity and temperature from SEVIRI infrared channels : a validation and inter-comparison study ”

Some Weakness: Implementation details are NOT well given, makes it not clear in 1) how the multiple SEVIRI observations are needed (e.g. how many times, temporal difference between the observations and their restriction, etc.) 2) Impact of the forward model parameterization and identity assumption of the dynamical model operator to the retrieval results. Some discussions should be given: 1) Physics of the threshold values of the Kalman filter recursive process and its impact to the retrievals. 2) Any restriction of the temporal step set (of the multiple observations) or its impact to the retrievals 3) Sensitive analysis of the Kalman Filter application to the surface temperature and emissivity retrieval. 4) Quality requirement of the background emissivity determination and its impact to retrievals.

(1) My major concern is that the validation is incomplete and unbalanced. For land, only two in situ sites are used for surface temperature (of which Evora in my opinion is associated with too large of an uncertainty to be used for validation) with no intercomparison with any other heritage products. Considering the SEVIRI temperature/emissivity products are mostly used for land surface studies, I'm not sure what benefit the comparisons with ECMWF/MODIS/AVHRR over water have other than to possibly demonstrate stability of the retrieval. When so few in situ sites are available for temperature validation over land, other approaches should be sought out, for example the Radiance-based validation method which is currently used extensively to validate the NASA MODIS and VIIRS LST products, for example. There are a number of sites over Africa/Arabian Peninsula where field emissivity data are currently available to enable the R-based method -for example from the KIT and JPL thermal infrared groups.
(2) Why was AVHRR used over ocean to demonstrate the seasonal cycle and not ECMWF? AVHRR simply introduces more uncertainty since you are dealing with a bulk temperature estimate, not skin. Along the same lines, I'm not sure it makes sense to compare with ECMWF skin temperature when the atmospheric state vector in the KF approach is already driven by ECMWF fields? The author needs to justify that there is no correlation between skin temperature and atmospheric fields in the ECMWF assimilation approach.
(3) Past studies (e.g. Guillevic et al. and Ermida et al.) have showed large uncertainties associated with the Evora site due to spatial heterogeneity and shading, and this paper does not do enough to convince the reader that results from this site should be valid for a primary validation study -in fact the Guillevic and Ermida studies only further expose the difficulties of using Evora as a validation site for temperature, particularly during daytime.
(4) Why was the Dahra LSA-SAF site not used for temperature validation? Discussion Paper some qualitative discussion on full disk map, but this does not constitute a validation. Emissivity spectra should be compared with either in situ emissivity measurements or from lab measurements of field samples. At the minimum, at least an intercomparison should be made with other physical-based emissivity products such as MOD11B1 v4.1 or the ASTER Global Emissivity Database available from the LPDAAC. There are methods already demonstrated to adjust the emissivity from either the MODIS or ASTER bands to any other sensor's spectral response (e.g. Goettsche and Hulley 2012).
(6) Diurnal emissivity variation is highlighted at the Gobabeb site (Fig. 11) using 3-hr moving average, but at the Evora site a daily average is used, possibly smoothing out any diurnal variations. In order to convince the reader that the diurnal variations seen at Gobabeb are in fact due to a physical phenomenon, the emissivity variations should be smoothed over the same time window at both sites. In fact if these kind of diurnal variations are seen over all surface types then the close correlation with temperature variation indicates it could be some kind of retrieval artifact instead. Without in situ measurements, implying the variations observed are due to dew or vapor adsorption is purely conjecture.
(7) It is not evident that the approach used by the author to extend the UW/BFEMIS emissivity so SEVIRI was implemented correctly. You cannot simply interpolate the UW/BFEMIS emissivity to the central bandwidth of another broadband sensor. The High Spectral Resolution (HSR) algorithm has to be first applied to generate a high spec resolution version and then this should be convolved to the sensors (e.g. SEVIRI) spectral response.

Minor Comments:
Abstract: Not sure about AMT but typically acronyms are spelt out in the Introduction otherwise they can be distracting in the abstract.
P4052, L2: 'Here' is confusing. Do you mean here as in the current study or here as in the previous study. Section 3. Again there is no brief description (even a sentence or two) to describe the basic physics principle used in the KF approach to separate temperature and emissivity. It is difficult for the reader to attempt to understand the approach by looking at equations alone. It is only at the end of this section that the link to an Optimal estimation approach is stated. P4060, L25: 'slightly superior'. Given the large uncertainties associated with the in situ measurement made at this site, I don't think you can claim any one product is 'superior' of another, unless there is a significantly large and systematic bias. Section 4.4: This is really just a qualitative justification for the emissivity product. An objective intercomparison with other heritage products is needed, and would be very interesting (e.g. ASTER GED, MOD11B1) P4068, L1-25: This is reading more like a discussion, not conclusion, since this is the first time processing speed of the algorithm is discussed. Furthermore this sounds more like a justification for including this approach in real-time LSA SAF processing and is not necessarily relevant and interesting to the reader.
Figures: Figure 2. The word descriptions within the images are near impossible to read. Figures  2 and 3. Google Earth reference is needed for both figures since using Google Earth imagery is copyrighted. Figure 4 and 8. x/y labels and (a) and (b) fonts are too small. Figure 5. No description in caption of the location of these results, or the meaning of SD. Figure6. Very difficult to see or learn anything from the delta T time serious shown in the bottom figure. Either smaller range or moving average should be shown. Figure  17. Why is emissivity from 1400-2600 shown if only spectral responses in the 800-1200 range are shown? Figure 20: Low emissivites (greenish values below 0.94) at 12 micron over eastern Africa and parts of Madagascar seem to be a reason for concern. The 12 micron emissivity values should very seldom go below 0.95, unless the surface consists of maffic rock types (e.g. basalt), or possibly dry grass.