A total sky cloud detection method using real clear sky background

The brightness distribution of sky background is usually non-uniform, which creates many problems for traditional cloud detection methods, including the failure of thin cloud detection in total sky images and significantly reducing retrieval accuracy in the circumsolar and near-horizon regions. This paper describes the development of a new cloud detection algorithm, named “clear sky background differencing (CSBD)”, which is accomplished by differencing the original image and the corresponding clear sky background image using the images’ green channel. First, a library of clear sky background images with a variety of solar elevation angles needs to be developed. The image rotation and image brightness adjustment algorithms are applied to ensure the two images being differenced have the same solar position and similar brightness distribution. Sensitivity tests show that the cloud detection results are satisfactory when the two images have the same solar positions. Several experimental cases show that the CSBD algorithm obtains good cloud recognition results visually, especially for thin clouds.


Introduction
Clouds are an essential part of the atmospheric energy and water cycle, and their coverage state is crucial for radiative transfer models and climate simulations.Traditional visual observations have been carried out worldwide at meteorological stations to estimate sky cloud amount for more than 100 years (Deutscher Wetterdienst, 2013).So far, human observations are still the main means of obtaining surface cloud coverage in many countries.However, the subjectivity of visual observations introduces significant uncertainty into the accuracy of cloud coverage measurement.In the Automated Surface Observing System program, a laser ceilometer is widely used, instead of visual observations, to retrieve whole sky cloud amount based on a time series of ceilometer data.The more direct means of determining cloud cover is using a digital camera and a fisheye lens to capture total sky visible images and adopting a certain cloud detection algorithm to calculate cloud coverage.A number of such instruments have been reported in the literature, such as the Whole-Sky Imager (WSI, Shields et al., 1993), the Whole-Sky Camera (WSC, Calbó and Sabburg, 2008;Long et al., 2006), the All-Sky Imager (Huo and Lu, 2009;Cazorla et al., 2008), the Totalsky Cloud Imager (TCI, Yang et al., 2012), the Automaticcapturing Digital Fisheye Camera (ADFC, Yamashita and Yoshimura, 2012), the All Sky Infrared Visible Analyzer (ASIVA, Klebe et al., 2014), and the PROMES-CNRS sky imager (Chauvin et al., 2015).The Total Sky Imager (TSI, Long and Deluisi, 1998) is a commercial instrument used to capture total sky images using a charge-coupled device looking downward onto a parabolic mirror.The strong forward scattering of the sun greatly impedes the ability of the digital camera to acquire a good total sky image.Most of these imagers adopt a solar tracking shadowband to occlude the intense direct solar radiation, while others, such as TCI, ADFC, and ASIVA, use auto exposure (AE) technologies, instead of a sun shielding device, to retrieve a satisfying total sky image.
In total sky images, the clouds often appear as white or gray colors because of the Lorenz-Mie scattering of the cloud particles in the visible range, while cloud-free skies show predominantly blue colors because of Rayleigh scattering of the atmospheric molecules.Many cloud detec-Published by Copernicus Publications on behalf of the European Geosciences Union.
tion algorithms have been implemented on the aforementioned devices using these properties.The methods adopting the blue (B) and the red (R) channels to separate cloud pixels from the sky background have been used by many researchers.Koehler et al. (1991) discriminated between opaque cloud, thin cloud, and clear sky by defining two thresholds in the R/B ratio space for the WSI images.Long et al. (2006) recommend a single threshold of 0.6 to classify the cloudy pixels, also in the R/B space, but for the WSC images instead.The differencing of R − B replaced the R / B ratio in the method of Heinle et al. (2010), in which they suggested R − B = 30 as a better threshold for cloud identification.The fixed threshold (FT) algorithms obtained satisfactory detection results for opaque clouds but often failed for thin clouds and encountered some issues in the circumsolar region.The brightness distribution of different total sky images changes significantly because of different imaging instruments and illumination conditions, preventing the application of FT methods to all images.Li et al. (2011) proposed a hybrid threshold algorithm to estimate clouds in the (R − B)/(R + B) color space.Yang et al. (2012) put forward the background subtraction adaptive threshold algorithm to segment clouds in the same (R − B)/(R + B) color space.In addition to these 2-D red and blue channel algorithms, several methods have been developed to detect clouds based on the 3-D red-green-blue (RGB) color space.Sylvio et al. (2010) considered all RGB information in order to accurately detect clouds, and their algorithm combined both Bayesian statistics and Euclidean geometric distance.Kazantzidis et al. (2012) followed the idea of R − B difference and proposed a multicolor criterion to identify cloudy pixels from the no-sun-shadowing whole-sky images.Yamashita and Yoshimura (2012) defined sky index (SI) and brightness index (BI) in the RGB color space for the wholesky images and classified the sun, blue sky, and clouds using the threshold curve developed from 2-D BI and SI coordinates.Souza-Echer et al. (2006) transferred the images from the RGB color space to the hue saturation intensity space and presented a cloud detection method only using 1-D saturation information.Most existing cloud identification algorithms encounter great uncertainties for cloud detection in the circumsolar and near-horizon regions.To face this challenge, Long et al. (2006) established a clear-sky function to identify clear, thin, and opaque clouds for the TSI images, and then Long (2010) proposed a statistical analysis method to correct the cloud identification errors in those regions.The technology operates by simulating clear sky background (CSB) and by computing the difference with the original image to obtain improved cloud identification results.Ghonima et al. (2012) classified clear, opaque, and thin clouds from the TSI images by utilizing red-blue ratio to subtract the clear sky library, which is a function of image zenith angles, sun-pixel angle, and solar zenith angle.Chauvin et al. (2015) simulated clear sky background using least squares fitting in the normalized red/blue ratio space.All of these algorithms assume the RGB information of the images is correct and reliable.However, by analyzing the imaging principle of the cameras, Yang et al. (2015) found, after using the Bayer color filter array (CFA, Bayer, 1976) and various demosaicing algorithms (Kimmel, 1999), that the RGB color images acquired from the cameras do not represent true color, and the channel's operation will magnify this divergence from true representation.Since the CFA filter adopts more green pixels than blue or red pixels, to simulate the human eye's sensitivity to green light, Yang et al. (2015) suggested using a 1-D green channel instead of the 2-D red-to-blue or the 3-D RGB algorithms to perform cloud detection, especially for partly cloudy images.The green channel background subtraction adaptive threshold (GBSAT) method (Yang et al., 2015) simulated background brightness of the green channel using a morphological operation and had better cloud identification accuracy than the traditional methods, especially in the circumsolar and near-horizon regions.It is true that the simulated background sometimes cannot characterize the true sky background; this will induce some detection errors, especially for the thin clouds.
Accordingly, we put forward a new cloud detection algorithm, named "clear sky background differencing" (CSBD), which adopts a real clear sky background, rather than a simulation, to improve the cloud detection accuracy.The imaging device and image data are introduced in Sect. 2. Section 3 describes the CSBD algorithm in detail.Several TCI images at different weather conditions are analyzed to verify the validity of the proposed algorithm in Sect. 4. Finally, a summary and suggestions for future research are shown in Sect. 5.

Device and data
The images used in this paper were taken by a TCI instrument (Yang et al., 2012), developed by the State Key Laboratory of Severe Weather at the Chinese Academy of Meteorological Sciences.The core unit of the TCI system consists of an industrial camera and a fisheye lens.It can provide the 24-bit RGB color images at resolution of 1392 × 1024 pixels at fixed intervals.The effective area of the TCI image is a circular region with a diameter of 800 pixels, after the removal of the zero value regions and some ground objects.Unlike other total-sky imagers, TCI adopts an AE technology to capture the total sky images without occluding the sun.Since the illumination conditions at each imaging time are not the same, the exposure time for each TCI image may differ, and these images have significantly different brightness distributions.To better understand the relationship between the exposure time and brightness distribution, we compared the brightness histograms of two TCI images, which were taken at the same place and with the same camera parameters but at the exposure times of 300 and 600 µs, respectively.Theoretically, since the exposure time of Fig. 1b is twice that of Fig. 1a, the gray value of each pixel should be twice that of Fig. 1a. Figure 1c, in which most of pixel values are zeros except for some pixels with non-zero noises, is the result of the difference of 2 × (a) and (b) for the green channel.The normalized brightness histograms' distribution and the green channel peaks of Fig. 1a and b are shown in Fig. 1d.The multiplicative relationship between the two peaks (their gray values are 23 and 46) can represent the exposure time difference of the two images; although the lighting conditions and exposure time for any two TCI images vary, we can make the two images have very similar background brightness by adjusting their gray histograms.This property will be used later in the algorithm analysis.

Cloud detection algorithm
In this section, we give an overview for the proposed CSBD algorithm, introduce how to set up the database of clear sky backgrounds, and describe the details of CSBD using an example.Finally, some sensitivity tests and limitations of the algorithm are presented.

Overview
Depending on the proportion of clouds in the sky, the sky type can be classified into clear, partly cloudy, and overcast.Li et al. (2011) and Yang et al. (2015) presented a method to distinguish sky type from total sky images using brightness histogram analysis.The radiometric data can be used to classify these three sky types accurately (Alonso et al., 2014; Chu et al., 2014).Determining the cloud coverage of clear sky and overcast images is relatively easy because of their cloudless or all-sky cloud properties, so the proposed algorithm focuses mainly on partly cloudy images.All subsequent total sky images were obtained in Tibet, China, (88.88 • E, 29.25 • N) by a TCI device from 20 August 2012 to 5 July 2014.The CSBD algorithm first needs to build a clear sky background library (CSBL) for plenty of predefined solar elevation angles, which is accomplished by rotating the original TCI clear sky images with the angles equal to the solar azimuth.Then, for any cloud image of TCI, the longitude and latitude of TCI location and its imaging time will be used to calculate the solar azimuth angle and the solar elevation angle, and then its green channel is selected to perform cloud detection.Based on the solar elevation angle of the cloudy image, the corresponding CSB image, which will be used to perform differencing with the partly cloudy image, can be retrieved in the CSBL.The detailed flowchart of the algorithm is shown in Fig. 2, which will be explained in the following subsections in detail.

Real clear sky background library
The brightness of the sky varies greatly throughout the day because of the direct scattering of sunlight and the Rayleigh scattering of the atmospheric molecules.The brightness distribution of the sky background in a total sky image heavily depends on the position of the sun in the image, and the location of the sun in the sky is relevant to both observation time and the geographic coordinates of the observation site.Many solar positioning algorithms have been developed to calculate the solar azimuth and solar elevation angle for a named time and a given longitude and latitude (Spencer, 1971;Michalsky, 1988;Meeus, 1998;Blanco-Muriel et al., 2001).The TCI images are typically oriented in a fashion such that north is up, south is down, west is left, and east is right.Based on the solar azimuth and solar elevation angle information, Yang et al. (2015) proposed a solution to determine the position of the sun in the TCI images.For a fixed observation site, the positions of the sun in the images are hardly consistent for any two different imaging times.This is a big challenge for us to establish the CSBL.If we build the CSBL to include the CSB images for all times, the database will be too large and complex.
Fortunately, using the symmetry of the total sky images (Huo and Lu, 2009), we can rotate each TCI image along the center of the image and the center of the sun, where the rotation angle is equal to the solar azimuth.Figure 3a and b are two TCI images acquired in the morning and afternoon of 8 October 2012.Obviously, the positions of the sun in these two images are completely different, though they have the same solar elevation angles; one is on the right of the image, and the other is on the left.After image rotation, the centers of the sun in the two images (see the Fig. 3e and f) are almost completely coincident.Here, to better preserve the brightness information of the original images, the nearest-neighbor interpolation algorithm is adopted when performing image rotation.Another two TCI images (Fig. 3c and d) are captured at different dates.They also have the same solar elevation angles but different solar azimuth angles, resulting in entirely different coordinates of the sun between the two images.Figure 3g and h show the results after image rotation, in which the positions of two solar centers are exactly the same.Actually, the four TCI images (Fig. 3a-d) have equal solar eleva-tion angles.Using the technique of image rotation, the CSB for these four times in the CSBL can be represented using any one of Fig. 3e-h.
Every day the sun rises from the east and sets in the west, and the solar azimuth angle and the solar elevation angle change constantly.For any TCI image, by rotating the image with an angle equal to the solar azimuth, the sun will always be situated at the lower half of the perpendicular bisector of the image.The distance from the center of the sun to the center of the image is determined by the solar elevation angle, so we build a database representing the real clear sky background with different solar elevation angles.The interval of the solar elevation angles in the CSBL is 1 • .Figure 4 shows examples of the CSBL for some typical CSB images, which were captured on 11 June 2013.The CSBL is updated on every clear sky day through the year because aerosols and climate affect the brightness distribution of the clear sky.Therefore, when the proposed cloud detection algorithm is applied to a certain TCI image, the CSB image with the same solar elevation angle and the closest date as the TCI image will be selected to perform cloud detection.

Cloud detection method using real clear sky background
For a given TCI image, imaging time and site longitude and latitude are known.Using any classical solar positioning algorithms, the solar azimuth angle and solar elevation angle at the imaging time can be calculated.Using the solar azimuth angle, the original TCI image is rotated along the center of the sun and the center of the image to ensure the center of the sun is located at the perpendicular bisector of the new image.Retrieving from the CSBL, we can obtain the CSB image which has the same solar elevation angle as the TCI  image.Then, the green channels of the images are separated from the RGB color images.By analyzing the brightness histograms of the green channels, the gray values of the cloudy image are adjusted, pixel by pixel, by multiplying or dividing by a number to ensure the two green channel images have the similar background brightness distribution.The number is determined by the ratio of the peak positions of the two histograms.Finally, the differencing and binarization methods are applied to obtain the final cloud detection result.Figure 5 shows an example to illustrate each step of the proposed CSBD algorithm.Figure 5a is the original TCI image acquired on 7 June 2013, and Fig. 5b represents the resulting image after image rotation.Figure 5c is the CSB image captured on 26 May 2013, which has the same solar el-evation angle (68 • ) as Fig. 5b. Figure 5d and e are the green channels of Fig. 5b and c. Figure 5f denotes the new green channel after adjusting the gray values of Fig. 5d. Figure 5e and f have very similar brightness distributions, especially for sky background after this histogram adjustment.The resulting difference of Fig. 5f minus e is shown in Fig. 5g and has a very homogeneous background.The background value represents the scattering differences of the aerosols in the two images.By setting a threshold larger than this difference, the final cloud detection result (Fig. 5h) can be obtained using a simple binarization processing.The accuracy of cloud detection in the circumsolar and near-horizon regions is relatively low for traditional methods but, in this case, the proposed CSBD algorithm obtains satisfactory detection results for all regions in the visual comparison.For some thin clouds, this method can also give an accurate identification.Even some bright noises, caused by the refraction of the light, are successfully excluded from the detection result because of their constant positions and similar brightness in Fig. 5e and f.

Sensitivity tests
The theoretical basis of the proposed CSBD algorithm is that two images have similar brightness distribution as long as the positions of the sun in the images are the same; so a correct CSB image is critical for the accurate cloud detection.To check the influence of the CSB images with different solar elevation angles on our algorithm, we performed some sensitivity tests.Figure 6 shows the cloud detection results for a TCI image based on different CSB images.Figure 6a shows the original TCI image after rotation, which was taken on 11 October 2012, with its solar elevation angle equal to 33 • .Figure 6b represents three different CSB images.Figure 6c shows the resulting difference images, and Fig. 6d shows the corresponding cloud detection results.The CSB image in the first row has completely the same solar elevation angle as Fig. 6a, while the solar elevation angle of the CSB image in the second row deviates 2 • from Fig. 6a, and the deviation of solar elevation angle in the last row is 5 • .It is obvious that the cloud detection result in the first row is better than the others because the CSB image in that row has solar elevation angles identical to Fig. 6a. Figure 6e represents the differencing results between the 2 and 5 • offset with the baseline result in row one.When there is a deviation of solar elevation angle between the two images, some cloud detection errors will appear, especially in the circumsolar region.The greater the deviation angle, the greater the cloud detection errors.
Figure 7 shows the cloud detection results for the other TCI image based on different CSB images.The design and the test are the same as in Fig. 6, but the TCI image in Fig. 7a was acquired on 20 March 2013 and its solar elevation angle is equal to 50 • , which is higher than in Fig. 6a.The test results for another TCI image are given in Fig. 8. Different from the first two tests, the TCI image in Fig. 8a was obtained on 28 June 2013, and has a very high solar elevation angle (about 65 • ).The conclusions for these three sensitivity tests are coincident; regardless of the solar elevation angle in the TCI image, the detection result will be satisfactory as long as the CSB image has the same solar elevation angle with the tested TCI image.If not, there will be a large cloud detection error, especially in the circumsolar region.

Limitations
The proposed CSBD algorithm is suitable for the detection of clouds in partly cloudy total sky images, especially when the sun is not obscured by clouds.However, if the sun is blocked by clouds partially or completely, the intensity of the Mie scattering is significantly weaker than the strength of the forward scattering of the sun, resulting in the failed detection result of the CSBD algorithm in the circumsolar region.The examples in the first row of Fig. 9 portray this limitation.Figure 9a is the original TCI image after rotation, which was obtained on 25 August 2012, Fig. 9b shows the CSB image retrieved in the CSBL, Fig. 9c presents the difference between the two images (Fig. 9a and b), and Fig. 9d denotes the final cloud detection result.This limitation can be improved by combing the detection result in the circumsolar region.Yang et al. (2015) presented a method to determine the center coordinate of the sun, and by prescribing a radius around this coordinate, we can define the range of the circumsolar region.Setting a suitable threshold for the green channel, we can obtain the cloud pixels in the circumsolar region.Merging these results, a more accurate cloud detection result can be obtained.
The second row of Fig. 9 shows the other limitation for the CSBD algorithm.The original TCI image was taken on 23 August 2012.When the clouds are optically thick and the cloud base height is very low, the brightness of these clouds is very low, sometimes even darker than the pixels in the CSB image; this phenomenon will inevitably introduce some detection errors for these pixels (see the left region in the second row of Fig. 9).Considering the traditional 2-D red-toblue methods have very high detection accuracy for optically thick clouds, these two results can be merged to improve the identification accuracy of clouds.

Results comparison
To better understand the effectiveness of the proposed CSBD method, the cloud detection results for five different TCI images, using different methods, are compared in Fig. 10.In this comparison, we selected three traditional algorithms as references, including 2-D R/B, 3-D multicolor, and 1-D GB-SAT.For the 2-D R/B method, 0.6 is adopted as a single threshold.Figure 10a shows the original TCI images after rotation, Fig. 10b  gorithm.Of all these results, the white pixels represent cloud and black pixels are sky regions.
Quantitatively evaluating the precision of different cloud detection methods is considerably difficult because, if we want to carry out such an assessment, a standard cloud mask for each TCI image first needs to be available.However, such a standard mask can only be obtained by manual sketch and, as such, is heavily dependent on human objectivity; so, in   this comparison, we simply evaluate the precision of different cloud detection methods by visual examination.The results of the 2-D R/B method are acceptable for some thick clouds but are unacceptable for the vast majority of thin clouds.When the sun is visible in the TCI images, the 2-D R/B method misclassifies most sun pixels into cloud pixels.Of all these algorithms, the 3-D multicolor method incorrectly identified most sky pixels as cloud pixels, especially in the circumsolar and near-horizon regions.The reason may be that some fixed multicolor criterions are not suitable for our TCI instruments and local climatic conditions.The GBSAT algorithm obtains satisfactory results, both in the circumsolar and near-horizon regions, but still fails to detect some thin clouds, as seen in the last two rows for the fourth column.The CSBD algorithm outperforms the traditional methods, especially when the sun is visible.The results of these cases also indicate the CSBD algorithm is very effective for thin clouds, for which it can obtain better results than the classical methods.The bright noises caused by the refraction of light can also be successfully identified as non-cloud pixels, which is almost impossible for the traditional methods.
The purpose of this paper was to introduce a new cloud detection algorithm for the total sky partly cloudy images.Traditionally, the 2-D R/B methods were widely used and accepted as standard algorithms to detect clouds.The 3-D RGB methods were developed by some researchers in order to improve the accuracy of cloud detection.However, Yang et al. (2015) suggested using the 1-D green channel of the RGB image instead of the 2-D R/B and the 3-D RGB methods for cloud detection methods by analyzing the imaging principle of the color camera; so, in this paper, the proposed CSBD algorithm was based on the green channel of the images.The database of CSBL was built by rotating the original TCI images for clear sky conditions.The center of the sun was always on the perpendicular bisector of the image in the CSBL.For any single cloudy TCI image, the CSB image with the same solar elevation angle as the TCI image was retrieved from the CSBL.The histogram adjustment was performed to ensure the two images had similar brightness distribution.Finally, the differencing and binarization processing were applied to obtain the final cloud identification result.The test results showed that the proposed CSBD algorithm outperforms the traditional methods, especially in the circumsolar and nearhorizon regions and for thin clouds when the sun is visible in the image.Additionally, some bright noises, due to the refraction of light, can also be correctly classified as non-cloud pixels.
It needs to be noted that the CSBD algorithm still has some limitations.When the overlying clouds are thick and the cloud base heights are low, the brightness values of these cloud pixels are even lower than the corresponding pixels in the CSB image.The single CSBD algorithm may misclassify these cloud pixels as clear pixels.The accurate thick cloud identification from the 2-D red-to-blue methods can be combined to improve the cloud detection accuracy of this new algorithm.Additionally, when the sun is obscured by clouds, partially or completely, the identification result may miss some cloud pixels in the circumsolar region.This limitation can be improved by merging the cloud detection result in the circumsolar region with the CSBD result.

Figure 1 .
Figure 1.Relationships between the exposure time and the brightness value for the TCI images.(a) TCI image at exposure time of 300 µs, (b) TCI image with the same lighting conditions and camera parameters as in (a), but with 600 µs exposure time, (c) difference of 2 × (a) and (b) for the green channel, and (d) normalized brightness histograms distribution for the green channel of (a) and (b).

Figure 2 .
Figure 2. Flowchart of the proposed algorithm.

Figure 3 .
Figure 3. Several TCI images of different imaging times and the results after rotation.Panels (a) and (b) are the TCI images shot in the morning and afternoon of 8 October 2012, image (c) is captured on 5 April 2013, (d) is obtained on 11 June 2013, with (e) through (h) as the corresponding rotation images of (a) to (d).

Figure 4 .
Figure 4. Several typical clear sky background images captured on 11 June 2013.The solar elevation angles of (a) to (h) range from 10 to 80 • at intervals of 10 • .

Figure 5 .
Figure 5. Cloud detection result based on CSB.Panel (a) is the original TCI image captured on 7 June 2013, (b) is the image after rotation, (c) is the real clear background image with the same solar elevation angle as (b), (d) shows the green channel of (b), (e) shows the green channel of (c), (f) represents the new green channel of (d) after brightness adjustment, which has very similar brightness distribution to (e) especially for sky background, (g) is the difference of (f) and (e), and (h) is the ultimate cloud detection result.

Figure 6 .Figure 6 .
Figure 6.Sensitivity test for low solar elevation angle.(a) is the TCI image after rotation,

Figure 7 .Figure 7 .
Figure 7. Sensitivity test as Fig. 6 but for medium solar elevation angle.(a) is the TCI image

Figure 8 .Figure 8 .Figure 9 .Figure 9 .
Figure 8. Sensitivity test as Fig. 6 but for high solar elevation angle.(a) is the TCI image after

Figure 10 .
Figure 10.Comparison for different cloud detection methods.(a) is the TCI images after 9

Figure 10 .
Figure 10.Comparison of different cloud detection methods.Column (a) shows the TCI images after rotation, (b) represents the results of R/B, (c) shows the results of multicolor method, (d) denotes the results of GBSAT, and (e) shows the results of the proposed CSBD method.