An automated cloud detection method based on the green channel of total-sky visible images

Obtaining an accurate cloud-cover state is a challenging task. In the past, traditional two-dimensional red-toblue band methods have been widely used for cloud detection in total-sky images. By analyzing the imaging principle of cameras, the green channel has been selected to replace the 2-D red-to-blue band for detecting cloud pixels from partly cloudy total-sky images in this study. The brightness distribution in a total-sky image is usually nonuniform, because of forward scattering and Mie scattering of aerosols, which results in increased detection errors in the circumsolar and near-horizon regions. This paper proposes an automatic cloud detection algorithm, “green channel background subtraction adaptive threshold” (GBSAT), which incorporates channel selection, background simulation, computation of solar mask and cloud mask, subtraction, an adaptive threshold, and binarization. Five experimental cases show that the GBSAT algorithm produces more accurate retrieval results for all these test total-sky images.


Introduction
Clouds play an important role in the global water cycle and the Earth's radiation budget (Kokhanovsky, 2006), and their sky coverage and movements are strongly influenced by weather phenomena.Currently there are numerous satellitebased and ground-based instruments engaged in retrieving global cloud coverage.Satellites provide large-scale cloud observations but their relatively low spatial and temporal resolution often lead to some uncertainties in quantifying cloud features.In contrast, ground-based instruments can provide high temporal resolution measurements for clouds.Tapakis and Charalambides (2013) give a more fully detailed review about the advantages and disadvantages of satellite observations and ground observations.Several automatic ground-based sky imaging devices were manufactured and implemented in order to retrieve cloud coverage and type from the visible portion of the electromagnetic spectrum.These instruments include the Total-Sky Imager (TSI) (Long and DeLuisi, 1998), the Whole-Sky Imager (WSI) (Johnson et al., 1989;Shields et al.,1993), the Whole-Sky Camera (WSC) (Long et al., 2006;Calbó and Sabburg, 2008), the All-Sky Imager (Cazorla et al., 2008), the All-Sky Imager system (ASIs) (Huo and Lu, 2009), the Total-Sky Cloud Imager (TCI) (Yang et al., 2012) and the Solmirus All-Sky Infrared Visible Analyzer (ASIVA) (Morris and Klebe, 2010;Klebe et al., 2014).Each of these imagers can retrieve the color total-sky images at user-defined intervals using a color charge-coupled device (CCD) or a Complementary Metal-Oxide Semiconductor (CMOS) camera with a fisheye lens or a parabolic mirror.These instruments adopt various strategies to deal with the intense direct radiation from the sun: a solar-tracking occulter is used in most of the imagers except ASIVA and TCI; TCI uses an auto-exposure technology, instead of a shadow band, to capture unsaturated images.
The theoretical foundation of cloud detection algorithms incorporates Rayleigh scattering for the atmosphere and Mie scattering for clouds, giving a blue sky and white cloud appearance in fair sky conditions.Two-dimensional red-to-blue band threshold methods were widely used with the aforementioned instruments to discriminate cloud pixels from sky pixels for 8-bit red-green-blue (RGB) color images (the intensity range is 0-255 for each channel).The WSI team (Johnson et al., 1988;Koehler et al., 1991) adopted two fixed thresholds (FT) to discriminate clear sky, thin cloud, and opaque cloud in red-to-blue ratio (R/B) space.Long et al. (2006) detected clouds in the WSC images using 0.6 as a single FT in R/B space, and a sophisticated clear envelop function with some configurable settings was used to improve the cloud detection accuracy for a commercial TSI instrument.Heinle et al. (2010) adopted R − B difference instead of R/B ratio and recommended R − B = 30 as an optimal FT to detect clouds.Yamashita et al. (2004) defined a sky index, which is equal to (B − R)/(B + R), and considered sky index values less than 0.2 as cloud areas.The adaptive threshold (AT) method (Yang et al., 2009) and hybrid threshold algorithm (HYTA) (Li et al., 2011) were proposed to detect clouds within a small field of view in (R − B)/(R + B) space.Yang et al. (2012) improved the AT method and proposed the background subtraction adaptive threshold (BSAT) method, which is suitable for total-sky images.Combining the clear sky library (CSL) (Ghonima et al., 2012;Chow et al., 2011) and the sunshine parameter (SP) (Pfister et al., 2003), a more accurate cloud classification can be obtained compared to the FT method in the R/B space.In addition to these 2-D red-to-blue methods, Souza-Echer et al. (2006) utilized the predefined upper and lower thresholds to classify image pixels into clear sky, cloud, and undefined class in 1-D saturation component space.Sylvio et al. (2010) classified the sky and cloud pixels using Bayesian statistics methods and Euclidean geometric distance (EGD) in a 3-D RGB color space.Kazantzidis et al. (2012) proposed a multi-color criterion method to detect cloud pixels also in a 3-D RGB space.The precision of all these traditional methods heavily depends on accurate RGB color information.Cloud detection in the circumsolar and near-horizon regions encounters great difficulties because of the influence of forward scattering and Mie scattering of aerosols, which makes these regions have similar brightness distributions to clouds.Long (2010) developed a method applying a statistical analysis of images at least once per minute to correct the detection errors in these regions but, so far, the detection accuracy of clouds in these areas is still low.
In this paper, we analyze the imaging principle of color cameras and propose a new algorithm to automatically detect clouds within ground-based total-sky visible images.The imaging theory and band selection are described in Sect. 2. In Sect.3, the cloud detection algorithm is introduced in detail.Several different types of partly cloudy total-sky images are analyzed to evaluate the validity of the proposed method in Sect. 4. Finally, a summary and suggestions for future research are given in Sect. 5. Bayer grid (twice as many green pixels than red or blue pixels) (Sakamoto et al., 1998).

Imaging principle and band selection
All total-sky images used in this paper were obtained with a TCI instrument, operating in Tibet (29.25 • N, 88.88 • E) from August 2012 to July 2014.The core component of TCI is an industrial digital camera with a fisheye lens, with a view angle of 185 • .The TCI device produces a color total-sky image every 5 min; however, most digital cameras use CCD or CMOS sensors to convert true scenes into electric signals, which distinguish light intensity with little or no wavelength specificity (Nakamura, 2005).In order to get color information, a color filter array (CFA) is placed over an image sensor's light sensitive surface.The color filters filter the light by wavelength range by passing light of a specific wavelength and absorbing other wavelengths.The most commonly used color filter in photography is Bayer CFA, which comes from Bryce Bayer's patent (1976).This filter pattern is composed of 50 % green, 25 % red, and 25 % blue, taking advantage of the human eye's higher sensitivity to green light, and can be seen in Fig. 1.
Information a camera acquires after using CFA is called raw data, which needs to be converted to a full-color image by various demosaicing algorithms (Kimmel, 1999).Since the data are interpolated, they will inevitably introduce additional noise errors, meaning the color images we obtain from the cameras are not really true color.Each channel can be considered as an independent true signal with noise.To take 2-D red-to-blue bands methods as an example, the linear processing of red and blue signals result in the root-meansquare (RMS) sum of the noises, so 2-D red-to-blue band methods will inevitably increase the error for unsigned 8-bit images.
To elaborate upon this explanation, the brightness distribution of RGB channels and several plot the brightness distribution of RGB channels along a horizontal line (shown as a red line in the image) as the left panel of the bottom row in Fig. 2. It is easily understood that cloud pixels have higher brightness than sky pixels in the same or adjacent region in the RGB channels, and the brightness difference between clouds and sky is very large.When the sun is shadowed, the values of sky pixels should be relatively homogeneous, which can be seen from the pixel position 400 to 600 in the RGB channels brightness distribution.Next, we draw the luminance distribution of R/B, R − B, and (R − B)/(R + B) results for the same horizontal position, shown in the right panel.For a better comparison, all these 2-D red-to-blue results are converted to unsigned 8bit images.The results show that (1) 2-D red-to-blue methods have more noises in the transition regions of clouds and sky, and (2) the brightness difference between clouds and sky becomes even smaller than the RGB channels.
The same analysis has been done for another TCI image in which the sun is visible.Due to the forward scattering of the direct sunlight, those pixels close to the sun become brighter, which can be seen from the pixel position 300 to 600 in the RGB channels' brightness distribution in Fig. 3.However, for those 2-D red-to-blue results, not only does their luminance distribution have more noises, but the brightness difference between cloud and sky also becomes smaller than the RGB channels due to interpolation effects.
More examples have been analyzed, and similar conclusions are drawn i.e., the traditional 2-D red-to-blue methods introduce additional noise, especially in the transition regions of clouds and sky, and the single red, green or blue channel is more suitable to detect cloud pixels from the partly cloudy images.Considering the importance of the green channel in the CFA, we selected the green channel for the following processing in the proposed algorithm.Certainly, the color of sky images is determined by both imaging instruments and local atmospheric conditions, and a different white balance scheme may produce a different color for the same sky.For most of the total-sky images, the color of thin clouds or cumulus appears white and their brightness values are very high, while the brightness values of some optically thick clouds may be lower than those in the clear sky regions.Combining multiple information, including the R/B band, may improve the accuracy of thick cloud identification.

Overview
The sky type can be divided into clear, overcast, and partly cloudy according to its cloud-cover state.The brightness histograms for clear and overcast types show significant unimodal distribution, while those of cloudy types are strongly bimodal.Using histogram analysis, the cloudy images can be separated from the other images.On the other side, by analyzing the histograms of the R/B band, the peak of the clear type is on the low brightness side, while the peak of the overcast type is on the high brightness side.Given a reasonable threshold, it can distinguish between clear and overcast sky; so in this paper, the proposed cloud detection method only focuses on partly cloudy images.The GBSAT algorithm is composed of three main sub-processes: determining whether the sun is obscured by clouds and providing a solar mask, detecting clouds based on a background subtraction adaptive threshold method, and removing the impact of direct sunlight.The process is described in the flow chart illustrated in Fig. 4. The first sub-process calculates the circularity of the saturated pixels in the circumsolar region to determine whether the sun is obscured by clouds and creates the solar mask.The cloud detection process consists of a morphology opening operation, the adjustment of background information dependent on sun-shadowing, background subtraction, and using AT methods to define the cloud mask.In the third sub-process, the final cloud detection result will be obtained by subtracting the solar mask from the cloud mask.The detailed steps of the GBSAT will be explained in the following subsections.

Determining whether the sun is occulted and calculating the solar mask
The position of the sun in the sky depends on both time and the geographic coordinates of the observer.To compute the sun's position for a given time and a given longitude and latitude, we need to first calculate the sun's position in the ecliptic coordinate system, then convert it to the equatorial coordinate system, and finally convert it to the horizontal coordinate system for the given local time and geographic position.Existing solar positioning algorithms can be divided into fast algorithms for general engineering and high-precision astronomical algorithms.Typical position algorithms of the sun include Spencer (1971), Michalsky (1988) and the Plataforma Solar de Almería (PSA) al- gorithm (Blanco-Muriel et al., 2001).Meeus (1998) is representative of highly accurate astronomical algorithms.
Using any one solar positioning algorithm, we can obtain the solar zenith angle and azimuth for the imaging time at the specified location.Figure 5 shows the schematic of fisheye imaging and the position of the sun in the fisheye image.In  the schematic, W, E, S, and N represent west, east, south, and north directions, respectively.O is the center of the image.OZ represents zenith direction.P is the projection of the sun.OD is the radius of the fisheye image.H and A represent solar zenith angle and azimuth.Here, the range of azimuth is from −180 to 180 • .The southern azimuth is zero, and from south to east is negative; from south to west is positive.
The fisheye lens in the TCI device adopts an angular projection (also called a f-theta lens), which means the distance from the center of the image is proportional to the angle around the projection sphere (Schneider et al., 2009).According to the imaging principle of f-theta lens, the following equations are applied to calculate the position of the sun in the TCI image: (1) where X and Y are the central coordinates of the sun in the image; C x and C Y represent the central point of the image; and α is the half of the angle of view of fisheye lens.The circumsolar region is assumed to be circular in the TCI images.Since the location of the sun in the coordinate system has been calculated, the circumsolar region can be determined by giving a radius in the TCI green channel.The high brightness pixels (HBP) can be obtained by setting a high threshold (such as 180) for the circumsolar region in the green image.Saturated pixels (brightness is close or equal to 255 in the unsigned 8-bit image) can also be acquired in the circumsolar region.Several works (Yang et al., 2014;Kazantzidis et al., 2012;Alonso et al., 2014) described how to determine whether the sun is occulted by combining pyrheliometer measurements but, in most cases, measurement of direct radiation is not available.When there are no obstructions (i.e., clouds), the shape of the sun retrieval is roughly circular, so circularity is adopted in this paper to judge whether the sun is blocked by clouds.This circularity is computed for the circumsolar saturated pixels (CSP) using the following equation (Montero and Bribiesca, 2009): where C denotes circularity, S is the area of CSP, and L represents perimeter of CSP.Higher values of C imply situations where the possibility of the sun being occluded is low.A circularity value of 1.0 means a perfect circle.In general, when C is greater than 0.3, the sun can be considered as visible.
Figure 6 shows three cases for calculating the solar masks.The leftmost column shows the original TCI images, and the second column shows the corresponding green channels.The third and fourth columns show HBPs and CSPs, respectively.For the first row TCI image, the sun is occluded by the clouds and the circularity of CSP is very low, so the solar mask is set to empty.However, the values of circularity in the remaining two TCI images are relatively higher, which means the sun is not blocked and, thus, the solar masks are equal to the circumsolar HBPs.

Computing cloud mask using modified background subtraction adaptive threshold
Due to the effects of direct solar radiation and scattering of aerosols and clouds, the background brightness is nonuniform in the total-sky images (Long et al., 2006).The BSAT method can estimate background information from the totalsky images using a morphology opening operation, but the estimated background still has missing information, especially in the circumsolar and near-horizon regions, where the forward scattering of the solar direct light makes it much brighter than the other regions in the total-sky image.On the other hand, a low solar elevation angle leads to greater aerosol scattering, which will increase the brightness in the near-horizon region.
The proposed method adopts the same idea as the BSAT method, but instead using a green channel with the morphology opening operation used to simulate background brightness (Yang et al., 2012).In mathematical morphology, there are two fundamental operations, dilation and erosion, which can be used to expand or shrink a set A by a structure element E, respectively.The opening of A by E is given by the erosion A by E, followed by the dilation of the resulting image by E. The opening operation can simulate background information of a gray image by giving an appropriate structure element.In this study, the disk structure is adopted to get accurate sky background information.When the sun is visible in the images, the pixels in the circumsolar and near-horizon regions are brighter than the other regions in the green channel.The simulated background is adjusted with the following equations: where ABG denotes new background information after adjusted, BG represents the background simulated using the morphology opening operation, and α, β, and γ are adjusted coefficients.In general, α is larger than β and γ .The radii of the direct light and circumsolar regions are predefined according to the imaging characteristics of the camera and fisheye lens.The size of the near-horizon region mainly depends on the local climatology and weather.An image of the homogeneous background can be obtained by subtracting the adjusted background from the green channel image.Figure 7 shows the intermediate processing results for each step of the proposed method.The first column shows the green channel, and the second column shows the simulated background information using morphology opening operation.The middle column shows the new background after adjusted brightness of circumsolar and near-horizon regions.For the image in the first row, we adjusted the background brightness in the near-horizon region because the sun is occluded in this image.For the other two images in which the sun is visible, the background brightness is adjusted both in the circumsolar and near-horizon regions.
Here, the adjusted coefficient α is set to 2, and both β and γ are equal to 1.5.The fourth column shows the results of ABG subtracted from the green channel.The AT method is used to compute a threshold for the uniform background image and binarization processing is done to distinguish cloud and sky pixels.The results of the cloud mask can be seen in the rightmost of Fig. 7.

Removing the impact of direct sunlight
When the sun is visible in the image, the influence of the direct light of the sun is a large source of errors for cloud detection.After getting the solar mask and cloud mask, the ultimate cloud detection result can be obtained by subtracting the solar mask from the cloud mask.The related cases are shown in Fig. 8.In the case of an occluded sun, the final cloud detection result is the same as the cloud mask because the solar mask is empty.By removing the impact of direct sunlight, the obtained cloud detection results appear correct in a visual comparison.Due to the refraction of the light caused by the transparent protective cover, some bright spot noises in the TCI image are inevitable when the sun is visible; this phenomenon can be seen in the middle row of Figs. 7 and 8.It is very difficult to discriminate between these bright spot noises and cloud pixels because of their similar brightness distribution.

Results comparison
Five partly cloudy TCI images are taken as examples in order to evaluate the performance of the proposed algorithm.In this paper, the GBSAT results are compared with three traditional methods, R/B, R − B, and BSAT.Here, the fixed threshold 0.6 (Long et al., 2006) is adopted in the R/B method.The difference threshold 30 (Heinle et al., 2010) is given in the R−B algorithm.The precision of the cloud detection can be evaluated by subjective human examination and objective quantitative comparison.To conduct a quantitative assessment, a standard cloud segmentation result should be obtained for each test image.However, it is difficult to get such a standard cloud segmentation result, which can only be obtained through an artificial sketch.Even if we can get such a standard result, it still depends heavily on the accuracy of manual transcription.So, in this paper, we compare the precision of different methods by human examination.
The R/B method clearly detects large amounts of cloud in the circumsolar region when the sun is visible in the images (see the first two rows of the second column).For some thin clouds, R/B method also fails to give correct detection results (see the third row of the second column).These errors may be improved by using Long's clear-sky limit methodology or statistical analysis method (Long et al., 2006(Long et al., , 2010)).Of all these methods, the results of the R − B method are the weakest.This may be an indicator that the fixed difference threshold is not suitable for TCI devices and local weather conditions.The results of the BSAT method are better than R/B and R − B methods, but they still have some incorrect detection, especially in the circumsolar and near-horizon regions.The proposed GBSAT method obtains more satisfactory results in all these cases.

Conclusions
Two-dimensional red-to-blue band threshold methods historically have been used extensively for cloud detection.However, the imaging theory of cameras shows that the color images we acquire from cameras are not really true color images, which are obtained by interpolation from raw CFA data.These interpolations will inevitably introduce some noises into each channel and the use of 2-D red-to-blue bands methods will amplify this noise, resulting in significant detection errors, especially in the transition regions of the sky and clouds.In actuality, the single red, green or blue channel is more suitable for cloud detection especially for partly cloudy images.Because there are twice as many green pixels as there are blue or red, meaning the green channel represents more real information, we select this channel for the proposed algorithm.
The difficulty of detecting clouds lies in the uneven illumination in the total-sky images, which means a single threshold is inapplicable for this condition.Up to now, almost all traditional methods have many cloud detection errors in the circumsolar and near-horizon regions, especially when the sun is visible.The radius of the circumsolar region and the size of the near-horizon region depend on the imaging instrument and local climate.The proposed GBSAT method is dedicated to improving the detection accuracy in these regions.The experimental results show that the GBSAT algorithm is good for all types of tests for partly cloudy images and obtaining satisfactory visual effects.For some optically thick clouds, the results of 2-D red-to-blue methods should be combined to improve the accuracy of cloud detection.Also, it should be noted that some detection errors due to the noise caused by the refraction of direct sunlight have still not been removed.These false retrievals have a similar brightness to clouds and their positions are variable with the change of the solar zenith angle.In future work, the real clear background should be considered as a replacement for the simulated background.For a long-term in situ observation, massive numbers of clear images can be acquired and stored for different sun positions so that, for any cloudy image, we can find a corresponding clear image in which the sun position is very near or same as in the cloudy image.These two images should have not only very similar background brightness distributions, but also the same refractive noises.Thus, a more accurate cloud detection result can be taken by subtracting the clear background from the cloudy image directly.

Figure 2 .
Figure 2. Brightness distribution for different channels.

Figure 3 .
Figure 3. Same as Fig. 2 but with visible sun.

Figure 4 .
Figure 4. Flow chart of the proposed algorithm. 22

Figure 5 .
Figure 5. Schematic of fisheye imaging and position of the sun.

Figure 6 .
Figure 6.Calculation of the solar masks.

Figure 7 .
Figure 7. Modified background subtraction using the green channel.

Figure 8 .
Figure 8. Removing the impact of direct sunlight.

Figure 9 .
Figure 9. Results comparison for different cloud detection methods.