Sky camera geometric calibration using solar observations

A camera model and associated automated calibration procedure for stationary daytime sky imaging cameras is presented. The specific modeling and calibration needs are motivated by remotely deployed cameras used to forecast solar power production where cameras point skyward and use 180 fisheye lenses. Sun position in the sky and on the image plane provides a simple and automated approach to calibration; special equipment or calibration patterns are not required. Sun position in the sky is modeled using a solar position algorithm (requiring latitude, longitude, altitude and time as inputs). Sun position on the image plane is detected using a simple image processing algorithm. The performance evaluation focuses on the calibration of a camera employing a fisheye lens with an equisolid angle projection, but the camera model is general enough to treat most fixed focal length, central, dioptric camera systems with a photo objective lens. Calibration errors scale with the noise level of the sun position measurement in the image plane, but the calibration is robust across a large range of noise in the sun position. Calibration performance on clear days ranged from 0.94 to 1.24 pixels root mean square error.


Introduction
The power output variability of renewable energy sources poses challenges to its integration into the electricity grid.Forecasting of renewable power generation (e.g., Monteiro et al., 2009;Perez et al., 2010;Kleissl, 2013) enables more economical and reliable scheduling and dispatch of all generation resources, including renewables, which in turn accommodates a larger amount of variable supply on the electricity grid.Specifically for solar power forecasting, a number of technologies are being applied: numerical weather prediction (e.g., Lorenz et al., 2009;Mathiesen and Kleissl, 2011;Perez et al., 2013); satellite image-based forecasting (e.g., Hammer et al., 1999;Perez and Hoff, 2013); and stochastic learning methods (e.g., Bacher et al., 2009;Marquez and Coimbra, 2011;Pedro and Coimbra, 2012).For very short term (15 min ahead) solar power forecasting on the kilometer scale, sky imaging from ground stations has demonstrated utility (Chow et al., 2011;Urquhart et al., 2013;Marquez and Coimbra, 2013;Yang et al., 2014).
Some of these sky imaging methods require the camera to be geometrically calibrated; i.e., each pixel must be associated with a corresponding view direction.Together with cloud height estimates, the view direction allows geolocation of clouds and their shadow projections such that their position is known relative to solar power plants.Geometric calibration is a common task in photogrammetry and computer vision, and calibration methods have been developed for a variety of applications.Some methods for calibrating a stationary camera require the use of calibration equipment or setups (Tsai, 1987;Weng et al., 1992;Heikkilä and Silvén, 1996;Shah and Aggarwal, 1996) or planar targets (Wei and Ma, 1993;Sturm and Maybank, 1999;Zhang, 2000).Geometric scene information can be used to calibrate the camera's internal parameters (Liebowitz and Zisserman, 1998) or estimate lens distortion (Brown, 1971;Devernay and Faugeras, 2001;Tardif et al., 2006).Scenes with parallel or perpendicular lines or primitive shapes are not generally available for skyward-pointing cameras, and thus there are no structures from the built environment around which to base a generic and automated calibration procedure.
Cameras used for solar power forecasting often employ fisheye lenses, which require appropriate camera modeling and associated model parameter estimation methods due to the large distortion required to achieve the approximately 180 • field of view.Many models which include lens distortion cannot account for distortion present in lenses which Published by Copernicus Publications on behalf of the European Geosciences Union.
B. Urquhart et al.: Sky camera geometric calibration using solar observations have a field of view equal to or exceeding 180 • because they rely on converting "distorted" image coordinates (which are finite measurements on the image plane) to "undistorted" image coordinates which are infinite at angles 90 • from the optical axis (e.g., Tsai, 1987).Gennery (2006) and Kannala and Brandt (2006) propose generic camera models suitable for fisheye lenses, and the form of the camera model presented here has features of both.The goal of the current work is to develop (1) a general camera model for a fixed-focal length wide-angle dioptric sky camera with a photo objective lens and (2) a calibration method that can be automated with little user input.
Providing detailed and quantitative yet widely applicable specifications for geometric camera calibration in solar energy forecasting applications is difficult.The impact of geometric calibration errors depends on cloud size, cloud speed, forecast averaging interval, geometry of cloud and camera relative to the plant, etc.For an illustration of geometrical relationships and sensitivity to cloud height errors see Nguyen and Kleissl (2014).
The calibration approach taken here is sometimes referred to as stellar calibration, where the 3-D position of an object or set of objects is treated as known.In particular the sun position in the sky is treated as a known input which is used along with the corresponding measured sun position in an image to calibrate a stationary camera of fixed focal length.Sun position has been used previously for camera calibration.Lalonde et al. (2010) have used manual image annotation to select the sun position in a few images, and with this estimated the focal length, principle point, and two of the three rotational degrees of freedom (the camera horizontal axis was assumed parallel to the ground).The work presented here builds on this idea and extends it using a more generalized camera model and automated sun detection.The camera model here allows any pose, non-square pixels, and both radially symmetric and decentering distortion components.
The layout of this paper is as follows.Section 2 discusses the forward and backward camera model.Section 3 discusses the imaging equipment and solar position input used for the calibration process.Section 4 provides details of the calibration procedure: initialization, linear estimation, and nonlinear estimation.Section 5 provides results for both measured solar position input and synthetic data.Synthetic data are used to assess the uncertainty in calibration performance and parameter estimation as a function of measurement uncertainty.

Camera model
The forward camera model projects points from a 3-D scene onto the image plane.The backward camera model described in Sect.2.2.projects points on the image plane to rays in 3space.Both models are developed assuming that the camera-lens system is central; i.e., all refracted rays within the lens pass through a single point.This, while not physically accurate, yields a close approximation (Ramalingam et al., 2005).

Projective transformation camera model
The standard model for a camera without distortion is a 3-D to 2-D projective transformation, mapping points X = (X, Y, Z, T ) in P 3 to x = (x, y, w) in P 2 : where P is a 3×4 perspective projection transformation with 11 degrees of freedom (it is defined up to scale), and P n is the nth dimension of projective space.The points X ∈ P 3 and x ∈ P 2 are homogeneous quantities and thus are defined only up to scale.The corresponding inhomogeneous points in Euclidean space are X = (X/T , Y /T , Z/T ) = ( X, Y , Z) , X ∈ R 3 , and x = (x/w, y/w) = ( x, y) , x ∈ R 2 .The tilde overbar indicates inhomogeneous coordinates throughout this work.When scale factors T or w are zero, the corresponding Euclidean point is infinite.For points not lying on the plane or line at infinity, we can write X = ( X , 1) and x = ( x , 1) , respectively.The point imaging transformation P is given by a composition of Euclidean, affine and perspective transformations r 11 r 12 r 13 t x r 21 r 22 r 23 t y r 31 r 32 r 33 t z 0 0 0 1 where the affine transformation is known as the camera calibration matrix (denoted by K, parameters defined later), the perspective transformation P n = [I|0] projects 3-D space points to 2-D image points, and the Euclidean transformation gives the rotation and displacement of the camera center relative to the world coordinate system.The Euclidean transformation can be written in block matrix notation as where the upper left block R with components r ij is a rotation from world coordinates into the camera coordinate system, and the upper right block t with components t i is a translation giving the displacement from the origin of the camera coordinate system (i.e., the camera center) to the origin of the world coordinate system.The rotation matrix has only three degrees of freedom and can be represented by the angle-axis three vector w, where R = expm([w] × ); expm is the matrix exponential and the notation [ ] × indicates the 3 × 3 skew symmetric matrix corresponding to the vector argument.The three rotation plus three translation parameters are known as the camera's extrinsic parameters.In inhomogeneous coordinates, the rigid body (Euclidean) transformation from world coordinates X to camera coordinates X cam is or equivalently using homogeneous coordinates The calibrated points x = ( x, ŷ, ŵ) on the image plane given by x = [I|0] X cam , can be converted to pixel coordinates x = (x, y, w) by the affine transformation K where α x and α y are the effective focal lengths (in pixels) in the x and y directions, respectively, (x o , y o ) = x o is the principal point (i.e., the point of intersection of the optical axis with the image plane), and s is the skewness of the pixel coordinate axes.The five parameters in matrix K are known as the camera's intrinsic parameters.The effective focal lengths α x = f k x and α y = f k y csc ψ account for the actual focal length f (in meters) and the potential for pixel sizes 1/k x and 1/k y (in meters per pixel) to vary in the x and y directions, respectively.The angle ψ is the angle between the x and y axes, which is close to π/2 for our camera, thus α y ≈ f k y .The skewness s = α x cot ψ is the degree to which the rows and columns of the image sensor are not orthogonal.
In summary, the model of a camera given by Eq. (1) contains six extrinsic (external) and five intrinsic (internal) camera parameter's and thus has 11 degrees of freedom.While the perspective projection camera model has been widely used, it does not account for lens distortion and assumes that the camera is a central projection camera.Since we seek to develop a model for use with a fisheye lens exhibiting a significant amount of distortion, the above model must be modified appropriately.

Distortion model
An equivalence class X cam ∈ P 3 in projective space (i.e., a point or vector in P 3 ) defines a ray = (θ, φ) in Euclidean space (R 3 ): where θ is the angle between the ray and the optical axis, and φ is the angle from the positive X cam axis to the projection of the ray onto the X cam -Y cam plane.The angle φ is positive in the counterclockwise direction.The incoming ray is mapped onto the image plane by a mapping D as where ˆ x, ˆ y are calibrated inhomogeneous coordinates in the image plane (the hat ∧ denotes a calibrated point, and the tilde ∼ denotes an inhomogeneous coordinate, defined previously).The mapping D is, in general, nonlinear and includes the distortion produced by the lens-camera system.Here we model D following Brown (1971) as where r (θ ) is the normalized radius on the image plane, and δ cx and δ cy account for decentering distortion in the x and y directions, respectively.The normalized radial distance r is obtained by dividing the actual radial distance in the image plane by the focal length f .These terms will be further discussed in the following subsections.

Radially symmetric distortion
The most common form of distortion in dioptric imaging systems with a photo objective lens is radially symmetric distortion.Several adjustments to the perspective projection model to account for radially symmetric distortion have been proposed for small field of view lenses exhibiting moderate amounts of pincushion or barrel distortion (e.g., Slama et al., 1980).In order to generate a one-to-one mapping of hemispherical radiance (180 • field of view) to the image plane, fisheye lenses must introduce extreme radial distortion.For a centered lens system, δ cx and δ cy can be taken as zero and r (θ ) can be set to one of the following projection functions (Miyamoto, 1964): Equations (8b) to (8e) correspond to fisheye lens projections.Equation (8a) is the undistorted perspective projection (i.e., same projection model as Eq.1), but can still be used in the camera model and calibration as described here.
Fisheye lens designers generally strive to meet one of the above projections, but due to manufacturing and assembly tolerances, the standard projections (Eqs.8b-e) only approximate a particular lens-camera system.In order to model B. Urquhart et al.: Sky camera geometric calibration using solar observations wide-angle and fisheye lenses more accurately, a number of models have been proposed (e.g., Kannala et al., 2006;Shah and Aggarwal, 1996).Here, instead of modeling the radially symmetric distortion using a polynomial in θ (e.g., Kannala et al., 2006), we follow a suggestion by Gennery (2006) and use one of the standard models g o (θ) in Eq. ( 8), and then fit a polynomial to the residual radial distortion as where r(θ ) is the normalized radius on the image plane, and the polynomial in k n models deviations from g o (θ).In this work, N was set to nine.The choice N = 9 was made based on a review of the residual radial distortion, and was appropriate for the lens examined.A lower order polynomial should be considered if calibration input data do not provide full incidence angle coverage (i.e., θ does not span (0, π/2)).
For more stable curve fitting results, θ should be appropriately normalized.In Sect. 4 the coefficients k n are denoted as a vector k, where k ∈ R 8 .

Decentering distortion
In addition to radially symmetric distortion, lenses exhibit tangential distortion.This deviation from the radial alignment constraint (Tsai, 1987) causes the measured azimuth of a point ϕ to differ from its true azimuth φ.Tangential distortion is due in part to a decentering of lens elements (Conrady, 1919;Brown, 1966).Based on the paraxial optics assumption, Conrady (1919) developed the following radial δ C cr and tangential δ C ct distortion terms arising from decentering for a point located at (r, χ ) on the image plane: where p 1 and p 2 are constants that determine the magnitude of each centering defect, r is the radius in the image plane taken from the principal point, χ 1 and χ 2 are reference axes for the distortion effects.The constants are proportional to the lens decentering magnitude as p 1 ∝ and p 2 ∝ 2 .Conrady (1919) did not develop terms of higher than first order in ; i.e., only terms containing p 1 were developed.The terms containing p 2 (briefly investigated for this work) are the only higher order terms that are not constant or symmetric over the image.
The Brown-Conrady distortion model (Brown, 1971) formulates the radial δ cr and tangential δ ct decentering distortion components with reference axis φ o to be that of maximum tangential distortion with φ positive counter clockwise from the x axis (χ is positive clockwise): where the terms containing p 2 have been neglected, and the profile function P = p 1 r 2 .Because Conrady did not develop terms in of higher order than one, Brown speculated that P could be extended as a polynomial in even powers of r (written here as a normalized radial distance): Expanding the aberrations due to decentering as developed by Conrady (1919) as in Eq. ( 10), one finds that the second and third order terms in produce only lower order terms in r (i.e., r 1 and r 0 ).The zeroth order term in (which is thus present in centered lens systems) produces a shift in the image proportional to r 3 and is commonly known as pincushion or barrel distortion.Because decentering effects in most lenses are small (p 1 r 2 ∼ 10 −4 pixels for our lens), it is reasonable to retain only first order terms in and to neglect p 2 .
The use of the Brown-Conrady decentering distortion model for a fisheye lens should only be considered as an expedient for model fitting, and not as a physical description of optical distortion.Conrady derived the decentering formulae following a paraxial method he devised to analytically obtain the five classical Seidel aberrations (Conrady, 1918).Equations 10a and 10b are therefore only valid under the small angle approximation sin θ ≈ θ , and are thus not valid for the large incidence angles in a fisheye lens.Additionally, there is no physical justification for Brown's extrapolation of P as an even ordered polynomial in r (recall that Conrady's original model had no higher order terms than r 3 ).The retention in this work of the Brown-Conrady decentering distortion model is for model fitting only.
The radial and tangential decentering distortion can be converted to the corresponding Cartesian components as which upon expanding gives Following Brown by taking it can be shown that Here only p 1 through p 4 are used.In Sect.4, the coefficients p n are denoted as a vector p, where p ∈ R 4 .

Forward camera model overview
Summarizing the results of this section, the forward projection of a 3-D space point to 2-D pixel coordinates consists of the following four steps.
1. Euclidean transformation The full camera model is represented by a nonlinear vectorvalued function f :

Backward projection
In many cases, one is given points x in image coordinates and what is needed is the back projection of those points into world coordinates.This is true for the application of solar forecasting where many quantities derived from images are assigned a space angle according to their image coordinates.For example, Chow et al. (2011) back project cloud positions detected within an image to a 3-D world plane to generate a mapping of the clouds, and subsequently used this cloud map to ray trace cloud shadows.Note that obtaining the distance from the camera to an object in the scene is not possible from a single image because of the projective nature of the imaging process.
The inversion of mapping D (Eqs.6 and 7) is the most difficult part of developing a back projection model from a forward projection model, and Kannala et al. (2006) suggest a function inversion approach to this end.An alternative is to formulate a separate back projection model and fit it using synthetic data generated from the forward projection.
After converting to calibrated inhomogeneous image coordinates using ˆ x, ˆ y, 1 = K −1 (x, y, 1) , the decentering distortion is formulated as a function of the polar coordinate (r, ϕ) in the image plane where The residual radially symmetric distortion polynomial (Eq.9) is reformulated as a function of r: where r, equivalent to its definition in the forward projection, is the radial coordinate after adjustment for decentering: N for the back-projection is set to nine.An image point can then be back-projected using where inversion of g o from any of the options listed in Eq. ( 8) is straightforward.The ray can be parameterized in the world reference frame as where λ is a scalar.For sky imaging, the camera center is often considered the origin of the world coordinate system and thus t = 0.
3 Solar position input from sky imager data

Imaging equipment and setup
The University of California, San Diego (UCSD) sky imager (USI) camera system was developed for the purpose of solar power forecasting (Urquhart et al., 2013)  N, 318 m.The horizon around the SGP site is free of mountainous terrain, thus the USI has a nearly 180 • field of view of the sky.The camera nominally points straight up, but has a slight angular offset due to the ground not being perfectly level.No leveling of the equipment was performed.Figure 1 shows the USI on its portable mounting stand.

Solar position modeling
The inputs used to calibrate the camera model (i.e., fitting the camera model parameters) are the angular position of the sun s = (θ s , φ s ) and the corresponding position of the sun in a sky image x s = (x s , y s ) .The angular solar position s is estimated using the NREL solar position algorithm (Reda and Andreas, 2004), which adopts the procedure from Meeus (1998).The NREL algorithm takes observer position (latitude, longitude, altitude) and time as inputs, and outputs the topocentric solar zenith angle θ s and topocentric solar azimuth angle φ s .The vector X s is computed from s using Eq. ( 19) with λ = 1, t = 0, and R is a 3 × 3 identity matrix.Due to the standard coordinate convention for φ s , cos(φ) and sin(φ) in Eq. ( 19) must be interchanged to compute X s .The refractive index of the air is a function of its density (hence a function of temperature and pressure) along the optical path, and because the atmospheric density gradient is predominantly vertical, the apparent solar zenith angle must be corrected accordingly (Brown, 1964).A correction using annual averages of surface air pressure and temperature is included in the NREL algorithm, and the default value for refraction magnitude at sunrise/sunset is used.Using the nominal pressure and temperature values is a suitable choice since, for example, at θ s = 90 • a change from 0.0057 to 0.0065 K m −1 in lapse rate introduces a difference in refraction of only θ = 0.007 • (Hohenkerk and Sinclair, 1985).
(Considering this, it should be noted that the ±0.0003 • uncertainty on solar zenith angle θ s reported by Reda and Andreas is quite generous in some cases.)A one second error in the image capture time results in an error of ±0.004 • for solar hour angle.In comparison, for our lens a one pixel uncertainty in sun position measurements corresponds to approximately θ = 0.14 • at the horizon.For the "full" calibration data set (case 3, below), 82 % of measurements were within the one pixel measurement uncertainty bounds.The sun detection process, therefore, introduces significantly more error than the solar position model and time recording errors.

Image plane calibration input
Measurement data consist of automated detection of the sun's position x s in an image using a set of methods described in Appendix A. The present method assumes that only pixels in the solar disk and surrounding halo of scattered sunlight saturate and that the saturation region is nearly circular with symmetry about the radial line from the principal point through the sun's center.In practice, thin clouds could cause an expansion of the saturation region, especially for cameras with low dynamic range, and this would reduce the accuracy of the sun position detection.In an operational setting, an unobstructed image of the sun is less likely to occur at large zenith angles because of the longer optical path through the troposphere (Warren et al., 1986).A cloud detection process (e.g., Ghonima et al., 2012) can be used to discard images with significant cloud cover.
The detection process for a single image i results in a set {x s } i and a set {y s } i of potential sun coordinates, from each of which the median was taken as the final sun coordinate x s,i = x s,i , y s,i , 1 to be used for calibration.Here, i is the image index and {•} i represents the set of measurements for image i.The detection methods leverage the fact that the sun is the brightest object in a daytime sky image.For the days chosen, the sun could be seen at solar zenith angles near 90 • , indicating that the horizon is at a similar altitude as the instrument.The sun could be detected reliably for images with θ s < 89.6 • .The sun detection algorithm described here was tested on predominantly clear days which simplifies detection because clouds cause occlusion of the sun or saturation of cloudy pixels near the sun.
The sun position is detected in a series of images collected from sunrise to sunset, yielding over 1400 calibration points per day.The set of points collected throughout a single (clear) day nominally forms a smooth arc.To evaluate the camera model and calibration performance under different solar arc input possibilities, 5 input cases were tested: (1) a single solar arc, (2) 2 solar arcs on consecutive days, (3) 4 solar arcs, (4) 10 solar arcs with measurement noise due to occasional clouds, (5) a single solar arc with noise due to clouds (Table 1).The solar arcs for cases 1, 4 and 5 are shown in Fig. 2. Case 1 would be preferred in practice as it requires only one -admittedly perfectly clear -day of data.However, limitations in sun position availability during one day may not provide sufficient data to accurately fit the camera model.The improvement associated with adding more days is evaluated in cases 2 and 3. Cases 4 and 5 were Table 1.Calibration test cases.The days included in each test case are given along with the number of sun position points and an estimate of measurement standard deviation SD m which represents the extent to which the data deviate from a smooth arc.See also Fig. 2 and Eq. ( 20).

Day(s)
points SD m (pixels) designed to provide more realistic and noisy data that would be found in climates without completely clear days.The sequence of sun position detections forms an arc that should be a smooth curve.The detection process, however, is associated with errors, especially when clouds are present.The deviation of the measured data from a smooth arc can be used to quantify the calibration input error.Separately for each day, a 9th order polynomial is fit to the x and y pixel coordinates as a function of solar hour angle H (obtained from the NREL solar position algorithm).A separate polynomial is obtained for x and y, which after obtaining the polynomial coefficients a n and b n can be written as where x f , y f is the pixel coordinate.A separate polynomial must be computed for each day.The standard deviation of the pixel-by-pixel distance between the measurements and the polynomial fit (Eq.27c) is given in Table 1 as SD m and is a useful estimate of the sun position measurement error.
The calibration procedure is a three-step process: (1) generate a rough estimate of the intrinsic parameters; (2) estimate the camera pose (rotation and translation), assuming one of the projections in Eq. ( 8); and (3) perform a three-stage nonlinear parameter estimation using the Levenberg-Marquardt (LM) algorithm to obtain the final intrinsic and extrinsic parameters.Steps 1 and 2 will be described in Sect.4.1 and step 3 will be discussed in Sect.4.2.Calibration results are given in Sect. 5.

Model initialization
In order to apply the LM algorithm to estimate the model parameters, the parameter vector β = (α x , α y , s, x o , w , t , k p ) must be initialized with a reasonably close estimate (see Sect. 2 for the definitions of the components of β).The simple estimate of intrinsic parameters (α x , α y , s, x o ) given in Sect.4.1.1needs to only be performed once unless the camera or lens is modified.Pose estimation (Sect.4.1.2) for the extrinsic parameters (w , t ) needs to be performed any time the camera is moved.The distortion parameters (k , p ) can be initialized to zero vectors.

Intrinsic parameter estimation
In whole sky imagery, the entire sky hemisphere is visible and forms an ellipse on the image plane with eccentricity near unity (e.g., Figs. 2 or 3).This enables a simple automated estimation approach.A Hough circle transform is used to obtain the approximate center x img of and radius r img of this near circular ellipse.The principal point x o is initialized to x img .The x and y focal lengths are assumed to be equal, i.e., α x = α y = α, and are determined using an unnormalized version of Eq. ( 9) where r = αg o (θ ): where the radius r img from the Hough circle detection process corresponds to the maximum field of view.For the USI, θ max is taken to be π/2 and g o is given by Eq. ( 8c), thus α = r img / √ 2. Initially it is assumed that s is zero; i.e., the camera pixel axes are orthogonal.The initial estimate of the camera calibration matrix K o is then

Pose estimation
The camera pose is estimated by computing the linear transformation between the inhomogeneous camera coordinates X cam,i and the homogeneous world coordinates X s,i (from the NREL solar position algorithm, Sect.3.2): where i is the data point (time/image) index for the set of points to be used in calibration.The camera coordinates X cam,i are obtained by first computing the calibrated image coordinates from the measured sun position: xi = K −1 o x s,i , and then by using where ri is given by Eq. ( 17) with decentering distortion set to zero, and finally by projecting onto the unit sphere: The calibrated perspective projection matrix P = [R|t] from Eq. ( 23) can be obtained using the direct linear transform (DLT) algorithm.Premultiplying Eq. ( 23) by the left null space of X cam,i : where ⊥ denotes the null space of the argument (and thus transposing the input gives the left null space).Applying the vec operator yields X s,i ⊗ X cam,i ⊥ vec P = 0, Atmos.Meas. Tech., 9, 4279-4294, 2016 www.atmos-meas-tech.net/9/4279/2016/where ⊗ is the Kronecker product.Our design matrix then contains subblocks where p = vec P .Due to measurement noise A is rank 12, and for non-zero p the right hand side cannot be identically zero.A least squares solution is obtained by computing the singular value decomposition of A and taking p as the right singular vector corresponding to the smallest singular value (Hartley and Zisserman, 2003).As always with the DLT algorithm, appropriate data normalization is required (Hartley, 1997).P is obtained from p by inverting the vec operation.
Due to imperfect data, the left 3×3 subblock of P is likely not an orthogonal matrix in SO(3) as is required for rotation matrices.To obtain R and t from the DLT estimate of P we take P = [M|v] where M is the 3 × 3 left subblock of P and v is the rightmost column vector.We then use singular value decomposition to write M = UDV where U comprises the left singular vectors and V comprises the right singular vectors of M (both U and V are orthogonal matrices), while D contains the singular values of M.An orthogonal matrix in SO(3) is obtained by taking R = µUV where µ = sign det UV .This gives the closest matrix R to M in the sense of the Frobenius norm.The translation vector t, which is nominally zero here, is given by t = −R P ⊥ , where the tilde indicates that after computing the null space, the resulting homogeneous 4-vector is converted to an inhomogeneous 3-vector before multiplication by R.

Nonlinear optimization of model parameters
Using Eq. ( 13) we define an error function i (β) for a single measurement: where f is a function that projects world coordinates X s,i to image coordinates x i = f X s,i , β as a function of parameters β.What we seek is a parameter vector β such that (β) 2 is minimum, where = 1 . .., M , c .The vector c (β) is a penalty vector defined in the next subsection, and M is the number of measurements.The model is fit by minimizing the sum of squared distances between the measured and modeled inhomogeneous pixel coordinates: where d is the Euclidean distance function.In the case of the synthetic data of Sect.5.3, the sum of squared distances is taken for the ground truth data with noise added x i + δ i (representing noised measurements) and the modeled data x i (i.e., M i d x i , x i + δ i 2 ).The nonlinear calibration of the forward model is accomplished by using the LM algorithm, for which an excellent introduction is given in Hartley and Zisserman (2003).
The calibration is performed in three successive stages: (1) take k = 0 and p = 0, i.e., do not include residual radial distortion and decentering distortion; (2) include residual radial distortion terms k, but take p = 0; and (3) include both residual radial distortion k and decentering distortion p.Three stages of nonlinear optimization were used because it was found that this approach was more consistent across the different test cases.The multi-stage optimization process first fits the "basic" model parameters (α x , α y , s, x o , w , t ) and does not include corrections to the standard distortion model g o (θ ).Additional degrees of complexity are sequentially added (i.e., radial (stage 2) followed by decentering distortion (stage 3)).The motivation in doing so is to avoid local minima that would result in errors in the estimation of the basic parameters.Intrinsic and extrinsic parameter estimates for stage 1 initialization are given in Sects.4.1.1 and 4.1.2,and the subsequent stages are initialized with the results of the previous stage.

Calibration constraints
It was found necessary to enforce additional constraints in the model fitting process to ensure consistent and physically significant results.The LM algorithm is a type of unconstrained optimization, so enforcement of constraints is implemented using a penalty vector c (β) = (c 1 , . .., c n ) appended to error vector .Constraints are recomputed each iteration of the LM algorithm along with updates to β.
To ensure that the residual and nominal radially symmetric distortion are orthogonal functions over the field of view, the following constraint was used: where c 1 = 0 indicates orthogonality.Without this constraint, the LM algorithm tended to decrease α x and α y and increase the k n to compensate, leaving the focal lengths at values that deviated by over 10 % from nominal lens and sensor specifications.This constraint is very important if the formulation in Eq. ( 9) is to be used for the radially symmetric distortion.The specific shape of the solar arc used to calibrate the camera, particularly when only a single day was used, resulted in a skewness s which was unexpectedly large -it indicated pixel axes deviated by over 5 • from orthogonal.This was corrected by applying a penalty c 2 on deviations from circularity of the ellipse formed at θ max parameterized by varying the azimuth angle.A simple metric such as the standard deviation of the radius of the ellipse at different azimuth angles, taken from the center x o is simple and effective for this purpose.A similar and simpler approach would be to www.atmos-meas-tech.net/9/4279/2016/place a penalty c 2 that is proportional to |s|; however this was not tested in this work.The last constraint applied was that r(θ ) was forced to be monotonically increasing with θ (the lens mapping would not be one-to-one if it was not!) by applying a penalty c 3 if dr (θ ) /dθ < 0.
5 Calibration results

Calibration performance metrics
The root mean square error (RMSE), mean absolute error (MAE) and standard deviation (SD) are computed as where the total number of measurements is M, and dx i = x i − x s,i is the distance vector from the modeled point x i to the measured solar position x s,i .The vertical bars | | denote the 2-norm of the argument, and dx i is the mean distance vector for all points i.These definitions hold for the evaluation of measurement error as well, where instead dx i = x s,i − x f,i (see Sect. 3.3 for description of the polynomial fit x f,i ).

Calibration using the solar position
The results of calibrating the USI for the five different solar arc cases is shown in Table 2.In the cases using more than one solar arc (cases 2-4), the principle point is consistent to within 0.60 pixels (4.4 µm).The x and y focal lengths are consistent to within 0.77 pixels (5.7 µm) for all cases and consistent to within 0.29 pixels (2.1 µm) for cases 2-4.The camera pose results presented here are represented by three angles in Table 2: φ xz is the angle of rotation of the camera X cam axis from the world X axis about the world Z axis (effectively the instrument's rotation from a northern alignment); θ zz is the angle between the camera Z cam and world Z axis (i.e., the degree to which the system is tilted); and ψ z is the azimuthal direction towards which the Z cam axis is tilted (measured clockwise from the north-pointing world Y axis).
The pose determined in all cases was very consistent, with a maximum difference in θ zz and φ xz of 0.2 • .Because the tilt angle θ zz was very small, there was increased variability in the tilt direction ψ z which is to be expected.Anecdotal observations of the USI 1.8 system as deployed at the SGP site indicate that it was tilted slightly southeast and rotated by about 45 • with respect to north, which is supported by the estimated pose.The performance of camera calibration using solar position is given in Table 3 along with the estimated measurement error of the sun position.Calibration error for 1 or more clear days was around 1 pixel (0.94-1.24 pixels).Including cloudy days increased the error to 2.9 and 6.3 pixels for the two cloudy cases tested.Including measurement data with more dispersion, as is the case with cloudy days in this study, will always increase the calibration error.This is because for a given set of model parameters, the projection of sun position will form a smooth arc, while the measured sun position will have some dispersion around this arc.Larger dispersion will yield larger calibration error values, which is why it is important to consider calibration error in the context of measurement error.
While not a true lower bound on calibration accuracy, the measurement errors given here can be used to assess the calibration accuracy relative to the estimated accuracy of the input data.The polynomial fit to the measurement data (Eq.20) does not have the same constraints as fitting the camera model parameters to the measurement data, thus the measurement standard deviation (SD m ) and measurement root mean square difference (RMSD m ) are lower than the SD and RMSE obtained for camera calibration, with (RMSE − RMSD m ) /RMSD m lying between 1.8 and 24 %.The proportionality of calibration and measurement error indicated in Table 3, along with the consistency of the parameter estimation (Table 2) indicates that the camera model pre-   sented here reasonably approximates the imaging process for the camera and lens tested.It also indicates that the calibration procedure is consistently obtaining reasonable parameter estimates for the model used.Additionally, the robustness of the model and calibration to larger measurement errors and outliers is demonstrated in case 5.

Camera model parameter uncertainty
As with any image detection algorithm, there are errors in the position of the sun obtained from the detection algorithm (Tables 1 and 3).Depending on the content of each image, such as the possibility of thin clouds veiling a still visible sun, or more opaque clouds passing near or occluding the sun, the magnitude of the detection error will vary.A Monte Carlo method was used to assess the uncertainty in model performance and parameter estimation as a function of measurement error.A ground truth synthetic calibration data set was constructed with Q = 1673 data points by simulating a single solar arc on 13 May 2013 (Fig. 3).The points in the world coordinate system X i = (X i , Y i , Z i , 1) ∈ P 3 , i = 1. ..Q were obtained by computing solar position s,i every 30 s from sunrise to sunset with the NREL solar position algorithm (Reda and Andreas, 2004), and then by projecting the solar position onto the unit sphere centered at the camera center (i.e.similar to the procedure applied in Sect.3.2).The ground truth pixel coordinates x i = x i , y i , 1 ∈ P 2 , i = 1. ..Q were obtained by applying the forward camera model (Sect.2.1) to points X i .The ground truth camera model parameters were set according to the results from solar calibration case 3. The points X i were treated as known points in space corresponding to synthetic measurements x i + δ i , where δ i is the measurement error generated as follows.The points x i were taken as the mean measurement values for Q × S independent normal probability distributions P ij x i , σ j with standard deviations σ j where j = 1. ..S. Standard deviation σ j was varied from 0 to 10 pixels in steps of 0.25 pixels (thus S = 41).The synthetic sun measurements (x i + δ i ) kj used in calibration trial kj were obtained by sampling P ij x i , σ j .A number of trials N t = 1000 was performed for each j , yielding N t × S calibration trials for each set of Q points.For the kth trial at error level σ j , a set of model parameters β kj = α x , α y , s, x o , w , t , k , p kj , k = 1. ..N t , j = 1. ..S} is obtained.The distribution of β kj at each j (i.e., along dimension k) is a measure of the uncertainty in the model parameters at error level σ j .
The distribution of true root mean square calibration error (RMSE) for σ j ∈ [0, 10] pixels is shown in Fig. 4. RMSE is computed as where x i is the ground truth pixel position of the ith point, and x i,kj is the modeled pixel position of the ith point (i.e., the projected pixel position of X i ) for calibration trial k at error level σ j .Even for σ j = 10 pixels, the median RMSE is below 1 pixel.For reference, the measurement error SD in Table 1 for clear days is less than 0.75 pixels, and the worst case tested here (case 5) has a measurement SD of 5.21 pixels.Based on Fig. 4, these measurement errors correspond to true calibration errors of 0.14 ± 0.03 pixels and 0.37±0.11pixels (mean ±90 % confidence interval), respectively, which is considerably lower than the RMSE reported in the first column of Table 3.This assumes measurement errors are normally distributed.Distributions of parameter estimation for four of the intrinsic parameters are shown in Fig. 5.For both α x and α y the 90 % uncertainty bounds are nearly linear.For our camera, this approximately follows α P 90 = α ± 0.05SD m , which is about 0.09 % error at SD m = 10 pixels.The overbar indicates the mean value.Similar results hold for the 90 % uncertainty bounds of (x o , y o ), which follow x o,P 90 = x o ±0.25SD m and gives 0.29 % error at SD m = 10 pixels.The latter error percentage is computed using 0.25SD m /N p × 100 %, where N p = 874 pixels is the radius of the usable sky image circle (Fig. 3).

Conclusions
The increasing use of stationary daytime sky imagery instruments for solar forecasting applications has motivated the need to develop automatic geometric camera calibration methods and an associated general camera model.The camera model presented is not specific to fisheye lenses, and is generally applicable to most fixed focal length dioptric camera systems with a photo objective lens.We have proposed a method to automatically detect and use the sun position over a sequence of images to calibrate the proposed camera model.Calibration performance on clear days ranged from 0.94 to 1.24 pixel root mean square error (RMSE).An uncertainty analysis indicated that if measurement errors are normally distributed, this corresponds to a true calibration error of 0.14 ± 0.03 to 0.16 ± 0.03 pixels RMSE (0.07 ± 0.02 to 0.08 ± 0.02 pixels SD), respectively.A back-projection model, which may be more useful for many applications, is proposed as a straightforward extension of the forward projection model.The uncertainty in the forward model parameters was analyzed and is provided graphically as a function of solar position measurement error.

Data availability
The sky imagery data used in this work were collected as part of the UCSD Sky Imager Cloud Position Study (https://www.arm.gov/campaigns/sgp2013sicps).Data from that study are available upon request to the corresponding author.constraints were found to be unnecessary because the process involves fitting only the residual radially symmetric and decentering distortion.The focal lengths α x and α y are already set, thus the orthogonality constraint is not required.The other two constraints treat the specific shape of the solar arc, and the back projection parameters are fit using synthetic data points generated from the forward projection which cover the whole image, and therefore are not required.To initialize LM for back-projection fitting, coefficients b n and q n can be set to zero.

Figure 2 .
Figure 2. Solar position measurements on (a) 13 May 2013 (case 1); (b) three days in May and seven days in June 2013 with an image on 11 June 2013 (case 4); (c) 7 June 2013 (case 5).Measurements are overlaid on example images.

Figure 3 .
Figure 3. Synthetic data set point distribution.The 1673 points are generated from taking the solar position every 30 s from sunrise to sunset on 13 May 2013, and projecting onto the image plane using a set of ground truth camera model parameters.The points shown are ground truth with no noise added.Background image (for visual reference only) is from 3 May 2013.

Figure 4 .
Figure 4. Root mean square calibration error distribution (Eq.28) as a function of simulated measurement error standard deviation σ j .The mean, 50 and 90 % confidence intervals are shown as curves.

Figure 5 .
Figure 5. Distributions of (a) x focal length α x ; (b) y focal length α y ; and (c), (d) principal point (x o , y o ) are shown as a function of measurement error standard deviation σ j .The mean, 50 and 90 % confidence intervals are shown as curves.
Urquhart et al. (2015)ied Vision GE-2040C camera which contains a 15.15 mm × 15.15 mm, 2048 × 2048 pixel Truesense KAI-04022 interline transfer charge coupled device (CCD).The lens is a Sigma circular fisheye lens with a 4.5 mm nominal focal length and equisolid angle projection (Eq.8c).Images cropped to 1748 × 1748 pixels (3.1 MP) are captured every 30 s during daylight hours, which for this experiment yielded over 1400 images per day.The USI uses 3 exposures at integration times of 3, 12, and 48 ms to generate a composite HDR image.The system clock is regularly updated using the network time protocol, so image capture times are accurate to within a second.Extensive details of the USI can be found inUrquhart et al. (2015).The USI used in this work was deployed at the Department of Energy, Atmospheric Radiation Measurement (ARM) Program, Southern Great Plains (SGP) Climate Research Facility from 11 March to 4 November 2013 at a lon-gitude, latitude, altitude of 97.484856• W, 36.604043

Table 2 .
Camera model parameters (excluding distortion terms) determined from the five test cases of solar input data.The mean and standard deviation are also given.The units denoted [pixels f −1 ] is pixels per focal length.

Table 3 .
Calibration error metrics for each case: root mean square error (RMSE); mean absolute error (MAE); standard deviation (SD); and measurement root mean square difference (RMSD m ) and measurement standard deviation (SD m ).