This is the html version of the file https://arxiv.org/pdf/1709.08681. Google automatically generates html versions of documents as we crawl the web.
Tip: To quickly find your search term on this page, press Ctrl+F or ⌘-F (Mac) and use the find bar.
arXiv:1709.08681v1 [astro-ph.IM] 25 Sep 2017
Page 1
Draft version May 9, 2018
Typeset using LATEX twocolumn style in AASTeX61
THE POINTING SELF CALIBRATION ALGORITHM FOR APERTURE SYNTHESIS RADIO TELESCOPES
S. Bhatnagar1
and T.J. Cornwell
1, 2
1National Radio Astronomy Observatory, 1003 Lopezville Road, Socorro, NM, 87801, USA.
2Tim Cornwell Consulting
(Dated: Received: June 13, 2017; Accepted: Sept. 22, 2017)
ABSTRACT
This paper is concerned with algorithms for calibration of direction dependent effects (DDE) in aperture synthesis radio tele-
scopes (ASRT). After correction of Direction Independent Effects (DIE) using self-calibration, imaging performance can be
limited by the imprecise knowledge of the forward gain of the elements in the array. In general, the forward gain pattern is
directionally dependent and varies with time due to a number of reasons. Some factors, such as rotation of the primary beam
with Parallactic Angle for Azimuth-Elevation mount antennas are known a priori. Some, such as antenna pointing errors and
structural deformation/projection effects for aperture-array elements cannot be measured a priori. Thus, in addition to algorithms
to correct for DD effects known a priori, algorithms to solve for DD gains are required for high dynamic range imaging. Here, we
discuss a mathematical framework for antenna-based DDE calibration algorithms and show that this framework leads to compu-
tationally efficient optimal algorithms which scale well in a parallel computing environment. As an example of an antenna-based
DD calibration algorithm, we demonstrate the Pointing SelfCal algorithm to solve for the antenna pointing errors. Our analysis
show that the sensitivity of modern ASRT is sufficient to solve for antenna pointing errors and other DD effects. We also discuss
the use of the Pointing SelfCal algorithm in real-time calibration systems and extensions for antenna Shape SelfCal algorithm for
real-time tracking and corrections for pointing offsets and changes in antenna shape.
Keywords: Methods: data analysis – Techniques: interferometry – image processing
Corresponding author: S. Bhatnagar
sbhatnag@nrao.edu
arXiv:1709.08681v1 [astro-ph.IM] 25 Sep 2017

Page 2
2
Bhatnagar and Cornwell
1. INTRODUCTION
The scientific deliverables of modern and next generation
interferometric radio telescopes, some under operation or in
advanced stages of commissioning, naturally require high
sensitivity and high dynamic range imaging. All these tele-
scopes typically promise at least a ten-fold increase in in-
stantaneous sensitivity compared to previous generation tele-
scopes. The projected achievable thermal noise in the im-
ages from these telescopes is in the range of 1−10 µJy/beam
corresponding to typical imaging dynamic ranges of 105−7,
particularly at frequencies <few GHz.
The underlying assumption in sensitivity calculations is
that the data processing procedure will remove systematic ef-
fects due to the instrument, atmosphere/ionosphere or sky to
a level significantly below the thermal noise limit so that the
RMS noise decreases as square root of the total number of in-
dependent measurements (the product of the total observing
bandwidth (∆ν) and total on-source integration time (∆T)).
In practice, the data are corrupted due to a number of direc-
tion independent (DIE) and direction dependent (DDE) ef-
fects. DI effects are constant across the field-of-view (FoV)
and correspond to a single value per antenna as a function
of time, frequency, and polarisation. Efficient algorithms
to solve for DI effects have been in use for many decades
(Thompson et al. 2017). However some instrumental and
atmospheric/ionospheric effects are directionally dependent.
Designing efficient solvers for DDE is more difficult and lag
behind solvers for DIE.
Aperture synthesis radio telescopes synthesize an aper-
ture equivalent to the largest projected separation between
the individual antennas in the array by cross-correlating
the signals from each antenna-pair in the array (Thomp-
son et al. 2017). Before being correlated, imprinted on the
signals are the effects of a number of instrumental and at-
mospheric/ionospheric components. The far-field complex-
valued electric field pattern (EFP) of the antenna and the
feed further modifies the incoming wave before it is detected
as a voltage fluctuation. In general, the cumulative effect
of all these is time-, frequency-, polarization- and direction-
dependent.
The EFP of the individual elements in an aperture synthesis
array constitutes the strongest instrumental DD effect. While
it is fundamentally direction-, frequency- and polarization-
dependent, it can also vary significantly with time for a num-
ber of reasons. The primary beam (PB) is typically rota-
tionally asymmetric and for 2-axis Azimuth-Elevation mount
(Az-El mount) dish antennas, it rotates with respect to the
sky as a function of the Parallactic Angle (PA). This leads to
time varying DDE gains within the field-of-view (FoV) and
constitutes the strongest time varying DDE for such systems.
For aperture-array elements, the beam geometry is fixed in
Earth-based coordinates, and thus varies in celestial coordi-
nates. This constitutes the strongest instrumental DD effect
for aperture-array telescopes. Given an antenna aperture il-
lumination function measured (or modeled) a priori, the ex-
isting A-Projection algorithm can be used to correct for these
known effects during image deconvolution (Bhatnagar et al.
2008; Bhatnagar et al. 2013).
Antenna pointing errors affect the approximately static
case of snapshot imaging with telescopes with high instanta-
neous sensitivity. In observations with long integration time
(for improved sensitivity or uv-coverage, or both), time vary-
ing antenna pointing offsets also lead to time-varying DDE
gains comparable in magnitude to those due to the rotation
of PB with Parallactic angle (PA). In this paper, we describe
a general mathematical framework for algorithms to solve for
the unknown DDE. We apply it to the specific case of antenna
pointing errors.
2. THEORETICAL FRAMEWORK
The measurement equation including DDE for an aperture
synthesis telescopes can be compactly written using the no-
tation in Hamaker et al. (1996) as1
VObs
ij
=MDI
ij
MDD
ij (s) I(s) e2πιs·bi j ds
=MDI
ij
(
Aij ⋆ V) .
(1)
The appropriate symbols here and in the rest of the paper rep-
resent quantities in the instrumental polarization basis (cir-
cular or linear polarization). VObs
ij
is the observed and V
the true full polarization visibility vectors. Aij is the Fourier
transform of MDD
ij
and the symbol ’⋆’ represents the standard
matrix-vector multiplication, except that the multiplication
operations are replaced by the convolution operation in the
algebra. MDI
ij and MDD
ij (s) are the outer products of the DI
and DD Jones matrices JDI and JDD(s) respectively. We refer
to these outer products as the Radio Mueller matrices2 for DI
and DD gains. s is a direction in the sky, I is the image and
bij is the projected separation between the antennas i and j
in units of the wavelength. The goal of imaging is to esti-
mate the true sky brightness distribution I(s) in the presence
of known or unknown gains MDI
ij and MDD
ij (s).
2.1. Overview of direction independent calibration
1 The symbol ’ι’ is used as a symbol for iota throughout the text and
should not be confused with i which is used as an antenna-index subscript.
2 In the original optical literature (Jones 1941; Mueller 1948), Jones ma-
trices are defined in the Stokes basis and an outer product of the Jones matri-
ces is called the Mueller Matrix. Radio interferometric measurements are an
outer product of electric fields measured at the two antennas in the feed po-
larization basis (circular or linear) which is the natural basis for calibration.
However since this is also specific to radio interferometric measurements,
we use the term Radio Mueller matrix. This Radio Mueller matrix is related
to the optical Mueller matrix via a unitary transform (see e.g., Hamaker et al.
1996).

Page 3
The Pointing Selfcal algorithm
3
Direction independent effects (DIE) (MDI
ij in Eq. 1) are due
to the telescope electronics and atmospheric/ionospheric ef-
fects at scales much large than the antenna FoV. Calibra-
tion of such gains is done using the self-calibration tech-
nique (Cornwell & Wilkinson 1981; Cornwell 1999). Ignor-
ing MDD
ij (s) and factoring MDI
ij into the antenna based DIE Ji,
a convenient specialization of Eq. 1 for DIE calibration can
be written as
VObs
ij
= (Ji ⊗ J
j
)
V
(2)
and Ji solved-for using χ2−minimization techniques. For
model visibilities VM
ij , the residual visibilities are
Rij = VObs
ij
− (Ji ⊗ J
j )VM
ij
(3)
and
χ2 = ∑
ij
R
ij· Wij · Rij.
(4)
Wij is a diagonal matrix of weights proportional to the inverse
of the measurement variance. Equation 4 is a sum over all
baselines of the weighted L2-norm of the full-polarization
residual vector Rij. For clarity, we show below the expansion
of Eq. 4 in terms of the elements of Rij and Wij:
χ2 =∑
ij


[
rpp
rpq
rqp
rqq
]
ij
wpp
0
wpq
0 wqp
wqq

ij
rpp
rpq
rqp
rqq

ij


=∑
ij
(
|rpp|2 wpp + |rpq|2 wpq + |rqp|2 wqp + |rqq|2 wqq)
ij
(5)
where r and w are the elements of the vector Rij and the ma-
trix Wij respectively and subscripts p and q represent orthog-
onal polarization states (circular or linear).
The Jones matrix Ji in Eq. 3 are direction independent with
its elements consisting of single numbers (and not 2D func-
tions). The evaluation of MDI
ij in Eq. 1 therefore requires
computationally simpler outer-product operator (as against
the outer-convolution operator required for MDD
ij ; see Sec-
tion 2.2) and application of
(
Ji⊗ J
j
)
is a simple matrix mul-
tiplication. This observation leads to an efficient DI calibra-
tion algorithm. For the relevant case of parallel-hand only
calibration,
Ji =
 gp
0
0 gq

i
(6)
where the superscripts p and q represent the two orthogonal
polarization pairs. Rij can be written as
Rij = Xij − (Ji ⊗ J
j )1
(7)
where the symbol 1 is a 4×1 column-vector with all elements
equal to one and
Xij = [diag(VM
ij )
]−1
VObs
ij .
The function diag(a) returns a diagonal matrix with the vec-
tor a as its diagonal. For a simple minimization algorithms
such as the Steepest Descent algorithm, equating the gradient
∂χ2
J
i
to zero for an Nant antenna-array leads to Nant simulta-
neous non-linear equations which are then solved using the
following iterative equation:
Ji∣∣∣n⊗ 1 = (1 − γ)
[
Ji∣∣∣n−1⊗ 1] +
γ [S oW]−1
∑
j,ji
(
Jj∣∣∣n−1⊗ 1
)
WijXij

(8)
where
S oW =
j,ji
(
Jj∣∣∣n−1
⊗ J
j ∣∣∣n−1
)
Wij
The symbol |n represents value at iteration n, 0 < γ < 1 is
the standard feedback loop-gain of non-linear minimization
algorithms and 1 is the 4 × 4 identity matrix. When VM
ij is a
wide-band prediction of the sky brightness distribution in the
FoV, Xij is a weak function of time and frequency and it may
be pre-averaged for the solution interval for improved signal-
to-noise ratio (SNR). Using previous solutions or unity as
an initial guess for Ji, convergence is typically achieved in a
few iterations (see Thompson & Daddario (1982); Bhatnagar
(1998)3 for a detailed derivation and an intuitive interpreta-
tion).
The primary focus of this paper is DD calibration – de-
scribed in the following sections – which is fundamentally
coupled to imaging. DD calibration is therefore better de-
scribed in a Mueller-matrix framework. To establish equiv-
alence between DI and DD calibration and to show that DD
calibration is a generalization of the DI calibration, we also
described the DI calibration above in a Mueller matrix for-
mulation. In practice, most software implementations of
Eq. 8 implement the matrix arithmetic by-hand for better
code optimization where the distinction between Jones- and
Mueller-matrix based formulations is not important. For DI-
only implementations which may benefit from directly using
the matrix arithmetic, it is more efficient to re-formulate the
DI algorithm using VObs
ij
= JiVJ
j where all matrices are 2×2
matrices, instead of Eq. 2. However it is important to point
out here that while this may offer computational advantages,
it does not lead to a fundamentally new algorithm.
Since VM in Eq. 3 is independent of J, it is treated as
a constant in the minimization algorithm. Calibration for
3 Also accessible from
http://www.aoc.nrao.edu/∼sbhatnag/GMRT Offline/antsol/antsol.html

Page 4
4
Bhatnagar and Cornwell
DIE gains and imaging to solve for I(s) are alternated iter-
atively. Thus calibration can be done keeping the model of
the sky fixed and imaging is done keeping the DI calibration
terms fixed. Iterating between imaging using calibrated data
VC
ij = (Ji⊗ J
j )−1VObs
ij
to make the model image IM and using
IM to solve for Jis forms a closed-loop solution to account
for DIE. We retain this overall structure for our DDE calibra-
tion approach and substitute an algorithm for estimating the
DDE’s while keeping the image fixed.
2.2. Direction dependent effects antenna-based calibration
Direction dependent effects (DDE) are represented by MDD
ij
in Eq. 1. Mathematically the effect of these terms is indistin-
guishable from I(s) and the model data VM cannot be eval-
uated independent of MDD. Consequently, to compute the
residual vector (the difference between data and the model),
the integral in Eq. 1 must be evaluated for each measured
data point (i.e., for all i, j, frequency and polarization mea-
surements). With typical modern data sizes in the multi-Tera
bytes regime the computational and the data I/O costs be-
come very high even with only a few unresolved sources for
a direct evaluation of the integral, or evaluating it separately
for different directions in the FoV. These costs are prohibitive
for dense fields (as is typical at frequencies ≤ f ew GHz) and
for fields with a combination of compact and significant ex-
tended emission (as is typical for observations in the Galactic
Plane and for mosaic observations at any frequency).
Direction dependent effects that are fundamentally aper-
ture plane effects can often be compactly modeled with a few
parameters in the aperture plane. For example, the effect of
antenna pointing errors can be modelled by two parameters
per antenna for the entire FoV. Solving for these effects in
the image plane requires solving for the antenna gains to-
wards multiple sources. This has significant numerical and
computational disadvantages. On the other hand, solving for
these effects directly in the aperture plane has lower compu-
tational complexity and optimal utilization of the full SNR
based on the integrated flux in the FoV. The Pointing Self-
cal (PSC) algorithm described below is an example of such
an aperture-plane calibration algorithm for antenna pointing
errors.
The Aij in Eq. 1 can be factored into antenna-based Jones
matrices. The observed visibilities VObs
ij
calibrated for MDI
can then be written as
VObs
ij
= [Ai⊛A
j] ⋆ VM,
(9)
where VM= F I(s) – the Fourier transform of the sky bright-
ness distribution on a grid. The symbol ’⊛’, introduced in
Bhatnagar et al. (2008), represents the outer-convolution op-
erator4. Ai is an antenna DD Jones matrix given by
Ai =
Ap
0
0 Aq

i
.
(10)
The elements along the diagonal of this matrix describe the
electric field distribution across the antenna aperture for the
two orthogonal polarizations.
The gradient of χ2 w.r.t. antenna-based parameters a
(which are, in general, complex-valued) for the more general
Eq. 9 is given by5
∂χ2
∂a
i ∣∣∣∣∣∣n
= −2 ∑
j,ji
(
R
ij∣∣∣∣n
· Wij ·
[
iVM
ij ∣∣∣n
]) ,
(11)
where
iAij =
∂Aij
∂a
i
=
∂AM
i
∂a
i
⊛AM
j
(12)
and
iVM
ij ∣∣∣n = ∇iAij∣∣∣n
⋆ VM.
The symbol
represents the real part of its argument, which
evaluates to a scalar in the same way as Eq. 4. AM
i
– the
model for the true Ai – is parametrized by the parameter a
and Rij∣∣∣n = VObs
ij
[
AM
i ⊛AM
j
]
n
⋆ VM
is the residual vector
computed at iteration n. The parameters are then updated
iteratively as
a|n = a|n−1 + f
( ∂χ2
∂a∣∣∣∣∣∣n
)
(13)
where f is a function that depends on the details of the
non-linear minimization algorithm used. For minimization
algorithms that assume a diagonally-dominant Hessian ma-
trix (such as the Steepest descent algorithm), f(x) = γx.
More sophisticated minimization algorithms involve poten-
tially expensive evaluation of the Jacobi/covariance matrix
(see Press et al. 1992, or later editions for detailed discus-
sions).
3. THE POINTING SELFCAL ALGORITHM
Our algorithm works in the general way of self-calibration
algorithms, iterating between estimation of the sky bright-
ness holding the calibration parameters fixed, and estima-
tion of the calibration parameters holding the sky brightness
model fixed (Cornwell 1999). For the first part, we use the
4 The element-by-element algebra of the outer-convolution operator is
the same as that of the outer-product operator used in the DI description of
Hamaker et al. (1996), except that the complex multiplications are replaced
by convolutions.
5 Equations 11 and 12 are a general form of equations for antenna-based
calibration and include the DI case. As a test and for intuitive understanding,
replacing ⋆ by dot product, Aij by Ji⊗ J
i , a
i by J
i and equating Eq. 11 to
zero recovers Eq. 8.

Page 5
The Pointing Selfcal algorithm
5
combination of the Wide-band A-Projection (Bhatnagar et al.
2013) and the Multi-term Multi-Frequency Synthesis algo-
rithms (Rau & Cornwell 2011). Hence in this section, we will
concentrate on the second part: estimating the calibration pa-
rameters (pointing errors) while holding the sky brightness
fixed.
In Eq. 11 the evaluation of the residual vector R, and in
this case also ∇Aij, can be expensive. Both Rij and ∇iAij
in Eq. 11 involves evaluation of Aij and
[Ai
∂a
i
⊛A
j
]
at each
iteration of the minimization algorithm. For an array with
Nant antennas, the computational cost of these evaluations is
O
(
4 × 2N2
A log(NA) × 2N2
ant
)
, where NA is the size in pixels
of the quantized representation of the elements of A. This
cost can be prohibitive for modern telescopes with Nant and
NA in the range of few×102−3 antenna elements and 105−7
pixels respectively. Thus treating the PB as being completely
unknown is not viable computationally nor can we expect it
to be well-conditioned. Hence a form for the PB with less
degrees of freedom is necessary.
The mechanical antenna pointing errors can be represented
with fewer degrees of freedom; as a phase gradient across
the antenna aperture illumination pattern. Purely mechani-
cal fractional pointing errors are also the same for both po-
larization and at all frequencies. Aperture-plane solvers for
the pointing errors can therefore easily benefit from the in-
stantaneous continuum sensitivity of modern wide-band re-
ceivers. For simplicity and to facilitate analysis, we model
the main-lobe of the EFP for antenna i as a gaussian of the
form e−(l−li)2α2
where l is a direction on the sky with respect
to the pointing direction, li is the pointing error and α is 2−1/2
times the inverse of the standard deviation. It can be replaced
with a more accurate function for the antenna PB without the
pointing errors in the final results. Aij (the Fourier transform
of the PB) can then be expressed as
Aij = A
ij
e
(li−l j)2α2
2
 eιπu(li+lj)
(14)
and Eq. 12 for a
i = li becomes
iAij =
∂Aij
∂li
= Aij
[(
lj − li) α2 + ιπu]
(15)
where u is the Fourier-conjugate variable for l (in units of
wavelengths and radians respectively). The equations above
are written for one dimension only for clarity and can be triv-
ially extended for the other dimension and for the heteroge-
neous array case. li and lj are the antenna pointing errors
for antennas i and j. A
ij is the pre-computed version of A
and includes all PB effects known a priori – but not the me-
chanical antenna pointing errors – like polarization squint,
off-axis polarization effects, effects of antenna blockages and
feeds, rotation with PA, dependence on frequency, etc. Us-
ing A-Projection to apply Aij and ∇iAij, both Rij and ∂χ2/∂a
i
can be computed at full continuum sensitivity by integration
across time, frequency and polarization without loss of ac-
curacy. The term inside the square brackets in Eq. 14 is the
reduction in the amplitude because of the decorrelation of
the measured visibilities due to the pointing errors at each
antenna of the baseline. It is of order unity for small values
of (li − lj) – the difference in the pointing errors at the two
antennas – and may be ignored (for a typical maximum value
for (li −lj) of order 1−2% of the width of the antenna PB, this
term constitutes an error of less than 0.1% in the amplitude).
Algorithm 1 The Pointing Selfcal algorithm: estimation of
pointing errors
1: Pre-compute VM= F
[
IM(s)
]
2: Pre-compute A
ij(t,ν) for all required i, j, t and ν
3: for all data do
4:
for all t and ν in the interval (τsol,∆νsol) do
5:
for Iteration n do
6:
Chi[] = 0.0; dChi[] = 0.0;
7:
for all i do
8:
for all j, j i do
9:
Compute Aij(t,ν) and ∇iAij(t,ν)
(Eqs. 14, 15)
10:
Use A-Projection algorithm to compute:
VM
ij = Aij ⋆ VM
iVM
ij = ∇iAij ⋆ VM
11:
Compute Rij = VObs
ij
(t,ν) − VM
ij (t,ν)
12:
Accumulate
Chi[j]=Chi[j]+R
ijWijRij
dChi[j]=dChi[j]+
(
R
ijWij
[
iVM
ij
])
13:
if Terminate(Chi[],dChi[],n) then
14:
break
15:
Update li|n = li|n−1 + f (dChi[j])
16:
Save all li for solution intervals (τsol, ∆νsol)
To predict the model data at each iteration, modified A as
in Eq. 14 is used with the A-Projection algorithm to compute
the model data, including the effects of the antenna pointing
errors and subtracted from VObs
ij
to compute Rij. Similarly,
modified A as in Eq. 15 is used to compute ∇iAij∣∣∣n
⋆ VM
.
See Algorithm 1 for the various computational steps and the
nesting of the loops involved. The functions Terminate() and
f() in steps 13 and 15 depend on the details of the minimiza-
tion algorithm.
4. RESULTS
4.1. Simulations
To verify the numerical correctness and performance of
the PSC algorithm, we simulated data for the Karl G. Jan-
sky Very Large Array (VLA) (Perley et al. 2011) which in-
cluded the effects of the time-varying pointing errors at each
antenna. The simulation was for an L-Band observation and
used a sky model derived from the NVSS source list. The

Page 6
6
Bhatnagar and Cornwell
total integrated flux in the FoV, including the first sidelobe of
the PB, was 1.5 Jy distributed across the beam. For numeri-
cal accuracy, the data for the sky model was predicted using
direct Fourier transform and a model for the antenna PB with
pointing offsets. The pointing offsets for the antennas were
uniformly distributed between ±20 , which corresponds to
∼ ±1% of the beam-width at 1.5 GHz. Independent pointing
offset errors of ±5 were added for each antenna as a function
of time to simulate short term time-varying offsets. Finally,
random noise corresponding to a continuum sensitivity limit
of 1 µJy/beam in eight hours of observing with an integra-
tion time of 10 sec per sample was added to the visibilities to
simulate thermal noise.
Figure 1 shows the result of application of the PSC algo-
rithm to this simulated data. The continuous curves show the
antenna pointing errors for 4 of the 27 antennas (for clarity)
as a function of time. The over-plotted filled circles are the
PSC solutions with a solution interval of 30 sec. The resid-
uals per baselines for solutions with 10 sec solution interval
were consistent with the thermal noise limit in the simula-
tion (see Bhatnagar et al. 2004), verifying the basic numeri-
cal correctness of the algorithm.
Figure 1. The solutions from the Pointing Selfcal algorithm us-
ing simulated data to verify the algorithm. The continuous lines
show the pointing offsets as a function of time for four representa-
tive antennas while the filled-circles show the pointing offset solu-
tions with 30 sec. solution interval.
4.2. Application to VLA data
To test the algorithm with real data, we used a wide-band
observation of the IC2233 field with the VLA at L-Band.
This data was imaged using about ∼600 MHz of bandwidth
to generate the continuum sky model used as an input for the
PSC algorithm. The VLA antenna optics creates an offset
between the parallel-hand PBs (polarization squint) of ∼6%
of the beam at any frequency, corresponding to ∼ 110 at
1.5 GHz. To gain confidence in any antenna pointing offset
that the solutions may show, we set up the PSC algorithm
to solve for the offsets independently for both the polariza-
tions (R and L). Difference in R- and L-solutions will then be
a measure of the polarization squint and the mean value of
these solutions ([Ro f f set +Lo f f set]/2) will be a measure of the
antenna mechanical pointing error (which should be the same
for both polarizations). In the antenna Az-El plane, the sep-
aration between the R and L solutions will then correspond
to the length of the polarization squint vector, while a vector
from the center of the squint vector to the origin of the Az-
El plane would correspond to the actual antenna mechanical
pointing error vector.
The results on the Az-El plane for two of the antennas in
the array are shown in Fig. 2. The vector labeled “Squint
Vector” shows the separation between the median values of
the two offsets and is equal to ∼120 and 105 for the two
antennas. The vector labeled “Pointing Error Vector” from
the center of the Squint Vector to the origin is a measure of
the antenna mechanical pointing offsets, which has a magni-
tude of 3.5 and 0.5 for these two antennas. These are large
systematic pointing offsets, which were subsequently veri-
fied independently and corrected in the telescope software.
As expected, after corrections, particularly for the first an-
tenna, an improved antenna sensitivity was measured giving
us a verification of the solutions and the sign convention used
in the software.
The Az- and El-offsets of the RR and LL beams as a func-
tion of time from the nominal antenna pointing direction
for a set of representative antennas with solution-intervals of
5 min and 600 MHz in time and frequency respectively are
shown in Fig. 3. The PB model was derived using a geo-
metric optics simulator for the antenna illumination patterns
which includes the effects of aperture blockage, off-axis feed
locations, and illumination taper (Brisken 2003). The rota-
tion with PA and scaling of the PB with frequency was in-
cluded in the model using the A-Projection algorithm. The
sky brightness for this observation is dominated by two com-
pact sources separated by ∼25 with other weaker sources
spread across the FoV. The scatter in the pointing solutions is
due to three factors:
1. The actual sky brightness is dominated by two strong
sources and so the constraints on the perpendicular di-
rections are weak.
2. Imperfections in the sky model
3. The limitations of the PB model used, particularly as a
function of frequency.

Page 7
The Pointing Selfcal algorithm
7
Figure 2. The results from the application of the Pointing SelfCal algorithm to the VLA data. Pointing error solutions along the elevation and
azimuth axis for the R- and L-beams are shown with red and blue points. The average separation between these set of points corresponds to the
polarization squint of the VLA antennas due to the off-axis feed location. The separation between the center of the squint vector and the origin
corresponds to the antenna mechanical pointing error.
−70
−60
−50
−40
−30
−20
−10
0
10
Ant. 1
−90−80−70−60−50−40−30−20−100
Ant. 2
−70
−60
−50
−40
−30
−20
−10
0
10
Ant. 3
−30
−20
−10
0
10
20
30
40
50
Ant. 7
0
10
20
30
40
Time (min)
−20−10010203040506070
Ant. 18
Az offset (arcsec)
−50
0
50
100
150
200
250
Ant. 1
−50
0
50
100
150
200
250
300
Ant. 2
−60−40−20020406080100120
Ant. 3
−100
−80
−60
−40
−20
0
20
40
Ant. 7
0
10
20
30
40
Time (min)
−140
−120
−100
−80
−60
−40
−20
0
Ant. 18
El offset (arcsec)
Figure 3. The pointing offsets in the antenna azimuth (left) and elevation (right) axises as a function of time for several antennas. The two
curve in each panel correspond to the offsets for the R- and L-polarizations. The separation between the two curves in each panel correspond to
the azimuth and elevation components of the polarization squint for each antenna.
Analysis of independent holographic measurements (Per-
ley 2016) show that in addition to systematic deviations from
the expected value there are oscillations in the magnitude of
the squint vector as a function of frequency due to standing

Page 8
8
Bhatnagar and Cornwell
waves in the antenna optics (Jagannathan et al. 2017), which
were not included in the PB model. Some variations in the
pointing offset solutions, including in the length of the squint
vector could therefore be also real. The derived offsets as a
function of time along the azimuth axis for a few antennas in
the array are shown in Fig. 4. The solutions show significant
differences in the antenna pointing errors between the anten-
nas, as well as slow drifts over longer timescales (∼ 30 min)
– both of which are expected based on the regular pointing
model measurements and anecdotal evidence from imaging
results.
Figure 4. The solved antenna pointing offsets in the azimuth axis
with time for a few representative antennas of the VLA.
4.3. Noise budget
In general, the stability and the error on the solved param-
eters depend directly on the SNR in the measured data (the
RHS of Eq. 9) and on the number of free parameters. It is
therefore important to devise algorithms that maximize data-
SNR using as few free parameters as possible.
The equivalent thermal noise that contribute to the variance
in the solution is reduced by a factor equal to the square root
of the number of statistically independent samples averaged.
The residuals in the PSC solver are averaged at each iteration
across all baselines with a given antenna and for the duration
of the solution intervals in time and frequency. Assuming
that the pointing offsets at each antenna are statistically in-
dependent and random in nature, the noise contribution in
the solver due to the thermal noise in the data is reduced by
a factor equal to
∆νsolτsol (Nant − 1) for solution intervals
of ∆νsol and τsol in frequency and time. On the other hand,
the signal for the solver is the differential apparent-flux with
respect to the pointing offsets in the FoV. Comparing this sig-
nal with the effective noise allows an estimate for the mag-
nitude and the distribution of the sky brightness required for
−2
−1
0
1
2
Angular offset (σ1GHz)
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
A
m
plitude
E @ 1 GHz
|(∂E/∂l)E| @ 1 GHz
|(∂E/∂l)E| @ 2 GHz
Figure 5. The curves show slices across the ∣∣∣E(ν)∂E(ν)/∂l∣∣∣ function
at the edges of a 1-GHz wide band (in green and red). The main-
lobe of E at 1 GHz is approximated as a gaussian of width σ1GHz (in
blue).
deriving pointing offset solutions. For a homogeneous array
case with identical antennas, the SNR per parameter for an
antenna-based pointing offsets solver is
S NRPS C =
|∇S|
SEFD
√∆νsolτsol(Nant − 1)
(16)
where
SEFD =
2kbTsys
AπR2)
× 1026 (Jy)
kb is the Boltzmann’s constant. R, ηA and Tsys are the antenna
aperture radius (in meters), efficiency and system tempera-
ture (in Kelvin) respectively. ∇S is the differential apparent
integrated flux in the FoV with respect to the antenna point-
ing errors is given by:
∇S =
∫ (∂E
∂l
⊗ E
)
IM(s) ds (Jy)
(17)
where E is the antenna far-field EFP6 and IM is a model for
the sky brightness distribution as a function of the direction
s. For the VLA at L-band, Nant = 27 and SEFD ≈ 358 Jy
(Tsys = 35 K, ηA = 0.55 and R = 12.5 m). The total apparent
flux in the VLA FoV at L-band above the thermal noise limit
for a usable bandwidth of 800 MHz and 1 min of integration
in time is estimated to be few×100 mJy. In the absence of any
other source of noise, solutions for antenna pointing offsets
should be possible at reasonably high accuracy on very short
timescales. Some numerical experiments suggest that reli-
able solutions are possible at several minutes timescale due
6 Recall that E is related to Eq. 1 via MDD
ij
= Ei⊗ E
i and the Fourier
transform of E is the DD equivalent of J in Eq. 2

Page 9
The Pointing Selfcal algorithm
9
to addition numerical noise, e.g. due to gridding, rotation of
AM with PA, inaccuracies in IM and AM, etc.
Analysis of Eq. 17 offers a few useful thumb-rules. The
function ∣∣∣∣
E
∂l
⊗ E∣∣∣∣ is a double-hump shaped curve with a null
at the location of the peak of E (see Fig. 5). Therefore,
while the sky brightness at the center of the PB does not
contribute signal for the solver, the contribution of the flux
around the center increases with the magnitude of the point-
ing error. This function peaks around the half-power-point
of the PB. The flux in this region of the PB therefore con-
tributes the maximum signal. The PSC algorithm therefore
works well for fields where the sky brightness distribution
is spread across the PB. This is almost always the case at
low frequencies (⪅few GHz) and for mosaic imaging which
is typically the observing mode at high frequencies. Note
that the antenna pointing errors also adversely affect imaging
performance only for fields with significant flux away from
the pointing center. The area under this curve is indepen-
dent of frequency (changes in ∂E/∂l and E compensates for
each other as a function of frequency). Since the mechanical
pointing errors are independent of frequency and on an av-
erage the low-frequency radio flux varies as ν−0.7, the signal
for the pointing solver will increase with bandwidth, partic-
ularly at lower radio frequencies. More precise estimates for
the SNR and solution timescales will require simulations us-
ing models for E, the sky brightness and its spectral index
distribution.
5. DISCUSSIONS
5.1. Run-time performance analysis
The run-time cost of PSC has two components: compu-
tation of Aij and ∇iAij and their application to VObs
ij
via A-
Projection. Using Eqs. 14 and 15 with a precomputed A
ij,
the cost of evaluating Aij and ∇iAij at each iteration becomes
relatively insignificant and the total run-time cost is domi-
nated by the cost of their application to VObs
ij . For a support
size of Nsup for Aij, this cost scales as NvisN2
sup, where Nvis is
the total number of data samples within the solution interval.
Since the computations for each data sample is independent,
this cost reduces linearly with parallelization by partitioning
Nvis across multiple computing cores. As a test case using a
single CPU running at 1.2 GHz clock, each iteration of the
PSC took about 1 sec per frequency channel for the VLA.
Using all the 16 computing cores available, the run-time was
reduced to ∼ 70 millisec per channel, with the total run-time
for convergence of ∼ 2 min per solution.
Multi-threaded gridders (e.g., Golap (2015)) which can
benefit from a much larger number of compute-cores avail-
able on massively parallel hardware like the modern GPUs
may help in reducing the run-time cost by a large factor.
For arrays that are non-coplanar for long-integration obser-
vations, the run-time cost of PSC increases with the W-term
(Cornwell et al. 2008). However, for short solution interval
the run-time cost may be largely independent of the W-term
– by using the fact that arrays are instantaneously co-planar
(Cornwell et al. 2012). However, more work is needed in
these areas to determine the optimal computing architecture
for PSC.
5.2. Use in real-time-calibration
Some modern-era radio telescopes assume that a Local Sky
Model is available and use it to determine calibration param-
eters in real time without resorting to iteration over the model
(see e.g. Tasse et al. (2012)). With some additional cost in
computing, our algorithm for estimation of the pointing er-
rors could be used to track any antenna pointing errors in real
time and possibly correct the pointing errors in real-time.
5.3. Solving for PB shape
The PSC algorithm described above solves for the tip-tilt
of the antenna by solving for a phase gradient across the an-
tenna aperture. This does not however alter the Hermitian
nature of an ideal aperture illumination pattern and therefore
does not alter the predicted shape of the antenna PB. In prac-
tice, change in the shape of the antenna PB arises due to a
variety of reasons including de-focus and astigmatism in the
antenna optics, distortion of the main reflector with eleva-
tion, misplaced or misaligned elements in the antenna optics,
and projection effects in aperture-array elements. Many of
these terms result in a non-Hermitian aperture illumination
pattern which, in addition to shape distortions also leads to
complex-value PB. While pointing errors constitute the dom-
inant direction-dependent error, errors due to the shape of the
PB are also significant for the sensitivity offered by all mod-
ern radio telescopes. Its calibration is therefore also neces-
sary.
The low-order A-Solver approach (Jagannathan et al.
2017) offers a method for a closed-loop Shape Selfcal (SSC)
similar to the PSC algorithm. The A-Solver uses a physi-
cally motivated parametrized model of the antenna structure
in a geometric optics (GO) simulator (Brisken 2003) for the
antenna illumination pattern (AIP). Starting with reasonable
values for these physical parameters, the A-Solver solves
for these parameters by minimizing the difference between
the predicted and holographically measured AIP. Since the
model AIP needs to be predicted in the optimization iter-
ations, it is important to use a simulator with a reasonably
short run-time. Full-EM simulators are typically quite expen-
sive (both, in capital and run-time costs). GO simulators on
the other hand are less computationally complex and when
used in the A-Solver, capture the dominant electromagnetic
effects in the resulting optimized model. Jagannathan et al.
(2017) show that this approach captures otherwise difficult
to model effects like the effect of Standing Waves in the an-
tenna optics and other higher order phase terms across the

Page 10
10
Bhatnagar and Cornwell
antenna aperture. These higher order phase terms severely
affect the off-axis polarization leakage patterns and the A-
Solver approach offers an effective method that can enable
noise-limited full-Stokes imaging (not just Stoke-I imaging).
The error analysis in Sec.4.3 above suggests that the use of
a sky-model instead of the holographic measurements may
be possible (from an SNR point of view) in the A-Solver
for a closed-loop SSC algorithm. Note however that for
many antenna arrays, shape changes are smooth and grad-
ual and it may be sufficient to first derive the AIP model
using holographic measurements and include the tempo-
ral evolution of the derived parameters separately. For the
aperture-array elements, where these evolutions are more
severe and faster, a closed-loop SSC may be required, and
possible, using a model for the sky-brightness distribution
(e.g. a pre-determined Global Sky Model).
6. CONCLUSIONS
In this paper we present the mathematical framework for
calibration of direction-dependent effects (DDE) not known
a priori. We also include a brief overview of the direction-
independent (DI) calibration and the mathematical formula-
tion of the existing DI SelfCal algorithm.
As an example of a DD calibration algorithm, we present
the Pointing Selfcal (PSC) algorithm. As in DI SelfCal, given
a model of the sky brightness distribution, the pointing off-
set vector per antenna is solved by iteratively minimizing
the residual vector with respect to the antenna-based point
offsets. We verified the PSC solver, first by applying it to
simulated wide-band data for the VLA at L-band and show
that the pointing offset vector is recovered correctly. We then
also apply the PSC algorithm to on-sky data from the VLA
at L-band. The VLA antenna optics has a polarization squint
which results in an angular separation between the right- and
left-circular polarization beams. To verify the PSC algo-
rithm, the solver was set-up to solve for the pointing offset
vector separately for the two polarizations. In the antenna
Az-El plane, the difference between the pointing offset vec-
tors for the two polarizations is a measure of the squint vec-
tor and the separation of the center of squint vector from the
origin gives a measure of the mechanical antenna pointing
offset. We verified that the PSC solver indeed recovers the
average squint vector, though antenna-to-antenna variations
were also large and significant. Some of the antennas had
large mechanical pointing offsets, which were subsequently
verified via independent measurement of the expected im-
provement in the antenna gain after correcting for them in
the telescope software.
We also discuss the noise budget for the PSC algorithm.
Analysis of the signal-to-noise ratio (SNR) available for the
PSC solver as a function of the wide-band sky brightness dis-
tribution and telescope parameters leads to the following con-
clusions:
1. The PSC algorithm is optimal in utilizing the SNR due
to the sky brightness distribution in the entire antenna
field of view, rather than, for example, a few bright
sources.
2. Antenna pointing offsets can be solved-for at high sig-
nificance with the instantaneous sensitivity of most
modern radio interferometric telescopes with wide-
band receivers and typical sky brightness distributions.
3. While the sky brightness at the center of the PB does
not contribute signal for the PSC solver, the contribu-
tion from around the center increases with the mag-
nitude of the pointing offsets. This signal also peaks
around the half-power points of the PB. The PSC al-
gorithm therefore works well for observations where
the sky brightness is distributed across the FoV. The
degradation in the imaging performance due to the an-
tenna pointing errors is also more significant for such
observations. Emission spread across the FoV is typi-
cal at frequencies below a few GHz and at much higher
frequencies where mosaic imaging of emission much
larger than the antenna PB is often necessary.
4. We expect PSC to scale well in a parallel comput-
ing environment. Simple parallelization by data-
partitioning is possible and efficient. Reduction in
the run time by large factors using multi-threaded re-
samplers (Golap 2015) deployed on massively parallel
hardware remains a possibility, though more work is
needed in this area to arrive at an optimal computing
architecture.
5. The PSC is typically set-up for relatively short solu-
tion interval in time. For low-frequency observations,
the increase in the run time due to the w-term may be
mitigated by treating the array as co-planar for each
solution interval (Cornwell et al. 2012).
The mathematical framework for DD calibration presented
here can be extended for a Shape SelfCal (SSC) algorithm
to account for the change in the shape of the aperture us-
ing the low-order A-Solver approach. In the A-Solver ap-
proach, the parameters describing the physical structure of
the antenna are determined using a geometric optics predictor
for the antenna aperture illumination pattern (AIP) and holo-
graphic measurements of the AIP (Jagannathan et al. 2017).
Based on our estimate of the SNR typically available, it may
be possible to develop an SSC algorithm using a model of
the sky brightness distribution. More work is required, and
in progress, in this area.

Page 11
The Pointing Selfcal algorithm
11
Finally, our algorithm could be used in the real-time cali-
bration systems of modern-era radio telescopes where a Lo-
cal Sky Model is assumed to be available. This will allow
tracking of any antenna pointing errors in real time and pos-
sibly real-time corrections.
This work was done using the R&D branch of the CASA
code base. We wish to thank the referee, Daniel Mitchell,
for very useful comments and pointing out some corrections
in the equations and their interpretation. The National Radio
Astronomy Observatory is a facility of the National Science
Foundation operated under cooperative agreement by Asso-
ciated Universities, Inc.
Software: CASA7, Python, MatPlotLib
REFERENCES
Bhatnagar, S. 1998, Computation Of Antenna Dependent Complex
Gains, Tech. Rep. No. R00l72, National Centre for Radio
Astrophysics, Pune. http://www.aoc.nrao.edu/
˜sbhatnag/GMRT_Offline/antsol/antsol.html
Bhatnagar, S., Cornwell, T. J., & Golap, K. 2004, Solving for the
antenna based pointing errors, Tech. rep., EVLA Memo 84.
http://www.aoc.nrao.edu/vla/geninfo/memoseries/
evlamemo84.ps
Bhatnagar, S., Cornwell, T. J., Golap, K., & Uson, J. M. 2008,
A&A, 487, 419. http:
//adsabs.harvard.edu/abs/2008A%26A...487..419B
Bhatnagar, S., Rau, U., & Golap, K. 2013, The Astrophysical
Journal, 770, 91.
http://stacks.iop.org/0004-637X/770/i=2/a=91
Brisken, W. 2003, Using Grasp8 To Study The VLA Beam, Tech.
rep., EVLA Memo 58. "http://www.aoc.nrao.edu/vla/
geninfo/memoseries/evlamemo58.ps"
Cornwell, T., Voronkov, M., & Humphreys, B. 2012, Proc. SPIE
8500, Image Reconstruction from Incomplete Data VII: Wide
field imaging for the Square Kilometre Array
Cornwell, T. J. 1999, in ASP Conf. Ser. 180: Synthesis Imaging in
Radio Astronomy II, ed. G. B. Taylor, C. L. Carilli, & R. A.
Perley. http://adsabs.harvard.edu/cgi-bin/nph-bib_
query?bibcode=1999sira.conf.....T&db_key=AST
Cornwell, T. J., Golap, K., & Bhatnagar, S. 2008, IEEE Journal of
Selected Topics in Signal Processing, 2, 647
Cornwell, T. J., & Wilkinson, P. N. 1981, MNRAS, 196, 1067.
http://adsabs.harvard.edu/cgi-bin/nph-bib_query?
bibcode=1981MNRAS.196.1067C&db_key=AST
Golap, K. 2015, Multi-threading gridders without copying and
locks, Tech. rep., EVLA Memo 191. https://library.
nrao.edu/public/memos/evla/EVLAM_191.pdf
Hamaker, J. P., Bregman, J. D., & Sault, R. J. 1996, A&AS, 117,
137. http://adsabs.harvard.edu/cgi-bin/nph-bib_
query?bibcode=1996A%26AS..117..137H&db_key=AST
Jagannathan, P., Bhatnagar, S., Brisken, W., & Taylor, A. R. 2017,
The Astrophysical Journal (submitted)
Jones, R. C. 1941, J. Opt. Soc. America, 31, 488
Mueller, H. 1948, J. Opt. Soc. America, 38, 661
Perley, R. 2016, Jansky Very Large Array Primary Beam
Characteristics, Tech. rep., EVLA Memo 195. "https://
library.nrao.edu/public/memos/evla/EVLAM_195.pdf"
Perley, R. A., Chandler, C. J., Butler, B. J., & Wrobel, J. M. 2011,
The Astrophysical Journal Letters, 739, L1.
http://stacks.iop.org/2041-8205/739/i=1/a=L1
Press, W., Teukolsky, S., Vetterling, W., & Flannery, B. 1992,
Numerical Recipes in C (Cambridge: Cambridge University
Press)
Rau, U., & Cornwell, T. J. 2011, A&A, 532, 17.
http://stacks.iop.org/0004-637X/770/i=2/a=91
Tasse, C., van Diepen, G., van der Tol, S., et al. 2012, Comptes
Rendus Physique, 13, 28
Thompson, A. R., & Daddario, L. R. 1982, Radio Science, 17, 357.
http://adsabs.harvard.edu/cgi-bin/nph-bib_query?
bibcode=1982RaSc...17..357T&db_key=AST
Thompson, A. R., Moran, J. M., & Swenson, G. W., J. 2017,
Interferometry and Synthesis in Radio Astronomy (John Wiley
& Sons, Inc.)
7 https://casa.nrao.edu