arXiv:1709.08681v1 [astro-ph.IM] 25 Sep 2017

Page 1

Draft version May 9, 2018

Typeset using LATEX twocolumn style in AASTeX61

THE POINTING SELF CALIBRATION ALGORITHM FOR APERTURE SYNTHESIS RADIO TELESCOPES

S. Bhatnagar1

and T.J. Cornwell

1, 2

1National Radio Astronomy Observatory, 1003 Lopezville Road, Socorro, NM, 87801, USA.

2Tim Cornwell Consulting

(Dated: Received: June 13, 2017; Accepted: Sept. 22, 2017)

ABSTRACT

This paper is concerned with algorithms for calibration of direction dependent effects (DDE) in aperture synthesis radio tele-

scopes (ASRT). After correction of Direction Independent Effects (DIE) using self-calibration, imaging performance can be

limited by the imprecise knowledge of the forward gain of the elements in the array. In general, the forward gain pattern is

directionally dependent and varies with time due to a number of reasons. Some factors, such as rotation of the primary beam

with Parallactic Angle for Azimuth-Elevation mount antennas are known a priori. Some, such as antenna pointing errors and

structural deformation/projection effects for aperture-array elements cannot be measured a priori. Thus, in addition to algorithms

to correct for DD effects known a priori, algorithms to solve for DD gains are required for high dynamic range imaging. Here, we

discuss a mathematical framework for antenna-based DDE calibration algorithms and show that this framework leads to compu-

tationally efficient optimal algorithms which scale well in a parallel computing environment. As an example of an antenna-based

DD calibration algorithm, we demonstrate the Pointing SelfCal algorithm to solve for the antenna pointing errors. Our analysis

show that the sensitivity of modern ASRT is sufficient to solve for antenna pointing errors and other DD effects. We also discuss

the use of the Pointing SelfCal algorithm in real-time calibration systems and extensions for antenna Shape SelfCal algorithm for

real-time tracking and corrections for pointing offsets and changes in antenna shape.

Keywords: Methods: data analysis – Techniques: interferometry – image processing

Corresponding author: S. Bhatnagar

sbhatnag@nrao.edu

arXiv:1709.08681v1 [astro-ph.IM] 25 Sep 2017

Page 2

Bhatnagar and Cornwell

1. INTRODUCTION

The scientific deliverables of modern and next generation

interferometric radio telescopes, some under operation or in

advanced stages of commissioning, naturally require high

sensitivity and high dynamic range imaging. All these tele-

scopes typically promise at least a ten-fold increase in in-

stantaneous sensitivity compared to previous generation tele-

scopes. The projected achievable thermal noise in the im-

ages from these telescopes is in the range of 1−10 µJy/beam

corresponding to typical imaging dynamic ranges of 105−7,

particularly at frequencies <few GHz.

The underlying assumption in sensitivity calculations is

that the data processing procedure will remove systematic ef-

fects due to the instrument, atmosphere/ionosphere or sky to

a level significantly below the thermal noise limit so that the

RMS noise decreases as square root of the total number of in-

dependent measurements (the product of the total observing

bandwidth (∆ν) and total on-source integration time (∆T)).

In practice, the data are corrupted due to a number of direc-

tion independent (DIE) and direction dependent (DDE) ef-

fects. DI effects are constant across the field-of-view (FoV)

and correspond to a single value per antenna as a function

of time, frequency, and polarisation. Efficient algorithms

to solve for DI effects have been in use for many decades

(Thompson et al. 2017). However some instrumental and

atmospheric/ionospheric effects are directionally dependent.

Designing efficient solvers for DDE is more difficult and lag

behind solvers for DIE.

Aperture synthesis radio telescopes synthesize an aper-

ture equivalent to the largest projected separation between

the individual antennas in the array by cross-correlating

the signals from each antenna-pair in the array (Thomp-

son et al. 2017). Before being correlated, imprinted on the

signals are the effects of a number of instrumental and at-

mospheric/ionospheric components. The far-field complex-

valued electric field pattern (EFP) of the antenna and the

feed further modifies the incoming wave before it is detected

as a voltage fluctuation. In general, the cumulative effect

of all these is time-, frequency-, polarization- and direction-

dependent.

The EFP of the individual elements in an aperture synthesis

array constitutes the strongest instrumental DD effect. While

it is fundamentally direction-, frequency- and polarization-

dependent, it can also vary significantly with time for a num-

ber of reasons. The primary beam (PB) is typically rota-

tionally asymmetric and for 2-axis Azimuth-Elevation mount

(Az-El mount) dish antennas, it rotates with respect to the

sky as a function of the Parallactic Angle (PA). This leads to

time varying DDE gains within the field-of-view (FoV) and

constitutes the strongest time varying DDE for such systems.

For aperture-array elements, the beam geometry is fixed in

Earth-based coordinates, and thus varies in celestial coordi-

nates. This constitutes the strongest instrumental DD effect

for aperture-array telescopes. Given an antenna aperture il-

lumination function measured (or modeled) a priori, the ex-

isting A-Projection algorithm can be used to correct for these

known effects during image deconvolution (Bhatnagar et al.

2008; Bhatnagar et al. 2013).

Antenna pointing errors affect the approximately static

case of snapshot imaging with telescopes with high instanta-

neous sensitivity. In observations with long integration time

(for improved sensitivity or uv-coverage, or both), time vary-

ing antenna pointing offsets also lead to time-varying DDE

gains comparable in magnitude to those due to the rotation

of PB with Parallactic angle (PA). In this paper, we describe

a general mathematical framework for algorithms to solve for

the unknown DDE. We apply it to the specific case of antenna

pointing errors.

2. THEORETICAL FRAMEWORK

The measurement equation including DDE for an aperture

synthesis telescopes can be compactly written using the no-

tation in Hamaker et al. (1996) as1

VObs

=MDI

MDD

ij (s) I(s) e2πιs· ~bi j ds

=MDI

Aij ? V◦ .

(1)

The appropriate symbols here and in the rest of the paper rep-

resent quantities in the instrumental polarization basis (cir-

cular or linear polarization). VObs

is the observed and V◦

the true full polarization visibility vectors. Aij is the Fourier

transform of MDD

and the symbol ’?’ represents the standard

matrix-vector multiplication, except that the multiplication

operations are replaced by the convolution operation in the

algebra. MDI

ij and MDD

ij (s) are the outer products of the DI

and DD Jones matrices JDI and JDD(s) respectively. We refer

to these outer products as the Radio Mueller matrices2 for DI

and DD gains. s is a direction in the sky, I is the image and

~bij is the projected separation between the antennas i and j

in units of the wavelength. The goal of imaging is to esti-

mate the true sky brightness distribution I(s) in the presence

of known or unknown gains MDI

ij and MDD

ij (s).

2.1. Overview of direction independent calibration

1 The symbol ’ι’ is used as a symbol for iota throughout the text and

should not be confused with i which is used as an antenna-index subscript.

2 In the original optical literature (Jones 1941; Mueller 1948), Jones ma-

trices are defined in the Stokes basis and an outer product of the Jones matri-

ces is called the Mueller Matrix. Radio interferometric measurements are an

outer product of electric fields measured at the two antennas in the feed po-

larization basis (circular or linear) which is the natural basis for calibration.

However since this is also specific to radio interferometric measurements,

we use the term Radio Mueller matrix. This Radio Mueller matrix is related

to the optical Mueller matrix via a unitary transform (see e.g., Hamaker et al.

1996).

Page 3

The Pointing Selfcal algorithm

Direction independent effects (DIE) (MDI

ij in Eq. 1) are due

to the telescope electronics and atmospheric/ionospheric ef-

fects at scales much large than the antenna FoV. Calibra-

tion of such gains is done using the self-calibration tech-

nique (Cornwell & Wilkinson 1981; Cornwell 1999). Ignor-

ing MDD

ij (s) and factoring MDI

ij into the antenna based DIE Ji,

a convenient specialization of Eq. 1 for DIE calibration can

be written as

VObs

= Ji ⊗ J∗

V◦

(2)

and Ji solved-for using χ2−minimization techniques. For

model visibilities VM

ij , the residual visibilities are

Rij = VObs

− (Ji ⊗ J∗

j )VM

(3)

and

χ2 = X

†

ij· Wij · Rij.

(4)

Wij is a diagonal matrix of weights proportional to the inverse

of the measurement variance. Equation 4 is a sum over all

baselines of the weighted L2-norm of the full-polarization

residual vector Rij. For clarity, we show below the expansion

of Eq. 4 in terms of the elements of Rij and Wij:

χ2 =X







rpp∗

rpq∗

rqp∗

rqq∗



wpp

wpq

0 wqp

wqq







rpp

rpq

rqp

rqq











|rpp|2 wpp + |rpq|2 wpq + |rqp|2 wqp + |rqq|2 wqq

(5)

where r and w are the elements of the vector Rij and the ma-

trix Wij respectively and subscripts p and q represent orthog-

onal polarization states (circular or linear).

The Jones matrix Ji in Eq. 3 are direction independent with

its elements consisting of single numbers (and not 2D func-

tions). The evaluation of MDI

ij in Eq. 1 therefore requires

computationally simpler outer-product operator (as against

the outer-convolution operator required for MDD

ij ; see Sec-

tion 2.2) and application of

Ji⊗ J∗

is a simple matrix mul-

tiplication. This observation leads to an efficient DI calibra-

tion algorithm. For the relevant case of parallel-hand only

calibration,

Ji =



 gp

0 gq





(6)

where the superscripts p and q represent the two orthogonal

polarization pairs. Rij can be written as

Rij = Xij − (Ji ⊗ J∗

j )1

(7)

where the symbol 1 is a 4×1 column-vector with all elements

equal to one and

Xij = hdiag(VM

ij )

i−1

VObs

ij .

The function diag(ã) returns a diagonal matrix with the vec-

tor ã as its diagonal. For a simple minimization algorithms

such as the Steepest Descent algorithm, equating the gradient

∂χ2

∂J∗

to zero for an Nant antenna-array leads to Nant simulta-

neous non-linear equations which are then solved using the

following iterative equation:

Jin⊗ 1 = (1 − γ)

Jin−1⊗ 1i +

γ [S oW]−1



X

j,j,i

Jjn−1⊗ 1

WijXij





(8)

where

S oW =

j,j,i

Jjn−1

⊗ J∗

j n−1

Wij

The symbol |n represents value at iteration n, 0 < γ < 1 is

the standard feedback loop-gain of non-linear minimization

algorithms and 1 is the 4 × 4 identity matrix. When VM

ij is a

wide-band prediction of the sky brightness distribution in the

FoV, Xij is a weak function of time and frequency and it may

be pre-averaged for the solution interval for improved signal-

to-noise ratio (SNR). Using previous solutions or unity as

an initial guess for Ji, convergence is typically achieved in a

few iterations (see Thompson & Daddario (1982); Bhatnagar

(1998)3 for a detailed derivation and an intuitive interpreta-

tion).

The primary focus of this paper is DD calibration – de-

scribed in the following sections – which is fundamentally

coupled to imaging. DD calibration is therefore better de-

scribed in a Mueller-matrix framework. To establish equiv-

alence between DI and DD calibration and to show that DD

calibration is a generalization of the DI calibration, we also

described the DI calibration above in a Mueller matrix for-

mulation. In practice, most software implementations of

Eq. 8 implement the matrix arithmetic by-hand for better

code optimization where the distinction between Jones- and

Mueller-matrix based formulations is not important. For DI-

only implementations which may benefit from directly using

the matrix arithmetic, it is more efficient to re-formulate the

DI algorithm using VObs

= JiV◦J∗

j where all matrices are 2×2

matrices, instead of Eq. 2. However it is important to point

out here that while this may offer computational advantages,

it does not lead to a fundamentally new algorithm.

Since VM in Eq. 3 is independent of J, it is treated as

a constant in the minimization algorithm. Calibration for

3 Also accessible from

http://www.aoc.nrao.edu/∼sbhatnag/GMRT Offline/antsol/antsol.html

Page 4

Bhatnagar and Cornwell

DIE gains and imaging to solve for I(s) are alternated iter-

atively. Thus calibration can be done keeping the model of

the sky fixed and imaging is done keeping the DI calibration

terms fixed. Iterating between imaging using calibrated data

ij = (Ji⊗ J∗

j )−1VObs

to make the model image IM and using

IM to solve for Jis forms a closed-loop solution to account

for DIE. We retain this overall structure for our DDE calibra-

tion approach and substitute an algorithm for estimating the

DDE’s while keeping the image fixed.

2.2. Direction dependent effects antenna-based calibration

Direction dependent effects (DDE) are represented by MDD

in Eq. 1. Mathematically the effect of these terms is indistin-

guishable from I(s) and the model data VM cannot be eval-

uated independent of MDD. Consequently, to compute the

residual vector (the difference between data and the model),

the integral in Eq. 1 must be evaluated for each measured

data point (i.e., for all i, j, frequency and polarization mea-

surements). With typical modern data sizes in the multi-Tera

bytes regime the computational and the data I/O costs be-

come very high even with only a few unresolved sources for

a direct evaluation of the integral, or evaluating it separately

for different directions in the FoV. These costs are prohibitive

for dense fields (as is typical at frequencies ≤ f ew GHz) and

for fields with a combination of compact and significant ex-

tended emission (as is typical for observations in the Galactic

Plane and for mosaic observations at any frequency).

Direction dependent effects that are fundamentally aper-

ture plane effects can often be compactly modeled with a few

parameters in the aperture plane. For example, the effect of

antenna pointing errors can be modelled by two parameters

per antenna for the entire FoV. Solving for these effects in

the image plane requires solving for the antenna gains to-

wards multiple sources. This has significant numerical and

computational disadvantages. On the other hand, solving for

these effects directly in the aperture plane has lower compu-

tational complexity and optimal utilization of the full SNR

based on the integrated flux in the FoV. The Pointing Self-

cal (PSC) algorithm described below is an example of such

an aperture-plane calibration algorithm for antenna pointing

errors.

The Aij in Eq. 1 can be factored into antenna-based Jones

matrices. The observed visibilities VObs

calibrated for MDI

can then be written as

VObs

= hAi~A∗

ji ? VM◦ ,

(9)

where VM◦ = F I(s) – the Fourier transform of the sky bright-

ness distribution on a grid. The symbol ’~’, introduced in

Bhatnagar et al. (2008), represents the outer-convolution op-

erator4. Ai is an antenna DD Jones matrix given by

Ai =



Ap

0 Aq





(10)

The elements along the diagonal of this matrix describe the

electric field distribution across the antenna aperture for the

two orthogonal polarizations.

The gradient of χ2 w.r.t. antenna-based parameters a

(which are, in general, complex-valued) for the more general

Eq. 9 is given by5

∂χ2

∂a∗

i n

= −2 X

j,j,i

†

ijn

· Wij ·

∇iVM

ij n

i ,

(11)

where

∇iAij =

∂Aij

∂a∗

∂AM

∂a∗

~AM∗

(12)

and

∇iVM

ij n = ∇iAijn

? VM◦ .

The symbol < represents the real part of its argument, which

evaluates to a scalar in the same way as Eq. 4. AM

– the

model for the true Ai – is parametrized by the parameter a

and Rijn = VObs

−

i ~AM∗

? VM◦

is the residual vector

computed at iteration n. The parameters are then updated

iteratively as

a|n = a|n−1 + f

∂χ2

∂a∗ n

(13)

where f is a function that depends on the details of the

non-linear minimization algorithm used. For minimization

algorithms that assume a diagonally-dominant Hessian ma-

trix (such as the Steepest descent algorithm), f(x) = γx.

More sophisticated minimization algorithms involve poten-

tially expensive evaluation of the Jacobi/covariance matrix

(see Press et al. 1992, or later editions for detailed discus-

sions).

3. THE POINTING SELFCAL ALGORITHM

Our algorithm works in the general way of self-calibration

algorithms, iterating between estimation of the sky bright-

ness holding the calibration parameters fixed, and estima-

tion of the calibration parameters holding the sky brightness

model fixed (Cornwell 1999). For the first part, we use the

4 The element-by-element algebra of the outer-convolution operator is

the same as that of the outer-product operator used in the DI description of

Hamaker et al. (1996), except that the complex multiplications are replaced

by convolutions.

5 Equations 11 and 12 are a general form of equations for antenna-based

calibration and include the DI case. As a test and for intuitive understanding,

replacing ? by dot product, Aij by Ji⊗ J∗

i , a∗

i by J∗

i and equating Eq. 11 to

zero recovers Eq. 8.

Page 5

The Pointing Selfcal algorithm

combination of the Wide-band A-Projection (Bhatnagar et al.

2013) and the Multi-term Multi-Frequency Synthesis algo-

rithms (Rau & Cornwell 2011). Hence in this section, we will

concentrate on the second part: estimating the calibration pa-

rameters (pointing errors) while holding the sky brightness

fixed.

In Eq. 11 the evaluation of the residual vector R, and in

this case also ∇Aij, can be expensive. Both Rij and ∇iAij

in Eq. 11 involves evaluation of Aij and

∂Ai

∂a∗

~A∗

at each

iteration of the minimization algorithm. For an array with

Nant antennas, the computational cost of these evaluations is

4 × 2N2

A log(NA) × 2N2

ant

, where NA is the size in pixels

of the quantized representation of the elements of A. This

cost can be prohibitive for modern telescopes with Nant and

NA in the range of few×102−3 antenna elements and 105−7

pixels respectively. Thus treating the PB as being completely

unknown is not viable computationally nor can we expect it

to be well-conditioned. Hence a form for the PB with less

degrees of freedom is necessary.

The mechanical antenna pointing errors can be represented

with fewer degrees of freedom; as a phase gradient across

the antenna aperture illumination pattern. Purely mechani-

cal fractional pointing errors are also the same for both po-

larization and at all frequencies. Aperture-plane solvers for

the pointing errors can therefore easily benefit from the in-

stantaneous continuum sensitivity of modern wide-band re-

ceivers. For simplicity and to facilitate analysis, we model

the main-lobe of the EFP for antenna i as a gaussian of the

form e−(l−li)2α2

where l is a direction on the sky with respect

to the pointing direction, li is the pointing error and α is 2−1/2

times the inverse of the standard deviation. It can be replaced

with a more accurate function for the antenna PB without the

pointing errors in the final results. Aij (the Fourier transform

of the PB) can then be expressed as

Aij = A◦



e−

(li−l j)2α2



 eιπu(li+lj)

(14)

and Eq. 12 for a∗

i = li becomes

∇iAij =

∂Aij

∂li

= Aij

lj − li α2 + ιπui

(15)

where u is the Fourier-conjugate variable for l (in units of

wavelengths and radians respectively). The equations above

are written for one dimension only for clarity and can be triv-

ially extended for the other dimension and for the heteroge-

neous array case. li and lj are the antenna pointing errors

for antennas i and j. A◦

ij is the pre-computed version of A

and includes all PB effects known a priori – but not the me-

chanical antenna pointing errors – like polarization squint,

off-axis polarization effects, effects of antenna blockages and

feeds, rotation with PA, dependence on frequency, etc. Us-

ing A-Projection to apply Aij and ∇iAij, both Rij and ∂χ2/∂a∗

can be computed at full continuum sensitivity by integration

across time, frequency and polarization without loss of ac-

curacy. The term inside the square brackets in Eq. 14 is the

reduction in the amplitude because of the decorrelation of

the measured visibilities due to the pointing errors at each

antenna of the baseline. It is of order unity for small values

of (li − lj) – the difference in the pointing errors at the two

antennas – and may be ignored (for a typical maximum value

for (li −lj) of order 1−2% of the width of the antenna PB, this

term constitutes an error of less than 0.1% in the amplitude).

Algorithm 1 The Pointing Selfcal algorithm: estimation of

pointing errors

1: Pre-compute VM◦ = F

IM(s)

2: Pre-compute A◦

ij(t,ν) for all required i, j, t and ν

3: for all data do

for all t and ν in the interval (τsol,∆νsol) do

for Iteration n do

Chi[] = 0.0; dChi[] = 0.0;

for all i do

for all j, j , i do

Compute Aij(t,ν) and ∇iAij(t,ν)

(Eqs. 14, 15)

10:

Use A-Projection algorithm to compute:

ij = Aij ? VM◦

∇iVM

ij = ∇iAij ? VM◦

11:

Compute Rij = VObs

(t,ν) − VM

ij (t,ν)

12:

Accumulate

Chi[j]=Chi[j]+R†

ijWijRij

dChi[j]=dChi[j]+<

R†

ijWij

∇iVM

13:

if Terminate(Chi[],dChi[],n) then

14:

break

15:

Update li|n = li|n−1 + f (dChi[j])

16:

Save all li for solution intervals (τsol, ∆νsol)

To predict the model data at each iteration, modified A as

in Eq. 14 is used with the A-Projection algorithm to compute

the model data, including the effects of the antenna pointing

errors and subtracted from VObs

to compute Rij. Similarly,

modified A as in Eq. 15 is used to compute ∇iAijn

? VM◦

See Algorithm 1 for the various computational steps and the

nesting of the loops involved. The functions Terminate() and

f() in steps 13 and 15 depend on the details of the minimiza-

tion algorithm.

4. RESULTS

4.1. Simulations

To verify the numerical correctness and performance of

the PSC algorithm, we simulated data for the Karl G. Jan-

sky Very Large Array (VLA) (Perley et al. 2011) which in-

cluded the effects of the time-varying pointing errors at each

antenna. The simulation was for an L-Band observation and

used a sky model derived from the NVSS source list. The

Page 6

Bhatnagar and Cornwell

total integrated flux in the FoV, including the first sidelobe of

the PB, was 1.5 Jy distributed across the beam. For numeri-

cal accuracy, the data for the sky model was predicted using

direct Fourier transform and a model for the antenna PB with

pointing offsets. The pointing offsets for the antennas were

uniformly distributed between ±2000, which corresponds to

∼ ±1% of the beam-width at 1.5 GHz. Independent pointing

offset errors of ±500 were added for each antenna as a function

of time to simulate short term time-varying offsets. Finally,

random noise corresponding to a continuum sensitivity limit

of 1 µJy/beam in eight hours of observing with an integra-

tion time of 10 sec per sample was added to the visibilities to

simulate thermal noise.

Figure 1 shows the result of application of the PSC algo-

rithm to this simulated data. The continuous curves show the

antenna pointing errors for 4 of the 27 antennas (for clarity)

as a function of time. The over-plotted filled circles are the

PSC solutions with a solution interval of 30 sec. The resid-

uals per baselines for solutions with 10 sec solution interval

were consistent with the thermal noise limit in the simula-

tion (see Bhatnagar et al. 2004), verifying the basic numeri-

cal correctness of the algorithm.

Figure 1. The solutions from the Pointing Selfcal algorithm us-

ing simulated data to verify the algorithm. The continuous lines

show the pointing offsets as a function of time for four representa-

tive antennas while the filled-circles show the pointing offset solu-

tions with 30 sec. solution interval.

4.2. Application to VLA data

To test the algorithm with real data, we used a wide-band

observation of the IC2233 field with the VLA at L-Band.

This data was imaged using about ∼600 MHz of bandwidth

to generate the continuum sky model used as an input for the

PSC algorithm. The VLA antenna optics creates an offset

between the parallel-hand PBs (polarization squint) of ∼6%

of the beam at any frequency, corresponding to ∼ 11000 at

1.5 GHz. To gain confidence in any antenna pointing offset

that the solutions may show, we set up the PSC algorithm

to solve for the offsets independently for both the polariza-

tions (R and L). Difference in R- and L-solutions will then be

a measure of the polarization squint and the mean value of

these solutions ([Ro f f set +Lo f f set]/2) will be a measure of the

antenna mechanical pointing error (which should be the same

for both polarizations). In the antenna Az-El plane, the sep-

aration between the R and L solutions will then correspond

to the length of the polarization squint vector, while a vector

from the center of the squint vector to the origin of the Az-

El plane would correspond to the actual antenna mechanical

pointing error vector.

The results on the Az-El plane for two of the antennas in

the array are shown in Fig. 2. The vector labeled “Squint

Vector” shows the separation between the median values of

the two offsets and is equal to ∼12000 and 10500 for the two

antennas. The vector labeled “Pointing Error Vector” from

the center of the Squint Vector to the origin is a measure of

the antenna mechanical pointing offsets, which has a magni-

tude of 3.50and 0.50 for these two antennas. These are large

systematic pointing offsets, which were subsequently veri-

fied independently and corrected in the telescope software.

As expected, after corrections, particularly for the first an-

tenna, an improved antenna sensitivity was measured giving

us a verification of the solutions and the sign convention used

in the software.

The Az- and El-offsets of the RR and LL beams as a func-

tion of time from the nominal antenna pointing direction

for a set of representative antennas with solution-intervals of

5 min and 600 MHz in time and frequency respectively are

shown in Fig. 3. The PB model was derived using a geo-

metric optics simulator for the antenna illumination patterns

which includes the effects of aperture blockage, off-axis feed

locations, and illumination taper (Brisken 2003). The rota-

tion with PA and scaling of the PB with frequency was in-

cluded in the model using the A-Projection algorithm. The

sky brightness for this observation is dominated by two com-

pact sources separated by ∼2500 with other weaker sources

spread across the FoV. The scatter in the pointing solutions is

due to three factors:

1. The actual sky brightness is dominated by two strong

sources and so the constraints on the perpendicular di-

rections are weak.

2. Imperfections in the sky model

3. The limitations of the PB model used, particularly as a

function of frequency.

Page 7

The Pointing Selfcal algorithm

Figure 2. The results from the application of the Pointing SelfCal algorithm to the VLA data. Pointing error solutions along the elevation and

azimuth axis for the R- and L-beams are shown with red and blue points. The average separation between these set of points corresponds to the

polarization squint of the VLA antennas due to the off-axis feed location. The separation between the center of the squint vector and the origin

corresponds to the antenna mechanical pointing error.

−70

−60

−50

−40

−30

−20

−10

Ant. 1

−90−80−70−60−50−40−30−20−100

Ant. 2

−70

−60

−50

−40

−30

−20

−10

Ant. 3

−30

−20

−10

Ant. 7

Time (min)

−20−10010203040506070

Ant. 18

Az offset (arcsec)

−50

100

150

200

250

Ant. 1

−50

100

150

200

250

300

Ant. 2

−60−40−20020406080100120

Ant. 3

−100

−80

−60

−40

−20

Ant. 7

Time (min)

−140

−120

−100

−80

−60

−40

−20

Ant. 18

El offset (arcsec)

Figure 3. The pointing offsets in the antenna azimuth (left) and elevation (right) axises as a function of time for several antennas. The two

curve in each panel correspond to the offsets for the R- and L-polarizations. The separation between the two curves in each panel correspond to

the azimuth and elevation components of the polarization squint for each antenna.

Analysis of independent holographic measurements (Per-

ley 2016) show that in addition to systematic deviations from

the expected value there are oscillations in the magnitude of

the squint vector as a function of frequency due to standing

Page 8

Bhatnagar and Cornwell

waves in the antenna optics (Jagannathan et al. 2017), which

were not included in the PB model. Some variations in the

pointing offset solutions, including in the length of the squint

vector could therefore be also real. The derived offsets as a

function of time along the azimuth axis for a few antennas in

the array are shown in Fig. 4. The solutions show significant

differences in the antenna pointing errors between the anten-

nas, as well as slow drifts over longer timescales (∼ 30 min)

– both of which are expected based on the regular pointing

model measurements and anecdotal evidence from imaging

results.

Figure 4. The solved antenna pointing offsets in the azimuth axis

with time for a few representative antennas of the VLA.

4.3. Noise budget

In general, the stability and the error on the solved param-

eters depend directly on the SNR in the measured data (the

RHS of Eq. 9) and on the number of free parameters. It is

therefore important to devise algorithms that maximize data-

SNR using as few free parameters as possible.

The equivalent thermal noise that contribute to the variance

in the solution is reduced by a factor equal to the square root

of the number of statistically independent samples averaged.

The residuals in the PSC solver are averaged at each iteration

across all baselines with a given antenna and for the duration

of the solution intervals in time and frequency. Assuming

that the pointing offsets at each antenna are statistically in-

dependent and random in nature, the noise contribution in

the solver due to the thermal noise in the data is reduced by

a factor equal to

√

∆νsolτsol (Nant − 1) for solution intervals

of ∆νsol and τsol in frequency and time. On the other hand,

the signal for the solver is the differential apparent-flux with

respect to the pointing offsets in the FoV. Comparing this sig-

nal with the effective noise allows an estimate for the mag-

nitude and the distribution of the sky brightness required for

−2

−1

Angular offset (σ1GHz)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

plitude

E @ 1 GHz

|(∂E/∂l)E| @ 1 GHz

|(∂E/∂l)E| @ 2 GHz

Figure 5. The curves show slices across the E(ν)∂E(ν)/∂l function

at the edges of a 1-GHz wide band (in green and red). The main-

lobe of E at 1 GHz is approximated as a gaussian of width σ1GHz (in

blue).

deriving pointing offset solutions. For a homogeneous array

case with identical antennas, the SNR per parameter for an

antenna-based pointing offsets solver is

S NRPS C =

|∇S|

SEFD

p∆νsolτsol(Nant − 1)

(16)

where

SEFD =

2kbTsys

(ηAπR2)

× 1026 (Jy)

kb is the Boltzmann’s constant. R, ηA and Tsys are the antenna

aperture radius (in meters), efficiency and system tempera-

ture (in Kelvin) respectively. ∇S is the differential apparent

integrated flux in the FoV with respect to the antenna point-

ing errors is given by:

∇S =

Z ∂E

∂l

⊗ E∗

IM(s) ds (Jy)

(17)

where E is the antenna far-field EFP6 and IM is a model for

the sky brightness distribution as a function of the direction

s. For the VLA at L-band, Nant = 27 and SEFD ≈ 358 Jy

(Tsys = 35 K, ηA = 0.55 and R = 12.5 m). The total apparent

flux in the VLA FoV at L-band above the thermal noise limit

for a usable bandwidth of 800 MHz and 1 min of integration

in time is estimated to be few×100 mJy. In the absence of any

other source of noise, solutions for antenna pointing offsets

should be possible at reasonably high accuracy on very short

timescales. Some numerical experiments suggest that reli-

able solutions are possible at several minutes timescale due

6 Recall that E is related to Eq. 1 via MDD

= Ei⊗ E∗

i and the Fourier

transform of E is the DD equivalent of J in Eq. 2

Page 9

The Pointing Selfcal algorithm

to addition numerical noise, e.g. due to gridding, rotation of

AM with PA, inaccuracies in IM and AM, etc.

Analysis of Eq. 17 offers a few useful thumb-rules. The

function

∂E

∂l

⊗ E∗ is a double-hump shaped curve with a null

at the location of the peak of E (see Fig. 5). Therefore,

while the sky brightness at the center of the PB does not

contribute signal for the solver, the contribution of the flux

around the center increases with the magnitude of the point-

ing error. This function peaks around the half-power-point

of the PB. The flux in this region of the PB therefore con-

tributes the maximum signal. The PSC algorithm therefore

works well for fields where the sky brightness distribution

is spread across the PB. This is almost always the case at

low frequencies (/few GHz) and for mosaic imaging which

is typically the observing mode at high frequencies. Note

that the antenna pointing errors also adversely affect imaging

performance only for fields with significant flux away from

the pointing center. The area under this curve is indepen-

dent of frequency (changes in ∂E/∂l and E compensates for

each other as a function of frequency). Since the mechanical

pointing errors are independent of frequency and on an av-

erage the low-frequency radio flux varies as ν−0.7, the signal

for the pointing solver will increase with bandwidth, partic-

ularly at lower radio frequencies. More precise estimates for

the SNR and solution timescales will require simulations us-

ing models for E, the sky brightness and its spectral index

distribution.

5. DISCUSSIONS

5.1. Run-time performance analysis

The run-time cost of PSC has two components: compu-

tation of Aij and ∇iAij and their application to VObs

via A-

Projection. Using Eqs. 14 and 15 with a precomputed A◦

ij,

the cost of evaluating Aij and ∇iAij at each iteration becomes

relatively insignificant and the total run-time cost is domi-

nated by the cost of their application to VObs

ij . For a support

size of Nsup for Aij, this cost scales as NvisN2

sup, where Nvis is

the total number of data samples within the solution interval.

Since the computations for each data sample is independent,

this cost reduces linearly with parallelization by partitioning

Nvis across multiple computing cores. As a test case using a

single CPU running at 1.2 GHz clock, each iteration of the

PSC took about 1 sec per frequency channel for the VLA.

Using all the 16 computing cores available, the run-time was

reduced to ∼ 70 millisec per channel, with the total run-time

for convergence of ∼ 2 min per solution.

Multi-threaded gridders (e.g., Golap (2015)) which can

benefit from a much larger number of compute-cores avail-

able on massively parallel hardware like the modern GPUs

may help in reducing the run-time cost by a large factor.

For arrays that are non-coplanar for long-integration obser-

vations, the run-time cost of PSC increases with the W-term

(Cornwell et al. 2008). However, for short solution interval

the run-time cost may be largely independent of the W-term

– by using the fact that arrays are instantaneously co-planar

(Cornwell et al. 2012). However, more work is needed in

these areas to determine the optimal computing architecture

for PSC.

5.2. Use in real-time-calibration

Some modern-era radio telescopes assume that a Local Sky

Model is available and use it to determine calibration param-

eters in real time without resorting to iteration over the model

(see e.g. Tasse et al. (2012)). With some additional cost in

computing, our algorithm for estimation of the pointing er-

rors could be used to track any antenna pointing errors in real

time and possibly correct the pointing errors in real-time.

5.3. Solving for PB shape

The PSC algorithm described above solves for the tip-tilt

of the antenna by solving for a phase gradient across the an-

tenna aperture. This does not however alter the Hermitian

nature of an ideal aperture illumination pattern and therefore

does not alter the predicted shape of the antenna PB. In prac-

tice, change in the shape of the antenna PB arises due to a

variety of reasons including de-focus and astigmatism in the

antenna optics, distortion of the main reflector with eleva-

tion, misplaced or misaligned elements in the antenna optics,

and projection effects in aperture-array elements. Many of

these terms result in a non-Hermitian aperture illumination

pattern which, in addition to shape distortions also leads to

complex-value PB. While pointing errors constitute the dom-

inant direction-dependent error, errors due to the shape of the

PB are also significant for the sensitivity offered by all mod-

ern radio telescopes. Its calibration is therefore also neces-

sary.

The low-order A-Solver approach (Jagannathan et al.

2017) offers a method for a closed-loop Shape Selfcal (SSC)

similar to the PSC algorithm. The A-Solver uses a physi-

cally motivated parametrized model of the antenna structure

in a geometric optics (GO) simulator (Brisken 2003) for the

antenna illumination pattern (AIP). Starting with reasonable

values for these physical parameters, the A-Solver solves

for these parameters by minimizing the difference between

the predicted and holographically measured AIP. Since the

model AIP needs to be predicted in the optimization iter-

ations, it is important to use a simulator with a reasonably

short run-time. Full-EM simulators are typically quite expen-

sive (both, in capital and run-time costs). GO simulators on

the other hand are less computationally complex and when

used in the A-Solver, capture the dominant electromagnetic

effects in the resulting optimized model. Jagannathan et al.

(2017) show that this approach captures otherwise difficult

to model effects like the effect of Standing Waves in the an-

tenna optics and other higher order phase terms across the

Page 10

Bhatnagar and Cornwell

antenna aperture. These higher order phase terms severely

affect the off-axis polarization leakage patterns and the A-

Solver approach offers an effective method that can enable

noise-limited full-Stokes imaging (not just Stoke-I imaging).

The error analysis in Sec.4.3 above suggests that the use of

a sky-model instead of the holographic measurements may

be possible (from an SNR point of view) in the A-Solver

for a closed-loop SSC algorithm. Note however that for

many antenna arrays, shape changes are smooth and grad-

ual and it may be sufficient to first derive the AIP model

using holographic measurements and include the tempo-

ral evolution of the derived parameters separately. For the

aperture-array elements, where these evolutions are more

severe and faster, a closed-loop SSC may be required, and

possible, using a model for the sky-brightness distribution

(e.g. a pre-determined Global Sky Model).

6. CONCLUSIONS

In this paper we present the mathematical framework for

calibration of direction-dependent effects (DDE) not known

a priori. We also include a brief overview of the direction-

independent (DI) calibration and the mathematical formula-

tion of the existing DI SelfCal algorithm.

As an example of a DD calibration algorithm, we present

the Pointing Selfcal (PSC) algorithm. As in DI SelfCal, given

a model of the sky brightness distribution, the pointing off-

set vector per antenna is solved by iteratively minimizing

the residual vector with respect to the antenna-based point

offsets. We verified the PSC solver, first by applying it to

simulated wide-band data for the VLA at L-band and show

that the pointing offset vector is recovered correctly. We then

also apply the PSC algorithm to on-sky data from the VLA

at L-band. The VLA antenna optics has a polarization squint

which results in an angular separation between the right- and

left-circular polarization beams. To verify the PSC algo-

rithm, the solver was set-up to solve for the pointing offset

vector separately for the two polarizations. In the antenna

Az-El plane, the difference between the pointing offset vec-

tors for the two polarizations is a measure of the squint vec-

tor and the separation of the center of squint vector from the

origin gives a measure of the mechanical antenna pointing

offset. We verified that the PSC solver indeed recovers the

average squint vector, though antenna-to-antenna variations

were also large and significant. Some of the antennas had

large mechanical pointing offsets, which were subsequently

verified via independent measurement of the expected im-

provement in the antenna gain after correcting for them in

the telescope software.

We also discuss the noise budget for the PSC algorithm.

Analysis of the signal-to-noise ratio (SNR) available for the

PSC solver as a function of the wide-band sky brightness dis-

tribution and telescope parameters leads to the following con-

clusions:

1. The PSC algorithm is optimal in utilizing the SNR due

to the sky brightness distribution in the entire antenna

field of view, rather than, for example, a few bright

sources.

2. Antenna pointing offsets can be solved-for at high sig-

nificance with the instantaneous sensitivity of most

modern radio interferometric telescopes with wide-

band receivers and typical sky brightness distributions.

3. While the sky brightness at the center of the PB does

not contribute signal for the PSC solver, the contribu-

tion from around the center increases with the mag-

nitude of the pointing offsets. This signal also peaks

around the half-power points of the PB. The PSC al-

gorithm therefore works well for observations where

the sky brightness is distributed across the FoV. The

degradation in the imaging performance due to the an-

tenna pointing errors is also more significant for such

observations. Emission spread across the FoV is typi-

cal at frequencies below a few GHz and at much higher

frequencies where mosaic imaging of emission much

larger than the antenna PB is often necessary.

4. We expect PSC to scale well in a parallel comput-

ing environment. Simple parallelization by data-

partitioning is possible and efficient. Reduction in

the run time by large factors using multi-threaded re-

samplers (Golap 2015) deployed on massively parallel

hardware remains a possibility, though more work is

needed in this area to arrive at an optimal computing

architecture.

5. The PSC is typically set-up for relatively short solu-

tion interval in time. For low-frequency observations,

the increase in the run time due to the w-term may be

mitigated by treating the array as co-planar for each

solution interval (Cornwell et al. 2012).

The mathematical framework for DD calibration presented

here can be extended for a Shape SelfCal (SSC) algorithm

to account for the change in the shape of the aperture us-

ing the low-order A-Solver approach. In the A-Solver ap-

proach, the parameters describing the physical structure of

the antenna are determined using a geometric optics predictor

for the antenna aperture illumination pattern (AIP) and holo-

graphic measurements of the AIP (Jagannathan et al. 2017).

Based on our estimate of the SNR typically available, it may

be possible to develop an SSC algorithm using a model of

the sky brightness distribution. More work is required, and

in progress, in this area.

Page 11

The Pointing Selfcal algorithm

Finally, our algorithm could be used in the real-time cali-

bration systems of modern-era radio telescopes where a Lo-

cal Sky Model is assumed to be available. This will allow

tracking of any antenna pointing errors in real time and pos-

sibly real-time corrections.

This work was done using the R&D branch of the CASA

code base. We wish to thank the referee, Daniel Mitchell,

for very useful comments and pointing out some corrections

in the equations and their interpretation. The National Radio

Astronomy Observatory is a facility of the National Science

Foundation operated under cooperative agreement by Asso-

ciated Universities, Inc.

Software: CASA7, Python, MatPlotLib

REFERENCES

Bhatnagar, S. 1998, Computation Of Antenna Dependent Complex

Gains, Tech. Rep. No. R00l72, National Centre for Radio

Astrophysics, Pune. http://www.aoc.nrao.edu/

˜sbhatnag/GMRT_Offline/antsol/antsol.html

Bhatnagar, S., Cornwell, T. J., & Golap, K. 2004, Solving for the

antenna based pointing errors, Tech. rep., EVLA Memo 84.

http://www.aoc.nrao.edu/vla/geninfo/memoseries/

evlamemo84.ps

Bhatnagar, S., Cornwell, T. J., Golap, K., & Uson, J. M. 2008,

A&A, 487, 419. http:

//adsabs.harvard.edu/abs/2008A%26A...487..419B

Bhatnagar, S., Rau, U., & Golap, K. 2013, The Astrophysical

Journal, 770, 91.

http://stacks.iop.org/0004-637X/770/i=2/a=91

Brisken, W. 2003, Using Grasp8 To Study The VLA Beam, Tech.

rep., EVLA Memo 58. "http://www.aoc.nrao.edu/vla/

geninfo/memoseries/evlamemo58.ps"

Cornwell, T., Voronkov, M., & Humphreys, B. 2012, Proc. SPIE

8500, Image Reconstruction from Incomplete Data VII: Wide

field imaging for the Square Kilometre Array

Cornwell, T. J. 1999, in ASP Conf. Ser. 180: Synthesis Imaging in

Radio Astronomy II, ed. G. B. Taylor, C. L. Carilli, & R. A.

Perley. http://adsabs.harvard.edu/cgi-bin/nph-bib_

query?bibcode=1999sira.conf.....T&db_key=AST

Cornwell, T. J., Golap, K., & Bhatnagar, S. 2008, IEEE Journal of

Selected Topics in Signal Processing, 2, 647

Cornwell, T. J., & Wilkinson, P. N. 1981, MNRAS, 196, 1067.

http://adsabs.harvard.edu/cgi-bin/nph-bib_query?

bibcode=1981MNRAS.196.1067C&db_key=AST

Golap, K. 2015, Multi-threading gridders without copying and

locks, Tech. rep., EVLA Memo 191. https://library.

nrao.edu/public/memos/evla/EVLAM_191.pdf

Hamaker, J. P., Bregman, J. D., & Sault, R. J. 1996, A&AS, 117,

137. http://adsabs.harvard.edu/cgi-bin/nph-bib_

query?bibcode=1996A%26AS..117..137H&db_key=AST

Jagannathan, P., Bhatnagar, S., Brisken, W., & Taylor, A. R. 2017,

The Astrophysical Journal (submitted)

Jones, R. C. 1941, J. Opt. Soc. America, 31, 488

Mueller, H. 1948, J. Opt. Soc. America, 38, 661

Perley, R. 2016, Jansky Very Large Array Primary Beam

Characteristics, Tech. rep., EVLA Memo 195. "https://

library.nrao.edu/public/memos/evla/EVLAM_195.pdf"

Perley, R. A., Chandler, C. J., Butler, B. J., & Wrobel, J. M. 2011,

The Astrophysical Journal Letters, 739, L1.

http://stacks.iop.org/2041-8205/739/i=1/a=L1

Press, W., Teukolsky, S., Vetterling, W., & Flannery, B. 1992,

Numerical Recipes in C (Cambridge: Cambridge University

Press)

Rau, U., & Cornwell, T. J. 2011, A&A, 532, 17.

http://stacks.iop.org/0004-637X/770/i=2/a=91

Tasse, C., van Diepen, G., van der Tol, S., et al. 2012, Comptes

Rendus Physique, 13, 28

Thompson, A. R., & Daddario, L. R. 1982, Radio Science, 17, 357.

http://adsabs.harvard.edu/cgi-bin/nph-bib_query?

bibcode=1982RaSc...17..357T&db_key=AST

Thompson, A. R., Moran, J. M., & Swenson, G. W., J. 2017,

Interferometry and Synthesis in Radio Astronomy (John Wiley

& Sons, Inc.)

7 https://casa.nrao.edu