Smithsonian
  Astrophysical
    Observatory
Stelircam Pipeline SAO Telescope Data Center

Stelircam Pipeline - details

Last update: 2007-22-03 by Bill Wyatt

Note that before reading this document, you probably should be familiar with the overview, found at http://www.cfa.harvard.edu/ircam/pipedoc.shtml.


Contents

runpipe - Top routine for initialization
pipe.sh - Master processing routine
- Locating Date-relative parameters
lincor - Image Linearization
update_headers - Add BPM key to headers
trimfiles - Image edge removal
mkfiledb - Create database of image headers
Also, how to modify for other instruments
darkcreate - Create dark correction files
amdarkcheck - Compare afternoon and morning dark exposures
darksub - Subtract dark correction
flatcreate - Create flatfield files
flatdiv - Divide files by flatfield
skysubtract - Create sky-subtracted files

Table of contents

runpipe - Top routine for initialization

The runpipe script serves to initialize the environment for running the rest of the scripts in the stelircam pipeline. The initialization consists of:

The argument list is then passed to pipe.sh using an eval command, so any arguments with whitespace or shell metacharacters must be quoted or escaped. See the pipe.sh details, below. One useful feature I have adopted is to make the -x option print a usage message with the current values, default and from the command line, then immediately exit.

Examples:


Table of Contents

pipe.sh - Master processing routine

The pipe.sh script performs appropriate setup on the data in the specified directory (usually the current directory) and controls the scripts that do the actual processing, passing appropriate arguments to them.

The setup consists of moving any raw files into the raw subdirectory, including ones that are compressed. Any files with the suffixes .r .b .r.gz .b.gz .r.fits .r.fits.gz .b.fits .r.fits.gz and also simply .gz are recognized. The files are uncompressed if need be.

Locating date-relative parameters

The stelircam pipeline is designed to operate on data taken at any epoch. Because of equipment changes, especially the blue channel array replacement in the summer of 2001, some parameters and calibration files for earlier data are not appropriate for the later data. Also, of course, the parameters and calibrations for one channel are unlikely to be correct for the other channel.

The user can completely specify his or her own set of parameters, by supplying either a complete path/value for each one or supplying a directory that contains them. For example, if the runpipe arguments are:

-L '"-c 0 -1 /home/foo/path/c0_001.fits -2 10.0 -k /home/coeffs"'
(note the double sets of quotes around the above) then lincor takes the channel 0 coefficient 1 as /home/foo/path/c0_000.fits, the channel 0 coefficient 2 as 10.0 explicitly, and all other parameters, including those for channel 1, are searched for under /home/coeffs. I anticipate, however, that the user will normally let the system provide the nominal values as appropriate.

To accommodate this variety of possibilities, the stelircam pipeline will search under a master calibration directory (default /home/ircam/Lincor): first, for files that match the parameter name and second, for subdirectories whose names consist of dates in the form YYYYMMDD that are equal to or earlier than the data of the data set being processed.

In these ``dated'' directories, there are files or coefficients with a particular syntax related to the channel and parameter name. When one of these is located, it is returned to the calling program. If the parameter is not located, the next earlier dated directory is searched.

For example, suppose the following directory and file hierarchy exists:

/home/ircam/Lincor/
c0_imexpr.txt

19971201/
c0_bpm.pl
c1_bpm.pl
c1_imexpr.txt
20010531/
c1_bpm.pl
c0_imexpr.txt
Thus, if the data to be processed is from 2001-09-30, the channel 0 bad pixel mask file (c0_bpm) is found in /home/ircam/Lincor/19971201/c0_bpm.pl, while the channel 1 bad pixel mask is found in /home/ircam/Lincor/20010531/c1_bpm.pl. However, the channel 0 linearization algorithm (c0_imexpr) will always be located first as /home/ircam/Lincor/c0_imexpr.txt; the version under 20010531 is ignored.

The file names within the dated directories have one of two forms:

cn_Param.Ext    or    cn_Param=Value
where cn is the letter ``c'' (lower case) plus the channel number (0 or 1), the Param is the parameter name, and the .Ext is ignored for location purposes but is passed back as part of the file name. In the second version above, if an equals sign is embedded in the file name, the string after it is stripped off and passed back instead as a value result.

The current set of ``dated'' parameters are:

cn_bpm Bad pixel mask for channel n, for example, c0_bpm.pl
cn_trimsec Trimming image section for channel n, for example, c0_trimsec=[6:251,6:251]
cn_statsec Statistics image section for channel n, for example, c1_statsec=[10:230,75:230]
cn_imexpr Linearization formula for channel n, for example, c0_imexpr.txt, passed to the IRAF routine imexpr. See the lincor details for more information on how this is used.
cn_satlevel Saturation clipping level for linearization and flat creation for channel n. For example:   c1_satlevel=16000
cn_000 Linearization coefficient 0 for channel n, used in lincor. For example, c0_000.fits would be a fits file of coefficients to use at each pixel location.
cn_001 Linearization coefficient 1 for channel n, used in lincor. For example c0_001=2.305e-4 would be a coefficient to use on all pixels in the linearization.
cn_002 Linearization coefficient 2 for channel n, used in lincor.
cn_003 Linearization coefficient 3 for channel n, used in lincor.
cn_004 Linearization coefficient 4 for channel n, used in lincor.
cn_005 Linearization coefficient 5 for channel n, used in lincor.

Options to pipe.sh

Defaults to the options are in bold italics, if available:
-c <chan> Select channel (0, 1, ...)
-b <bpm[chan]> Pixel mask file path for selected channel chan
-s <statsect[chan]> Statistics image section for channel chan
-t <trimsection[chan]> Image section to trim for channel chan
-i./ The starting data directory
-j 1 (one) The number number of processes to run per CPU. If zero, only one process will be run at a time, despite the presence of multiple CPUs.
-k /home/ircam/Lincor Parameter directory (``dated'' directories)
-l pipe.log Highest-level output results file
-m 20000 Pixel saturation level (* coadds) used in linearization and flat creation for channel chan
-n 0.0 Minimum pixel (* coadds) used in linearization
BUG: currently not passed to lincor script!
-p 5.0 Maximum percent saturated pixels allowed to use an image in flat or sky creation
-o raw Raw data subdirectory
-h /home/ircam/unixiraf irafhome, as used with IRAF
-u ${irafhome}/uparm/ Unixiraf parameter directory
-d allfiles.db Master raw data header database
 
-C /bin/cat Name of program or script filter used by mkfiledb to convert input FITS keywords to ones expected by the pipeline. For example, ``UTC'' is converted to ``UT'' even if this option is not used.
-L <opts> Options sent to lincor
-T <opts> Options sent to trimfiles
-D <opts> Options sent to darkcreate, darksub and amdarkcheck
-F <opts> Options sent to flatcreate and flatdiv
-S <opts> Options sent to skysubtract
-U <opts> Options sent to update_headers
 
-x print usage then exit

Table of Contents

lincor - Image Linearization

lincor makes the linearity correction to Stelircam images. It uses the IRAF package imexpr to do this. The correction formula is read in from cN_imexpr in the ``dated'' parameter directories.

For example, the original red-channel (channel 0) algorithm, in /home/ircam/Lincor/19971201/c0_imexpr.txt is:

d * min(b, max(c, a/d - j * (f + (a/d)*(g+h*a/d))**2 ))
The following is the correspondence between the imexpr letters and lincor's coefficients:
imexpr opt  what
a   input file
b -m maximum pixel allowed, default 20,000 * number of coadds
c -n minimum pixel allowed, default 0.0 * number of coadds
d   number of coadds in this image
e -0 C0 (not used in this example algorithm)
f -1 C1
g -2 C2
h -3 C3
i -4 C4 (not used in this example algorithm)
j -5C5
Thus, the above formula in other terms is:
S(true) = S(obs) - C5 * [C1 + C2 * S(obs)+ C3 * S(obs)**2] **2

where S(obs) is the (observed data)/(# of co-adds) and C1, C2, C3 and C5 are coefficients given by Tollestrup. The coefficients are stored as a 256x256x5 data cube, i.e. each Cn is a pixel file with a number for each pixel at that coordinate.

Pixels below the minimum value (default is 0.0) or above a maximum value (default 22,000.0 for the red side, 18,000 for the blue) are replaced with the minimum or maximum values, respectively, times the number of co-adds in the exposure.

Options to lincor

-c <chan> channel select
-a <value> linearization algorithm for channel chan. May be a full path or relative to the parameter directory (see -k).
-b <[row:col,row:col]> bad pixel mask file for channel chan
-m 20000 Maximum pixel value for channel chan
-i raw input data directory
-o linear output data directory
-j 1 number of jobs per cpu to run
-k /home/ircam/Lincor Path to parameter directory
-n 0.0 c, Minimum pixel value
-0 <path> or <value> e, Coefficient 0
-1 <path> or <value> f, Coefficient 1
-2 <path> or <value> g, Coefficient 2
-3 <path> or <value> h, Coefficient 3
-4 <path> or <value> i, Coefficient 4
-5<path> or <value> j, Coefficient 5
-x   Print usage and exit

Table of Contents

update_headers - Add keywords to headers of linearized files

Since the bad pixel masking file is not used by default by the pipeline (except in creating the darkfiles and flatfields), and since the user may need or want to apply it later after the pipeline finishes, the simplest way to carry the information forward is in the FITS header.

This step writes that Bad Pixel Mask (BPM) file location to the header. If there is already a BPM value in the header, or no BPM parameter exists for this channel, the header is not changed.

If there is no CHANNEL keyword in the header, the BPM file for channel 0 is supplied by default.

Options to update_headers

-c <chan> Select channel (0, 1, ...)
-b <bpm[chan]> Pixel mask file path for selected channel chan
-i./linear The starting data directory
-j 1 (one) The number number of processes to run per CPU. If zero, only one process will be run at a time, despite the presence of multiple CPUs.

Table of Contents

trimfiles - Trim edges off images

Since the Stelircam arrays have some bad pixels around the edges, this step trims off rows and columns from the edges of the data set. The image section to be output is set with the cn_trimsec parameter in the calibration directory, or by the runpipe command line with the -T option and those given below.

The IRAF routine imcopy is used to copy the imagesection, so the image header is updated with LTV1 and LTV2 values.

Analogously to the lincor step, the output images are written into the trim subdirectory, with the extension updated to become ``.lt.fits''.

Options to trimfiles:

-c <chan> select channel
-j 1 (one) The number number of processes to run per CPU. If zero, only one process will be run at a time, despite the presence of multiple CPUs.
-t <[row:col,row:col]> Image section to trim for channel chan
-s <statsect[chan]> statistics image section for channel chan
-i linear input data directory
-o trim output data directory
-x print usage and exit

Table of Contents

mkfiledb - Create database of header keywords and values

The Stelircam instrument has a consistent set of header keywords, and of course the pipeline was written with that set in mind. The pipeline database needs entries under the following keywords to operate correctly:

DATE_OBS  UT        RRA       RDEC      OBJECT
EXPTIME   CO_ADDS  
FILTER    CHANNEL   LEN     
ITIME     CAMMODE   CLKMODE   FASTMODE  SAM_MD  NUM_SAM  SLOW_CNT

Note - keyword explanations and legal values need to be added

Datasets from other instruments must be made to have (at least) these keywords. The -C option to runpipe is passed to mkfiledb. It must contain the name of an executable filter to manipulate the database of header keywords and values to conform to the pipeline's expectations.

Note that the data headers are not altered - the change is only made to the pipeline's database, which it uses for all subsequent decisions about how to group objects with darks or flats, etc.

Options to trimfiles:

-f allfiles.db OUtput database file name
-i ./trim input data directory
-C /bin/cat Filter to alter database output rows or columns.

Filter example 1

#!/bin/sh
PATH=${PATH}:/data/oir/bin/
export PATH
column -a UTC=UT RA=RRA DEC=RDEC ITIME CO_ADDS | 
validate '
   toupper(FILTER) ~ "DARK" { FILTER = "Blank" }
   { if (ITIME ~ "") { ITIME = EXPTIME } }
   { if (CO_ADDS ~ "") { CO_ADDS = 1 } }
   ITIME == "5."    { ITIME = "5.00";   EXPTIME = "5.00" }
   ITIME == "5.0"   { ITIME = "5.00";   EXPTIME = "5.00" }
   ITIME == "5.01"  { ITIME = "5.00";   EXPTIME = "5.00" } '

The above filter script uses the starbase set of scripts to manipulate the database columns and headers.

The column command takes data on its standard input, with the -a option telling it to pass on through all otherwise unmentioned columns, renames an existing column named "UTC" to "UT", one named "ra" to "RRA" and one named "dec" to "RDEC" and adding extra columns ITIME and CO_ADDS, which are initially empty. Its output goes to the validate program, which is basically a front end to an enhanced implementation of awk called tawk.

Each line of the awk script tests against each row of the input database. Since the pipeline distinguishes dark frames by a filter value of "Blank", any input row with FILTER matching "DARK" is set to "Blank" instead. Similarly, if ITIME is empty, the value of the EXPTIME element is copied into it. The CO_ADDS element is set to 1 if it was empty. Finally, since the pipeline distinguishes among different darkframe sets using exact matches (as a string) of the ITIME, this value is in essence reformatted.

So, if the above script were in file /home/user/myconvert and executable, the way to use it would be:

runpipe -C /home/user/myconvert [<more options>]

Filter example 2

#!/bin/sh
PATH=${PATH}:/data/oir/bin/
export PATH
# Exclude by unique file number
# This excludes files *.0979.* through *.0994.* and also
# files *.1003.* through *.1010.*
validate '{
  split(HDRFILE, arr, ".")
  val = arr[2] + 0
  if((val >= 979 && val <= 994) || (val >= 1003 && val <= 1010)) { 
     USEFLAT = "F"
     USESKY  = "F" 
  }
}'

The above filter script splits the file name, HDRFILE, into component parts, at the period separator. It assumes all file names are of the form name.number.extension(s). The file number is used to select two subranges of files to NOT be used in flatfield creation and sky subtraction. Note that the file will still itself be flatfielded and skysubtracted; it just won't be used to calculate the flatfield or skysubtraction to be applied to any other files.

There is also a USEDARK flag for controlling file inclusion in the darkfile creation phase as well.


Table of Contents

darkcreate - Create dark correction files

At this stage, the pipeline scans through the database of file header information allfiles.db to determine what type of dark exposure is required for each object, printing out a summary table of a name, the Exposure Type (abbreviated ExpType) and the total exposure time. The exposure type is a concatenation of all the exposure characteristics that are unique. For a little better readability, the string is made up of each parameter separated by underscores. Thus, an exposure type of 1_5.0000_basic_arc_0_2_1_4 decodes to:
 
1    Channel (0=Red, 1=Blue)
5.0000    Frame time, seconds
basic    Camera Mode
arc    Clocking or Array Readout Mode
0    Fastmode (0=off, 1= on)
2    Sample Mode (1=single, 2=double)
1    Number of Readouts
4    Number of Slow Counts

For each required dark exposure type, darkcreate creates a dark frame that is the median, on a pixel-by-pixel basis, of each set of dark exposures of that type, not including any files whose USEDARK flag is set to "F". The output file name format is dark.<exptype>.fits, e.g. dark.1_5.0000_basic_arc_0_2_1_4.fits. If no dark exposures are available, a warning diagnostic is printed.

Options to darkcreate

-c <chan> select channel
-s <statsect[chan]> statistics image section for channel chan
-d unused - dummy arg for compatibility with amdarkcheck and darksub
-i trim input data directory
-o darksub output data directory
-f darktypes.db Output table of dark exposure types.
-j 1 number of jobs per cpu to run
-l   unused - dummy arg for compatibility with darksub
-x   print usage and exit

Table of Contents

amdarkcheck - Compare afternoon and morning dark exposures

As an extra check on stability, some observers take a set of darks in the afternoon before observing and in the morning after observing. The amdarkcheck script looks for these corresponding sets and compares them.

Generally, the camera is stable enough that the difference in the two sets is one count or less, and in any case well under the one-sigma level.

Options to amdarkcheck

-c <chan> select channel
-s <statsect[chan]> statistics image section for channel chan
-d darkdiff.db table of PM--AM differences
-i trim input data directory
-o darksub output data directory
-f darktypes.db input table of dark exposure types.
-j 1 number of jobs per cpu to run
-l   unused - dummy arg for compatibility with darksub
-x   print usage and exit

Table of Contents

darksub - Subtract dark correction

Once the median dark frame files are created, this stage subtracts the appropriate dark from each object file, scaling if necessary. If the underlying frame time of the dark and object files is different, the program chooses to scale the dark frame with the largest exposure time where the other components of the exposure type are the same. That is, if an object has a frame time of 2.5 seconds and no darks have the same frame time, a dark of the same channel, camera mode, fast mode, etc. but perhaps a frame time of 10 seconds could be used.

Analogously to the earlier routines, the output files are written into the output directory with the termination .ltd.fits.

Options to darksub

-c unused - dummy arg for compatibility with amdarkcheck and darksub
-s   unused - dummy arg for compatibility with amdarkcheck and darksub
-i trim input data directory
-o darksub output data directory
-d darkdiff.db table of PM--AM differences
-f allfiles.db input table of raw file header keywords
-j 1 number of jobs per cpu to run
-l darksub.log Output summary of processing
-x   print usage and exit

Table of Contents

flatcreate - Create flatfield files

As with the creation of dark frames, this stage creates flatfields. A median of up to 150 object files is used to create a flatfield for each channel, filter and lens combination, with the minimum number defaulting to 20 (see the -g option). Exposures of 10 seconds or less are ignored unless there are no longer-exposure files in the set, on the principle that there is not enough S/N in the sky. This means in practice that standard star exposures are not used. Any file with the USEFLAT value set to "F" in the database is not used. Also, all exposures with more than 5% saturated pixels (>=20,000 counts per frame) are ignored as presumably being in error or at least too dense with objects. The saturation limit is settable with the -p option and the saturation level is settable (per channel) with the -m option.

As is usual practice, the median file is divided through by its mean to normalize it to 1.0. Also, because of sometimes limited memory and swap space, only up to 150 files are used in the median for each flatfield file.

The output file name is of the format flat.cn.<filter-name>.fits, where n is the channel number. An example might be flat.c1.H_Barr.fits

Options to flatcreate

-c <chan> select channel
-s <statsect[chan]> statistics image section for channel chan
-m 20000 pixel saturation level (* coadds) for channel chan
-d flattypes.db Table of flat types in data set
-i darksub input data directory
-g 20 Minimum number of files required to create a flat-field
-o flat output data directory
-f allfiles.db Input table of raw header keywords
-j 1 number of jobs per cpu to run
-p 5.0 maximum percent saturated pixels allowed in images used to create a flatfield
-l   unused - dummy arg for compatibility with flatdiv
-x   print usage and exit

Table of Contents

flatdiv - Divide files by flatfield

Once the flatfield images have been created, this step runs through the list of all objects, matching up the correct flat and dividing through by it.

Analogously to the earlier routines, the output files are written into the output directory with the termination .ltdf.fits.

Options to flatdiv

-c   unused - dummy arg for compatibility with flatdiv
-s   unused - dummy arg for compatibility with flatdiv
-d flattypes.db Input table of flat types in data set
-i darksub input data directory
-o flat output data directory
-f allfiles.db Input table of raw header keywords
-j 1 number of jobs per cpu to run images used to create a flatfield
-l do_flats.log output processing log
-m   unused - dummy arg for compatibility with flatdiv
-p   unused - dummy arg for compatibility with flatdiv

Table of Contents

skysubtract - Create sky-subtracted files

The sky subtraction step involves calculating an optimum sky file for each object file.

For each image, a set of images (not including itself) with the same filter and nearly the same sky level are chosen, excepting any with the USESKY value set to "F". As with the flatfield creation step, there is a distinction between images with exposure times less than 10 seconds and those with longer exposures.

For images with exposures less than 10 seconds, only other short-exposure images are selected for use and the set size is at  least 14 and up to 22 images. The lower number is used if the background level is different by 5% or more; if the background level stays the same to within 5%, the larger number of images is used.

For images with longer exposures, only similar long-exposure images are selected, and the set size is at least 9 up to 18.

In the output log file is summarized the difference of the object from each of the skyfile components in delta minutes of time of exposure, separation on the sky in arc-minutes, absolute value of background level in counts, and ratio of background levels. Note that only the ratio is important in selecting images for sky correction.

Once the data set is chosen, it is median-combined into a sky image. The sky is subtracted from the selected image, which is written to the skysub subdirectory with the extension .ltdfs.fits. The sky image is written to the skysub/sky subdirectory, prefixed with the string ``sky_for.''.

Options to skysubtract

-c <chan> select channel
-s <statsect> statistics image section for channel chan
-d flattypes.db Table of flat types in data set
-i flat input data directory
-o skysub output data directory
-f allfiles.db Input table of raw header keywords
-j 1 number of jobs per cpu to run