NAME
kepoutlier -- Remove or replace statistical outliers from time
series data
USAGE
kepoutlier infile outfile datacol nsig stepsize npoly niter operation ranges
plot plotfit clobber verbose logfile
PARAMETERS
infile = string
The name of a MAST standard format FITS file containing a Kepler light curve
within the first data extension.
outfile = string
The name of the output FITS file. outfile will be direct copy of
infile with either data outliers removed (i.e. the table will have
fewer rows) or the outliers will be corrected according to a best-fit
function and a noise model.
datacol = string
The column name containing data stored within extension 1 of infile.
This data will be searched for outliers. Typically this name is
SAP_FLUX (Simple Aperture Photometry fluxes) or PDCSAP_FLUX
(Pre-search Data Conditioning fluxes).
nsig = float
The sigma clipping threshold. Data deviating from a best fit function by
more than the threshold will be either removed or corrected according to
the user selection of operation.
stepsize = float
The data within datacol is unlikely to be well represented by a single
polynomial function. stepsize splits the data up into a series of time
blocks, each is fit independently by a separate function. The user can provide
an informed choice of stepsize after inspecting the data with the
kepdraw tool. Units are days.
npoly = integer
The polynomial order of each best-fit function.
niter = integer
If outliers are found in a particular data section, that data will be removed
temporarily and the time series fit again. This will be iterated niter
times before freezing upon the best available fit.
operation = string (remove|replace)
There are only two options. remove throws away outliers. The output data
table will smaller or equal in size to the input table. replace replaces
outliers with a value that is consistent with the best-fit polynomial function
and a random component defined by the rms of the data relative to the fit and
calculated using the inverse normal cumulative function and a random number
generator.
ranges = string
The user can choose specific time ranges of data on which to work. This could,
for example, avoid removing known stellar flares from a dataset. Ranges can
be supplied using one of two methods.
- Time ranges are supplied as comma-separated pairs of Barycentric Julian
Dates (BJDs). Multiple ranges are separated by a semi-colon. An example
containing two time ranges is:
'2455012.48517,2455014.50072;2455022.63487,2455025.08231'.
If the user wants to correct the entire time series then providing
ranges = '0,0' will tell the task to operate on the whole time
series.
- The user can provide time ranges within a pre-prepared ascii file containing
one time range per line, e.g.:
2455012.48517,2455014.50072
2455022.63487,2455025.08231
etc
The file 'arbitraryname.txt' is provided to the task using ranges =
@arbitraryname.txt, where the '@' tells the task that a file is being provided.
Files containing time ranges can be generated manually or with the aid of
data inspection using the task keprange.
plot = boolean
Plot the data and outliers?
plotfit = boolean
Overlay the polynomial fits upon the plot?
clobber = boolean (optional)
Overwrite the output file? if clobber = no and an existing file has
the same name as outfile then the task will stop with an error.
verbose = boolean (optional)
Print informative messages and warnings to the shell and logfile?
logfile = string (optional)
Name of the logfile containing error and warning messages.
status = integer
Exit status of the script. It will be non-zero if the task halted with an
error. This parameter is set by the task and should not be modified by the
user.
DESCRIPTION
kepoutlier identifies data outliers relative to piecemeal best-fit
polynomials. Outliers are either removed from the output time series or
replaced by a noise-treated value defined by the polynomial fit. Identified
outliers and the best fit functions are optionally plotted for inspection
purposes.
EXAMPLE
- Replace data outliers with noise-treated model:
- kepoutlier infile=kplr002437145-2009350155506_llc.fits
outfile=new.fits datacol=SAP_FLUX nsig=4 stepsize=5 npoly=2
niter=10 operation=replace ranges=@range.txt plot=y plotfit=y
- Remove data outliers, replace existing file:
- kepoutlier infile=kplr002437145-2009350155506_llc.fits
outfile=new.fits datacol=SAP_FLUX nsig=3 stepsize=5 npoly=2
niter=10 operation=remove ranges=@range.txt plot=y plotfit=y
clobber=y
TIME REQUIREMENTS
Full completion upon one quarter of Kepler long cadence target using a 3.06
GHz Intel Core 2 Duo Mac running OS 10.6.4 takes a few seconds. Running times
increase by several factors if input data contains NaNs. These will be
filtered out before task execution.
BUGS AND LIMITATIONS
The Kepler PyRAF package is privately-developed software made available to
the community through the contributed software page of the GO program at
http://keplergo.arc.nasa.gov/ContributedSoftware.shtml. It is not an
official software product of the Kepler mission. Bugs and errors are not
the responsibility of NASA or the Kepler Team. Please send bug reports and
suggestions to keplergo@mail.arc.nasa.gov.
HISTORY
|
|
|
|
|
Initial software release (MS)
|
|
|
Updated for Kepler FITS v2.0 (MS)
|
|
|
Fixed bug which allowed only integer sigma-clipping (TB)
|
|
|
Code can now be run from the command line (TB)
|
|
|
more reliable plot rendering on linux operating systems (MS)
|
SEE ALSO
keprange
|