NASA - National Aeronautics and Space Administration Follow this link to skip to the main content + Contact NASA
Kepler Guest Observer Program

Contributed Software - KEPOUTLIER

Software: PyKE
Version: 2.0.1
Contributor:

NAME
kepoutlier -- Remove or replace statistical outliers from time series data

USAGE
kepoutlier infile outfile datacol nsig stepsize npoly niter operation ranges plot plotfit clobber verbose logfile

PARAMETERS
infile = string
The name of a MAST standard format FITS file containing a Kepler light curve within the first data extension.

outfile = string
The name of the output FITS file. outfile will be direct copy of infile with either data outliers removed (i.e. the table will have fewer rows) or the outliers will be corrected according to a best-fit function and a noise model.

datacol = string
The column name containing data stored within extension 1 of infile. This data will be searched for outliers. Typically this name is SAP_FLUX (Simple Aperture Photometry fluxes) or PDCSAP_FLUX (Pre-search Data Conditioning fluxes).

nsig = float
The sigma clipping threshold. Data deviating from a best fit function by more than the threshold will be either removed or corrected according to the user selection of operation.

stepsize = float
The data within datacol is unlikely to be well represented by a single polynomial function. stepsize splits the data up into a series of time blocks, each is fit independently by a separate function. The user can provide an informed choice of stepsize after inspecting the data with the kepdraw tool. Units are days.

npoly = integer
The polynomial order of each best-fit function.

niter = integer
If outliers are found in a particular data section, that data will be removed temporarily and the time series fit again. This will be iterated niter times before freezing upon the best available fit.

operation = string (remove|replace)
There are only two options. remove throws away outliers. The output data table will smaller or equal in size to the input table. replace replaces outliers with a value that is consistent with the best-fit polynomial function and a random component defined by the rms of the data relative to the fit and calculated using the inverse normal cumulative function and a random number generator.

ranges = string
The user can choose specific time ranges of data on which to work. This could, for example, avoid removing known stellar flares from a dataset. Ranges can be supplied using one of two methods.

  1. Time ranges are supplied as comma-separated pairs of Barycentric Julian Dates (BJDs). Multiple ranges are separated by a semi-colon. An example containing two time ranges is:
    '2455012.48517,2455014.50072;2455022.63487,2455025.08231'.
    If the user wants to correct the entire time series then providing ranges = '0,0' will tell the task to operate on the whole time series.
  2. The user can provide time ranges within a pre-prepared ascii file containing one time range per line, e.g.:
    2455012.48517,2455014.50072
    2455022.63487,2455025.08231
    etc
    The file 'arbitraryname.txt' is provided to the task using ranges = @arbitraryname.txt, where the '@' tells the task that a file is being provided. Files containing time ranges can be generated manually or with the aid of data inspection using the task keprange.

plot = boolean
Plot the data and outliers?

plotfit = boolean
Overlay the polynomial fits upon the plot?

clobber = boolean (optional)
Overwrite the output file? if clobber = no and an existing file has the same name as outfile then the task will stop with an error.

verbose = boolean (optional)
Print informative messages and warnings to the shell and logfile?

logfile = string (optional)
Name of the logfile containing error and warning messages.

status = integer
Exit status of the script. It will be non-zero if the task halted with an error. This parameter is set by the task and should not be modified by the user.

DESCRIPTION
kepoutlier identifies data outliers relative to piecemeal best-fit polynomials. Outliers are either removed from the output time series or replaced by a noise-treated value defined by the polynomial fit. Identified outliers and the best fit functions are optionally plotted for inspection purposes.

EXAMPLE

  1. Replace data outliers with noise-treated model:
    • kepoutlier infile=kplr002437145-2009350155506_llc.fits outfile=new.fits datacol=SAP_FLUX nsig=4 stepsize=5 npoly=2 niter=10 operation=replace ranges=@range.txt plot=y plotfit=y

  2. Remove data outliers, replace existing file:
    • kepoutlier infile=kplr002437145-2009350155506_llc.fits outfile=new.fits datacol=SAP_FLUX nsig=3 stepsize=5 npoly=2 niter=10 operation=remove ranges=@range.txt plot=y plotfit=y clobber=y

TIME REQUIREMENTS
Full completion upon one quarter of Kepler long cadence target using a 3.06 GHz Intel Core 2 Duo Mac running OS 10.6.4 takes a few seconds. Running times increase by several factors if input data contains NaNs. These will be filtered out before task execution.

BUGS AND LIMITATIONS
The Kepler PyRAF package is privately-developed software made available to the community through the contributed software page of the GO program at http://keplergo.arc.nasa.gov/ContributedSoftware.shtml. It is not an official software product of the Kepler mission. Bugs and errors are not the responsibility of NASA or the Kepler Team. Please send bug reports and suggestions to keplergo@mail.arc.nasa.gov.

HISTORY

Date
Version
Description
2010-06-29
1.0.0
Initial software release (MS)
2011-07-01
1.0.1
Updated for Kepler FITS v2.0 (MS)
2011-07-01
1.0.2
Fixed bug which allowed only integer sigma-clipping (TB)
2012-03-27
2.0.0
Code can now be run from the command line (TB)
2012-06-07
2.0.1
more reliable plot rendering on linux operating systems (MS)

SEE ALSO
keprange


Questions concerning Kepler's science opportunities and open programs, public archive or community tools? Contact us via the email address.
FirstGov - Your First Click to the US Government
+ Freedom of Information Act
+ Budgets, Strategic Plans and Accountability Reports
+ The President's Management Agenda
+ NASA Privacy Statement, Disclaimer,
and Accessibility Certification

+ Inspector General Hotline
+ Equal Employment Opportunity Data Posted Pursuant
to the No Fear Act

+ Information-Dissemination Priorities and Inventories
NASA - National Aeronautics and Space Administration
Editor: Martin Still
NASA Official: Jessie Dotson
Last Updated: Jan 6, 2012
+ Contact NASA