pyPetal Output Files and Directories

Depending on the modules the user chooses to run, and the parameters chosen for each module, the output diagnostic information from pyPetal, and the structure of the output directiry will be different. In general, running all of the modules with three lines (named “cont”, “line1”, and “line2”) will produce the following directory structure:

output_directory/
├── cont/
│   ├── drw_rej/
│   └── detrend.pdf
├── line1/
│   ├── drw_rej/
│   ├── mica2/
│   ├── pyccf/
│   ├── pyzdcf/
│   ├── javelin/
│   ├── weights/
│   └── detrend.pdf
├── line2/
│   ├── drw_rej/
│   ├── mica2/
│   ├── pyccf/
│   ├── pyzdcf/
│   ├── javelin/
│   ├── weights/
│   └── detrend.pdf
├── processed_lcs/
├── pyroa/
├── pyroa_lcs/
├── light_curves/
├── mica2_weights_res.pdf
├── pyccf_weights_res.pdf
├── pyroa_weights_res.pdf
└── javelin_weights_res.pdf

Each line will have its own subdirectory labeled with the same names given in the line_names argument for pyPetal.pipeline.run_pipeline. Each of these line subdirectories will have multiple subdirectories for each module, depending on which modules are run.

The processed_lcs subdirectory contains the original light curves for all input lines after processing steps (i.e. DRW-based outlier rejection and detrending). This subdirectory will not exist if neither DRW-based outlier rejection nor detrending are run.

The light_curves subdirectory contains the original light curves for all input lines before processing steps. In addition to the light curves, these files will contain the masks produced in the DRW-based outlier rejection module if it is run.

In addition, each of these subdirectories contain files with the results of modules from pyPetal. We describe the files output from each of these modules below.

Light Curves

Regardless of the modules run, pyPetal will produce a light_curves subdirectory within the main output_directory. This contains the original light curves obtained through the arg2 argument of pypetal.pipeline.run_pipeline. These light curves will be named {line_name}.dat, where line_name is the name of the given light curve input to the pyPetal pipeline. These will be formatted as CSV files, with the first three columns representing the times, values, and uncertainties of the light curve.

If the DRW-based outlier rejection module is run, these light curve files will contain a fourth column with the DRW-rejection mask. This mask will consist of booleans, where True means a point was rejeted.

Module: DRW-based Outlier Rejection

The DRW Rejection module file output is unique in that it depends on the user’s reject_data argument. For all lines with reject_data=True, their subdirectory will contain a drw_rej/ subdirectory. This subdirectory will contain the following files:

Filename

Description

Format

Columns

{line_name}_chain.dat

The MCMC chains for the DRW parameters from the fit.

CSV

\(\tau_{\rm DRW}, \sigma_{\rm DRW}, \sigma_n\)

{line_name}_drw_fit.dat

The DRW fit to the light curve.

CSV

time, value, uncertainty

{line_name}_mask.dat

The DRW-based outlier rejection mask.

CSV

{line_name}_drw_fit.pdf

A figure describing the DRW fit to the light curve (see the DRW rejection tutorial).

PDF

Note

If jitter=False for the module, there will only be two columns in {line_name}_chain.dat, and the jitter term \(sigma_n\) will not be included.

In addition, pypetal will save the light curve excluding the rejected points to the processed_lcs subdirectory under the name {line_name}_data.dat.

Module: Detrending

There is only file output from the detrending module, which will appear in each line’s subdirectory. This will be a plot showing the linear fit to the original light curve before subtraction, which will be named detrend.pdf.

In addition, the detrended light curve will be saved to the processed_lcs subdirectory under the name {line_name}_detrended.dat.

Warning

The detrending module takes place after the DRW rejection module. Therefore, the detrended and rejected results will overwrite the purely rejected results in the processed_lcs/ directory under the same filename.

Module: PyCCF

Each line subdirectory (excluding the continuum) will contain a subdirectory pyccf/ for all results from the pyCCF module. This subdirectory will contain the following files:

Filename

Description

Format

Columns

{line_name}_ccf_dists.dat

The CCCD and CCPD.

CSV

CCCD, CCPD

{line_name}_ccf.dat

The CCF.

CSV

Time lags, CCF

{line_name}_ccf.pdf

A figure showing the CCF and output pyCCF distributions (see the pyCCF tutorial).

PDF

Module: pyZDCF

Each line subdirectory (excluding the continuum) will contain a subdirectory pyzdcf/ for all results from the pyZDCF module. This subdirectory will contain the following files:

Filename

Description

Format

Columns

{line_name}_{prefix}.dcf

The ZDCF file from pyZDCF.

ASCII

tau, -sig(tau), +sig(tau), dcf, -err(dcf), +err(dcf), #bin

{line_name}_zdcf.pdf

A figure showing the ZDCF (see the pyZDCF tutorial).

PDF

Module: PLIKE

If PLIKE is run under the pyZDCF module, its results will be stored in the pyzdcf/ directory for a given line. It will add the following additional files:

Filename

Description

Format

Columns

{line_name}_plike.out

The PLIKE results.

ASCII

num, lag, -dr, +dr, r, likelihood

Module: PyROA

Unlike the previous modules, the layout of the output directory and the structure of the files depend on the together parameter.

If together=True, the output directory for all lines will be output_directory/pyroa/. If together=False, each line will have it’s PyROA results in its own subdirectory, labeled pyroa/.

In addition, PyROA necessitates a directory for all light curves with names and contents in a specific format. This will be the output_directory/pyroa_lcs/ directory.

Each PyROA directory (whether together is True or False) will have the following files:

Filename

Description

Format

Columns

samples.obj

The PyROA MCMC samples.

pickle

see below

samples_flat.obj

The PyROA MCMC samples, flattened.

pickle

see below

Lightcurve_models.obj

The models for the light curves (including the continuum).

pickle

There will be one model for each light curve, and each model with have the time, value, and error for the modeled light curve.

X_t.obj

The drving continuum light curve model.

pickle

time, value, error

trace_plot.pdf

A figure showing the MCMC trace plots for each parameter, and the cutoff for the specified burn-in.

PDF

histogram_plot.pdf

A figure showing the MCMC posterior histograms for each parameter (excluding burn-in).

PDF

corner_plot.pdf

A figure showing the MCMC corner plot for all parameters (excluding burn-in).

PDF

fits_plot.pdf

A figure analogous to the PyROA fit plots, showing the light curve fits to the data, the time lag distributions, and the delay_dist distributions (if delay_dist=True).

PDF

If together=True, the columns of the samples files will be:

add_var

delay_dist

Columns

False

False

\(A_0, B_0, \tau_0, A_1, B_1, \tau_1, ..., \Delta\)

True

False

\(A_0, B_0, \tau_0, \sigma_0, A_1, B_1, \tau_1, \sigma_0, ..., \Delta\)

False

True

\(A_0, B_0, \tau_0, A_1, B_1, \tau_1, \Delta_1, A_2, B_2, \tau_2, \Delta_2, ..., \Delta\)

True

True

\(A_0, B_0, \tau_0, \sigma_0, A_1, B_1, \tau_1, \Delta_1, \sigma_1, A_2, B_2, \tau_2, \Delta_2, \sigma_2, ..., \Delta\)

If together=False, the columns will be the same as for together=True, except the file for each line will only contain samples for the continuum, and that line.

Module: MICA2

Like the PyROA module, the output of this module depends on the together parameter.

If together=True, the output directory for all lines will be output_directory/mica2/. If together=False, each line will have it’s MICA2 results in its own subdirectory, labeled mica2/.

Each PyROA directory (whether together is True or False) will have both a data/ and param/ directory, which were used by MICA2 to store the CDNest sampling and output information. To learn more about this data, see the MICA2 and CDNest documentation.

In general, the names of the files will depend on the number of gaussians/tophats used in the analysis. There will be a file for every gaussian used, indicated by a number indexed at 1.

Only a few files will be of note in the data/ directory, which are two figures:

Filename

Description

cdnest_{ngauss}.pdf

A figure showing the post-processing analysis of the diffusive nested sampling process.

fig_{ngauss}.pdf

A figure showing the quality of the MICA2 fits, including the center/centroid histograms, the transfer function, and the fits to the light curves.

Additionally, pyPetal will save the following files in the mica2/ directory:

Filename

Description

Format

Columns

cont_recon.dat

The reconstructed continuum light curve.

CSV

time, value, uncertainty

{line_name}_recon.dat

The reconstructed line light curve.

CSV

time, value, uncertainty

{line_name}_centers_{ngauss}.dat

The output samples for the centers of the gaussians for a given line and gaussian/tophat.

CSV

value

{line_name}_centroids_{ngauss}.dat

The output samples for the centroids of the gaussians for a given line and gaussian/tophat.

CSV

value

{line_name}_transfunc.dat

The transfer function for a given line and gaussian/tophat.

CSV

tau, transfer_function, lower_uncertainty, upper_uncertainty

If together=True, the only difference will be that the transfer function file will be named transfunc.dat.

If together=False and no_order=False, the data/ and param/ directories will be located in output_directory/mica2/ and the individual sample files will be located in output_directory/{line_name}/mica2/.

Module: JAVELIN

Unlike the other modules, the layout of the output directory and the structure of the files depends on multiple parameters, in particular together, rm_type, and fixed/p_fix.

If together=True, the output directory for all lines will be output_directory/javelin/. If together=False, each line will have it’s JAVELIN results in its own subdirectory, labeled javelin/.

If together=True, the output directory will contain the following files:

Filename

Description

Format

Columns

burn_cont.txt

The burn-in samples for the initial continuum fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)

burn_rmap.txt

The burn-in sampled for the total JAVELIN fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for each line

chain_cont.txt

The MCMC chains for the initial continuum fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)

chain_rmap.txt

The MCMC chains for the total JAVELIN fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for each line

logp_cont.txt

The log-probability for the initial continuum fit.

ASCII

logp_rmap.txt

The log-probability for the total JAVELIN fit.

ASCII

cont_lcfile.dat

The continuum light curve in JAVELIN format.

ASCII

{line_name}_lcfile.dat

The line light curve in JAVELIN format. There will be one file for each line.

ASCII

{line_name}_lc_fits.dat

The best-fit light curves for each line. There will be one file for each line.

CSV

time, value, uncertainty

javelin_histogram.pdf

A figure showing the histograms of the MCMC chains for each parameter.

PDF

javelin_bestfit.pdf

A figure showing the best-fit light curves for each line.

PDF

javelin_corner.pdf

A corner plot for all JAVELIN parameters.

PDF

If together=False, the output directory for each line will contain the following files:

Filename

Description

Format

Columns

burn_cont.txt

The burn-in samples for the initial continuum fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)

burn_rmap.txt

The burn-in sampled for the total JAVELIN fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for the line

chain_cont.txt

The MCMC chains for the initial continuum fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)

chain_rmap.txt

The MCMC chains for the total JAVELIN fit.

ASCII

\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for the line

logp_cont.txt

The log-probability for the initial continuum fit.

ASCII

logp_rmap.txt

The log-probability for the total JAVELIN fit.

ASCII

cont_lcfile.dat

The continuum light curve in JAVELIN format.

ASCII

tot_lcfile.dat

All light curves in JAVELIN format.

ASCII

{line_name}_lc_fits.dat

The best-fit light curves for the line.

CSV

time, value, uncertainty

javelin_histogram.pdf

A figure showing the histograms of the MCMC chains for each parameter.

PDF

javelin_bestfit.pdf

A figure showing the best-fit light curves for each line.

PDF

javelin_corner.pdf

A corner plot for all JAVELIN parameters.

PDF

Note

If both DRW parameters (i.e. the first two) are fixed, then there will not be a burn_cont.txt or chain_cont.txt file.

Note

If any parameters are fixed, there will not be a javelin_corner.pdf file.

The number of tophat parameters in the burn and chain files depends on the rm_type argument. If rm_type="spec", there will be 3 tophat parameters for each line (t, w, s). If rm_type="phot", there will be 2 tophat parameters for each line (t, w, s, \(\alpha\)).

If together=True, the tophat parameters will be grouped by line in order. For example, if rm_type="spec", the columns of the chain and burn files will be \(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW}), t_1, w_1, s_1, t_2, w_2, s_2, ...\).

Module: Weighting

The output of the weighting module depends on if the pyCCF and JAVELIN modules are run. All results will either be stored in the weights/ subdirectory for each line or the main output_directory/.

If the pyCCF module is run, the weights/ subdirectory will contain the following files:

Filename

Description

Format

Columns

pyccf_weights.dat

The distributions needed to weight the CCCD for the line.

CSV

lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed CCCD, smoothed weighted CCCD

pyccf_weighted_cccd.dat

The downsampled CCCD after weighting and finding the primary peak.

CSV

If the JAVELIN module is run, the weights/ subdirectory will contain the following files:

Filename

Description

Format

Columns

javelin_weights.dat

The distributions needed to weight the JAVELIN lag distribution \(t\) for the line.

CSV

lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed \(t\), smoothed weighted \(t\)

javelin_weighted_lag_dist.dat

The downsampled \(t\) after weighting and finding the primary peak.

CSV

If the PyROA module is run, the weights/ subdirectory will contain the following files:

Filename

Description

Format

Columns

pyroa_weights.dat

The distributions needed to weight the PyROA lag distribution \(t\) for the line.

CSV

lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed \(t\), smoothed weighted \(t\)

pyroa_weighted_lag_dist.dat

The downsampled \(t\) after weighting and finding the primary peak.

CSV

In addition, the weighting module will always output the following files in the weights/ subdirectory:

Filename

Description

Format

Columns

{line_name}_weights.pdf

A figure showing the distributions needed to weight the CCCD, JAVELIN lag distribution, or PyROA lag distribution.

PDF

weight_summary.fits

A FITS table containing the results of the weighting and auxiliary information from the weighting.

FITS

See below

The weight_summary.fits file contains the following information for each module (pyCCF, JAVELIN, and/or PyROA):

Name

Description

Type

k

The exponent used to calculate \(P(\tau)\)

float

n0_(module)

The value of \(N(0)\). Given for both the CCCD and \(t\).

float

peak_bounds_(module)

The bounds of the primary peak of the weighted distribution. Given as [lower bound, peak, upper bound] for both the CCCD and \(t\).

list of float

peak_(module)

The peak of the primary peak. Given for both the CCCD and \(t\).

float

lag_(module)

The median of the downsampled lag distribution. Given for both the CCCD and \(t\).

float

lag_err_(module)

The uncertainty on the lag. Given as [lower error, upper error] for both the CCCD and \(t\).

list of float

frac_rejected_(module)

The fraction of the original distribution that was rejected to obtain the downsampled distribution. Given for both the CCCD and \(t\)

float

rmax_(module)

The maximum value of the CCCD within the region covered by the downsampled JAVELIN lag distribution.

float

where (module) is either pyccf, javelin, or pyroa.

Note

If a module is not run, its values in weight_summary.txt for that module will be NaN.

Note

If pyCCF is not run, rmax_(module) will be NaN.

In addition, the following files will be placed in the main output_directory/:

Filename

Description

Format

Columns

pyccf_weights_res.pdf

A figure showing the output of the weighting process for the CCCD.

PDF

javelin_weights_res.pdf

A figure showing the output of the weighting process for the JAVELIN lag distribution.

PDF

pyroa_weights_res.pdf

A figure showing the output of the weighting process for the PyROA lag distribution.

PDF