pyPetal Output Files and Directories

Depending on the modules the user chooses to run, and the parameters chosen for each module, the output diagnostic information from pyPetal, and the structure of the output directiry will be different. In general, running all of the modules with three lines (named “cont”, “line1”, and “line2”) will produce the following directory structure:

output_directory/
├── cont/
│   ├── drw_rej/
│   └── detrend.pdf
├── line1/
│   ├── drw_rej/
│   ├── mica2/
│   ├── pyccf/
│   ├── pyzdcf/
│   ├── javelin/
│   ├── weights/
│   └── detrend.pdf
├── line2/
│   ├── drw_rej/
│   ├── mica2/
│   ├── pyccf/
│   ├── pyzdcf/
│   ├── javelin/
│   ├── weights/
│   └── detrend.pdf
├── processed_lcs/
├── pyroa/
├── pyroa_lcs/
├── light_curves/
├── mica2_weights_res.pdf
├── pyccf_weights_res.pdf
├── pyroa_weights_res.pdf
└── javelin_weights_res.pdf

Each line will have its own subdirectory labeled with the same names given in the line_names argument for pyPetal.pipeline.run_pipeline. Each of these line subdirectories will have multiple subdirectories for each module, depending on which modules are run.

The processed_lcs subdirectory contains the original light curves for all input lines after processing steps (i.e. DRW-based outlier rejection and detrending). This subdirectory will not exist if neither DRW-based outlier rejection nor detrending are run.

The light_curves subdirectory contains the original light curves for all input lines before processing steps. In addition to the light curves, these files will contain the masks produced in the DRW-based outlier rejection module if it is run.

In addition, each of these subdirectories contain files with the results of modules from pyPetal. We describe the files output from each of these modules below.

Light Curves

Regardless of the modules run, pyPetal will produce a light_curves subdirectory within the main output_directory. This contains the original light curves obtained through the arg2 argument of pypetal.pipeline.run_pipeline. These light curves will be named {line_name}.dat, where line_name is the name of the given light curve input to the pyPetal pipeline. These will be formatted as CSV files, with the first three columns representing the times, values, and uncertainties of the light curve.

If the DRW-based outlier rejection module is run, these light curve files will contain a fourth column with the DRW-rejection mask. This mask will consist of booleans, where True means a point was rejeted.

Module: DRW-based Outlier Rejection

The DRW Rejection module file output is unique in that it depends on the user’s reject_data argument. For all lines with reject_data=True, their subdirectory will contain a drw_rej/ subdirectory. This subdirectory will contain the following files:

Filename	Description	Format	Columns
`{line_name}_chain.dat`	The MCMC chains for the DRW parameters from the fit.	CSV	\(\tau_{\rm DRW}, \sigma_{\rm DRW}, \sigma_n\)
`{line_name}_drw_fit.dat`	The DRW fit to the light curve.	CSV	time, value, uncertainty
`{line_name}_mask.dat`	The DRW-based outlier rejection mask.	CSV
`{line_name}_drw_fit.pdf`	A figure describing the DRW fit to the light curve (see the DRW rejection tutorial).	PDF

Note

If jitter=False for the module, there will only be two columns in {line_name}_chain.dat, and the jitter term \(sigma_n\) will not be included.

In addition, pypetal will save the light curve excluding the rejected points to the processed_lcs subdirectory under the name {line_name}_data.dat.

Module: Detrending

There is only file output from the detrending module, which will appear in each line’s subdirectory. This will be a plot showing the linear fit to the original light curve before subtraction, which will be named detrend.pdf.

In addition, the detrended light curve will be saved to the processed_lcs subdirectory under the name {line_name}_detrended.dat.

Warning

The detrending module takes place after the DRW rejection module. Therefore, the detrended and rejected results will overwrite the purely rejected results in the processed_lcs/ directory under the same filename.

Module: PyCCF

Each line subdirectory (excluding the continuum) will contain a subdirectory pyccf/ for all results from the pyCCF module. This subdirectory will contain the following files:

Filename	Description	Format	Columns
`{line_name}_ccf_dists.dat`	The CCCD and CCPD.	CSV	CCCD, CCPD
`{line_name}_ccf.dat`	The CCF.	CSV	Time lags, CCF
`{line_name}_ccf.pdf`	A figure showing the CCF and output pyCCF distributions (see the pyCCF tutorial).	PDF

Module: pyZDCF

Each line subdirectory (excluding the continuum) will contain a subdirectory pyzdcf/ for all results from the pyZDCF module. This subdirectory will contain the following files:

Filename	Description	Format	Columns
`{line_name}_{prefix}.dcf`	The ZDCF file from pyZDCF.	ASCII	tau, -sig(tau), +sig(tau), dcf, -err(dcf), +err(dcf), #bin
`{line_name}_zdcf.pdf`	A figure showing the ZDCF (see the pyZDCF tutorial).	PDF

Module: PLIKE

If PLIKE is run under the pyZDCF module, its results will be stored in the pyzdcf/ directory for a given line. It will add the following additional files:

Filename	Description	Format	Columns
`{line_name}_plike.out`	The PLIKE results.	ASCII	num, lag, -dr, +dr, r, likelihood

Module: PyROA

Unlike the previous modules, the layout of the output directory and the structure of the files depend on the together parameter.

If together=True, the output directory for all lines will be output_directory/pyroa/. If together=False, each line will have it’s PyROA results in its own subdirectory, labeled pyroa/.

In addition, PyROA necessitates a directory for all light curves with names and contents in a specific format. This will be the output_directory/pyroa_lcs/ directory.

Each PyROA directory (whether together is True or False) will have the following files:

Filename	Description	Format	Columns
`samples.obj`	The PyROA MCMC samples.	pickle	see below
`samples_flat.obj`	The PyROA MCMC samples, flattened.	pickle	see below
`Lightcurve_models.obj`	The models for the light curves (including the continuum).	pickle	There will be one model for each light curve, and each model with have the time, value, and error for the modeled light curve.
`X_t.obj`	The drving continuum light curve model.	pickle	time, value, error
`trace_plot.pdf`	A figure showing the MCMC trace plots for each parameter, and the cutoff for the specified burn-in.	PDF
`histogram_plot.pdf`	A figure showing the MCMC posterior histograms for each parameter (excluding burn-in).	PDF
`corner_plot.pdf`	A figure showing the MCMC corner plot for all parameters (excluding burn-in).	PDF
`fits_plot.pdf`	A figure analogous to the PyROA fit plots, showing the light curve fits to the data, the time lag distributions, and the `delay_dist` distributions (if `delay_dist=True`).	PDF

If together=True, the columns of the samples files will be:

`add_var`	`delay_dist`	Columns
`False`	`False`	\(A_0, B_0, \tau_0, A_1, B_1, \tau_1, ..., \Delta\)
`True`	`False`	\(A_0, B_0, \tau_0, \sigma_0, A_1, B_1, \tau_1, \sigma_0, ..., \Delta\)
`False`	`True`	\(A_0, B_0, \tau_0, A_1, B_1, \tau_1, \Delta_1, A_2, B_2, \tau_2, \Delta_2, ..., \Delta\)
`True`	`True`	\(A_0, B_0, \tau_0, \sigma_0, A_1, B_1, \tau_1, \Delta_1, \sigma_1, A_2, B_2, \tau_2, \Delta_2, \sigma_2, ..., \Delta\)

If together=False, the columns will be the same as for together=True, except the file for each line will only contain samples for the continuum, and that line.

Module: MICA2

Like the PyROA module, the output of this module depends on the together parameter.

If together=True, the output directory for all lines will be output_directory/mica2/. If together=False, each line will have it’s MICA2 results in its own subdirectory, labeled mica2/.

Each PyROA directory (whether together is True or False) will have both a data/ and param/ directory, which were used by MICA2 to store the CDNest sampling and output information. To learn more about this data, see the MICA2 and CDNest documentation.

In general, the names of the files will depend on the number of gaussians/tophats used in the analysis. There will be a file for every gaussian used, indicated by a number indexed at 1.

Only a few files will be of note in the data/ directory, which are two figures:

Filename	Description
`cdnest_{ngauss}.pdf`	A figure showing the post-processing analysis of the diffusive nested sampling process.
`fig_{ngauss}.pdf`	A figure showing the quality of the MICA2 fits, including the center/centroid histograms, the transfer function, and the fits to the light curves.

Additionally, pyPetal will save the following files in the mica2/ directory:

Filename	Description	Format	Columns
`cont_recon.dat`	The reconstructed continuum light curve.	CSV	time, value, uncertainty
`{line_name}_recon.dat`	The reconstructed line light curve.	CSV	time, value, uncertainty
`{line_name}_centers_{ngauss}.dat`	The output samples for the centers of the gaussians for a given line and gaussian/tophat.	CSV	value
`{line_name}_centroids_{ngauss}.dat`	The output samples for the centroids of the gaussians for a given line and gaussian/tophat.	CSV	value
`{line_name}_transfunc.dat`	The transfer function for a given line and gaussian/tophat.	CSV	tau, transfer_function, lower_uncertainty, upper_uncertainty

If together=True, the only difference will be that the transfer function file will be named transfunc.dat.

If together=False and no_order=False, the data/ and param/ directories will be located in output_directory/mica2/ and the individual sample files will be located in output_directory/{line_name}/mica2/.

Module: JAVELIN

Unlike the other modules, the layout of the output directory and the structure of the files depends on multiple parameters, in particular together, rm_type, and fixed/p_fix.

If together=True, the output directory for all lines will be output_directory/javelin/. If together=False, each line will have it’s JAVELIN results in its own subdirectory, labeled javelin/.

If together=True, the output directory will contain the following files:

Filename	Description	Format	Columns
`burn_cont.txt`	The burn-in samples for the initial continuum fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)
`burn_rmap.txt`	The burn-in sampled for the total JAVELIN fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for each line
`chain_cont.txt`	The MCMC chains for the initial continuum fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)
`chain_rmap.txt`	The MCMC chains for the total JAVELIN fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for each line
`logp_cont.txt`	The log-probability for the initial continuum fit.	ASCII
`logp_rmap.txt`	The log-probability for the total JAVELIN fit.	ASCII
`cont_lcfile.dat`	The continuum light curve in JAVELIN format.	ASCII
`{line_name}_lcfile.dat`	The line light curve in JAVELIN format. There will be one file for each line.	ASCII
`{line_name}_lc_fits.dat`	The best-fit light curves for each line. There will be one file for each line.	CSV	time, value, uncertainty
`javelin_histogram.pdf`	A figure showing the histograms of the MCMC chains for each parameter.	PDF
`javelin_bestfit.pdf`	A figure showing the best-fit light curves for each line.	PDF
`javelin_corner.pdf`	A corner plot for all JAVELIN parameters.	PDF

If together=False, the output directory for each line will contain the following files:

Filename	Description	Format	Columns
`burn_cont.txt`	The burn-in samples for the initial continuum fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)
`burn_rmap.txt`	The burn-in sampled for the total JAVELIN fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for the line
`chain_cont.txt`	The MCMC chains for the initial continuum fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\)
`chain_rmap.txt`	The MCMC chains for the total JAVELIN fit.	ASCII	\(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW})\), tophat parameters for the line
`logp_cont.txt`	The log-probability for the initial continuum fit.	ASCII
`logp_rmap.txt`	The log-probability for the total JAVELIN fit.	ASCII
`cont_lcfile.dat`	The continuum light curve in JAVELIN format.	ASCII
`tot_lcfile.dat`	All light curves in JAVELIN format.	ASCII
`{line_name}_lc_fits.dat`	The best-fit light curves for the line.	CSV	time, value, uncertainty
`javelin_histogram.pdf`	A figure showing the histograms of the MCMC chains for each parameter.	PDF
`javelin_bestfit.pdf`	A figure showing the best-fit light curves for each line.	PDF
`javelin_corner.pdf`	A corner plot for all JAVELIN parameters.	PDF

Note

If both DRW parameters (i.e. the first two) are fixed, then there will not be a burn_cont.txt or chain_cont.txt file.

Note

If any parameters are fixed, there will not be a javelin_corner.pdf file.

The number of tophat parameters in the burn and chain files depends on the rm_type argument. If rm_type="spec", there will be 3 tophat parameters for each line (t, w, s). If rm_type="phot", there will be 2 tophat parameters for each line (t, w, s, \(\alpha\)).

If together=True, the tophat parameters will be grouped by line in order. For example, if rm_type="spec", the columns of the chain and burn files will be \(\log(\sigma_{\rm DRW}), \log(\tau_{\rm DRW}), t_1, w_1, s_1, t_2, w_2, s_2, ...\).

Module: Weighting

The output of the weighting module depends on if the pyCCF and JAVELIN modules are run. All results will either be stored in the weights/ subdirectory for each line or the main output_directory/.

If the pyCCF module is run, the weights/ subdirectory will contain the following files:

Filename	Description	Format	Columns
`pyccf_weights.dat`	The distributions needed to weight the CCCD for the line.	CSV	lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed CCCD, smoothed weighted CCCD
`pyccf_weighted_cccd.dat`	The downsampled CCCD after weighting and finding the primary peak.	CSV

If the JAVELIN module is run, the weights/ subdirectory will contain the following files:

Filename	Description	Format	Columns
`javelin_weights.dat`	The distributions needed to weight the JAVELIN lag distribution \(t\) for the line.	CSV	lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed \(t\), smoothed weighted \(t\)
`javelin_weighted_lag_dist.dat`	The downsampled \(t\) after weighting and finding the primary peak.	CSV

If the PyROA module is run, the weights/ subdirectory will contain the following files:

Filename	Description	Format	Columns
`pyroa_weights.dat`	The distributions needed to weight the PyROA lag distribution \(t\) for the line.	CSV	lags \(\tau\) , \(N(\tau)\), \(w(\tau)\), ACF, smoothed \(t\), smoothed weighted \(t\)
`pyroa_weighted_lag_dist.dat`	The downsampled \(t\) after weighting and finding the primary peak.	CSV

In addition, the weighting module will always output the following files in the weights/ subdirectory:

Filename	Description	Format	Columns
`{line_name}_weights.pdf`	A figure showing the distributions needed to weight the CCCD, JAVELIN lag distribution, or PyROA lag distribution.	PDF
`weight_summary.fits`	A FITS table containing the results of the weighting and auxiliary information from the weighting.	FITS	See below

The weight_summary.fits file contains the following information for each module (pyCCF, JAVELIN, and/or PyROA):

Name	Description	Type
`k`	The exponent used to calculate \(P(\tau)\)	`float`
`n0_(module)`	The value of \(N(0)\). Given for both the CCCD and \(t\).	`float`
`peak_bounds_(module)`	The bounds of the primary peak of the weighted distribution. Given as [lower bound, peak, upper bound] for both the CCCD and \(t\).	list of `float`
`peak_(module)`	The peak of the primary peak. Given for both the CCCD and \(t\).	`float`
`lag_(module)`	The median of the downsampled lag distribution. Given for both the CCCD and \(t\).	`float`
`lag_err_(module)`	The uncertainty on the lag. Given as [lower error, upper error] for both the CCCD and \(t\).	list of `float`
`frac_rejected_(module)`	The fraction of the original distribution that was rejected to obtain the downsampled distribution. Given for both the CCCD and \(t\)	`float`
`rmax_(module)`	The maximum value of the CCCD within the region covered by the downsampled JAVELIN lag distribution.	`float`

where (module) is either pyccf, javelin, or pyroa.

Note

If a module is not run, its values in weight_summary.txt for that module will be NaN.

Note

If pyCCF is not run, rmax_(module) will be NaN.

In addition, the following files will be placed in the main output_directory/:

Filename	Description	Format
`pyccf_weights_res.pdf`	A figure showing the output of the weighting process for the CCCD.	PDF
`javelin_weights_res.pdf`	A figure showing the output of the weighting process for the JAVELIN lag distribution.	PDF
`pyroa_weights_res.pdf`	A figure showing the output of the weighting process for the PyROA lag distribution.	PDF