Detection Methodology

How S.O.L.A.R.I.S. identifies exoplanet candidates in NASA TESS photometric data

Data Source NASA TESS Mission

S.O.L.A.R.I.S. analyzes photometric time-series data from NASA's Transiting Exoplanet Survey Satellite (TESS), an all-sky survey mission launched in 2018 that monitors over 200,000 stars for brightness variations. The satellite observes each sector of sky for approximately 27 days, capturing precise brightness measurements at regular intervals.

The pipeline ingests 2-minute cadence SPOC (Science Processing Operations Center) light curves distributed as FITS files through the Mikulski Archive for Space Telescopes (MAST). Each light curve contains thousands of flux measurements recording how a star's brightness changes over time. We preferentially use PDCSAP (Pre-search Data Conditioning Simple Aperture Photometry) flux, which has already been corrected for instrumental systematics by the SPOC pipeline.

Cadence

2-minute exposures yielding ~19,000 data points per 27-day sector

Format

FITS binary tables with TIME, PDCSAP_FLUX, and PDCSAP_FLUX_ERR columns

Target Selection

M-dwarf stars (Teff < 4000 K) prioritized for closer habitable zones and deeper transits

Archive

Downloaded via lightkurve Python library from MAST archive

Our M-dwarf targeting strategy focuses on cool red dwarf stars with effective temperatures below 4,000 K. These stars are ideal transit-detection targets because their smaller radii produce deeper transit signals for a given planet size, and their closer-in habitable zones yield shorter orbital periods, increasing the probability of observing multiple transits within a single TESS sector.

Detection Pipeline 7-Stage Processing

Every star passes through a fully automated seven-stage pipeline. Each stage applies validated algorithms and quantitative thresholds to progressively refine raw photometric data into scored exoplanet candidates.

1

Data Acquisition & Normalization

Download TESS light curves via the lightkurve library from NASA's MAST archive. Long-term stellar variability trends are removed using polynomial or spline detrending. Flux values are median-normalized to dimensionless relative flux centered at 1.0, enabling comparison across different stellar magnitudes.

source: MAST/TESS SPOC  |  flux_type: PDCSAP
normalization: median division  |  detrending: Savitzky-Golay or spline
2

Noise Filtering

Outlier data points caused by cosmic rays, spacecraft jitter, or instrumental anomalies are removed via iterative sigma-clipping. Residual systematics are suppressed with Savitzky-Golay smoothing filters. Co-Trending Basis Vector (CBV) corrections address correlated noise shared across targets on the same CCD. Gaps introduced by spacecraft momentum dumps are identified and handled to prevent false periodicity signals.

sigma_clip: 5-sigma upper, 3-sigma lower
savgol_window: 101 cadences  |  savgol_order: 2
cbv_correction: first 4 basis vectors  |  gap_threshold: >0.5 days
3

Transit Detection (BLS)

The Box Least Squares algorithm (Kovacs, Zucker & Mazeh 2002) searches for periodic box-shaped brightness dips characteristic of planetary transits. The algorithm evaluates over 5,000 trial periods between 0.5 and 15 days, fitting a box-shaped model at each period to find the best-matching transit signal. The highest-power period is reported along with the transit epoch and depth.

algorithm: Box Least Squares (BLS)
trial_periods: 5,000+ between 0.5 – 15 days
min_transit_duration: 0.01 × period  |  max_transit_duration: 0.05 × period
outputs: best period, epoch (t0), depth, SNR, FAP
4

Orbital Parameter Fitting (MCMC)

Once a candidate transit signal is identified, Markov Chain Monte Carlo sampling via emcee (Foreman-Mackey et al. 2013) is used to determine precise orbital and physical parameters. The sampler explores the posterior probability distribution of five key parameters, yielding best-fit values with robust uncertainty estimates derived from the 16th/50th/84th percentile of the marginalized posteriors.

sampler: emcee  |  walkers: 32  |  steps: 2,500 (500 burn-in)
param_1: orbital period (P)
param_2: transit epoch (t0)
param_3: planet-star radius ratio (Rp/Rs)
param_4: scaled semi-major axis (a/Rs)
param_5: orbital inclination (i)
5

Candidate Scoring

Each candidate receives a composite confidence score from 0 to 100, weighted across six diagnostic metrics. This score quantifies how strongly the data supports a genuine planetary transit versus noise or astrophysical false positives. Only candidates scoring above 60 proceed to the validation stage.

SNR: signal-to-noise ratio (25% weight)
FAP: false alarm probability (20% weight)
MCMC convergence: Gelman-Rubin statistic (15% weight)
Odd-even consistency: transit depth comparison (15% weight)
Transit symmetry: ingress/egress duration match (15% weight)
No secondary eclipse: absence of secondary dip (10% weight)
6

False Positive Rejection

A battery of automated tests eliminates common astrophysical false positives. Eclipsing binary stars produce transit-like signals but with characteristically deeper dips and detectable secondary eclipses. The pipeline applies strict rejection criteria to filter these impostors before any candidate reaches the classification stage.

EB filter: reject if depth > 3% (30,000 ppm)
Radius ratio: reject if Rp/Rs > 0.15
Secondary eclipse: reject if secondary depth > 50% of primary
Odd-even: reject if depth difference > 3-sigma
Variability: reject if stellar RMS > 5× expected noise
7

Candidate Classification & Validation

Surviving candidates undergo automated re-verification by an independent distributed worker node to confirm reproducibility. The pipeline then calculates habitable zone boundaries using the models of Kopparapu et al. (2013), computes an Earth Similarity Index (ESI) based on radius, density, escape velocity, and surface temperature, and flags candidates with biosignature potential for priority follow-up.

re-verification: independent worker reprocessing
HZ model: Kopparapu et al. (2013) conservative/optimistic
ESI: weighted geometric mean of planetary similarity metrics
biosignature_flags: O2, CH4, O3, H2O spectral markers

Phase Folding Visualizing Periodic Transits

Phase folding is a critical technique that transforms scattered transit events spread across weeks of data into a single, clean transit profile. The algorithm divides the time axis by the detected orbital period and stacks all transits on top of each other, dramatically increasing the signal-to-noise ratio and revealing the true transit shape.

Raw Light Curve (scattered transits)
Phase-Folded (stacked transits)

In the raw light curve (left), individual transits appear as small, noisy dips separated by the orbital period. After phase folding (right), all transit events are superimposed at phase 0, producing a clear, high-SNR transit profile that reveals ingress, flat bottom, and egress morphology.

Confidence Scoring Composite Metric Breakdown

The composite confidence score is a weighted sum of six independent diagnostic metrics, each normalized to [0, 1]. The formula balances detection strength (SNR, FAP) with physical consistency checks (symmetry, odd-even) and MCMC fit quality.

Signal-to-Noise Ratio
25%
False Alarm Probability
20%
MCMC Convergence
15%
Odd-Even Consistency
15%
Transit Symmetry
15%
No Secondary Eclipse
10%
C = 0.25·SNR + 0.20·FAP + 0.15·MCMC + 0.15·OddEven + 0.15·Symmetry + 0.10·NoSecondary

Data Transparency Important Disclaimers

Public Data Source

All analyzed data originates from the NASA TESS mission and is publicly available through the Mikulski Archive for Space Telescopes (MAST) at mast.stsci.edu.

Statistical Candidates Only

All detected signals are statistical exoplanet candidates. They have not been independently confirmed and should not be cited as confirmed planets.

Follow-Up Required

Professional confirmation requires independent observations: radial velocity measurements, direct imaging, transit timing variations, or spectroscopic analysis by ground-based or space telescopes.

Independent Project

S.O.L.A.R.I.S. is an independent citizen-science initiative. It is not affiliated with, endorsed by, or funded by NASA, ESA, or any government space agency.

Scientific References

  • Kovacs, G.; Zucker, S.; Mazeh, T. (2002)
    "A box-fitting algorithm in the search for periodic transits"
    Astronomy & Astrophysics, 391, 369-377
    arXiv:astro-ph/0206099
  • Foreman-Mackey, D.; Hogg, D. W.; Lang, D.; Goodman, J. (2013)
    "emcee: The MCMC Hammer"
    Publications of the Astronomical Society of the Pacific, 125, 306-312
    arXiv:1202.3665
  • Kopparapu, R. K.; Ramirez, R.; Kasting, J. F.; et al. (2013)
    "Habitable zones around main-sequence stars: New estimates"
    The Astrophysical Journal, 765, 131
    arXiv:1301.6674
  • Ricker, G. R.; Winn, J. N.; Vanderspek, R.; et al. (2015)
    "Transiting Exoplanet Survey Satellite (TESS)"
    Journal of Astronomical Telescopes, Instruments, and Systems, 1, 014003
    arXiv:1406.0151
  • Jenkins, J. M.; Twicken, J. D.; McCauliff, S.; et al. (2016)
    "The TESS science processing operations center"
    Proceedings of SPIE, 9913, 99133E
    doi:10.1117/12.2233418