When the TESS spacecraft captures photons from a distant star, the signal that reaches us is far from pristine. Raw telescope data contains instrumental noise, thermal fluctuations, spacecraft jitter, and systematic effects that can easily mask the subtle dips caused by transiting exoplanets. The difference between raw and processed data is so dramatic that the same observation file, left untouched, might tell you there's no planet there—while carefully detrended data reveals a world orbiting billions of miles away. This is the hidden art of space telescope data reduction: knowing what to remove, how much to remove, and when you've removed too much.
The Two Faces of TESS Data: SAP and PDCSAP
Every TESS light curve you download exists in at least two versions, and understanding the difference between them is fundamental to working with space telescope data. The Simple Aperture Photometry (SAP) flux is the rawest form you'll encounter—photon counts collected from a circular aperture around your target star, with minimal processing. It's what the pixels saw, essentially unvarnished.
The Presearch Data Conditioning Simple Aperture Photometry (PDCSAP) flux, by contrast, is the NASA pipeline's attempt to remove systematic errors while preserving real astrophysical signals. PDCSAP data has been run through detrending algorithms, corrected for known instrumental effects, and processed to enhance detectability of planetary transits. For most exoplanet hunters, PDCSAP is the working product—it's what citizen scientists in projects like S.O.L.A.R.I.S. typically analyze when searching for new worlds in TESS observations.
But here's the critical insight: neither version is objectively "correct." SAP data is too noisy for reliable planet detection, while PDCSAP data has already been shaped by assumptions about what constitutes signal versus noise. Understanding what happened between SAP and PDCSAP is essential for interpreting your results accurately.
Spacecraft Motion and Thermal Breathing
Space telescopes live in hostile environments. The TESS spacecraft orbits Earth in a highly elliptical path, experiencing dramatic temperature swings as it passes through sunlight and shadow every 13.7 days. These thermal cycles cause the spacecraft structure itself to expand and contract—what engineers call "thermal breathing." When your telescope literally changes shape, the point spread function of your images shifts, and the photons from your target star land on slightly different pixels than they did yesterday.
Additionally, TESS uses reaction wheels to maintain pointing stability. These wheels spin down over time, requiring periodic momentum dumps—brief thruster firings that reset them. During these dumps, lasting typically 2-3 hours, the spacecraft's attitude oscillates noticeably, introducing sharp discontinuities and spikes in the light curve.
The TESS pipeline detects these momentum dumps automatically and flags the affected data points, but the thermal effects are more insidious. Unlike a momentum dump's sharp jolt, thermal breathing induces long-term drift in photometric baseline. A star that appears to brighten or fade over the course of an orbit may actually be doing nothing at all—the spacecraft is simply thermally expanding and contracting, moving the stellar image back and forth on the detector.
Data Gaps and Their Consequences
TESS doesn't observe continuously. The spacecraft operates in sectors lasting approximately 27 days, after which it must reorient to observe a new patch of sky. Additionally, momentum dumps, data transmission windows, and occasional instrument anomalies create gaps scattered throughout each light curve.
These gaps aren't mere missing data points—they break the continuity of your time series in ways that standard detrending methods struggle to handle. When you apply a smoothing filter across a gap, you introduce edge effects and spurious correlations. When you fit a polynomial trend to data with discontinuities, the polynomial can ring and oscillate unpredictably around the gap boundaries.
The TESS pipeline addresses this by masking problematic data points and handling gaps explicitly within its detrending algorithms. For researchers working with raw data, gaps must be identified, isolated, and often handled separately before any filtering or smoothing is applied.
Key point: Data gaps aren't just inconvenient—they fundamentally alter how detrending algorithms behave. Always inspect your light curves around gap boundaries for suspicious artifacts.
Filtering and Smoothing: The Double-Edged Sword
Once systematic effects and gaps are accounted for, the next step is to remove high-frequency noise while preserving the low-frequency signature of planetary transits. This is where median filtering and Savitzky-Golay smoothing become invaluable tools.
Median filtering works by replacing each data point with the median value of its neighbors within a specified window. Unlike mean filtering, which can be distorted by outliers, the median is robust: a single cosmic ray or bad pixel barely affects it. TESS uses median filtering as one of its first detrending steps, with window sizes typically chosen to preserve transit-like features while suppressing shorter-duration noise.
Savitzky-Golay smoothing takes a different approach, fitting low-order polynomials to small moving windows of data. The beauty of this method is that it's designed to preserve the shapes of features while smoothing noise—a carefully tuned Savitzky-Golay filter can preserve sharp dips like planetary transits while flattening noise spikes around them.
However, both methods are fundamentally aggressive. A wider median window removes more noise but also risk flattening shallow transits. A higher-order Savitzky-Golay polynomial fits the data more closely but fails to smooth effectively. The art of data reduction lies in choosing filter parameters that balance noise removal against signal preservation.
Co-Trending Basis Vectors: The TESS Secret Weapon
The most sophisticated part of the TESS detrending pipeline involves Co-Trending Basis Vectors (CBVs)—a technique that removes systematic errors without assuming a particular functional form for them. Rather than fitting a polynomial or spline to each light curve, CBVs leverage the fact that all stars on the same detector chip experience similar spacecraft-induced systematics.
The TESS pipeline computes CBVs by analyzing the common variance across hundreds of stars within a photometric mask. If all stars brighten and fade in sync with spacecraft thermal breathing, that synchronized variation becomes a basis vector. The algorithm then determines how much of each basis vector is present in your target star's light curve and removes it.
This approach is remarkably effective because it's based on direct observation of what all stars experience, rather than on models or assumptions about spacecraft behavior. CBVs adapt to detector-specific issues, seasonal effects, and other systematic variations without requiring human intervention for each new dataset.
Citizen science projects analyzing TESS data, including work at S.O.L.A.R.I.S., rely heavily on the CBV corrections already applied in PDCSAP data. This makes the preprocessed products far more suitable for amateur analysis than the raw SAP data would be.
The Peril of Over-Detrending
With all these powerful tools at our disposal—median filters, polynomial fits, basis vectors, smoothing algorithms—it's tempting to apply them aggressively and remove every blemish from a light curve. But this path leads to disaster.
Key point: Real astrophysical signals can look like noise. Over-aggressive detrending has destroyed discoveries, erased genuine planets, and turned valid detections into false negatives.
Consider a long-period planet—one that completes its orbit over months, causing a shallow dimming that spans the entire TESS observation window. A low-order polynomial detrending might misinterpret this planetary signal as part of the underlying stellar brightness trend and fit it away entirely. To your detrended light curve, the planet simply disappears.
Or imagine a star with gradual intrinsic variability—perhaps it's slightly active, with dark starspots rotating in and out of view. If your detrending algorithm assumes all long-term trends are instrumental and removes them, you've erased evidence of the star's magnetic activity, along with any planets that might depend on understanding that activity for proper characterization.
The solution is validation. Compare your detrended light curves against the original SAP data. Inject synthetic transit signals and confirm they survive your detrending pipeline. Use multiple independent detrending methods and check that your discoveries appear in all of them. At S.O.L.A.R.I.S., we maintain rigorous validation protocols precisely because aggressive data processing is easy; knowing whether it's actually helped is the hard part.
The invisible work of noise reduction is what separates real exoplanet discoveries from artifacts and mirages. Every planet you've heard about in a TESS light curve is there because someone carefully, thoughtfully decided what to keep and what to remove. That's not computation; that's science.
Join the Search for Habitable Worlds
Your computer could help discover the next Earth-like exoplanet. Download the free S.O.L.A.R.I.S. volunteer software and start contributing today.
Download S.O.L.A.R.I.S. Volunteer