Discussing the Impact of Our Exoplanet Study: How This Impacts Everything

This paper is available on arxiv under CC 4.0 license. Authors: (1) Eleonora Alei, ETH Zurich, Institute for Particle Physics & Astrophysics & National Center of Competence in Research PlanetS; (2) Björn S. Konrad, ETH Zurich, Institute for Particle Physics & Astrophysics & National Center of Competence in Research PlanetS; (3) Daniel Angerhausen, ETH Zurich, Institute for Particle Physics & Astrophysics, National Center of Competence in Research PlanetS & Blue Marble Space Institute of Science; (4) John Lee Grenfell, Department of Extrasolar Planets and Atmospheres (EPA), Institute for Planetary Research (PF), German Aerospace Centre (DLR) (5) Paul Mollière, Max-Planck-Institut für Astronomie; (6) Sascha P. Quanz, ETH Zurich, Institute for Particle Physics & Astrophysics & National Center of Competence in Research PlanetS; (7) Sarah Rugheimer, Department of Physics, University of Oxford; (8) Fabian Wunderlich, Department of Extrasolar Planets and Atmospheres (EPA), Institute for Planetary Research (PF), German Aerospace Centre (DLR); (9) LIFE collaboration, www.life-space-mission.com. Table of Links Abstract & Introduction Methods Results Discussion Conclusions Next Steps & References Appendix A: Scattering of terrestrial exoplanets Appendix B: Corner Plots Appendix C: Bayes’ factor analysis: other epochs Appendix D: Cloudy scenarios: additional figures 4. Discussion In Section 4.1 we compare the results we obtain for the cloudfree Modern Earth twin with the results from the similar study performed in Paper III. As previously mentioned, we retrieved spectra of cloudy exoplanets, while neglecting clouds in the forward model of the retrieval framework. We describe this effect in Section 4.2. We discuss the impact of the quality of the data on the retrievals in Section 4.3 and the systematic effects for the retrieval runs in Section 4.4. Finally, we quantify the potential that LIFE has in differentiating the various epochs (Section 4.5). For completeness, we mention that we also tested the impact of varying complexity of the theoretical spectral model on retrievals, by including and excluding scattering and/or CIA in the calculation. Through the analysis and comparison of ancillary retrieval grids, we could confirm that including or neglecting scattering and CIA in the calculation does not influence the quality of the results. We show the results in Appendix C. 4.1. Comparison with Paper III To allow for a proper comparison, we selected the model from Paper III that uses the same R, S/N, and wavelength range (4 − 18.5 µm, R = 50, and S/N = 10). The major difference between these two retrieval studies is that in Paper III retrievals were performed using the same theoretical atmospheric model that was also used to generate the input spectra. In contrast, in this work, we used an atmospheric model in the retrieval which is different than the one that was used to generate the spectrum. Further, in our previous study we assumed abundance profiles that were vertically constant while the spectra calculated by Rugheimer & Kaltenegger (2018) were based on a selfconsistent, altitude-dependent atmospheric composition. In Figure 8, we compare the retrieval results for the constrained planetary parameters and abundances from Paper III to our findings for the clear Modern Earth case. In the upper panel of Figure 8, we plot the retrieval results for the planetary parameters. The planetary radius Rpl is well constrained with respect to the assumed prior distribution and both posteriors are roughly centered on the corresponding truths. However, the spread of the clear Modern Earth Rpl posterior is larger than in Paper III, which indicates that the radius is slightly less well constrained. For Mpl our results are comparable to Paper III. Our results for the surface pressure P0 and surface temperature T0 agree less well with the results presented in Paper III. This is probably caused by small differences in the input PT profiles, as well as potential systematic errors, which we will discuss in more detail in Section 4.4. In the lower panel of Figure 8, we show the results obtained for the abundances of the trace gases that were constrained (Cor SL-type posteriors) by our retrieval analysis. We observe that the clear Modern Earth retrieval tends to overestimate the true abundances, while the estimates from Paper III appear more accurate. The retrieved posterior types match for all of the atmospheric gases considered. Additionally, for CO2, O3, and H2O, the spread of the posteriors for the clear Modern Earth runs are comparable to the results from Paper III. The larger spread in our CH4 abundance is the result of a slightly reduced sensitivity, which is most likely evoked by differences in the atmospheric scenarios used to generate the input spectrum and in the retrievals. These differences will reduce the accuracy overall of the retrieval results. 4.2. Impact of clouds on retrieval results As pointed out in Section 3, using a cloud-free atmospheric model in our retrievals will likely have introduced biases into our results for the cloudy scenarios. The presence of clouds in an atmosphere will reduce the MIR continuum emission of the observed exoplanet. The emission spectra of terrestrial exoplanets are typically dominated by the lowest, non-opaque atmospheric layers. In cloudy exoplanets, part of this thermal emission is hidden below the clouds. Thus, the atmospheric layers above the clouds contribute more to the overall spectrum. Because the atmospheric temperature at the top of the clouds is typically lower than the surface temperature of the planet, cloud coverage will generally lead to a cooler retrieval temperature with reduced continuum flux. This reduction in continuum flux can be clearly seen when comparing the clear to the cloudy spectra in Figure 1. Since the theoretical spectral model we use for our retrievals assumes a cloud-free atmosphere, the continuum emission must be reduced in other ways to achieve a satisfactory fit to the input spectrum. This reduction can be obtained by reducing the radius (and thus the emitting surface) and/or the surface temperature of the exoplanet. Due to these compensation effects, most of our retrievals of cloudy input spectra yielded smaller radii and/or cooler surface temperatures than the cloud-free inputs (see Figure 4). This is also valid at higher spectral resolutions and signalto-noise ratios, showing that the compensation effects are independent from the quality of the data (see Figure D.1). In addition to the surface conditions, the thermal structure of the layers and the chemical composition of the atmosphere also play a major role in shaping the emission spectrum, especially in the absorption/emission features. This could yield biased retrieval results for other parameters, such as the ai coefficients of the polynomial P-T profile. In our results, the cloudy Prebiotic Earth (PRE-C) model shows the clearest signature of this compensation effect (see Figure 3). This degeneracy between cloud coverage and thermal structure in retrievals was also found in other studies (see e.g. Mollière et al. 2020, and references therein). Such biased results could be misleading, especially when trying to analyze the habitability of an observed exoplanet. If, by neglecting clouds in our theoretical spectral model, we underestimate the surface temperature, we could therefore misclassify habitable exoplanets. A clear example is the cloudy Modern Earth (MOD-C) scenario: using our cloud-free forward model to retrieve this cloudy spectrum causes the retrieved ground temperature to be colder than 275 K. Such low temperatures suggest a potentially uninhabitable planet, which we know is not the correct interpretation for the cloudy Modern Earth spectrum. On the other hand, when looking at the retrieved chemical abundances (see Figure 5), we observe only minor variations in the shape of the posteriors for all the major absorbing gases in the atmosphere. This indicates that, despite having a major impact on the retrieved physical parameters (Rpl and T0), retrieving cloudy spectra with a cloud-free model does not significantly impact the chemical characterization of the atmosphere. Therefore, including a cloud model in the theoretical spectral model that we use for retrievals could improve the quality of the results. However, this depends on the goal of the analysis. If we aim to characterize the chemical composition of the atmosphere, it may be sufficient to use cloud-free retrievals. This would be a smart strategy considering that including a somewhat realistic cloud treatment in the theoretical spectral model significantly increases the number of retrieved parameters and subsequently the running time. Performing retrievals on input spectra that include visible/near-infrared data in addition to the MIR observations will likely provide additional information about the cloud composition and structure. In this sense, coupling with data acquired by HabEx/LUVOIR and LIFE may significantly improve retrieval results. We will compare the retrieval performance for different cloud models and discuss the capabilities of joint reflected light/thermal emission retrievals in future publications. 4.3. Increasing the quality of the input spectra The results of the retrievals performed assuming other combinations of R and S/N, described in Section 3.5, show that increasing the S/N to 20 will allow us to detect both O3 and CH4 in more cases. Increasing to R = 100 would also improve our results, especially when combined with an increase in S/N. This is an interesting finding for multiple reasons. From the scientific point of view, simultaneously detecting O3 (which can provide an indirect estimate of O2) and CH4 would be a strong indicator of chemical disequilibrium in the atmosphere possibly hinting at the existence of biological activity. Such a detection would make the respective exoplanet a highpriority target for the search of life beyond the Solar System. This concept will be further explored in Section 4.5. From the technical point of view, it would mean that one needs to consider longer integration times, while maintaining a stable architecture of the interferometer array. For the assumptions in our baseline case (see, Table 2), doubling the resolution would roughly correspond to a doubling of the integration time (from ∼ 50 to ∼ 100 days), while doubling the S/N would translate in integration times roughly four times longer (from ∼ 50 to ∼ 200 days). This poses challenges in terms of mission technical feasibility as well as mission scheduling. Increasing the instrument throughput, for which we assumed a conservative value (cf. Paper I), or the aperture size would bring the required integration times down. Also, the nearest rocky exoplanets orbiting within the habitable zone (HZ) of their Solar-type host stars may not be 10 pc away. Bryson et al. (2021) estimated that with 95% confidence the nearest HZ planet around G and K dwarfs is ∼6 pc away and they predict ∼4 HZ rocky planets around G and K dwarfs within 10 pc of the Sun. Taking all this together, we would therefore recommend the stick to the baseline requirements for LIFE of R = 50 and S/N = 10, as proposed in Paper III, since they allow for a reliable and quantitative characterization of the most important physical and chemical properties of the considered atmospheres. The most promising targets could then be observed further to increase the S/N, thus allowing a more precise characterization of the atmosphere. 4.4. Systematics and current challenges Thus far, we can confidently conclude that our Bayesian framework can retrieve consistent and robust results. This is not only valid for simulated observations generated with petitRADTRANS (see Paper III), but also for input spectra produced by other radiative transfer models (here by Rugheimer & Kaltenegger 2018). These results are highly promising in the context of analyzing real observational data in the future. However, as we mentioned in the previous sections, our work has identified some aspects which may lead to biased results. Some issues are linked to the intrinsic limitations of the Bayesian retrieval routine we described in Section 2.3. Ideally, these can be mitigated to improve the results, for example by choosing a different P-T profile parametrization, or by adding a cloud model to the retrieval. Further, we purposely chose to perform our retrievals assuming uniform priors for most parameters where all values were possible if within a specified, wide range (see Section 2 for details). However, for future observations, the prior space might already be constrained (e.g. if one or more parameters are already measured by independent observations) and this would likely improve our retrieval results. Despite these possibilities, we will eventually be limited by two factors. First, the number of parameters that the Bayesian framework can handle within reasonable computing time. This limit on the number of parameters will remain unless novel parameter estimation algorithms emerge. An example would be the use of machine-learning retrieval routines (e.g. Waldmann 2016; Márquez-Neila et al. 2018; Cobb et al. 2019). Second, for a given resolution the information content of the spectrum is limited. Therefore, considering additional parameters in the retrieval framework could bias the results, for example causing a false positive inference of an atmospheric species. However, the most relevant issues are independent of the parameter estimation routine. They are rooted in the intrinsic differences between individual radiative transfer models used to produce the MIR input spectra and the theoretical spectra in the retrievals. Such discrepancies may be caused by a slightly different treatment of physical or chemical processes, or differences in the assumed opacity tables. To investigate these issues, we computed the MIR spectra for the four clear-sky scenarios (MODCF, NOE-CF, GOE-CF, and PRE-CF) using petitRADTRANS. We assumed exactly the same input parameters (i.e. P-T profile, abundances, planetary dimensions) that Rugheimer & Kaltenegger (2018) used to produce their spectra. We show the results for R = 50 in Figure 9. The petitRADTRANS spectra are plotted as solid lines using the color scheme from Table 1. The input spectra from Rugheimer & Kaltenegger (2018) are shown as black dots. The error bars indicate the LIFEsim uncertainty used in the main grid of retrievals (S/N = 10 at 11.2 µm). We observe that the petitRADTRANS spectra deviate (mostly within the LIFEsim uncertainty) from the spectra calculated by Rugheimer & Kaltenegger (2018), despite both models assuming the exact same input. While the absorption features are generally in agreement with each other, the spectra produced by petitRADTRANS show a higher continuum flux, especially around 8−12 µm. This discrepancy is likely linked to differences in the opacity tables used by the two radiative transfer models. As stated in Section 2.3, these differ with respect to: 1. To prevent the wings of the pressure-broadened lines from extending to infinity (non-physical), it is necessary to introduce a wing cutoff. However, different radiative transfer models assume different cutoff thresholds (see the comparisons performed by e.g. Lee et al. 2019; Baudino et al. 2017; Barstow et al. 2020). Rugheimer & Kaltenegger (2018) used a wing cutoff at 25 cm−1 from the line center. In contrast, the line cutoff used for the petitRADTRANS opacity tables assumes an exponential line wing decrease (for details see Mollière et al. 2019). This may explain the higher continuum emission observed for all petitRADTRANS spectra. Wing Cutoff. 2. The default opacity tables used by petitRADTRANS stem from different sources. They are calculated from the HITEMP, HITRAN 2012, or ExoMol line lists (see Table 4). In contrast, the spectra from Rugheimer & Kaltenegger (2018) were computed using only the HITRAN 2016 line lists, which in some cases are more recent than the ones adopted in our study. At the pressures and temperatures of interest in the study, we wouldn’t expect large variations in the line lists, provided all the databases are synchronous. The use of different versions of the same database (e.g. HITRAN 2012 versus HITRAN 2016) might cause variations in the opacities, since databases more recently updated generally include more transition lines (see e.g. Gordon et al. 2017). Furthermore, the default petitRADTRANS opacities only account for transitions of the main isotope, while the opacity tables used in Rugheimer & Kaltenegger (2018) can account for additional isotopes. Line List Databases. 3. To compute the line profiles, it is necessary to account for collision-driven line broadening. This depends on the pressure and composition of each atmospheric layer. For most molecules both models assume air broadening, which is based on a Modern Earth-like atmospheric composition. However, for CH4, the petitRADTRANS opacity table assumes a theoretical broadening model based on Equation 15 in Sharp & Burrows (2007), which was experimentally validated. Another exception is N2O, for which H-He broadening was assumed (see Chubb et al. 2021). However, at the pressures and temperatures of interest, we do not expect large differences due to pressure broadening (Sharp & Burrows 2007; Mollière et al. 2019; Gharib-Nezhad & Line 2019; Chubb et al. 2021). We mention it here for completeness. Pressure Broadening Coefficients. These differences likely also account for a substantial part of the offsets we find in the retrieved parameter values. Future inter-comparison studies could help us define a “best practice” upon which to agree, as a community, to compute opacity tables for retrievals in order to minimize these systematic effects. Furthermore, an ongoing experimental work will be necessary to improve the completeness of the transition line databases and reduce discrepancies. 4.5. Differentiating the epochs A quantitative approach to differentiate between the various scenarios is through the results of the retrieval analyses. We performed a first qualitative step in this direction in Sections 3.2, 3.3, and 3.4, where we visually compared the retrieved P-T profiles and the posteriors for the planetary parameters and abundances. Through visual comparison, we found that differentiating the epochs via the retrieved P-T structure and planetary parameters is challenging. By studying the retrieved abundance posteriors, we found that the best candidates to perform such differentiation are O3 and CH4. This finding is especially interesting since the O2-CH4 pair is generally considered the strongest biosignature (see Lovelock 1965; Lederberg 1965) and O2 can be constrained from O3 through atmospheric chemistry models. Thus, the detection of one or both of these molecules will likely trigger follow up observations and could allow us to separate between potentially alive and lifeless planets. However, a more indepth characterization of the atmospheres is limited by the large variance on the posteriors of all these species, which typically exceeds one order of magnitude. Thus, small values of ∆ indicate that the compared posterior distributions only show small differences relative to each other. In this case it is hard to differentiate between the retrieved posteriors. On the other hand, larger values of ∆ indicate that the differences between the two posteriors are likely to correspond to different underlying true values of the considered parameter. We can calculate ∆ for all the combinations of the various scenarios and for all parameters. We get particularly interesting results for CH4 and O3. Figure 10 shows the cumulative distribution functions for all the combinations of the clear sky scenarios (MOD-CF, NOE-CF, GOE-CF, PRE-CF) calculated from the posteriors of CH4 and O3, for R = 50 and S/N = 10. Annotated in each subplot of the corner plot, we noted the values of ∆ (percentage) corresponding to each combination. On the diagonals, the retrieved posteriors for every scenario are shown for reference. We keep the color scheme defined by Table 1. Regarding CH4, we can fairly confidently distinguish between the clear prebiotic Earth (PRE-CF) and the Earth after the GOE (GOE-CF), for which ∆ = 95%, as well as between PRE-CF and the Earth after the NOE (NOE-CF), for which ∆ = 90%. The distinction between the prebiotic Earth and the Modern Earth (MOD-CF), as well as between the NOE and the GOE Earth is more difficult (∆ ≤ 31%). For O3, we observe a clear division into two subgroups: on the one side the Modern and NOE Earth, where we have a clear detection of O3, and on the other hand the GOE and prebiotic Earth, where we only retrieve an upper limit on the abundance. The high value of ∆ ∼ 90% between all combinations of MOD-CF/NOE-CF versus GOE-CF/PRE-CF clearly allows such a distinction. This is in agreement with what we concluded from in Figure 5 in Section 3.4. However, in contrast to the qualitative discussion based on the posteriors appearance, ∆ provides a promising metric to quantify the magnitude of these differences. In Figure 11 we summarize the calculated ∆ values for all combinations of cloud-free input spectra and for the different RS/N pairs considered in Section 4.3, for a total of four tables. Within the tables, each cell shows the ∆ value (percentage) for a given parameter (columns) and for a comparison of two specific scenarios (rows). The cells are also colored according to the value of ∆, with darker hues for larger values of ∆. As mentioned above, the biggest differences between the posteriors at R = 50, S/N = 10 can be observed for the molecules CH4 and O3. Furthermore, we observe some differences for CO2 and H2O, as well as for the P0 posteriors. However, as discussed previously, these differences are rooted in degeneracies between the pressure and the abundances (see Section 4.1) and not caused by large physical differences in the underlying atmospheres. These findings are generally still valid as we move to higher R and S/N. Similar conclusions can also be drawn for the cloudy inputs (see Appendix D). Since we are able to confidently detect O3 for a clear Earth after the NOE and for the Modern Earth and we can distiguish these two epochs from earlier scenarios (Prebiotic and GOE Earth), we can infer that LIFE would be able to detect traces of life as we know it in an Earth-like atmosphere when the abundance of O2 has passed the 10% PAL threshold. This is consistent with other studies that focused on different wavelength ranges, such as the work by Kawashima & Rugheimer (2019) based on LUVOIR. The biosignature pair CH4-O3 might be even easier to detect when the abundance of O2 is around 10% PAL (NOE Earth), rather than Modern Earth. The NOE Earth scenario is particularly favored since the atmosphere is filled with enough O3 to be detectable, but a low enough abundance of O2 to deplete the CH4 in the atmosphere. These results are also consistent with the results shown in Kawashima & Rugheimer (2019). In other words, if LIFE were to observe the Earth at various stages of its evolution orbiting the Sun at 10 pc distance, it would be able to detect strong indicators of life starting from around 0.8 Ga (NOE Earth). The detection of CH4 with an upper limit on O3 would also allow a tentative detection of potential biological activity up to 2.0 Ga (GOE Earth). We must keep in mind that the epochs that we chose for our study are momentary "snapshots" in the continuous evolution of Earth, even though these four scenarios represent the major changes in our atmosphere. Still, other evolutionary paths are possible in the context of exoplanets, especially when considering other stellar classes. Realistically, all promising candidates would be followed-up with additional observations within the LIFE mission. It is beyond the scope of this work to conclusively infer the presence of a biosphere from the measured spectra of potentially habitable candidates, As discussed in other works such as Meadows et al. (2018) or Krissansen-Totton et al. (2022), we would require a thorough discussion of the context information available for the observed planetary system before claiming a “life detection”. However, the presented retrieval results are certainly an important piece of information for the development of frameworks for systematically assessing biosignature detections (e.g. Catling et al. 2018; Walker et al. 2018). This paper is available on under CC 4.0 license. arxiv