Our Approach to Data Analysis

Overview

We conduct data analysis for three interconnected goals. First, we want to mechanistically understand past changes in the Earty system. Second, we want to project plausible future changes in the Earty system. Finally, we want to use this information to design sustainable, scientifically sound, technologically feasible, economically efficient, and ethically defensible risk management strategies. Our data analysis draws from a wide range of disciplines including engineering, Earth sciences, economics, philosophy, decision science, and statistics.

We believe that for our analyses to be useful and used, they must be transparent, rigorous, interpretable, and reusable.

The checklist below — designed by Klaus Keller, James Doss-Gollin, and Vivek Srikrishnan — provides the countours of our workflows to achieve our analysis goals:

  1. Setting up the analysis
    1. Legal and ethical questions assessed
    2. Clear question(s) defined
    3. Link to decision-making established
    4. Possible design errors discussed (types I, II, III, IV)
    5. License defined
    6. Copyright defined
    7. Authors defined
  2. Choosing a methods portfolio
    1. Bayesian vs Frequentist chosen
    2. Analytical vs computational approach chosen
    3. Data generating model defined
    4. Likelihood function defined
    5. (Implicit) prior (with correlations) defined
    6. Required approximations defined
    7. (Likelihood function, model structure, uncertainties, …)
    8. Assumptions spelled out
    9. Positive controls (across key methods and assumptions) performed
    10. Negative control (across key methods and assumptions) performed
    11. Expected results (with time-stamp) documented
  3. Performing the analysis
    1. Convergence tested (if using numerical methods)
    2. Assumptions tested (e.g., residuals consistent with likelihood function)
    3. Prior prediction checks passed
    4. Posterior predictive checks passed
    5. Size and effects of (deep uncertainties / forking paths) characterized
  4. Documenting the analysis
    1. Authors named and reachable
    2. Environment variables documented
    3. Assumptions and robustness discussed
    4. Code reviewed for logic and consistency with documentation
    5. Figures publication ready
    6. Documentation accessible to intended / target audience(s)
    7. All sources cited
    8. Non-author contributors acknowledged
    9. Funding acknowledged
    10. Conflict of interest declared
    11. Materials uploaded to repository

Motivating References

  • Doss-Gollin, J., & Keller, K. (2022). A subjective Bayesian framework for synthesizing deep uncertainties in climate risk management. Earth’s Future. https://doi.org/10.1029/2022ef003044

  • Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., et al. (2020, November 3). Bayesian Workflow. arXiv [stat.ME]. Retrieved from http://arxiv.org/abs/2011.01808

  • Pollack, A., Auermuller, L., Burleyson, C., Campbell, J. E., Condon, M., Cooper, C., et al. (2023, December). Investing in open and FAIR practices for more usable and equitable climate-risk research. Preprint. Retrieved from http://dx.doi.org/10.31219/osf.io/29nhv