Forschungspraktikum 1+2: Computational Social Science

Session 11: Open Science

Dr. Christian Czymara

Agenda

  • Evaluation
  • Social science’s replication crisis
  • The philosophy of Open Science
  • Exercise: Setting up GitHub

Evaluation

Or follow the QR code:

Replication Crisis

  • A systemic issue in science where many studies fail to produce the same results when repeated
  • Affects psychology, medicine, economics, biology, and other disciplines

Emergence of the Crisis

Causes of the Replication Crisis

  • Publication Bias: Preference for novel, positive results
  • File drawer problem: Null results more likely go unpublished
  • P-Hacking: Manipulating data until statistically significant results appear (e.g., \(p<0.05\))
  • Low statistical power: Small sample sizes lead to unreliable results
  • Lack of transparency: Inadequate sharing of data, methods, and materials

Impact of the Crisis

  • Loss of trust in published findings
  • Waste of resources on irreproducible studies
  • Misguided policies or treatments based on flawed research
  • Erosion of public trust in science

Replication Studies in the Social Sciences

  • Studies aimed at repeating previous studies to verify their results
  • (Usually) same methods, conditions, and context as the original study
  • Idea: Do teams of researchers come to the same conclusion testing the same hypothesis? (“many-analysts approach”)

Replication Studies: Examples

Criticism of Replication Studies

Addressing the Crisis

  • Pre-Registration: Register hypotheses, methods, and analyses before conducting research
  • Open Science: Sharing datasets, code, and protocols
  • Replication Studies
  • Statistical Reforms: Bayesian methods, confidence intervals, larger sample sizes
  • Cultural change in science: Shift focus from novelty to reliability (e. g.: PLOS ONE: “The editors make decisions on submissions based on scientific rigor, regardless of novelty.”)

Introduction to Open Science

  • A movement to make scientific research accessible, transparent, and reproducible
  • Core principles
    • Open data
    • Open methods
    • Open access
    • Open collaboration

The Core Tenets of Open Science

  • Transparency
    • Sharing data, methods, and code
    • Clear documentation of workflows
  • Reproducibility
    • Ensuring results can be independently verified
    • Avoiding the replication crisis
  • Collaboration
    • Facilitating interdisciplinary and global teamwork
    • Examples: Open-source projects, collaborative platforms

Tools for Open Science

A (Slowly) Changing Culture?

Sharing Materials

Open Science Framework (OSF)

GitHub

  • GitHub is a platform for hosting and collaborating on software development projects using Git (a Version Control System (VCS))
  • Offers version control, collaboration, and project management
  • Tracks who made changes and when
  • Allows reverting to previous versions
  • Supports branching for parallel development

GitHub

  • Cloud-Based Hosting: Stores Git repositories online
  • Collaboration Tools: Issue tracking, pull requests, and team discussions
  • Community: Tons of open-source and private projects

Key Features of GitHub

  • Repositories (Repos): Central location for project’s files and history (see example)
  • Branches: Separate versions of a project for different tasks
  • Pull Requests: Propose and discuss changes before merging into the main branch
  • Issues and Discussions: Track bugs, suggest features, and foster collaboration

Basic GitHub Workflow

  • Create or Clone a Repository: Download a project to your local machine
  • Make Changes: Edit files and track changes using Git
  • Commit Changes: Save a snapshot of your work
  • Push Changes: Upload updates to the GitHub repository
  • Pull Requests: Submit changes for review and merging

Using GitHub

  • GitHub Desktop is a graphical user interface (GUI) application for managing Git repositories
  • Simplifies Git workflows without requiring command-line usage
  • Alternatively, add GitHub to RStudio

Integrating GitHub with RStudio

  • Download Git and create a GitHub account
  • Open RStudio > Tools > Global Options > Git/SVN
  • Set the Git executable path (e.g., C:/Program Files/Git/bin/git.exe)
  • In RStudio, click File > New Project > Version Control > Git
  • Enter the repository URL from GitHub
  • Use the Git tab in RStudio to stage, commit, and push changes to your GitHub repository

More Resources

Uploading Your Materials to GitHub

  • Please share your term paper code via GitHub
  • Create a GitHub repository (e.g., Term_Paper_Project_YourName)
  • Upload your code using GitHub Desktop, RStudio, or GitHub’s web interface

Sharing Your Materials on GitHub

  • Share your repository with me
  • Option 1: Make the repository public and share the link
  • Option 2: Keep the repository private and invite me as a collaborator
  • Go to the repository > Settings > Manage Access > Invite Collaborator

Tutorial 11.