Creating presentations with R Markdown

Brief overview of ioslides

Creating slides with R Markdown

  • To create an ioslides presentation from R Markdown, specify ioslides_presentation in the YAML header
    • beamer and slidy are other presentation types
---
title: "Formatted Output & R Markdown Presentations"
output:
  ioslides_presentation: default
---

Brief overview of ioslides

Creating slides with R Markdown

  • Use # for a new section, ## for a new slide, and | for adding a subtitle
  • Can set slides to display smaller text {.smaller} or display the slide content incrementally {.build} as in this slide.
## Slide title {.smaller .build}

Here is a short tutorial. See Chapter 4.1 in R Markdown: The Definitive Guide for a more comprehensive treatment.

Including output in your slides

Loading packages

This presentation uses some familiar packages to generate and display results:

  • tidyverse
  • fixest
  • lmtest
  • multiwayvcov
  • kableExtra

Plus some packages for creating tables with formatted output:

  • stargazer
  • modelsummary

Loading data objects for this presentation

Load in the data objects you need, saved after creating in your R script.

Can load an .rdata file with many objects:

load("formatted_output_slides.RData") #includes objects from diff lessons

Or you can load an .rds file for a single object:

feDF <- readRDS(file = "feDF.rds") #a single data frame from the Medicaid lesson

Difference-in-means tables

Suppose we’re interested in comparing the poverty rate and vfund_per_poor of regencies between Java and Sulawesi islands.

Here is a tutorial for making this kind of table using tidyverse functions.

Variable Java Sulawesi Difference p-value
Population in poverty (1000s) 137.60 26.25 111.35 0.000
VF allocation/poor resident ($1000s) 1773.33 4603.95 -2830.62 0.000

Difference-in-means tables

Here is another example using the datasummary_balance() function in the modelsummary package.

Difference-in-means during year of Medicaid implementation by state eligilibity
High-Eligibility (N=24)
Low-Eligibility (N=25)
Mean Std. Dev. Mean Std. Dev. Diff. in Means p
afdc_rate 3.01 0.73 1.56 0.33 -1.45 <0.01
income_pc 3.39 0.68 3.21 0.32 -0.18 0.33
hospitals_pc 0.03 0.01 0.03 0.02 0.00 0.40
beds_pc 4.90 0.74 5.09 0.71 0.19 0.44
Note: States are weighted by child population.

Upside to this function:

  • Can show formatted output without much work

Downside:

  • Isn’t very flexible compared to previous approach

Using Stargazer for formatted output

stargazer is an R package that creates LATEX code, HTML code and ASCII text for well-formatted regression tables, with multiple models side-by-side, as well as for summary statistics tables, data frames, vectors and matrices.”

Here is a good tutorial with numerous stargazer examples and formatting options.

Regression results

Make we sure we know your PRF before reporting regression results!

We should also know how you estimate SEs, weight observations, and any other critical details.

For our Medicaid example, here is the PRF for our base difference-in-difference model with state and year fixed effects,

\[Y_{st} = \beta D_{st} + X_{st}' \gamma + \mu_s + \tau_t + \varepsilon_{st}\]

where \(Y_{st}\) is the outcome of interest in state \(s\) and year \(t\); \(D_{st}\) is a binary variable indicating treatment status for a state-year; \(X_{st}\) is a vector of time-varying controls (income, hospitals, and hospital beds per capita); \(\mu_s\) represents state fixed effects; \(\tau_{t}\) represents year fixed effects; and \(\varepsilon_{st}\) is an idiosyncratic error term.

Regression tables

Regression tables using modelsummary() with minimal customization:

FE 1  FE 2  FE 3  FE 4  FE 5
D 0.101*** 0.051** 0.038*** 0.038** 0.045***
(0.018) (0.021) (0.013) (0.015) (0.014)
hospitals_pc 0.513 2.017 2.605** 1.208
(1.107) (1.397) (1.072) (1.443)
beds_pc −0.011 −0.013 −0.009 −0.007
(0.011) (0.009) (0.010) (0.010)
income_pc 0.025*** 0.024 0.040* 0.036*
(0.004) (0.020) (0.020) (0.021)
Num.Obs. 685 685 685 685 685
R2 0.662 0.824 0.906 0.888 0.923
R2 Adj. 0.636 0.810 0.886 0.860 0.890
R2 Within 0.421 0.699 0.197 0.197 0.222
R2 Within Adj. 0.420 0.697 0.191 0.191 0.215
RMSE 0.04 0.03 0.03 0.02 0.02
Std.Errors by: stfips by: stfips by: stfips by: stfips by: stfips
FE: stfips X X X X X
FE: region^year X X
FE: year_mcaid^year X X
* p < 0.1, ** p < 0.05, *** p < 0.01

Regression tables

Regression tables using modelsummary() with additional customization:

Effect of high Medicaid eligibility on public insurance use
FE 1  FE 2  FE 3  FE 4  FE 5
High-eligibility 0.101*** 0.051** 0.038*** 0.038** 0.045***
(0.018) (0.021) (0.013) (0.015) (0.014)
Hospitals per capita 0.513 2.017 2.605** 1.208
(1.107) (1.397) (1.072) (1.443)
Hospital beds per capita −0.011 −0.013 −0.009 −0.007
(0.011) (0.009) (0.010) (0.010)
Income per capita 0.025*** 0.024 0.040* 0.036*
(0.004) (0.020) (0.020) (0.021)
N 685 685 685 685 685
State FEs X X X X X
Region-Year FEs X X
Medicaid timing-by-Year FEs X X
* p < 0.1, ** p < 0.05, *** p < 0.01
Robust standard errors clustered by state are shown in parentheses. Observations are weighted by the child population in each state.

See this resource for more details on modelsummary().

Regression tables

Here is another example from the Detroit water shutoff analysis that relies on lm() for FE estimation and stargazer for formatting results. The full PRF is:

\[total\_obs\_1000_{zt} = \beta_0 + \beta_1 si\_1000_{zt} + \beta_2 vac\_res\_p100_{zt} + \phi_z + \theta_t + u_{zt}\]

Columns 1 and 2 show results without controlling for vacancy rates, columns 3 and 4 include vacancy rate as a control variable.

Dependent variable:
total_obs_1000
(1) (2) (3) (4)
si_1000 0.013** 0.013* 0.016*** 0.016**
(0.006) (0.007) (0.006) (0.007)
vac_res_p100 -0.156*** -0.156***
(0.012) (0.059)
Constant 17.562*** 17.562*** 18.351*** 18.351***
(0.302) (0.251) (0.274) (0.307)
Zip code FE Yes Yes Yes Yes
Month FE Yes Yes Yes Yes
Clustered SEs No Zipcode No Zipcode
Mean dep var 12.43 12.43 12.43 12.43
Observations 2,880 2,880 2,784 2,784
Adjusted R2 0.835 0.835 0.846 0.846
Note: p<0.1; p<0.05; p<0.01
Robust standard errors are shown in parentheses unless otherwise indicated.

Additional ioslides formatting

Creating bullet points lists

  • This is a bullet point
  • This is another bullet point
    • With a sub bullet

Add a plot

Make sure to clearly label your axes and legend!

Note that when you’re working with panel data, descriptive time series plots
by group can help motivate your analysis.

Tips for your presentation

  • Remember to do all of your work in R script(s) and only load the objects you need into your .rmd file (use .RData and .rds files)
  • Don’t describe your code and nonessential data management details, describe your analysis and results.
  • Having extra appendix slides on hand is fine, but don’t try to cover too much! Stick to the essential parts of your story.
  • Don’t forget to spellcheck and review your knitted document!