Panel data visualization
panelView.Rd
Visualizes missing values, treatment and outcome variables, and their relationships in panel data
Usage
panelview(data, formula = NULL, Y = NULL, D = NULL,
X = NULL, index,
ignore.treat = FALSE, type = "treat",
outcome.type = "continuous",
treat.type = NULL, by.group = FALSE, by.group.side = FALSE,
by.timing = FALSE, theme.bw = TRUE,
xlim = NULL, ylim = NULL,
xlab = NULL, ylab = NULL,
gridOff = FALSE, legendOff = FALSE,
legend.labs = NULL, main = NULL,
pre.post = NULL, id = NULL, show.id = NULL,
color = NULL, axis.adjust = FALSE, axis.lab = "both",
axis.lab.gap = c(0, 0), axis.lab.angle = NULL, shade.post = TRUE,
cex.main = 15, cex.main.sub = 12, cex.axis = 8,
cex.axis.x = NULL, cex.axis.y = NULL,
cex.lab = 12, cex.legend = 12, background = NULL,
style = NULL, by.unit = FALSE, lwd = 0.2, leave.gap = FALSE,
display.all = NULL, by.cohort = FALSE,
collapse.history = NULL, report.missing = FALSE)
Arguments
- data
a data frame. The panel does not have to be balanced.
- formula
an object of class "formula": a symbolic description of the model to be fitted. The first variable on the right-hand-side is designated as the treatment indicator if
ignore.treat = FALSE
. If there is not any covariates, the formula should be likeY~1
, whereY
is the outcome variable.- Y
variable name of the outcome. Ignored if
formula
is provided.- D
variable name of the treatment. Ignored if
formula
is provided.- X
variable name of the time-varying covariates. Ignored if
formula
is provided.- index
a two-element string vector specifying the unit (group) and time indicators. Must be of length 2.
- ignore.treat
a logical flag indicating whether there is a treatment variable. Default value is
ignore.treat = FALSE
.- type
a string that specifies the type of the plot. Must be either
"treat"
(default), which plots the treatment status of each unit at each time point,"missing"
, which plots the missing-data,"outcome"
, which plots the raw outcome data, or"bivariate"
, which plots time series of outcome and treatment in one graph.- outcome.type
a string that specifies the type of outcome variable. Must be either
"continuous"
(default) or"discrete"
. For a continuous variable, time series lines for specified units will be plotted, and for discrete response, jitter-ed points at each time period will be plotted.- treat.type
a string that specifies the type of treatment variable. Must be either
"continuous"
or"discrete"
. The default is NULL, which means the option will be decided based on the number of unique treatment values: if the number if bigger than 10, it will be set as "continuous"; otherwise, it will be set as "discrete".- by.group
a logic flag indicating whether the data should be plotted in a column in separate groups based on treatment status changes for the outcome plot.
- by.group.side
a logical flag indicating whether to arrange subfigures of
by.group = TRUE
in a row rather than in a column.- by.timing
a logic flag indicating whether the units should be sorted based on the timing of receiving the treatment for the treat plot.
- theme.bw
a logical flag specifying whether to use a black-and-white theme.
- xlim
a two-element numeric vector specifying the range of x-axis. When the class of time variable is string, must specify the range of strings to be shown, e.g.
xlim=c(1,30)
.- ylim
a two-element numeric vector specifying the range of y-axis.
- xlab
a string indicating the label of the x-axis.
- ylab
a string indicating the label of the y-axis.
- gridOff
a logical flag controlling whether to show the grid lines on the treat plot..
- legendOff
a logical flag controlling whether to show the legend.
- legend.labs
a vector specifying the legend labels. Ignored when
legendOff=TRUE
.- main
a string that controls the title of the plot.
- pre.post
a logical flag indicating whether to distinguish control status of treated units from that of control units. Only used for staggered data in the treat and outcome plots.
- id
a vector specifying units to be shown in the plot. Useful when the number of units is very large.
- show.id
a numeric vector or sequence specifying the sorted order of units to be shown in the
"treat"
plot. Useful when the number of units is very large. Ignored if!is.null("id")
.- color
a string vector specifying color setting for the plot.
- axis.adjust
a logic flag indicating whether to adjust labels on the x-axis. Useful when the class of time variable is string and there are many time periods.
- axis.lab
a string indicating whether labels on the x- and y-axis will be shown. There are four options:
"both"
(default): labels on both axes will be shown;"unit"
: only labels on y-axis will be shown;"time"
: only labels on the x-axis will be shown; "none": no labels will be shown.- axis.lab.gap
a numeric vector setting the gaps between labels on the x- or y-axis for the plot. Default is
axis.lab.gap = c(0, 0)
, which means that all labels will be shown. Useful for datasets with large N or T.- axis.lab.angle
a numeric value setting the angle (degrees) of the labels shown on the x-axis. Must be between 0 and 90.
- shade.post
a logical flag controlling whether to shade the post-treatment periods. Ignored if
type = "treat"
or no treatment variable is supplied.- cex.main
a numeric value (pt) specifying the fontsize of the main title.
- cex.main.sub
a numeric value (pt) specifying the fontsize of the subtitles. Ignored if
type = "treat"
orby.group = FALSE.
- cex.axis
a numeric value (pt) specifying the fontsize of the texts on the axes; overwritten by
cex.axis.x
orcex.axis.y
.- cex.axis.x
a numeric value (pt) specifying the fontsize of the texts on the x-axis.
- cex.axis.y
a numeric value (pt) specifying the fontsize of the texts on the y-axis.
- cex.lab
a numeric value (pt) specifying the fontsize of the axis titles.
- cex.legend
a numeric value (pt) specifying the fontsize of the legend.
- background
a character specifying the background color.
- by.unit
a logic flag indicating whether to plot by each specified units or to plot mean D and Y against time in the same graph.
- style
a string vector to set line/connected line/bar styles for the outcome and treatment variables.
- lwd
a numeric value (pt) specifying the line width when plotting time series of treatment and outcome variables.
- leave.gap
a logical flag indicating whether to keep time gaps as white bars if time is not evenly distributed (possibly due to missing data). Default value is
leave.gap = FALSE
.- display.all
a logical flag indicating whether to show all units if the number of units is more than 500, otherwise we randomly select 500 units to present.
- by.cohort
a logical flag indicating whether to plot the average outcome lines based on unique treatment histories in an "outcome" plot.
- collapse.history
a logical flag indicating whether to collapse units by treat history in a "treat"" plot.
- report.missing
a logical flag indicating whether to report missingness in the included variables.
Details
panelview visualizes the treatment status, missing values, and raw outcome data of a time-series cross-sectional dataset.
Author
Hongyu Mou <hongyumou@g.ucla.edu>
Licheng Liu <liulch@mit.edu>
Yiqing Xu <yiqingxu@stanford.edu>