Generating: To make a model you need to provide a
DAG statement to make_model. For instance
"X->Y""X -> M -> Y <- X" or"Z -> X -> Y <-> X".Graphing: Once you have made a model you can inspect the DAG:
Simple summaries: You can access a simple summary
using summary()
summary(xy_model)
#> 
#> Causal statement: 
#> X -> Y
#> 
#> Nodal types: 
#> $X
#> 0  1
#> 
#>   node position display interpretation
#> 1    X       NA      X0          X = 0
#> 2    X       NA      X1          X = 1
#> 
#> $Y
#> 00  10  01  11
#> 
#>   node position display interpretation
#> 1    Y        1   Y[*]*      Y | X = 0
#> 2    Y        2   Y*[*]      Y | X = 1
#> 
#> Number of types by node:
#> X Y 
#> 2 4 
#> 
#> Number of causal types:  8
#> 
#> Note: Model does not contain: posterior_distribution, stan_objects;
#> to include these objects use update_model()
#> 
#> Note: To pose causal queries of this model use query_model()or you can examine model details using inspect().
Inspecting: The model has a set of parameters and a default distribution over these.
xy_model |> inspect("parameters_df")
#> 
#> parameters_df
#> Mapping of model parameters to nodal types: 
#> 
#>   param_names: name of parameter
#>   node:        name of endogeneous node associated
#>                with the parameter
#>   gen:         partial causal ordering of the
#>                parameter's node
#>   param_set:   parameter groupings forming a simplex
#>   given:       if model has confounding gives
#>                conditioning nodal type
#>   param_value: parameter values
#>   priors:      hyperparameters of the prior
#>                Dirichlet distribution 
#> 
#>   param_names node gen param_set nodal_type given param_value priors
#> 1         X.0    X   1         X          0              0.50      1
#> 2         X.1    X   1         X          1              0.50      1
#> 3        Y.00    Y   2         Y         00              0.25      1
#> 4        Y.10    Y   2         Y         10              0.25      1
#> 5        Y.01    Y   2         Y         01              0.25      1
#> 6        Y.11    Y   2         Y         11              0.25      1Tailoring: These features can be edited using
set_restrictions, set_priors and
set_parameters.
Here is an example of setting a monotonicity restriction (see
?set_restrictions for more):
Here is an example of setting priors (see ?set_priors
for more):
Simulation: Data can be drawn from a model like this:
| Z | X | Y | 
|---|---|---|
| 0 | 0 | 1 | 
| 0 | 1 | 0 | 
| 0 | 1 | 0 | 
| 1 | 0 | 0 | 
Updating: Update using update_model.
You can pass all rstan arguments to
update_model.
df <-
  data.frame(X = rbinom(100, 1, .5)) |>
  mutate(Y = rbinom(100, 1, .25 + X*.5))
xy_model <-
  xy_model |>
  update_model(df, refresh = 0)Inspecting: You can access the posterior distribution on model parameters directly thus:
| X.0 | X.1 | Y.00 | Y.10 | Y.01 | Y.11 | 
|---|---|---|---|---|---|
| 0.4802981 | 0.5197019 | 0.1754291 | 0.1730648 | 0.5101839 | 0.1413222 | 
| 0.5969120 | 0.4030880 | 0.0672990 | 0.1458238 | 0.5314693 | 0.2554079 | 
| 0.4081154 | 0.5918846 | 0.1279818 | 0.0784327 | 0.6366884 | 0.1568971 | 
| 0.5074739 | 0.4925261 | 0.1346880 | 0.0945238 | 0.6796534 | 0.0911348 | 
| 0.5293336 | 0.4706664 | 0.1725529 | 0.0041493 | 0.4037340 | 0.4195638 | 
| 0.5379008 | 0.4620992 | 0.0359858 | 0.1687144 | 0.6990939 | 0.0962059 | 
where each row is a draw of parameters.
Querying: You ask arbitrary causal queries of the model.
Examples of unconditional queries:
xy_model |>
  query_model("Y[X=1] > Y[X=0]",
              using = c("priors", "posteriors"))
#> 
#> Causal queries generated by query_model (all at population level)
#> 
#> |label           |using      |  mean|    sd| cred.low| cred.high|
#> |:---------------|:----------|-----:|-----:|--------:|---------:|
#> |Y[X=1] > Y[X=0] |priors     | 0.252| 0.192|    0.008|     0.702|
#> |Y[X=1] > Y[X=0] |posteriors | 0.586| 0.088|    0.401|     0.740|This query asks the probability that \(Y(1)> Y(0)\).
Examples of conditional queries:
xy_model |>
  query_model("Y[X=1] > Y[X=0] :|: X == 1 & Y == 1", using = c("priors", "posteriors"))
#> 
#> Causal queries generated by query_model (all at population level)
#> 
#> |label                                 |using      |  mean|    sd| cred.low| cred.high|
#> |:-------------------------------------|:----------|-----:|-----:|--------:|---------:|
#> |Y[X=1] > Y[X=0] given X == 1 & Y == 1 |priors     | 0.504| 0.285|    0.030|     0.972|
#> |Y[X=1] > Y[X=0] given X == 1 & Y == 1 |posteriors | 0.737| 0.106|    0.528|     0.940|This query asks the probability that \(Y(1) > Y(0)\) given \(X=1\) and \(Y=1\); it is a type of “causes of effects” query. Note that “:|:” is used to separate the main query element from the conditional statement to avoid ambiguity, since “|” is reserved for the “or” operator.
Queries can even be conditional on counterfactual quantities. Here the probability of a positive effect given some effect:
xy_model |>
  query_model("Y[X=1] > Y[X=0] :|: Y[X=1] != Y[X=0]",
              using = c("priors", "posteriors"))
#> 
#> Causal queries generated by query_model (all at population level)
#> 
#> |label                                  |using      |  mean|    sd| cred.low| cred.high|
#> |:--------------------------------------|:----------|-----:|-----:|--------:|---------:|
#> |Y[X=1] > Y[X=0] given Y[X=1] != Y[X=0] |priors     | 0.501| 0.290|    0.027|     0.973|
#> |Y[X=1] > Y[X=0] given Y[X=1] != Y[X=0] |posteriors | 0.863| 0.074|    0.725|     0.989|Note that we use “:” to separate the base query from the condition rather than “|” to avoid confusion with logical operators.
Query output is ready for printing as tables, but can also be plotted, which is especially useful with batch requests:
batch_queries <- xy_model |>
  query_model(queries = list(ATE = "Y[X=1] - Y[X=0]",
                             `Positive effect given any effect` = "Y[X=1] > Y[X=0] :|: Y[X=1] != Y[X=0]"),
              using = c("priors", "posteriors"),
              expand_grid = TRUE)
batch_queries |> kable(digits = 2, caption = "tabular output")| label | query | given | using | case_level | mean | sd | cred.low | cred.high | 
|---|---|---|---|---|---|---|---|---|
| ATE | Y[X=1] - Y[X=0] | - | priors | FALSE | 0.01 | 0.32 | -0.62 | 0.64 | 
| ATE | Y[X=1] - Y[X=0] | - | posteriors | FALSE | 0.49 | 0.08 | 0.32 | 0.64 | 
| Positive effect given any effect | Y[X=1] > Y[X=0] | Y[X=1] != Y[X=0] | priors | FALSE | 0.50 | 0.29 | 0.02 | 0.98 | 
| Positive effect given any effect | Y[X=1] > Y[X=0] | Y[X=1] != Y[X=0] | posteriors | FALSE | 0.86 | 0.07 | 0.72 | 0.99 |