sem: The SEM Function

Structural Equation Models (SEM) and particular cases using rstan interface

sem(
  data,
  blocks,
  paths,
  exogenous,
  signals,
  row_names = rownames(data),
  prior_specs = list(beta = c("normal(0,1)"), sigma2 = c("inv_gamma(2.1, 1.1)"), gamma0
    = c("normal(0,1)"), gamma = c("normal(0,1)"), tau2 = c("inv_gamma(2.1, 1.1)")),
  cores = parallel::detectCores(),
  pars = c("alpha", "lambda", "sigma2"),
  iter = 2000,
  chains = 4,
  scaled = FALSE,
  verbose = FALSE,
  refresh = 100,
  ...
)

Arguments

data	a mandatory 'matrix' object where the columns are variables and the rows are observations
blocks	a mandatory named list of colnames (or integers in 1:ncol(data)) indicating the manisfest variables corresponding to each block; generic names are assumed for latent variables internally if not defined
paths	list referring to the inner model paths; a list of characters or integers referring to the scores relationship; the jth first latent variable are explained if names(paths) is NULL
exogenous	list referring to the inner model exogenous; a list of characters or integers referring to relationship between exogenous and latent variables; the lth first columns are explained if names(exogenous) is NULL
signals	list referring to the signals of the factor loadings initial values; must be true: (length(signals) == length(blocks)) && (lengths(signals) == lengths(blocks)); (not allowed in runShiny)
row_names	optional identifier for the observations (observation = row)
prior_specs	prior settings for the Bayesian approach; only `normal` and `cauchy` for gamma0, gamma and beta; `gamma`, `lognormal` and `inv_gamma` for sigma2 and tau2 are available, those prior specifications are ignored if not needed (FA or SEM)
cores	number of core threads to be used
pars	allows parameters to omitted in the outcome; options are any subset of default c("alpha", "lambda", "sigma2")
iter	number of iterations
chains	number of chains
scaled	logical; indicates whether to center and scale the data; default FALSE
verbose	logical; see `sampling`; default FALSE
refresh	defaults to 100; see `sampling`;
...	further arguments passed to Stan such as warmup, adapt_delta and others, see `sampling`.

Value

An object of class bsem; a list of 14 to 19:

stanfit

S4 object of class stanfit

posterior

the list of posterior draws separated by chains

model

character; pointer to pre-defined stan model

mean_alpha

matrix of factor loadings posterior means

mean_lambda

matrix of factor scores posterior means

mean_sigma2

vector of error variances posterior means

mean_beta

vector of regression coefficients posterior means

mean_tau2

vector of inner paths error variances posterior means

mean_gamma

vector of inner paths regression coefficients posterior means

mean_gamma0

vector of inner paths intercept posterior means

stats

posterior descriptives statistics

blocks

list of blocks

paths

list of paths

credint

Highest posterior density intervals (HPD)

vector of posterior communalities

PTVE

vector of total variance proportions

adjusted coefficient of determination

SQE

explained sums of squares

SQT

total sums of squares

Details

Fits the SEM to specific data

Consider:

- the outer model as: -- outer blocks:

$$X_{p x n} = \alpha_{p x k}\lambda_{k x n} + \epsilon_{p x n}$$ where $X$ is the data matrix with variables in the rows and sample elements in the columns, $\alpha_{p x j}$ is the column vector of loadings for the $jth$ latent variable and $\lambda_{j x n}$ is the row vector of scores for the $jth$ unobserved variable, $j =1,\dots,k$. Normality is assumed for the errors as $\epsilon_{ij}~ N(0, \sigma_i ^2)$ for $i = 1,\dots, p$.

- the inner model as:

-- inner paths: $$\lambda_{j x n} = \beta \lambda^(-j) + \nu$$ where $\beta$ is a column vector of constant coefficients and $\lambda^(-j)_{ (k-1) x n}$ represents a subset of the matrix of scores, i.e. at least excluding the $jth$ row scores. The error assumes $\nu_j ~ N(0,1)$.

-- inner exogenous: $$Y_{l x n} = \gamma_0 + \gamma \lambda + \xi$$ where $\gamma$ is a column vector of constant coefficients and $\gamma_0$ is the intercept. $\lambda_{k x n}$ is the matrix of scores and the error assumes $\xi_l~ N(0,\tau_l^2)$.

Examples

dt <- bsem::simdata()
names(dt)
#> [1] "data"      "real"      "blocks"    "signals"   "paths"     "exogenous"
if (FALSE) {

semfit <- bsem::sem(
  data = dt$data,
  blocks = dt$blocks,
  paths = dt$paths,
  exogenous = dt$exogenous,
  signals = dt$signals,
  iter = 2000,
  warmup = 1000,
  chains = 4
)
summary(semfit)
}

Arguments

Value

Details

See also

Examples