Structural Equation Models (SEM) and particular cases using rstan interface

sem( data, blocks, paths, exogenous, signals, row_names = rownames(data), prior_specs = list(beta = c("normal(0,1)"), sigma2 = c("inv_gamma(2.1, 1.1)"), gamma0 = c("normal(0,1)"), gamma = c("normal(0,1)"), tau2 = c("inv_gamma(2.1, 1.1)")), cores = parallel::detectCores(), pars = c("alpha", "lambda", "sigma2"), iter = 2000, chains = 4, scaled = FALSE, verbose = FALSE, refresh = 100, ... )

data | a mandatory 'matrix' object where the columns are variables and the rows are observations |
---|---|

blocks | a mandatory named list of colnames (or integers in 1:ncol(data)) indicating the manisfest variables corresponding to each block; generic names are assumed for latent variables internally if not defined |

paths | list referring to the inner model paths; a list of characters or integers referring to the scores relationship; the jth first latent variable are explained if names(paths) is NULL |

exogenous | list referring to the inner model exogenous; a list of characters or integers referring to relationship between exogenous and latent variables; the lth first columns are explained if names(exogenous) is NULL |

signals | list referring to the signals of the factor loadings initial values; must be true: (length(signals) == length(blocks)) && (lengths(signals) == lengths(blocks)); (not allowed in runShiny) |

row_names | optional identifier for the observations (observation = row) |

prior_specs | prior settings for the Bayesian approach; only `normal` and `cauchy` for gamma0, gamma and beta; `gamma`, `lognormal` and `inv_gamma` for sigma2 and tau2 are available, those prior specifications are ignored if not needed (FA or SEM) |

cores | number of core threads to be used |

pars | allows parameters to omitted in the outcome; options are any subset of default c("alpha", "lambda", "sigma2") |

iter | number of iterations |

chains | number of chains |

scaled | logical; indicates whether to center and scale the data; default FALSE |

verbose | logical; see |

refresh | defaults to 100; see |

... | further arguments passed to Stan such as warmup, adapt_delta and others, see |

An object of class `bsem`

; a list of 14 to 19:

- stanfit
S4 object of class stanfit

- posterior
the list of posterior draws separated by chains

- model
character; pointer to pre-defined stan model

- mean_alpha
matrix of factor loadings posterior means

- mean_lambda
matrix of factor scores posterior means

- mean_sigma2
vector of error variances posterior means

- mean_beta
vector of regression coefficients posterior means

- mean_tau2
vector of inner paths error variances posterior means

- mean_gamma
vector of inner paths regression coefficients posterior means

- mean_gamma0
vector of inner paths intercept posterior means

- stats
posterior descriptives statistics

- blocks
list of blocks

- paths
list of paths

- credint
Highest posterior density intervals (HPD)

- h
vector of posterior communalities

- PTVE
vector of total variance proportions

- R2
adjusted coefficient of determination

- SQE
explained sums of squares

- SQT
total sums of squares

Fits the SEM to specific data

Consider:

- the outer model as: -- outer blocks:

$$X_{p x n} = \alpha_{p x k}\lambda_{k x n} + \epsilon_{p x n}$$ where \(X\) is the data matrix with variables in the rows and sample elements in the columns, \(\alpha_{p x j}\) is the column vector of loadings for the \(jth\) latent variable and \(\lambda_{j x n}\) is the row vector of scores for the \(jth\) unobserved variable, \(j =1,\dots,k\). Normality is assumed for the errors as \(\epsilon_{ij}~ N(0, \sigma_i ^2)\) for \(i = 1,\dots, p\).

- the inner model as:

-- inner paths: $$\lambda_{j x n} = \beta \lambda^(-j) + \nu$$ where \(\beta\) is a column vector of constant coefficients and \(\lambda^(-j)_{ (k-1) x n}\) represents a subset of the matrix of scores, i.e. at least excluding the \(jth\) row scores. The error assumes \(\nu_j ~ N(0,1)\).

-- inner exogenous: $$Y_{l x n} = \gamma_0 + \gamma \lambda + \xi$$ where \(\gamma\) is a column vector of constant coefficients and \(\gamma_0\) is the intercept. \(\lambda_{k x n}\) is the matrix of scores and the error assumes \(\xi_l~ N(0,\tau_l^2)\).

#> [1] "data" "real" "blocks" "signals" "paths" "exogenous"if (FALSE) { semfit <- bsem::sem( data = dt$data, blocks = dt$blocks, paths = dt$paths, exogenous = dt$exogenous, signals = dt$signals, iter = 2000, warmup = 1000, chains = 4 ) summary(semfit) }