gemmr.sample_size.linear_model.prep_data_for_lm

gemmr.sample_size.linear_model.prep_data_for_lm(ds, n_reqs, include_latent_explained_vars, include_pc_var_decay_constants, include_pdiff)

Prepare outcome data for use with linear model.

Constructs a predictor data matrix with columns representing linear model predictors, and rows representing stacked synthetic datasets (stacked dimensions are ‘px’, ‘r’, ‘Sigma_id’).

Parameters:
  • ds (xr.Dataset) – outcome dataset
  • n_reqs (xr.DataArray) – required sample sizes
  • include_pc_var_decay_constants (bool) – whether to include a predictor for the principal component spectrum decay constant in the linear model
  • include_latent_explained_vars (bool) – whether to include a predictor for the latent explained variance in the linear model
  • include_pdiff (bool) – whether to include predictor for \(|p_X - p_Y|\) in the linear model
Returns:

  • X ((n_synth_datasets, n_predictors)) – predictor data matrix
  • y ((n_synth_datasets,)) – dependent variable
  • coef_names (list) – labels for included linear model coefficients (first one is “const”)