gemmr.cca_sample_size
- gemmr.cca_sample_size(X, Y, ax=None, ay=None, rs=(0.1, 0.3, 0.5), criterion='combined', algorithm='linear_model', target_power=0.9, target_error=0.1, expl_var_ratio=0.5, data_home=None)
Suggest sample size for CCA.
Suggested sample sizes are estimated using a linear model to to inter- and extrapolate parameters for which the generative model was used beforehand to calculate sample sizes.
- Parameters:
X (np.ndarray (n_samples, n_X_features) or int >= 2) – either a data matrix or directly the number of features for data matrix \(X\)
Y (np.ndarray (n_samples, n_Y_features) or int >= 2) – either a data matrix or directly the number of features for data matrix \(Y\)
ax (float < 0 or None) – principal component spectrum decay constant, if
Xis not a data matrix,Noneotherwiseay (float < 0 or None) – principal component spectrum decay constant, if
Yis not a data matrix,Noneotherwisers (list-like) – true correlations for which sample sizes are estimated
criterion (str) –
criterion according to which sample sizes are estimated. Can be:
'combined''power''association_strength''weight''score''loading''crossloading'
algorithm (str) –
algorithm used to calculate sample sizes. Can be:
'linear_model'
target_power (float between o and 1) – if
criterionis'combined'or'power'sample size is chosen to obtain at leasttarget_powerpowertarget_error (float between 0 and 1) – if criterion is not
'power'sample size is chosen to obtain at mosttarget_errorerror in error metric(s)expl_var_ratio (float) – if
XorYis a data matrix,axoray, respectively, will be estimated directly from the data using the number of principal components that explain this amount of variancedata_home (None or str) – path where outcome data are stored,
Noneindicates default path
- Returns:
suggested_sample_sizes – suggested sample sizes for correlations
rs- Return type:
dict