gemmr.sample_analysis.analyzers.analyze_subsampled¶
-
gemmr.sample_analysis.analyzers.analyze_subsampled(estr, X, Y, Xorig=None, Yorig=None, x_align_ref=None, y_align_ref=None, addons=(), ns=(), n_rep=10, n_perm=100, n_test=0, postprocessors=(), n_jobs=1, show_progress=True, random_state=None, **kwargs)¶ Analyze subsampled versions of a dataset with a given estimator.
Parameters: - estr (sklearn-style estimator) – for performing CCA or PLS. Must have method
fitand (after fitting) attributesassocs_,x_rotations_,y_rotations_,x_scores_,y_scores_ - X (np.ndarray (n_samples, n_features)) – dataset X
- Y (np.ndarray (n_samples, n_features)) – dataset Y
- Xorig (
Noneor np.ndarray (n_samples, n_orig_features)) – ifNoneset toX. Allows to provide an alternative set of X features for calculating loadings. I.e. an implicit assumption is that the rows inXandXorigcorrespond to the same samples (subjects). - Yorig (
Noneor np.ndarray (n_samples, n_orig_features)) – ifNoneset toY. Allows to provide an alternative set of Y features for calculating loadings. I.e. an implicit assumption is that the rows inYandYorigcorrespond to the same samples (subjects). - x_align_ref ((n_features,)) – after fitting, the sign of X weights is chosen such that the
cosine-distance between fitted X weights and
x_align_refis positive - y_align_ref ((n_features,)) – after fitting, the sign of Y weights is chosen such that the
cosine-distance between fitted Y weights and
y_align_refis positive - addons (list-like of add-on functions) –
After fitting the estimator and saving association strengths, weights and loadings in
resultsadditional analyses can be performed with these functions. They are called in the given order, and must have the signatureaddana_fun(estr, X, Y, Xorig, Yorig, x_align_ref, y_align_ref, results, **kwargs)
and are expected to save their respective outcome features
results. Various such functions are provided in modulesample_analysis_addons - ns (list-like of int) – subsamples of these sizes are used
- n_rep (int) – number of times a subsample of a given size is drawn
- n_perm (int) – each subsample is permuted
n_permtimes to generate a null-distribution of outcome quantities - n_test (int) – number of subjects to use as test set.
max(ns) + n_testmust be <=n_samples - postprocessors (list-like of functions) – functions are called after the final dataset has been concatenated and take that xr.Dataset as only argument
- n_jobs (int or None) – number of parallel jobs (see
joblib.Parallel) - show_progress (bool) – whether to show progress bar
- random_state (
None, int or random number generator instance) – used to generate random numbers - kwargs (dict) – forwarded to additional analysis functions
Returns: results – containing data variables for outcome features generated by analyses
Return type: xr.Dataset
- estr (sklearn-style estimator) – for performing CCA or PLS. Must have method