API reference¶

Sample size calculation¶

`cca_sample_size`(X, Y[, rs, criterion, …])	Suggest sample size for CCA.
`pls_sample_size`(X, Y[, ax, ay, rs, …])	Suggest sample size for PLS.
`pearson_sample_size`([rs, criterion, …])	Calculate required sample sizes for accurate estimation of Pearson correlation.
`sample_size.linear_model.cca_req_corr`(X, Y, …)	Determines the minimum required true correlation to achieve power and error levels.
`sample_size.linear_model.pls_req_corr`(X, Y, …)	Determines the minimum required true correlation to achieve power and error levels.

Estimators¶

`estimators.SVDPLS`([n_components, …])	Partial Least Squares estimators based on singular value decomposition.
`estimators.SVDCCA`([n_components, …])	Canonical Correlation Analysis estimator based on singular value decomposition.
`estimators.NIPALSPLS`([n_components, scale, …])	Identical to sklearn.cross_decomposition.PLSCanonical, except that fit creates additional attributes for compatibility with SVDPLS and SVDCCA:
`estimators.NIPALSCCA`([n_components, scale, …])	Identical to sklearn.cross_decomposition.CCA, except that fit creates additional attributes for compatibility with SVDPLS and SVDCCA:
`estimators.SparseCCA`

Synthetic data generation¶

`generative_model.setup_model`(model[, …])	Generate a joint covariance matrix for X and Y.
`generative_model.generate_data`(Sigma, px, n)	Generate synthetic data for a given model.

Analysis of CCA/PLS results¶

`sample_analysis.analyzers.analyze_dataset`(…)	Analyze a given dataset with a given estimator
`sample_analysis.analyzers.analyze_resampled`(…)	Analyze a given dataset and resampled versions of it with a given estimator.
`sample_analysis.analyzers.analyze_subsampled`(…)	Analyze subsampled versions of a dataset with a given estimator.
`sample_analysis.analyzers.analyze_model`(…)	Synthetic datasets drawn from a model are analyzed with a given stimator.
`sample_analysis.analyzers.analyze_model_parameters`(model)	Parameter-dependent models are set up and resulting synthetic datasets are analyzed.

Analysis add-ons¶

The functions in sample_analysis.analyzers only fit an estimator and return association strengths, weights and loadings. Additional analyses can be specified in the form of add-on functions. The following functions are provided, and arbitrary custom ones can be used as long as they have the same function signature.

`sample_analysis.addon.remove_weights_loadings`(…)	Removes weights and loadings from `results` dataset to save storage space.
`sample_analysis.addon.remove_cv_weights`(…)	Removes `x_weights_cv` and `y_weights_cv` from `results` dataset to save storage space.
`sample_analysis.addon.weights_true_cossim`(…)	Calculates cosine-distance between estimated and true weights.
`sample_analysis.addon.scores_true_spearman`(…)	Calculates Spearman correlations between estimated and true test scores.
`sample_analysis.addon.loadings_true_pearson`(…)	Calculates Pearson correlations between estimated and true test loadings.
`sample_analysis.addon.test_scores`(estr, X, …)	Calculates test scores.
`sample_analysis.addon.remove_test_scores`(…)	Removes `x_test_scores` and `y_test_scores` from `results`.
`sample_analysis.addon.assoc_test`(estr, X, Y, …)	Calculates Pearson correlations between test scores.
`sample_analysis.addon.weights_pc_cossim`(…)	Calculates cosine-similarities of principal component axes of X and Y with corresponding weights.
`sample_analysis.addon.sparseCCA_penalties`(…)	Store penalties of a fitted SparseCCA estimator.
`sample_analysis.addon.cv`(estr, X, Y, Xorg, …)	Calculates cross-validated outcome metrics.

Some of these add-ons require some help to set them up for work:

`sample_analysis.addon.mk_scorers_for_cv`([…])	Create scorers to use with `cv()`.
`sample_analysis.addon.mk_test_statistics_scores`(…)	Calculate scores for test subjects.

Analyses, that look into relations across datasets, and therefore require outcomes of more than a given current dataset to work, can be specified as postprocessors:

`sample_analysis.postproc.power`(res[, alpha])	Calculate power
`sample_analysis.postproc.remove_between_assocs_perm`(res)	Removes between_assocs_perm from results dataset
`sample_analysis.postproc.weights_pairwise_cossim_stats`(res)	Calculate cosine similarity between weights for all pairs of repetitions.
`sample_analysis.postproc.scores_pairwise_spearmansim_stats`(res)	Calculate cosine similarity between weights for all pairs of repetitions.
`sample_analysis.postproc.remove_weights_loadings`(res)	Removes weights and loadings from result dataset.
`sample_analysis.postproc.remove_test_scores`(res)	Removes test scores from result dataset.

Finally, there are a number of analysis building blocks that we found useful:

`sample_analysis.macros.calc_p_value`(estr, X, Y)	Calculate permutation-based p-value.
`sample_analysis.macros.analyze_subsampled_and_resampled`(…)	Analyzes the given data with the given estimator.

Model selection¶

`model_selection.max_min_detector`(X, Y, p_max)	Hypothesis-test based method to jointly determine number of PCA and between-set components.
`model_selection.n_components_to_explain_variance`(…)	Given a covariance matrix, find the number of components necessary to explain at least variance_threshold variance.

Plotting¶

`plot.mean_metric_curve`(metric[, rs, …])	Plots mean curves for given `rs` as a function of `n_per_ftr`.
`plot.heatmap_n_req`(n_req[, clabel])	Plots a heatmap of required number of samples as a function of number of features and true correlation.
`plot.polar_hist`(angles[, bins, mark_mean])	Plot a polar histogram.

Data¶

Preprocessing¶

data.preprocessing.preproc_smith(fc, sm[, …]) Data preprocessing pipeline from Smith et al.

Handling of included data files¶

`data.loaders.set_data_home`(data_home)	Set directory in which outcome data is stored
`data.load_outcomes`(model[, estr, tag, …])	Load previously generated outcome data.
`data.generate_example_dataset`(model[, px, …])	Convenience function returning an example dataset for use with CCA or PLS.
`data.print_ds_stats`(ds)	Print outcome dataset statistics.

Utility functions¶

`util.rank_based_inverse_normal_trafo`(x[, c])	Rank-based inverse normal transformation.
`util.pc_spectrum_decay_constant`([X, …])	Estimate powerlaw decay constant.