API reference
Sample size calculation
|
Suggest sample size for CCA. |
|
Suggest sample size for PLS. |
|
Calculate required sample sizes for accurate estimation of Pearson correlation. |
|
Determines the minimum required true correlation to achieve power and error levels. |
|
Determines the minimum required true correlation to achieve power and error levels. |
Estimators
|
Partial Least Squares estimators based on singular value decomposition. |
|
Canonical Correlation Analysis estimator based on singular value decomposition. |
|
Identical to sklearn.cross_decomposition.PLSCanonical, except that fit creates additional attributes for compatibility with SVDPLS and SVDCCA: |
|
Identical to sklearn.cross_decomposition.CCA, except that fit creates additional attributes for compatibility with SVDPLS and SVDCCA: |
Synthetic data generation
|
Generate a joint covariance matrix for X and Y. |
|
Generate synthetic data for a given model. |
Analysis of CCA/PLS results
Analyze a given dataset with a given estimator |
|
Analyze a given dataset and resampled versions of it with a given estimator. |
|
Analyze subsampled versions of a dataset with a given estimator. |
|
Parameter-dependent models are set up and resulting synthetic datasets are analyzed. |
Analysis add-ons
The functions in sample_analysis.analyzers
only fit an estimator and return association strengths, weights and
loadings. Additional analyses can be specified in the form of add-on functions. The following functions are provided,
and arbitrary custom ones can be used as long as they have the same function signature.
Removes weights and loadings from |
|
Removes |
|
Calculates cosine-distance between estimated and true weights. |
|
Calculates Spearman correlations between estimated and true test scores. |
|
Calculates Pearson correlations between estimated and true test scores. |
|
Calculates Pearson correlations between estimated and true loadings (loadings with respect to possibly transformed variables, i.e. those in columns of X, Y (not Xorig, Yorig). |
|
|
Calculates test scores. |
Removes |
|
|
Calculates Pearson correlations between test scores. |
Calculates cosine-similarities of principal component axes of X and Y with corresponding weights. |
|
Store penalties of a fitted SparseCCA estimator. |
|
|
Calculates cross-validated outcome metrics. |
Some of these add-ons require some help to set them up for work:
Create scorers to use with |
|
Calculate scores for test subjects. |
Analyses, that look into relations across datasets, and therefore require outcomes of more than a given current dataset to work, can be specified as postprocessors:
|
Calculate power |
Removes between_assocs_perm from results dataset |
|
Calculate cosine similarity between weights for all pairs of repetitions. |
|
|
Calculate cosine similarity between weights for all pairs of repetitions. |
Removes weights and loadings from result dataset. |
|
Removes test scores from result dataset. |
Finally, there are a number of analysis building blocks that we found useful:
|
Calculate permutation-based p-value. |
|
Analyzes the given data with the given estimator. |
|
Calculates statistics of the weight-similarities from pairs of synthetic datasets. |
Model selection
|
Hypothesis-test based method to jointly determine number of PCA and between-set components. |
Given a covariance matrix, find the number of components necessary to explain at least variance_threshold variance. |
Plotting
|
Plots mean curves for given |
|
Plots a heatmap of required number of samples as a function of number of features and true correlation. |
|
Plot a polar histogram. |
Data
Preprocessing
|
Data preprocessing pipeline from Smith et al. (2015). |
Handling of included data files
|
Set directory in which outcome data is stored |
|
Load previously generated outcome data. |
|
Convenience function returning an example dataset for use with CCA or PLS. |
|
Print outcome dataset statistics. |
Utility functions
Rank-based inverse normal transformation. |
|
|
Estimate powerlaw decay constant. |