Evaluation module
Created on 12/14/2021 Evaluation - implements evaluation of emission, arrangement, or full models Assumes that data, likelihoods, and estimates comes as NxKxP tensors First are basic functions for save evaluation - Second are more complex functions that use different criteria
Author: dzhi, jdiedrichsen
- evaluation.ARI(U, Uhat, sparse=True)[source][source]
Compute the 1 - (adjusted rand index) between the two parcellations
- Parameters:
U – The true U’s
Uhat – The estimated U’s from fitted model
- Returns:
the adjusted rand index score
- evaluation.BIC(loglik, N, d)[source][source]
Bayesian Information Criterion
- Parameters:
loglik – the log-likelihood of the model
N – the number of examples in the training dataset
d – the number of parameters in the model
- Returns:
BIC statistic
References
Page 235, The Elements of Statistical Learning, 2016. BIC = -2 * LL + log(N) * k
- evaluation.KL_divergence(p, q)[source][source]
- Computes the KL divergence between two multinomial distributions
p and q. p and q are assumed to be have the same shape (nsub, K, P), or multiply broadcastable shapes. For example, (nsub, K, P) and (K, P) are valid shapes.
- Parameters:
p (ndarray or tensor) – the true probability distribution. typically it has a shape of (sub, K, P), where the dimension K reprensents the multinomial distribution, which must sum to 1
q (ndarray or tensor) – the predicted probability distribution. typically it has a shape of (sub, K, P), where the dimension K reprensents the multinomial distribution, which must sum to 1
- Returns:
KL_div (float) – The KL divergence between p and q.
- evaluation.align_models(models, in_place=True)[source][source]
Aligns the marginal probabilities across different models if in_place = True, it changes arrangement and emission models … Note that the models will be changed!
- Parameters:
models (list) – List of full models
in_place (bool) – Changes the models in place
- Returns:
Marginal probability – (n_models x K x n_vox) tensor
- evaluation.calc_consistency(params, dim_rem=None)[source][source]
Calculates consistency across a number of different solutions to a problem (after alignment of the rows/columns) Computes cosine similarities across the entire matrix of params
- Parameters:
params (pt.tensor) – n_sol x N x P array of data
dim_rem (int) – Dimension along which to remove the mean None: No mean removal 0: Remove the overall mean of the matrices 1: Remove the row mean 2: Remove the column mean
- Returns:
R (pt.tensory) – n_sol x n_sol matrix of cosine similarites
- evaluation.calc_test_error(M, tdata, U_hats, coserr_type='expected', coserr_adjusted=True, fit_emission='full')[source][source]
Evaluates the prediction (cosine-error) for group or individual parcellations on some new test data. For this evaluation, we need to obtain a V for new test data. The V is learned from N-1 subjects and then used to evaluate the left-out subjects for each parcellation. If fit_emission is:
‘full’: The emission and individual Uhats are fully refit for each fold (arrangement model is fixed) ‘use_Uhats’: Using the individual Uhats, the V is estimated using a single M-step
Because the emission model is retrained for each subject (and that can take a bit of time), the function evaluate a whole set of different parcellations (group, noise-floor, individual) in one go.
- Parameters:
M (full model) – Full model including emission model for test data.
tdata (ndarray or 3d-pt.tensor) – (numsubj x N x P) array or tensor of test data
U_hats (list) –
- List of strings and/or tensors. Each element of the list can be:
3d-pt.tensor: (nsubj x K x P) tensor of individual parcellations 2d-pt.tensor: (K x P) tensor of group parcellation
’group’: Group-parcellation from arrangement model ‘floor’: Noise-floor (E-step on left-out subject)
fit_emission (str) – ‘full’: fit the emission model and individual Uhats ‘use_Uhats’: Use the individual Uhats to derive V
coserr_type (str) – Type of cosine error (hard,average,expected)
coserr_adjusted (bool) – Adjusted cosine error?
- Returns:
A num_eval x num_subj matrix of cosine errors, 1 row for each element in U_hats
- evaluation.coserr(Y, V, U, adjusted=False, soft_assign=True)[source][source]
For backwards compatibility
- evaluation.coserr_2(Y, V, U, adjusted=False, soft_assign=True)[source][source]
For backwards compatibility
- evaluation.cosine_error(Y, V, U, adjusted=False, type='expected')[source][source]
Compute the cosine errors between the data to the predicted of the probabilistic model For mathematical details, see https://hierarchbayesparcel.readthedocs.io/en/latest/math.html
- Parameters:
Y (pt.tensor) – the test data, with a shape (num_sub, N, P) or (N,P) for one subject
V (pt.tensor) – the predicted mean directions (unit length) per parcel (N, K)
U (pt.tensor) – the expected U’s from the trained emission model (n_subj,K,P) or (K,P) for group model
adjusted (bool) – Is the weight of each voxel adjusted by the magnitude of the data? If yes, the cosine error is 2(1-R^2)
type (str) – ‘hard’: Do a hard assignment and use the V from the parcel with max probability ‘average’: Compute the cosine error for the average prediction (across parcels) ‘expected’: Compute the average cosine error across all predictions of the parcels
- Returns:
Cosine Error (pt.tensor) – (num_subj) tensor of cosine errors 0 (same direction) to 2 (opposite direction)
- evaluation.cross_entropy(p, q)[source][source]
- Computes the cross-entropy between two multinomial
distributions p and q. p and q are assumed to be have the same shape (nsub, K, P), or multiply broadcastable shapes. For example, (nsub, K, P) and (K, P) are valid shapes.
- Parameters:
p (ndarray or tensor) – the true probability distribution. typically it has a shape of (sub, K, P), where the dimension K reprensents the multinomial distribution, which must sum to 1
q (ndarray or tensor) – the predicted probability distribution. typically it has a shape of (sub, K, P), where the dimension K reprensents the multinomial distribution, which must sum to 1
- Returns:
ce (float) – The cross-entropy between p and q.
- evaluation.dice_coefficient(labels1, labels2, label_matching=True, separate=False)[source][source]
Compute the Dice coefficient between tow parcellations
- Parameters:
labels1 (pt.Tensor) – a 1d tensor of parcellation 1
labels2 (pt.Tensor) – a 1d tensor of parcellation 2
- Returns:
the Dice coefficient between the two parcellations
- evaluation.evaluate_U(U_true, U_predict, crit='u_prederr')[source][source]
- Evaluates an emission model on a given data set using a given
criterion. This data set can be the training dataset (includes U and signal if applied), or a new dataset given criterion
- Parameters:
U_predict – the predicted arrangement
U_true – the reference (true) arrangement
crit – the criterion to be used to evaluate the models
- Returns:
evaluation results
- evaluation.evaluate_completion_arr(arM, emM, data, part, crit='Ecos_err', Utrue=None)[source][source]
- Evaluates an arrangement model new data set using pattern completion from partition to
partition, using a leave-one-partition out crossvalidation approach.
- Parameters:
arM (ArrangementModel) – arrangement model
emloglik (tensor) – Emission log-liklihood
part (Partitions) – P tensor with partition indices
crit (str) – ‘logpY’,’u_abserr’,’cos_err’
Utrue (tensor) – For u_abserr you need to provide the true U’s
- Returns:
mean evaluation citerion
- evaluation.evaluate_full_arr(emM, data, Uhat, crit='cos_err')[source][source]
- Evaluates an arrangement model new data set using pattern completion from partition to
partition, using a leave-one-partition out crossvalidation approach.
- Parameters:
emM (EmissionModel)
data (tensor) – Y-data or U true (depends on crit)
Uhat (tensor) – Probility for each node (expected U)
crit (str) – ‘logpy’,’u_abserr’, ‘cos_err’,’Ecos_err’
- Returns:
evaluation citerion – [description]
- evaluation.extract_V(models)[source][source]
Extracts emission models vectors from a list of models
- Parameters:
models (list) – List of FullMultiModel
- Returns:
list – of ndarrays (n_models x M x K) for each emission model
- evaluation.extract_kappa(model)[source][source]
Summarizes Kappas from a model All emission models need to be either uniform or non-uniform
- Parameters:
model (FullMultiModel) – Model to extract kapps from
- Returns:
pt_tensor – (n_emission,K) matrix or (n_emission,) vector
- evaluation.extract_marginal_prob(models)[source][source]
Extracts marginal probability values
- Parameters:
models (list) – List of FullMultiModel
- Returns:
Marginal probability – (n_models x K x n_vox) tensor
- evaluation.homogeneity(Y, U_hat, soft_assign=False, z_transfer=False, single_return=True)[source][source]
- Compute the global homogeneity measure for a given parcellation.
The homogeneity is defined as the averaged correlation (Pearson’s) between all vertex pairs within a parcel. Then the global homogeneity is calculated as the mean across all parcels.
- Parameters:
Y – The underlying data to compute correlation must has a shope of (N, P) where N is the number of task activations and P is the number of brain locations.
U_hat – the given probabilistic parcellation, shape (K, P)
soft_assign – if True, compute the expected homogeneity. Otherwise,
homogeneity. (calcualte the hard assignment)
z_transfer – if True, apply r-to-z transformation
single_return – if True, return the global homogeneity measure only. Otherwise, return the homogeneity measure per parcel, and the number of vertices within each parcel.
- Returns:
g_homogeneity – the global resting-state homogeneity measure
global_homo – the mean honogeneity for each parcel
N – the valid (non-NaN) number of vertices in each parcel
Notes
In case of homogeneity measure, a higher value indicates the the parcellation performs better. In case of inhomogeneity measure, a lower value = better parcellation performance.
- evaluation.logpY(emloglik, Uhat)[source][source]
Averaged log of <p(Y|U)>q Not sure anymore that this criterion makes a lot of
- evaluation.matching_U(U_true, U_predict)[source][source]
Matching the parcel labels of U_hat with the true Us.
- Parameters:
U_true – The ground truth Us
U_predict – U_hat, the predicted Us
- Returns:
The U_hat with aligned labels
- evaluation.matching_greedy(Y_target, Y_source)[source][source]
Matches the rows of two Y_source matrix to Y_target Using row-wise correlation and matching the highest pairs consecutively
- Parameters:
Y_target – Matrix to align to
Y_source – Matrix that is being aligned
- Returns:
indx – New indices, so that YSource[indx,:]~=Y_target
- evaluation.mean_adjusted_sse(data, prediction, U_hat, adjusted=True, soft_assign=True)[source][source]
Calculate the adjusted squared error for goodness of model fitting
- Parameters:
data – the real mean-centered data, shape (n_subject, n_conditions, n_locations)
prediction – the predicted mu with shape (n_conditions, n_clusters)
U_hat – the probability of brain location i belongs to cluster k
adjusted – True - if calculate adjusted SSE; Otherwise, normal SSE
soft_assign – True - expected U over all k clusters; False - if take the argmax from the k probability
- Returns:
The adjusted SSE
- evaluation.nmi(U, Uhat)[source][source]
Compute the normalized mutual information score
- Parameters:
U – The real U’s
Uhat – The estimated U’s from fitted model
- Returns:
the normalized mutual information score
- evaluation.permutations(res, nums, l, h)[source][source]
The recursive algorithm to find all permutations using back-tracking algorithm
- Parameters:
res – resultant combinations
nums – the original array to find permutations
l – left pointer
h – right pointer
- Returns:
recursive return of `res`
- evaluation.permute(nums)[source][source]
Function to get the permutations
- Parameters:
nums – The input array to find all permutations
- Returns:
All permutations without replicates
- evaluation.pt_nanstd(tensor, dim=None)[source][source]
- Compute the standard deviation of tensor along the
specified dimension.
- Parameters:
tensor (torch.Tensor) – the given pytorch tensor
dim (int) – the dimension along which to compute the standard deviation. If None, compute the standard deviation of the flattened tensor.
- Returns:
the standard deviation of tensor along the specified
- evaluation.rmse_YUhat(U_pred, data, prediction, soft_assign=True)[source][source]
Compute the RMSE between true and predicted V’s :param U_pred: the inferred U hat on training set using fitted model :param data: the true data, shape (num_sub, N, P) :param prediction: the predicted V’s, shape (N, K) :param soft_assign: if True, compute the expected RMSE; Otherwise, argmax
- Returns:
the RMSE
- evaluation.task_inhomogeneity(Y, U_hat, z_transfer=True, single_return=True)[source][source]
- Compute the global inhomogeneity measure for a given parcellation.
The task inhomogeneity is defined as the standard deviation of activation z-values within each parcel. Then the task inhomogeneity of a contrast is calculated as the weighted mean across all parcels. Finally, the global task inhomogeneity is averaged across all task contrast.
- Parameters:
Y – The underlying task contrast with a shope of (N, P) where N is the number of task activations and P is the number of brain locations.
U_hat – the given probabilistic parcellation, shape (K, P)
z_transfer – if True, apply r-to-z transformation
single_return – if True, return the global homogeneity measure only. Otherwise, return the homogeneity measure per parcel, and the number of vertices within each parcel.
- Returns:
global_homo – the mean honogeneity for each parcel
N – the valid (non-NaN) number of vertices in each parcel
Notes
In case of inhomogeneity measure, a lower value = better parcellation performance.
- evaluation.u_abserr(U, uhat)[source][source]
Absolute error on U
- Parameters:
U (tensor) – Real U’s
uhat (tensor) – Estimated U’s from arrangement model
- evaluation.u_prederr(U, uhat, expectation=True)[source][source]
Prediction error on U
- Parameters:
U – The true U (tensor like)
uhat – The predicted U’s from emission model
expectation – if True, calculate the expected error; Otherwise calculate the hard assignment error between true and the inference Uhat
- Returns:
urpred – the averaged prediction error
- evaluation.unravel_index(index, shape)[source][source]
Equalavent function of np.unravel_index() that support Pytorch tensors
- Parameters:
index – An integer array whose elements are indices into the flattened version of an array of dimensions shape.
shape – the shape of the Tensor to use for unraveling indices.
- Returns:
unraveled coords - Each array in the tuple has
the same shape as the indices tensor.