Full_model module

class full_model.FullMultiModel(arrange, emission)[source][source]

The full generative model contains arrangement model and multiple emission models for training across dataset

Estep(separate_ll=False)[source][source]

E step for full model. Run a full process of EM procedure once on both arrangement and emission models.

Parameters:

separate_ll – if True, return separte ll_A and ll_E

Returns:

Uhat – the prediction U_hat
ll – the log-likelihood for arrangement and emission models

clear()[source][source]: Clears the data from all emission models and temporary statistics from the arrangement model

collect_evidence(emloglik)[source][source]

Collects evidence over the different data sets For subjects that are in multiple datasets, it sums the log evidence.

Parameters:: emloglik (list) – List of emissionlogliklihoods
Returns:: emloglik_comb (ndarray) – ndarray of emission logliklihoods

distribute_evidence(Uhat)[source][source]

Splits the evidence to the different emission models

Parameters:: Uhat (pt.tensor) – tensor of estimated or arrangement
Returns:: Usplit (list) – List of Uhats (per emission model)

fit_em(iter=30, tol=0.01, seperate_ll=False, fit_emission=True, fit_arrangement=True, first_evidence=False)[source][source]

Run the EM-algorithm on a full model this demands that both the Emission and Arrangement model: have a full Estep and Mstep and can calculate the likelihood, including the partition function

Parameters:

iter (int) – Maximal number of iterations (def:30)
tol (double) – Tolerance on overall likelihood (def: 0.01)
seperate_ll (bool) – Return arrangement and emission LL separately
fit_emission (list / array of bools) – If True, fit emission model. Otherwise, freeze it
fit_arrangement – If True, fit the arrangement model. Otherwise, freeze it
first_evidence (bool or list of bool) – Determines whether evidence is passed from emission models to arrangement model on the first iteration. Default = False. This is improve estimation for emission models from random starting values, as the initial guess will be determined by the intialization of the arrangement model. If a list of bools, it determines this for each emission model seperately. In this case the first log-likelihood is not valid and set to -inf.

Returns:

model (Full Model) – fitted model (also updated)
ll (ndarray) – Log-likelihood of full model as function of iteration If seperate_ll, the first column is ll_A, the second ll_E
theta (ndarray) – History of the parameter vector
Uhat (pt.tensor) – (n_subj,K,P) matrix of estimates - note that this is in the space of arrangement model - call distribute_evidence(Uhat) to get this in the space of emission model

fit_em_ninits(n_inits=20, first_iter=7, iter=30, tol=0.01, fit_emission=True, fit_arrangement=True, init_emission=True, init_arrangement=True, align='arrange', verbose=True)[source][source]

Run the EM-algorithm on a full model starting with n_inits multiple random initialization values and escape from local maxima by selecting the model with the highest likelihood after first_iter. This demands that both the Emission and Arrangement model have a full Estep and Mstep and can calculate the likelihood, including the partition function

Parameters:

n_inits – the number of random inits
first_iter – the first few iterations for the random inits to find the inits parameters with maximal likelihood
iter – the number of iterations for full EM process
tol – Tolerance on overall likelihood (def: 0.01)
fit_emission (list) – If True, fit emission model. Otherwise, freeze it
fit_arrangement – If True, fit arrangement model. Otherwise, freeze it
init_emission (list or bool) – Randomly initialize emission models before fitting?
init_arrangement (bool) – Randomly initialize arrangement model before fitting?
align –

(None,’arrange’, or int): Alignment one first step
is performed

None: Not performed - Emission models may not get aligned ‘arrange’: from the arrangement model only (works only if

spatially non-flat (i.e. random) initialization)

int: from emission model with number i. Works with spatially
flat initialization of arrangement model
verbose – if set to true, gives memory update for each iteration

Returns:

model (Full Model) – fitted model (also updated)
ll (ndarray) – Log-likelihood of best full model as function of iteration the initial iterations are included
theta (ndarray) – History of the parameter vector
Uhat – the predicted U (probabilistic)
first_lls – the log-likelihoods for the n_inits random parameters after first_iter runs

fit_sml(iter=60, batch_size=None, stepsize=0.8, estep='sample', seperate_ll=False, fit_emission=True, fit_arrangement=True, first_evidence=True)[source][source]

Runs a Stochastic Maximum likelihood algorithm on a full model. The emission model is still assumed to have E-step and Mstep. The arrangement model is has a postive and negative phase estep, and a gradient M-step. The arrangement likelihood is not necessarily FUTURE EXTENSIONS: * Sampling of subjects from training set * initialization of parameters * adaptitive stopping criteria * Adaptive stepsizes * Gradient acceleration methods

Parameters:

Y (3d-ndarray) – numsubj x N x numvoxel array of data
iter (int) – Maximal number of iterations
stepsize (double) – Fixed step size for MStep

Returns:

model (Full Model) – fitted model (also updated)
ll (ndarray) – Log-likelihood of full model as function of iteration If seperate_ll, the first column is ll_A, the second ll_E
theta (ndarray) – History of the parameter vector

get_param_indices(name)[source][source]

Return the indices for the full model theta vector

Parameters:: name (str) – Parameter name in the format of ‘arrange.logpi’ or ‘emissions.<X>.V’ where <X> is the index of emission model. For example ‘emissions.0.V’ will return the Vs from self.emissions[0]
Returns:: indices (np.ndarray) – 1-d numpy array of indices into the theta vector

get_params()[source][source]

Get the concatenated parameters from arrangemenet + emission model

Returns:: theta (ndarray)

initialize(Y=None, subj_ind='separate')[source][source]

Initializes the model for fitting. If Y or subj_ind is given, it replaces the existing. If set to None, the old existing will be used.

Parameters:

Y (list) – List of (numsubj x N x numvoxel) arrays of data
subj_ind (list) – List of unique subject indicators OR ‘separate’: sets seperate subjs for each data set OR None: Don’t change anything

marginal_prob()[source][source]

Convenience function that returns marginal probability for the arrangement model

Returns:: Prob (pt.tensor) – KxP marginal probabilities

move_to(device='cpu')[source][source]

Recursively move all torch.Tensor object in fullModel: class to the targe device

Parameters:

M (FullMultiModel) – Full model
device (str or pt.device) – the target device to store the tensors default - ‘cpu’

random_params(init_arrangement=True, init_emission=True)[source][source]

Sets all arrangement and emission model parameters to random values

Parameters:

init_arrangement (bool) – Defaults to True.
init_emission (bool) – Defaults to True.

sample(num_subj=None, U=None)[source][source]

Take in the number of subjects to sample for each emission model

Parameters:

num_subj (list) – list of subjects number. i.e [2, 3, 4] Or list of subject indices [[0,1,2],[0,1,2],[2,3,4]]

Returns:

U – the true Us of all subjects concatenated vertically, shape(num_subs, P)
Y – data sampled from emission models, shape (num_subs, N, P)

set_num_subj(num_subj=None)[source][source]

Sets the number of subjects for simulations

Parameters:: num_subj (list) – list of subjects number. i.e [2, 3, 4] for each dataset OR list of subject indices [[0,1,2],[0,1,2],[2,3,4]] for overlapping subjects

set_params(theta)[source][source]

Sets the parameters from a params vector ‘theta’: Assume the params order is arrange, emissions[0], emissions[1],…, emissions[X]

Parameters:: theta (np.ndarray or pt.tensor) – Input parameters as vector.