Full_model module

class full_model.FullMultiModel(arrange, emission)[source][source]

The full generative model contains arrangement model and multiple emission models for training across dataset

Estep(separate_ll=False)[source][source]

E step for full model. Run a full process of EM procedure once on both arrangement and emission models.

Parameters:

separate_ll – if True, return separte ll_A and ll_E

Returns:
  • Uhat – the prediction U_hat

  • ll – the log-likelihood for arrangement and emission models

clear()[source][source]

Clears the data from all emission models and temporary statistics from the arrangement model

collect_evidence(emloglik)[source][source]

Collects evidence over the different data sets For subjects that are in multiple datasets, it sums the log evidence.

Parameters:

emloglik (list) – List of emissionlogliklihoods

Returns:

emloglik_comb (ndarray) – ndarray of emission logliklihoods

distribute_evidence(Uhat)[source][source]

Splits the evidence to the different emission models

Parameters:

Uhat (pt.tensor) – tensor of estimated or arrangement

Returns:

Usplit (list) – List of Uhats (per emission model)

fit_em(iter=30, tol=0.01, seperate_ll=False, fit_emission=True, fit_arrangement=True, first_evidence=False)[source][source]
Run the EM-algorithm on a full model this demands that both the Emission and Arrangement model

have a full Estep and Mstep and can calculate the likelihood, including the partition function

Parameters:
  • iter (int) – Maximal number of iterations (def:30)

  • tol (double) – Tolerance on overall likelihood (def: 0.01)

  • seperate_ll (bool) – Return arrangement and emission LL separately

  • fit_emission (list / array of bools) – If True, fit emission model. Otherwise, freeze it

  • fit_arrangement – If True, fit the arrangement model. Otherwise, freeze it

  • first_evidence (bool or list of bool) – Determines whether evidence is passed from emission models to arrangement model on the first iteration. Default = False. This is improve estimation for emission models from random starting values, as the initial guess will be determined by the intialization of the arrangement model. If a list of bools, it determines this for each emission model seperately. In this case the first log-likelihood is not valid and set to -inf.

Returns:
  • model (Full Model) – fitted model (also updated)

  • ll (ndarray) – Log-likelihood of full model as function of iteration If seperate_ll, the first column is ll_A, the second ll_E

  • theta (ndarray) – History of the parameter vector

  • Uhat (pt.tensor) – (n_subj,K,P) matrix of estimates - note that this is in the space of arrangement model - call distribute_evidence(Uhat) to get this in the space of emission model

fit_em_ninits(n_inits=20, first_iter=7, iter=30, tol=0.01, fit_emission=True, fit_arrangement=True, init_emission=True, init_arrangement=True, align='arrange', verbose=True)[source][source]

Run the EM-algorithm on a full model starting with n_inits multiple random initialization values and escape from local maxima by selecting the model with the highest likelihood after first_iter. This demands that both the Emission and Arrangement model have a full Estep and Mstep and can calculate the likelihood, including the partition function

Parameters:
  • n_inits – the number of random inits

  • first_iter – the first few iterations for the random inits to find the inits parameters with maximal likelihood

  • iter – the number of iterations for full EM process

  • tol – Tolerance on overall likelihood (def: 0.01)

  • fit_emission (list) – If True, fit emission model. Otherwise, freeze it

  • fit_arrangement – If True, fit arrangement model. Otherwise, freeze it

  • init_emission (list or bool) – Randomly initialize emission models before fitting?

  • init_arrangement (bool) – Randomly initialize arrangement model before fitting?

  • align

    (None,’arrange’, or int): Alignment one first step

    is performed

    None: Not performed - Emission models may not get aligned ‘arrange’: from the arrangement model only (works only if

    spatially non-flat (i.e. random) initialization)

    int: from emission model with number i. Works with spatially

    flat initialization of arrangement model

  • verbose – if set to true, gives memory update for each iteration

Returns:
  • model (Full Model) – fitted model (also updated)

  • ll (ndarray) – Log-likelihood of best full model as function of iteration the initial iterations are included

  • theta (ndarray) – History of the parameter vector

  • Uhat – the predicted U (probabilistic)

  • first_lls – the log-likelihoods for the n_inits random parameters after first_iter runs

fit_sml(iter=60, batch_size=None, stepsize=0.8, estep='sample', seperate_ll=False, fit_emission=True, fit_arrangement=True, first_evidence=True)[source][source]

Runs a Stochastic Maximum likelihood algorithm on a full model. The emission model is still assumed to have E-step and Mstep. The arrangement model is has a postive and negative phase estep, and a gradient M-step. The arrangement likelihood is not necessarily FUTURE EXTENSIONS: * Sampling of subjects from training set * initialization of parameters * adaptitive stopping criteria * Adaptive stepsizes * Gradient acceleration methods

Parameters:
  • Y (3d-ndarray) – numsubj x N x numvoxel array of data

  • iter (int) – Maximal number of iterations

  • stepsize (double) – Fixed step size for MStep

Returns:
  • model (Full Model) – fitted model (also updated)

  • ll (ndarray) – Log-likelihood of full model as function of iteration If seperate_ll, the first column is ll_A, the second ll_E

  • theta (ndarray) – History of the parameter vector

get_param_indices(name)[source][source]

Return the indices for the full model theta vector

Parameters:

name (str) – Parameter name in the format of ‘arrange.logpi’ or ‘emissions.<X>.V’ where <X> is the index of emission model. For example ‘emissions.0.V’ will return the Vs from self.emissions[0]

Returns:

indices (np.ndarray) – 1-d numpy array of indices into the theta vector

get_params()[source][source]

Get the concatenated parameters from arrangemenet + emission model

Returns:

theta (ndarray)

initialize(Y=None, subj_ind='separate')[source][source]

Initializes the model for fitting. If Y or subj_ind is given, it replaces the existing. If set to None, the old existing will be used.

Parameters:
  • Y (list) – List of (numsubj x N x numvoxel) arrays of data

  • subj_ind (list) – List of unique subject indicators OR ‘separate’: sets seperate subjs for each data set OR None: Don’t change anything

marginal_prob()[source][source]

Convenience function that returns marginal probability for the arrangement model

Returns:

Prob (pt.tensor) – KxP marginal probabilities

move_to(device='cpu')[source][source]
Recursively move all torch.Tensor object in fullModel

class to the targe device

Parameters:
  • M (FullMultiModel) – Full model

  • device (str or pt.device) – the target device to store the tensors default - ‘cpu’

random_params(init_arrangement=True, init_emission=True)[source][source]

Sets all arrangement and emission model parameters to random values

Parameters:
  • init_arrangement (bool) – Defaults to True.

  • init_emission (bool) – Defaults to True.

sample(num_subj=None, U=None)[source][source]

Take in the number of subjects to sample for each emission model

Parameters:

num_subj (list) – list of subjects number. i.e [2, 3, 4] Or list of subject indices [[0,1,2],[0,1,2],[2,3,4]]

Returns:
  • U – the true Us of all subjects concatenated vertically, shape(num_subs, P)

  • Y – data sampled from emission models, shape (num_subs, N, P)

set_num_subj(num_subj=None)[source][source]

Sets the number of subjects for simulations

Parameters:

num_subj (list) – list of subjects number. i.e [2, 3, 4] for each dataset OR list of subject indices [[0,1,2],[0,1,2],[2,3,4]] for overlapping subjects

set_params(theta)[source][source]
Sets the parameters from a params vector ‘theta’

Assume the params order is arrange, emissions[0], emissions[1],…, emissions[X]

Parameters:

theta (np.ndarray or pt.tensor) – Input parameters as vector.