.. ipython:: python :suppress: import numpy as np import pandas as pd import matplotlib as mpl import matplotlib.pyplot as plt plt.style.use('default') mpl.rcParams['axes.labelsize'] = 'large' mpl.rcParams['savefig.bbox'] = 'tight' mpl.rcParams['savefig.pad_inches'] = 0.1 pd.options.display.max_rows = 15 ================== Evaluating a model ================== For a given model and set of parameters, we can measure the overall fit to a dataset by calculating the log likelihood. The *likelihood* is the probability of the data according to the model. Probability is calculated for full recall sequences and then multiplied for each recall sequence to obtain an overall probability of the data. In practice, this leads to extremely small probabilities, which may be difficult for the computer to calculate. Therefore, we use log probabilities to avoid this problem. For both likelihoods and log likelihood, greater values indicate a better fit of the model to the data. First, load some sample data: .. ipython:: python from cymr import fit, cmr data = fit.sample_data('Morton2013_mixed').query('subject <= 3') Patterns and Weights ~~~~~~~~~~~~~~~~~~~~ To simulate free recall using the CMR-Distributed model, we must first define pre-experimental weights for the network. For this example, we'll define localist patterns, which are distinct for each presented item. They can be represented by an identity matrix with one entry for each item. .. ipython:: python n_items = 768 loc_patterns = np.eye(n_items) We also need to define the item pool that corresponds to those patterns. We can get this information from the data: .. ipython:: python study = data.query("trial_type == 'study'") items = study.groupby('item_index')['item'].first().to_numpy() To indicate where the patterns should be used in the network, they are specified as :code:`vector` (for the :math:`\mathrm{M}^{FC}` and/or :math:`\mathrm{M}^{CF}` matrices) or :code:`similarity` (for the :math:`\mathrm{M}^{FF}` matrix). We also label each pattern with a name; here, we'll refer to the localist patterns as :code:`'loc'`. .. ipython:: python patterns = {'items': items, 'vector': {'loc': loc_patterns}} Parameters ~~~~~~~~~~ :py:class:`~cymr.parameters.Parameters` objects define how parameter values will be interpreted. One use of them is to define the layers and sublayers of a network. Each pattern is placed in a *region* of the connection matrix. The region is defined by the sublayer and segment of the :math:`f` and :math:`c` layers. Conventionally, the :math:`f` layer has only one *sublayer* called :code:`'task'`. The :math:`c` layer may have multiple sublayers with different names. Here, we'll just use one, also called :code:`'task'`. First, we indicate what sublayers will be included in the network. .. ipython:: python param_def = cmr.CMRParameters() param_def.set_sublayers(f=['task'], c=['task']) Patterns may include multiple components that may be weighted differently. Weight parameters are used to set the weighting of each component. Here, we only have one component, which we assign a weight based on the value of the :code:`w_loc` parameter. When setting the weights, we first indicate the region to apply weights to, followed by an expression. This expression may reference parameters and/or patterns. .. ipython:: python weights = {(('task', 'item'), ('task', 'item')): 'w_loc * loc'} param_def.set_weights('fc', weights) param_def.set_weights('cf', weights) Segments for simulating the start of the list will also be added automatically. Finally, we define the parameters that we want to evaluate, by creating a dictionary with a name and value for each parameter. We'll get a different log likelihood for each parameter set. For a model to be evaluated, all parameters expected by that model must be defined, including any parameters used for setting weights (here, :code:`w_loc`). .. ipython:: python param = { 'B_enc': 0.7, 'B_start': 0.3, 'B_rec': 0.9, 'w_loc': 1, 'Lfc': 0.15, 'Lcf': 0.15, 'P1': 0.2, 'P2': 2, 'T': 0.1, 'X1': 0.001, 'X2': 0.25 } Evaluating log likelihood ~~~~~~~~~~~~~~~~~~~~~~~~~ Define a model (here, cmr.CMRDistributed) and use :py:meth:`~cymr.fit.Recall.likelihood` to evaluate the log likelihood of the observed data according to that model and these parameter values. Greater (i.e., less negative) log likelihood values indicate a better fit. In :doc:`/guide/fitting`, we'll use a parameter search to estimate the best-fitting parameters for a model. .. ipython:: python model = cmr.CMR() results = model.likelihood(data, param, param_def=param_def, patterns=patterns) results