1. data Package¶
Provides easy access to some data.
1.1. data Module¶
1.1.1. Using the ni.Data data structures¶
The Data class is supposed to be easily accessible to the ni. models. They contain an index that separates the time series into different cells, trials and conditions.
Conditions are mostly for the users, as they are ignored by the model classes. They should be used to separate data before fitting a model on them, such that only data from a certain subset of trials (ie. one or more experimental conditions) are used for the fit. If multiple conditions are contained in a dataset that is passed to a model, the model should treat them as additional trials.
Trials assume a common time frame ie. that bin 0 of each trial corresponds to the same time relative to a stimulus, such that rate fluctuations can be averaged over trials.
Cells signify spike trains that are recorded from different sources (or spike sorted), such that there can be correlations between cells in a certain trail.
The index is hierarchical, as in for each condition there are several trials, which each have several cells. But since modelling is mainly used to distinguish varying behaviour of the same ensemble of cells, the number of cells in a trial and the number of trials pro condition has to be equal.
1.1.2. Storing Spike Data in Python with Pandas¶
The pandas package allows for easy storage of large data objects in python. The structure that is used by this toolbox is the pandas pandas.MultiIndexedFrame which is a pandas.DataFrame / pandas.DataFrame with an Index that has multiple levels.
The index contains at least the levels 'Cell', 'Trial' and 'Condition'. Additional Indizex can be used (eg. 'Bootstrap Sample' for Bootstrap Samples), but keep in mind that when fitting a model only 'Cell' and 'Trial' should remain, all other dimensions will be collapsed as more sets of Trials which may be indistinguishable after the fit.
Condition | Cell | Trial | t (Timeseries of specific trial) |
---|---|---|---|
0 | 0 | 0 | 0,0,0,0,1,0,0,0,0,1,0... |
0 | 0 | 1 | 0,0,0,1,0,0,0,0,1,0,0... |
0 | 0 | 2 | 0,0,1,0,1,0,0,1,0,1,0... |
0 | 1 | 0 | 0,0,0,1,0,0,0,0,0,0,0... |
0 | 1 | 1 | 0,0,0,0,0,1,0,0,0,1,0... |
... | ... | ... | ... |
1 | 0 | 0 | 0,0,1,0,0,0,0,0,0,0,1... |
1 | 0 | 1 | 0,0,0,0,0,1,0,1,0,0,0... |
... | ... | ... | ... |
To put your own data into a pandas.DataFrame, so it can be used by the models in this toolbox create a MultiIndex for example like this:
import ni
import pandas as pd
d = []
tuples = []
for con in range(nr_conditions):
for t in range(nr_trials):
for c in range(nr_cells):
spikes = list(ni.model.pointprocess.getBinary(Spike_times_STC.all_SUA[0][0].spike_times[con,t,c].flatten()*1000))
if spikes != []:
d.append(spikes)
tuples.append((con,t,c))
index = pd.MultiIndex.from_tuples(tuples, names=['Condition','Trial','Cell'])
data = ni.data.data.Data(pd.DataFrame(d, index = index))
If you only have one trial if several cells or one cell with a few trials, it can be indexed like this:
from ni.data.data import Data import pandas as pd
index = pd.MultiIndex.from_tuples([(0,0,i) for i in range(len(d))], names=[‘Condition’,’Cell’,’Trial’]) data = Data(pd.DataFrame(d, index = index))
To use the data you can use ni.data.data.Data.filter():
only_first_trials = data.filter(0, level='Trial')
# filter returns a copy of the Data object
only_the_first_trial = data.filter(0, level='Trial').filter(0, level='Cell').filter(0, level='Condition')
only_the_first_trial = data.condition(0).cell(0).trial(0) # condition(), cell() and trial() are shortcuts to filter that set *level* accordingly
only_some_trials = data.trial(range(3,15))
# using slices, ranges or boolean indexing causes the DataFrame to be indexed again from 0 to N, in this case 0:11
Also ix and xs pandas operations can be useful:
plot(data.data.ix[(0,0,0):(0,3,-1)].transpose().cumsum())
plot(data.data.xs(0,level='Condition').xs(0,level='Cell').ix[:5].transpose().cumsum())
- class ni.data.data.Data(matrix, dimensions=, []key_index='i', resolution=1000)[source]¶
Spike data container
Contains a panda Data Frame with MultiIndex. Can save to and load from files.
The Index contains at least Trial, Cell and Condition and can be extended.
- as_list_of_series(list_conditions=True, list_cells=True, list_trials=False, list_additional_indizes=True)[source]¶
Returns one timeseries, collapsing only certain indizes (on default only trials). All non collapsed indizes
- as_series()[source]¶
Returns one timeseries, collapsing all indizes.
The output has dimensions of (N,1) with N being length of one trial x nr_trials x nr_cells x nr_conditions (x additonal indices).
If cells, conditions or trials should be separated, use as_list_of_series() instead.
- cell(cells=False)[source]¶
filters for an array of cells -> see ni.data.data.Data.filter()
- condition(conditions=False)[source]¶
filters for an array of conditions -> see ni.data.data.Data.filter()
- filter(array=False, level='Cell')[source]¶
filters for arbitrary index levels array a number, list or numpy array of indizes that are to be filtered level the level of index that is to be filtered. Default: ‘Cell’
- firing_rate(smooth_width=0, trials=False)[source]¶
computes the firing rate of the data for each cell separately.
- getFlattend(all_in_one=True, trials=False)[source]¶
Deprecated since version 0.1: Use as_list_of_series() and as_series() instead
Returns one timeseries for all trials.
The all_in_one flag determines whether 'Cell' and 'Condition' should also be collapsed. If set to False and the number of Conditions and/or Cells is greater than 1, a list of timeseries will be returned. If both are greater than 1, then a list containing for each condition a list with a time series for each cell.
- interspike_intervals(smooth_width=0, trials=False)[source]¶
computes inter spike intervalls in the data for each cell separately.
- shape(level)[source]¶
Returns the shape of the sepcified level:
>>> data.shape('Trial') 100 >>> data.shape('Cell') == data.nr_cells True
- time(begin=None, end=None)[source]¶
gives a copy of the data that contains only a part of the timeseries for all trials,cells and conditions.
This resets the indices for the timeseries to 0...(end-begin)
- trial(trials=False)[source]¶
filters for an array of trials -> see ni.data.data.Data.filter()
- ni.data.data.matrix_to_dataframe(matrix, dimensions)[source]¶
conerts a trial x cells matrix into a DataFrame
1.2. decoding_data Module¶
Loads Data into a Panda Data Frame
1.3. monkey Module¶
- ni.data.monkey.Data(file_nr='101a03', resolution=1000, trial=, []condition=, []cell=[])[source]¶
Loads Data into a Data Frame
Expects a file number. Available file numbers are in ni.data.monkey.available_files:
>>> print ni.data.monkey.available_files ['101a03', '104a10', '107a03', '108a08', '112a03', '101a03', '104a11', '107a04', '109a04', '112b02', '101a04', '105a04', '108a05', '110a03', '113a04', '102a09', '105a05', '108a06', '111a03', '113a05', '103a03', '106a03', '108a07', '111a04']
trial
number of trial to load or list of trials to load. Non-existent trial numbers are ignored.condition
number of condition to load or list of conditions to load. Non-existent condition numbers are ignored.cell
number of cell to load or list of cells to load. Non-existent cell numbers are ignored.Example:
data = ni.data.monkey.Data(trial_nr = ni.data.monkey.available_trials[3], trial=range(10), condition = 0)