Analysis of pupil data¶

There are many valid approaches to the analysis of pupillometry data, but the optimal approach will depend on the type of experiment being run and the research question in mind. Kelbsch et al. (2019) provide an informative view on standards in pupillometry, and there are some great examples of community-developed packages for streamlining pupillometry data analysis. Whilst it is always worth exploring the wider options that are available, PyPlr includes a set of pandas-reliant (you won’t regret learning pandas!) scripting tools for implementing a standard data processing pipeline that is optimised to work with some of the idiosynchrasies of Pupil Labs data. So far we have found these tools to be adequate for analysing our own data, but we welcome suggestions for improvements.

See here for a general introduction to analysing Pupil Labs pupillometry data. What follows is a typical PyPlr analysis pipeline with tools from pyplr.utils and pyplr.preproc.

Export with Pupil Player¶

The first step in our pipeline is to export the data using Pupil Player, making sure the required plugins (e.g., Annotation Player plugin) are enabled. Currently this must be done individually for each recording. Below is printed the file structure of a typical recording after export (more info on recording format here).

[1]:

from pyplr import utils

# Pupil Labs recording directory
rec_dir = '/Users/jtm/OneDrive - Nexus365/protocols/pipr_protocol/JTM'
utils.print_file_structure(rec_dir)

JTM/
    .DS_Store
    annotation.pldata
    annotation_player.pldata
    annotation_player_timestamps.npy
    annotation_timestamps.npy
    blinks.pldata
    blinks_timestamps.npy
    eye1.intrinsics
    eye1.mp4
    eye1_lookup.npy
    eye1_timestamps.npy
    fixations.pldata
    fixations_timestamps.npy
    gaze.pldata
    gaze_timestamps.npy
    info.player.json
    notify.pldata
    notify_timestamps.npy
    pupil.pldata
    pupil_timestamps.npy
    surface_definitions_v01
    user_info.csv
    world.intrinsics
    world.mp4
    world_lookup.npy
    world_timestamps.npy
    analysis/
        plr_extraction.png
        plr_metrics.csv
        plr_overall_metrics.png
        processed.csv
        pupil_processing.png
    exports/
        000/
            annotations.csv
            export_info.csv
            gaze_positions.csv
            pupil_gaze_positions_info.txt
            pupil_positions.csv
            world.mp4
            world_timestamps.csv
            world_timestamps.npy
    pyplr_analysis/
    offline_data/
        tokens/
            gaze_positions_consumer_Vis_Polyline.token
            gaze_positions_producer_GazeFromRecording.token
            pupil_positions_consumer_Pupil_From_Recording.token
            pupil_positions_consumer_System_Timelines.token
            pupil_positions_producer_Pupil_From_Recording.token

Load exported data¶

Now we can set up some constants and use pyplr.utils to get things moving. new_subject(...) returns a dictionary with the root directory, recording id, data directory and a newly made output directory for the analysis results. Then, passing the data directory to load_pupil(...), we load the specified columns from the pupil_positions.csv exported data.

[2]:

# Columns to load
use_cols = ['confidence',
            'method',
            'pupil_timestamp',
            'eye_id',
            'diameter_3d',
            'diameter']

# Get a handle on a subject
s = utils.new_subject(
    rec_dir, export='000', out_dir_nm='pyplr_analysis')

# Load pupil data
samples = utils.load_pupil(
    s['data_dir'], eye_id='best', cols=use_cols)
samples

************************************************************
*************************** JTM ****************************
************************************************************
Loaded 48552 samples

[2]:

	eye_id	confidence	diameter	method	diameter_3d
pupil_timestamp
85838.895658	1	0.977211	50.102968	3d c++	6.105771
85838.903656	1	0.997867	50.037086	3d c++	6.099682
85838.911644	1	0.998295	49.702628	3d c++	6.060059
85838.919551	1	0.997833	49.968637	3d c++	6.093759
85838.927504	1	0.998183	49.883191	3d c++	6.084404
...	...	...	...	...	...
86246.730916	1	0.818574	56.684975	3d c++	6.966634
86246.739307	1	0.947873	56.824819	3d c++	6.983579
86246.746889	1	0.969337	57.176728	3d c++	7.027931
86246.758588	1	0.956524	56.838490	3d c++	6.988390
86246.762987	1	0.927155	56.933094	3d c++	7.000156

48552 rows × 5 columns

Preprocessing¶

Preprocessing pupillometry data is relatively straight-forward and typically involves dealing with signal loss due to eye blinks and smoothing out any high frequency noise. There are sophisticated algorithms for detecting blinks in a pupil time course, but Pupil Player has its own Blink Detection plugin that detects blinks based on rapid changes in the confidence metric. Whilst you can export these blink events and use the timestamps to index and mask the pupil timecourse, we find that it is effective simply to mask the pupil data with thresholds on the first derivative and confidence metric, and then to follow up with linear interpolation and smoothing.

[3]:

from pyplr import graphing
from pyplr import preproc

# Sampling frequency
SAMPLE_RATE = 120

# Pupil columns to analyse
pupil_cols = ['diameter_3d', 'diameter']

# Make figure for processing
f, axs = graphing.pupil_preprocessing(nrows=5, subject='Example')

# Plot the raw data
samples[pupil_cols].plot(title='Raw', ax=axs[0], legend=True)
axs[0].legend(loc='center right', labels=['mm', 'pixels'])

# Mask first derivative
samples = preproc.mask_pupil_first_derivative(
    samples, threshold=3.0, mask_cols=pupil_cols)
samples[pupil_cols].plot(
    title='Masked 1st deriv (3*SD)', ax=axs[1], legend=False)

# Mask confidence
samples = preproc.mask_pupil_confidence(
    samples, threshold=0.8, mask_cols=pupil_cols)
samples[pupil_cols].plot(
    title='Masked confidence (<0.8)', ax=axs[2], legend=False)

# Interpolate
samples = preproc.interpolate_pupil(
    samples, interp_cols=pupil_cols)
samples[pupil_cols].plot(
    title='Linear interpolation', ax=axs[3], legend=False)

# Smooth
samples = preproc.butterworth_series(
    samples, fields=pupil_cols, filt_order=3,
    cutoff_freq=4/(SAMPLE_RATE/2))
samples[pupil_cols].plot(
    title='3rd order Butterworth filter with 4 Hz cut-off',
    ax=axs[4], legend=False);

Extraction¶

Having cleaned the timecourse we need to extract the events of interest. This requires the annotation events sent during the recording, which have the timestamps needed for extraction. These can be loaded with utlis.load_annotations(...).

[4]:

events = utils.load_annotations(s['data_dir'])
events

Loaded 6 events

[4]:

	index	label	duration	color	creation_time	creator	protocol	pulse_duration	pulse_spec
timestamp
85854.348667	1835	LIGHT_ON	NaN	red	2020-11-10 09:39:10.888989	jtm	pulse	1000	[0, 0, 0, 0, 0, 0, 0, 0, 0, 1979]
85919.820362	9629	LIGHT_ON	NaN	blue	2020-11-10 09:39:10.882007	jtm	pulse	1000	[0, 0, 0, 2500, 0, 0, 0, 0, 0, 0]
85985.280630	17422	LIGHT_ON	NaN	blue	2020-11-10 09:39:10.882007	jtm	pulse	1000	[0, 0, 0, 2500, 0, 0, 0, 0, 0, 0]
86050.768628	25217	LIGHT_ON	NaN	red	2020-11-10 09:39:10.888989	jtm	pulse	1000	[0, 0, 0, 0, 0, 0, 0, 0, 0, 1979]
86116.228025	33008	LIGHT_ON	NaN	blue	2020-11-10 09:39:10.882007	jtm	pulse	1000	[0, 0, 0, 2500, 0, 0, 0, 0, 0, 0]
86181.688172	40801	LIGHT_ON	NaN	red	2020-11-10 09:39:10.888989	jtm	pulse	1000	[0, 0, 0, 0, 0, 0, 0, 0, 0, 1979]

Now we can pass our samples and events to utils.extract(...) along with some parameters defining the number of samples to extract and how much to offset the data relative to the event. Here we extract-65 second epochs and offset by a 5 second baseline period.

[5]:

# Number of samples to extract and which sample
# should mark the onset of the event
DURATION = 7800
ONSET_IDX = 600

# Extract the event ranges
ranges = utils.extract(
    samples,
    events,
    offset=-ONSET_IDX,
    duration=DURATION,
    borrow_attributes=['color'])
ranges

Extracted ranges for 6 events

[5]:

		eye_id	confidence	diameter	method	diameter_3d	interpolated	orig_idx	color
event	onset
0	0	1.0	0.998989	48.404937	3d c++	5.990967	0	85849.311553	red
	1	1.0	0.998972	48.436117	3d c++	5.994949	0	85849.319757	red
	2	1.0	0.999134	48.469848	3d c++	5.999237	0	85849.327708	red
	3	1.0	0.998426	48.505896	3d c++	6.003802	0	85849.335675	red
	4	1.0	0.998895	48.544031	3d c++	6.008612	0	85849.343669	red
...	...	...	...	...	...	...	...	...	...
5	7795	1.0	0.922093	54.375155	3d c++	6.707360	0	86242.127040	red
	7796	1.0	0.977674	54.338056	3d c++	6.703963	0	86242.135026	red
	7797	1.0	0.933031	54.304633	3d c++	6.701016	0	86242.142965	red
	7798	1.0	0.970404	54.274990	3d c++	6.698505	0	86242.150957	red
	7799	1.0	0.903852	54.249187	3d c++	6.696413	0	86242.162996	red

46800 rows × 8 columns

utils.extract(...) returns a pandas DataFrame with a two-level (event, onset) hierarchichal index. The original index to the data is preserved in a column along with any attributes borrowed from the annotations. Now is a good time to create additional columns expressing the pupil data as percent signal change from baseline.

[6]:

# Calculate baselines
baselines = ranges.loc[:, range(0, ONSET_IDX), :].mean(level=0)

# New columns for percent signal change
ranges = preproc.percent_signal_change(
    ranges, baselines, pupil_cols)
ranges

[6]:

		eye_id	confidence	diameter	method	diameter_3d	interpolated	orig_idx	color	diameter_3d_pc	diameter_pc
event	onset
0	0	1.0	0.998989	48.404937	3d c++	5.990967	0	85849.311553	red	-6.907039	-7.009474
	1	1.0	0.998972	48.436117	3d c++	5.994949	0	85849.319757	red	-6.845173	-6.949574
	2	1.0	0.999134	48.469848	3d c++	5.999237	0	85849.327708	red	-6.778534	-6.884774
	3	1.0	0.998426	48.505896	3d c++	6.003802	0	85849.335675	red	-6.707607	-6.815521
	4	1.0	0.998895	48.544031	3d c++	6.008612	0	85849.343669	red	-6.632865	-6.742260
...	...	...	...	...	...	...	...	...	...	...	...
5	7795	1.0	0.922093	54.375155	3d c++	6.707360	0	86242.127040	red	8.891820	8.796684
	7796	1.0	0.977674	54.338056	3d c++	6.703963	0	86242.135026	red	8.836673	8.722455
	7797	1.0	0.933031	54.304633	3d c++	6.701016	0	86242.142965	red	8.788822	8.655579
	7798	1.0	0.970404	54.274990	3d c++	6.698505	0	86242.150957	red	8.748060	8.596268
	7799	1.0	0.903852	54.249187	3d c++	6.696413	0	86242.162996	red	8.714104	8.544640

46800 rows × 10 columns

Plotting and parametrisation¶

It is common practice when analysing the PLR to describe it in terms of paremeters relating to time, velocity and acceleration. So once you have an array of data representing a pupil’s response to light, be it from a single trial or an average of multiple trials, simply pass it to pyplr.plr.PLR along with some basic parameters.

[7]:

from pyplr.plr import PLR

average_plr = ranges.mean(level=1)['diameter_3d'].to_numpy()
plr = PLR(average_plr,
          sample_rate=SAMPLE_RATE,
          onset_idx=ONSET_IDX,
          stim_duration=1)

Now you can do a quick plot with options to display the velocity and acceleration profiles and the derived parameters:

[8]:

fig = plr.plot(vel=True, acc=True, print_params=True)

Or simply access the parameters as a DataFrame.

[9]:

params = plr.parameters()
params

[9]:

	value
Baseline	6.751031
Latency_a	0.225000
Latency_b	0.283333
T2MaxVel	0.425000
T2MaxCon	1.550000
T2Rec75pc	18.825000
PeakCon	3.164023
ConAmplitude	3.587008
VelConMax	5.940593
VelConAve	2.670336
AccConMax	35.127363
ConTime	1.325000
VelRedMax	0.999670
AccRedMax	7.021978

Analysis of pupil data¶

Export with Pupil Player¶

Load exported data¶

Preprocessing¶

Extraction¶

Plotting and parametrisation¶

PyPlr

Navigation

Related Topics