Interpolate data and concatenate streams#
Raw Neon data is not always easy to work with. A data stream (e.g., gaze, eye states, IMU) might not have been sampled at a constant rate. Additionally, streams may have differing sampling rates and lack synchronized start timestamps. All of these issues add complexity to continuous data analysis and sensor fusion.
This tutorial demonstrates how to deal with these issues by interpolating data streams and concatenating them into a single DataFrame. We will use the same boardView
dataset as in the previous tutorial.
[1]:
import numpy as np
from pyneon import get_sample_data, NeonRecording
import matplotlib.pyplot as plt
import seaborn as sns
recording_dir = (
get_sample_data("boardView")
/ "Timeseries Data + Scene Video"
/ "boardview1-d4fd9a27"
)
We now access raw data from gaze, eye states, and IMU streams.
[2]:
recording = NeonRecording(recording_dir)
gaze = recording.gaze
eye_states = recording.eye_states
imu = recording.imu
Irregular sampling in data streams#
Data points from each stream are indexed by timestamp [ns]
, which denotes the UTC time of the sample in nanoseconds. Are these samples uniformly distributed over time? We can examine the initial samples from each stream to assess their distribution, where, due to device boot-up, the sampling may be irregular.
[3]:
# Take the first 0.3 seconds of gaze data
gaze_begin = gaze.crop(0, 0.3, by="time")
# And the corresponding eye states and IMU data
eye_states_begin = eye_states.restrict(gaze_begin)
imu_begin = imu.restrict(gaze_begin)
# Define a function to plot the timestamps of the gaze, eye states, and IMU data
def plot_timestamps(gaze, eye_states, imu, concat_stream=None):
_, ax = plt.subplots(figsize=(8, 2))
ax.scatter(gaze.ts, np.ones_like(gaze.ts), s=5)
ax.scatter(eye_states.ts, np.ones_like(eye_states.ts) * 2, s=5)
ax.scatter(imu.ts, np.ones_like(imu.ts) * 3, s=5)
# If a concatenated stream (explained later) is provided, plot its timestamps as well
if concat_stream is not None:
ax.scatter(concat_stream.ts, np.ones_like(concat_stream.ts) * 4, s=5)
ax.set_yticks([1, 2, 3, 4])
ax.set_yticklabels(["Gaze", "Eye states", "IMU", "Concatenated"])
ax.set_ylim(0.5, 4.5)
else:
ax.set_yticks([1, 2, 3])
ax.set_yticklabels(["Gaze", "Eye states", "IMU"])
ax.set_ylim(0.5, 3.5)
ax.set_xlabel("Timestamp (ns)")
plt.show()
plot_timestamps(gaze_begin, eye_states_begin, imu_begin)

As shown in the figure above, in addition to the apparently later onset of IMU data, the first 0.3 second of gaze and eye states data indeed suffers from dropouts.
In addition to dropouts, irregular sampling may also occur, especially for IMU data from our experience. For example, it can be observed in the middle of this recording (5 - 5.3 seconds):
[4]:
gaze_middle = gaze.crop(5, 5.3, by="time")
eye_states_middle = eye_states.restrict(gaze_middle)
imu_middle = imu.restrict(gaze_middle)
plot_timestamps(gaze_middle, eye_states_middle, imu_middle)

How frequent are these irregularities? We now examine the distribution of the time differences between consecutive samples, and compare them to the expected time difference for a regular, nominal (as specified by Pupil Labs) sampling rate.
[5]:
print(
f"Nominal sampling frequency of gaze: {gaze.sampling_freq_nominal} Hz. "
f"Actual: {gaze.sampling_freq_effective:.1f} Hz"
)
print(
f"Nominal sampling frequency of eye states: {eye_states.sampling_freq_nominal} Hz. "
f"Actual: {eye_states.sampling_freq_effective:.1f} Hz"
)
print(
f"Nominal sampling frequency of IMU: {imu.sampling_freq_nominal} Hz. "
f"Actual: {imu.sampling_freq_effective:.1f} Hz"
)
fig, axs = plt.subplots(3, 1, tight_layout=True)
axs[0].hist(gaze.ts_diff, bins=50)
axs[0].axvline(1e9 / gaze.sampling_freq_nominal, c="red", label="Nominal")
axs[0].set_title("Gaze")
axs[1].hist(eye_states.ts_diff, bins=50)
axs[1].axvline(1e9 / eye_states.sampling_freq_nominal, c="red", label="Nominal")
axs[1].set_title("Eye states")
axs[2].hist(imu.ts_diff, bins=50)
axs[2].axvline(1e9 / imu.sampling_freq_nominal, c="red", label="Nominal")
axs[2].set_title("IMU")
axs[2].set_xlabel("Time difference [ns]")
for i in range(3):
axs[i].set_yscale("log")
axs[i].set_ylabel("Counts")
axs[i].legend()
plt.show()
Nominal sampling frequency of gaze: 200 Hz. Actual: 199.4 Hz
Nominal sampling frequency of eye states: 200 Hz. Actual: 199.4 Hz
Nominal sampling frequency of IMU: 110 Hz. Actual: 113.9 Hz

For gaze and eye states data, the empirical distribution of time differences is close to the expected value. However, some integer multiples of the nominal sampling rate suggest possible eye video frame drops. For IMU data, the distribution is much wider.
Interpolating data streams#
Given the presence of irregular sampling, if you want to perform analyses that assume continuous data streams, interpolation is necessary. PyNeon uses the scipy.interpolate.interp1d
(API reference) function to interpolate data streams. The interpolate()
method of NeonStream
creates a new object with interpolated data.
With the default parameters, the interpolated data will have the same start timestamp as the original data, and the sampling rate is set to the nominal sampling frequency specified by Pupil Labs (200 Hz for gaze and eye states, 110 Hz for IMU).
[6]:
# Interpolate to the nominal sampling frequency
gaze_resampled = gaze.interpolate()
# Three ways you can check if the resampling was successful:
# 1. Compare the effective sampling frequency to the nominal sampling frequency
print(
f"Nominal sampling frequency of gaze: {gaze_resampled.sampling_freq_nominal} Hz. "
f"Actual (after interpolation): {gaze_resampled.sampling_freq_effective:.1f} Hz"
)
# 2. Check the number of unique time differences
print(f"Only one unique time difference: {np.unique(gaze_resampled.ts_diff)}")
# 3. Call the `is_uniformly_sampled` property (boolean)
print(
f"The new gaze stream is uniformly sampled: {gaze_resampled.is_uniformly_sampled}"
)
print(gaze_resampled.data.dtypes)
Nominal sampling frequency of gaze: 200 Hz. Actual (after interpolation): 200.0 Hz
Only one unique time difference: [5000000]
The new gaze stream is uniformly sampled: True
gaze x [px] float64
gaze y [px] float64
worn Int32
fixation id Int32
blink id Int32
azimuth [deg] float64
elevation [deg] float64
dtype: object
Note that after intepolation, the data types of the columns are preserved.
Alternatively, one can also resample the gaze data to any desired timestamps by specifying the new_ts
parameter. This is especially helpful when synchronizing different data streams. For example, we can resample the gaze data (~200Hz) to the timestamps of the IMU data (~110Hz).
[7]:
print(f"Original gaze data length: {len(gaze)}")
print(f"Original IMU data length: {len(imu)}")
gaze_resampled_to_imu = gaze.interpolate(new_ts=imu.ts)
print(f"Gaze data length after resampling to IMU: {len(gaze_resampled_to_imu)}")
Original gaze data length: 6091
Original IMU data length: 3459
Gaze data length after resampling to IMU: 3459
By default, float-type data is interpolated using the 'cubic'
method. We are examine its behavior by comparing interpolated data with the raw data.
[8]:
gaze_resampled_begin = gaze_resampled.restrict(gaze_begin)
fig, ax = plt.subplots(figsize=(8, 2))
sns.lineplot(
ax=ax,
data=gaze_begin.data,
x=gaze_begin.ts,
y="gaze x [px]",
marker="o",
label="Raw",
)
sns.lineplot(
ax=ax,
data=gaze_resampled_begin.data,
x=gaze_resampled_begin.ts,
y="gaze x [px]",
marker="^",
alpha=0.6,
label="Interpolated",
)
plt.legend()
plt.show()

Concatenating different streams#
There might be cases where you want to concatenate different streams into a single one to facilitate further analysis (e.g., epoching). This is possible by interpolating the streams’ data to common timestamps and concatenating them into a single DataFrame.
The method concat_streams()
of NeonRecording
provides such functionality. It takes a list of stream names and resamples them to common timestamps, defined by the latest start and earliest end timestamps of the streams. The new sampling frequency can either be directly specified or taken from the lowest/highest sampling frequency of the streams.
In the following example, we concatenate the gaze, eye states, and IMU streams into a single DataFrame using the default parameters.
[9]:
concat_stream = recording.concat_streams(["gaze", "eye_states", "imu"])
print(concat_stream.data.head())
print(concat_stream.data.columns)
Concatenating streams:
Gaze
3D eye states
IMU
Using lowest sampling rate: 110 Hz (['imu'])
Using latest start timestamp: 1732621490607650343 (['imu'])
Using earliest last timestamp: 1732621520979070343 (['gaze' '3d_eye_states'])
gaze x [px] gaze y [px] worn fixation id blink id \
1732621490607650343 705.518843 554.990998 1 1 <NA>
1732621490616741252 704.882466 553.793144 1 1 <NA>
1732621490625832161 707.703787 556.712159 1 1 <NA>
1732621490634923070 711.389879 553.846843 1 1 <NA>
1732621490644013979 709.281775 555.543777 1 1 <NA>
azimuth [deg] elevation [deg] pupil diameter left [mm] \
1732621490607650343 -7.085339 3.473196 3.346414
1732621490616741252 -7.126717 3.550038 3.363306
1732621490625832161 -6.944033 3.363048 3.368352
1732621490634923070 -6.707339 3.547815 3.365432
1732621490644013979 -6.842683 3.438366 3.374732
pupil diameter right [mm] eyeball center left x [mm] \
1732621490607650343 3.360563 -32.282935
1732621490616741252 3.359459 -32.249418
1732621490625832161 3.350918 -32.216781
1732621490634923070 3.361014 -32.249155
1732621490644013979 3.365781 -32.234159
... acceleration x [g] acceleration y [g] \
1732621490607650343 ... -0.067383 -0.340820
1732621490616741252 ... -0.062619 -0.315917
1732621490625832161 ... -0.052682 -0.329993
1732621490634923070 ... -0.058795 -0.334090
1732621490644013979 ... -0.060815 -0.322199
acceleration z [g] roll [deg] pitch [deg] yaw [deg] \
1732621490607650343 0.932129 1.923968 -20.230545 132.920122
1732621490616741252 0.925714 1.923949 -20.227639 132.924402
1732621490625832161 0.924432 1.920479 -20.228228 132.927574
1732621490634923070 0.931812 1.916587 -20.228839 132.931756
1732621490644013979 0.925476 1.916104 -20.227092 132.937429
quaternion w quaternion x quaternion y quaternion z
1732621490607650343 0.395828 -0.085287 -0.154390 0.901227
1732621490616741252 0.395796 -0.085271 -0.154370 0.901246
1732621490625832161 0.395766 -0.085242 -0.154389 0.901258
1732621490634923070 0.395728 -0.085207 -0.154411 0.901275
1732621490644013979 0.395683 -0.085190 -0.154403 0.901297
[5 rows x 34 columns]
Index(['gaze x [px]', 'gaze y [px]', 'worn', 'fixation id', 'blink id',
'azimuth [deg]', 'elevation [deg]', 'pupil diameter left [mm]',
'pupil diameter right [mm]', 'eyeball center left x [mm]',
'eyeball center left y [mm]', 'eyeball center left z [mm]',
'eyeball center right x [mm]', 'eyeball center right y [mm]',
'eyeball center right z [mm]', 'optical axis left x',
'optical axis left y', 'optical axis left z', 'optical axis right x',
'optical axis right y', 'optical axis right z', 'gyro x [deg/s]',
'gyro y [deg/s]', 'gyro z [deg/s]', 'acceleration x [g]',
'acceleration y [g]', 'acceleration z [g]', 'roll [deg]', 'pitch [deg]',
'yaw [deg]', 'quaternion w', 'quaternion x', 'quaternion y',
'quaternion z'],
dtype='object')
We show an exemplary sampling of eye, imu and concatenated data below. It can be seen that the concatenated data is regularly sampled at the nominal sampling rate of IMU.
[10]:
concat_stream_middle = concat_stream.restrict(gaze_middle)
plot_timestamps(gaze_middle, eye_states_middle, imu_middle, concat_stream_middle)

Despite different sampling from the original streams, the concatenated data still respects the original values of the data. For example, the gaze x [px]
and acceleration x [g]
data from the raw and concatenated stream are quite comparable.
[11]:
fig, axs = plt.subplots(2, 1, figsize=(8, 4), tight_layout=True)
sns.lineplot(
ax=axs[0],
data=gaze_middle.data,
x=gaze_middle.ts,
y="gaze x [px]",
marker="o",
label="Raw gaze",
)
sns.lineplot(
ax=axs[0],
data=concat_stream_middle.data,
x=concat_stream_middle.ts,
y="gaze x [px]",
marker="^",
label="Concatenated",
)
sns.lineplot(
ax=axs[1],
data=imu_middle.data,
x=imu_middle.ts,
y="acceleration x [g]",
marker="o",
label="Raw IMU",
)
sns.lineplot(
ax=axs[1],
data=concat_stream_middle.data,
x=concat_stream_middle.ts,
y="acceleration x [g]",
marker="^",
label="Concatenated",
)
plt.legend()
plt.show()
