Exporting to BIDS Formats#

What is BIDS and Why Use It?#

The Brain Imaging Data Structure (BIDS) is a comprehensive framework designed to systematically organize and share diverse types of data, including behavioral, physiological, and neuroimaging information. Converting datasets into BIDS format is a widely adopted methodology, particularly in the process of curating datasets that adhere to the principles of FAIR (Findable, Accessible, Interoperable, Reusable).

Key benefits of using BIDS:

  • Standardization: Consistent naming conventions and directory structures across datasets

  • Interoperability: Enables automated analysis pipelines and data sharing

  • Reproducibility: Comprehensive metadata ensures experiments can be understood and replicated

  • Community adoption: Widely accepted format in neuroscience research

The general framework of BIDS is described in the following publication:

Gorgolewski, K., Auer, T., Calhoun, V. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044 (2016). https://doi.org/10.1038/sdata.2016.44

BIDS Extensions for Neon Data#

For datasets from Pupil Labs Neon eye-tracker, we utilize two BIDS extensions:

  1. Motion-BIDS (BEP029): Organizes motion data including acceleration, angular velocity (gyroscope), and orientation from the IMU sensor

  2. Eye-Tracking-BIDS (BEP020): Organizes gaze position, pupil size/diameter data, and eye-tracking events (fixations, saccades, blinks)

In this tutorial, we demonstrate how to export Neon recordings to these BIDS formats using PyNeon’s export_motion_bids() and export_eye_bids() methods. These functions handle all file naming, metadata generation, and formatting requirements automatically.

[1]:
import json
from pathlib import Path
import pandas as pd
from seedir import seedir
from pyneon import Dataset, get_sample_data

# Load sample data
dataset = Dataset(get_sample_data("markers", format="cloud"))
rec = dataset.recordings[1]

Exporting to Motion-BIDS#

The Motion-BIDS specification provides a standardized way to organize motion sensor data from devices like IMUs (Inertial Measurement Units):

Jeung, S., Cockx, H., Appelhoff, S. et al. Motion-BIDS: an extension to the brain imaging data structure to organize motion data for reproducible research. Sci Data 11, 716 (2024). https://doi.org/10.1038/s41597-024-03559-8

Understanding the BIDS Prefix#

The export_motion_bids() method requires a prefix string that specifies the experimental context. The prefix follows this standardized format (fields in brackets are optional):

sub-<label>[_ses-<label>]_task-<label>_tracksys-<label>[_acq-<label>][_run-<index>]

Required fields:

  • sub-<label>: Subject/participant identifier (e.g., sub-01, sub-Alice)

  • task-<label>: Name of the experimental task (e.g., task-Navigation, task-Reading)

  • tracksys-<label>: Tracking system used (for Neon IMU: tracksys-NeonIMU)

Optional fields:

  • ses-<label>: Session identifier for multi-session experiments

  • acq-<label>: Acquisition parameters or protocol

  • run-<index>: Run number for repeated acquisitions

Adding Custom Metadata#

You can include additional experiment-specific metadata by passing a dictionary to the extra_metadata argument. This information will be saved in the JSON metadata file and is crucial for documenting your experimental setup.

Let’s export the motion data:

[2]:
# Create a BIDS directory
motion_dir = Path("export") / "BIDS" / "sub-01" / "ses-1" / "motion"

# Export the motion data to BIDS format
prefix = "sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1"
extra_metadata = {
    "TaskName": "LabMuse",
    "TaskDescription": "Watching artworks on the screen",
    "InstitutionName": "Streeling University",
    "InstitutionAddress": "Trantor, Galactic Empire",
    "InstitutionalDepartmentName": "Department of Psychohistory",
}

rec.export_motion_bids(motion_dir, prefix=prefix, extra_metadata=extra_metadata)

seedir(motion_dir.parent.parent)
sub-01/
└─ses-1/
  ├─motion/
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.json
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.tsv
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.json
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.tsv
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.json
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.tsv.gz
  │ ├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.json
  │ └─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.tsv.gz
  └─sub-01_ses-1_scans.tsv

Understanding the Motion-BIDS File Structure#

The export creates four files that work together to fully describe the IMU data:

  1. ``_motion.tsv``: Tab-separated file containing the raw IMU time-series data (no header)

  2. ``_motion.json``: Metadata describing the recording setup, device, and data characteristics

  3. ``_channels.tsv``: Information about each data channel (type, units, sampling rate)

  4. ``_channels.json``: Coordinate system information for the motion data

Additionally, a ``_scans.tsv`` file is created in the parent directory to log all acquisitions for the subject/session.

Let’s examine each file in detail.

1. Motion Time-Series Data (_motion.tsv)#

This file contains the continuous IMU measurements. Each row is a sample, and each column is a sensor channel (13 total: 3 gyroscope + 3 accelerometer + 7 orientation quaternion):

[3]:
physio_tsv_path = motion_dir / f"{prefix}_motion.tsv"
physio_df = pd.read_csv(physio_tsv_path, sep="\t", header=None)
print(f"Motion data shape: {physio_df.shape}")
print(physio_df.head())
Motion data shape: (3579, 13)
         0          1          2         3         4         5         6   \
0 -4.850388  64.683914  37.141800  0.044434  0.021484  1.003906 -1.620918
1 -5.055847  64.009297  36.934723  0.021241  0.007817  1.005563 -1.143437
2 -5.092621  63.636188  37.067672  0.009610  0.004351  0.992939 -0.669902
3 -5.049325  63.301292  37.320572 -0.005809  0.009269  0.995885  0.321108
4 -4.925333  64.229119  37.439189 -0.008550  0.001859  0.989226  1.114043

         7           8         9         10        11        12
0 -0.014557 -101.157675  0.634954 -0.011007 -0.008884 -0.772421
1 -0.052766 -100.864351  0.636961 -0.007987 -0.006000 -0.770828
2 -0.092127 -100.564013  0.638999 -0.005013 -0.003115 -0.769180
3 -0.166917  -99.932651  0.643222  0.001200  0.002926 -0.765656
4 -0.223694  -99.436470  0.646499  0.006153  0.007776 -0.762848

2. Motion Metadata (_motion.json)#

This file contains crucial metadata about the recording setup and data characteristics. Note how our custom metadata (TaskName, InstitutionName, etc.) has been included:

[4]:
motion_json = motion_dir / f"{prefix}_motion.json"
with open(motion_json, "r") as f:
    motion_metadata = json.load(f)
print(json.dumps(motion_metadata, indent=4))
{
    "TaskName": "LabMuse",
    "TaskDescription": "Watching artworks on the screen",
    "Instructions": "",
    "DeviceSerialNumber": "114837",
    "Manufacturer": "TDK InvenSense & Pupil Labs",
    "ManufacturersModelName": "ICM-20948",
    "SoftwareVersions": "App version: 2.9.26-prod; Pipeline version: 2.8.0",
    "InstitutionName": "Streeling University",
    "InstitutionAddress": "Trantor, Galactic Empire",
    "InstitutionalDepartmentName": "Department of Psychohistory",
    "SamplingFrequency": 110,
    "ACCELChannelCount": 3,
    "ANGACCELChannelCount": 0,
    "GYROChannelCount": 3,
    "JNTANGChannelCount": 0,
    "LATENCYChannelCount": 0,
    "MAGNChannelCount": 0,
    "MISCChannelCount": 0,
    "MissingValues": "n/a",
    "MotionChannelCount": 0,
    "ORNTChannelCount": 7,
    "POSChannelCount": 0,
    "SamplingFrequencyEffective": 110.0000011,
    "SubjectArtefactDescription": "",
    "TrackedPointsCount": 0,
    "TrackingSystemName": "Neon IMU",
    "VELChannelCount": 0
}

3. Channel Information (_channels.tsv)#

This file provides detailed information about each channel in the motion data, including the sensor type, spatial component (x/y/z), units, and sampling frequency:

[5]:
channels_tsv_path = motion_dir / f"{prefix}_channels.tsv"
channels_df = pd.read_csv(channels_tsv_path, sep="\t")
print(channels_df)
              name component   type tracked_point      units  \
0           gyro x         x   GYRO          Head      deg/s
1           gyro y         y   GYRO          Head      deg/s
2           gyro z         z   GYRO          Head      deg/s
3   acceleration x         x  ACCEL          Head          g
4   acceleration y         y  ACCEL          Head          g
5   acceleration z         z  ACCEL          Head          g
6             roll         x   ORNT          Head        deg
7            pitch         y   ORNT          Head        deg
8              yaw         z   ORNT          Head        deg
9     quaternion w    quat_w   ORNT          Head  arbitrary
10    quaternion x    quat_x   ORNT          Head  arbitrary
11    quaternion y    quat_y   ORNT          Head  arbitrary
12    quaternion z    quat_z   ORNT          Head  arbitrary

                    placement  sampling_frequency status  status_description
0   Head-mounted Neon glasses          110.000001   good                 NaN
1   Head-mounted Neon glasses          110.000001   good                 NaN
2   Head-mounted Neon glasses          110.000001   good                 NaN
3   Head-mounted Neon glasses          110.000001   good                 NaN
4   Head-mounted Neon glasses          110.000001   good                 NaN
5   Head-mounted Neon glasses          110.000001   good                 NaN
6   Head-mounted Neon glasses          110.000001   good                 NaN
7   Head-mounted Neon glasses          110.000001   good                 NaN
8   Head-mounted Neon glasses          110.000001   good                 NaN
9   Head-mounted Neon glasses          110.000001   good                 NaN
10  Head-mounted Neon glasses          110.000001   good                 NaN
11  Head-mounted Neon glasses          110.000001   good                 NaN
12  Head-mounted Neon glasses          110.000001   good                 NaN

Sensor types in Neon IMU:

  • GYRO: Angular velocity (rotation rate) in degrees/second

  • ACCEL: Linear acceleration in g-force units

  • ORNT: Orientation quaternion (w, x, y, z) in arbitrary units

4. Coordinate System (_channels.json)#

This file defines the reference frame for interpreting the motion data. For Neon, the global reference frame is defined by the IMU axes (X=right, Y=anterior, Z=superior):

[6]:
channels_json_path = motion_dir / f"{prefix}_channels.json"
with open(channels_json_path, "r") as f:
    channels_metadata = json.load(f)
print(json.dumps(channels_metadata, indent=4))
{
    "reference_frame": {
        "Levels": {
            "global": {
                "SpatialAxes": "RAS",
                "RotationOrder": "ZXY",
                "RotationRule": "right-hand",
                "Description": "This global reference frame is defined by the IMU axes: X right, Y anterior, Z superior. The scene camera frame differs from this frame by a 102-degree rotation around the X-axis. All motion data are expressed relative to the IMU frame for consistency."
            }
        }
    }
}

Exporting to Eye-Tracking-BIDS#

The Eye-Tracking-BIDS specification standardizes how gaze position, pupil data, and eye-tracking events should be organized:

Szinte, M., Bach, D. R., Draschkow, D., Esteban, O., Gagl, B., Gau, R., Gregorova, K., Halchenko, Y. O., Huberty, S., Kling, S. M., Kulkarni, S., Maintainers, T. B., Markiewicz, C. J., Mikkelsen, M., Oostenveld, R., & Pfarr, J.-K. (2026). Eye-Tracking-BIDS: The Brain Imaging Data Structure extended to gaze position and pupil data. bioRxiv. https://doi.org/10.64898/2026.02.03.703514

Export Configuration#

The export_eye_tracking_bids() method has similar arguments to export_motion_bids():

  • output_dir: Directory where files will be saved

  • prefix: BIDS naming prefix (must include sub-<label> and task-<label> at minimum). If not provided, the function will attempt to infer it from existing files in the output directory.

  • extra_metadata: Optional dictionary of additional metadata

Important: When exporting eye-tracking data, it’s best practice to use the same output directory as your motion (or other modality) data to link them together as part of the same recording session. The function will automatically detect and use the matching prefix from existing files, or you can explicitly provide one. Eye-tracking data is considered physiology data in BIDS.

[7]:
rec.export_eye_tracking_bids(motion_dir)
seedir(motion_dir)
Warning: Duplicated indices found and removed.
C:\Users\qian.chu\Documents\GitHub\PyNeon\pyneon\preprocess\preprocess.py:67: UserWarning: 23 out of 6496 requested timestamps are outside the data time range and will have empty data.
  warn(
motion/
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.json
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.tsv
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.json
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.tsv
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.json
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.tsv.gz
├─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.json
└─sub-01_ses-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.tsv.gz

Understanding the Eye-Tracking-BIDS File Structure#

Eye-Tracking-BIDS creates four main files:

  1. ``_physio.tsv.gz``: Compressed time-series data for gaze and pupil measurements

  2. ``_physio.json``: Metadata describing the eye-tracking setup and data columns

  3. ``_physioevents.tsv.gz``: Event data (fixations, saccades, blinks, custom messages)

  4. ``_physioevents.json``: Metadata for the events file

1. Physiological Time-Series Data (_physio.tsv.gz)#

This compressed file contains continuous gaze and pupil data with 5 columns:

  • timestamp: Time in nanoseconds

  • x_coordinate: Horizontal gaze position in pixels

  • y_coordinate: Vertical gaze position in pixels

  • left_pupil_diameter: Left pupil diameter in millimeters

  • right_pupil_diameter: Right pupil diameter in millimeters

Let’s inspect the data:

[8]:
physio_tsv_path = motion_dir / f"{prefix}_physio.tsv.gz"
physio_df = pd.read_csv(physio_tsv_path, sep="\t", compression="gzip", header=None)
print(f"Eye-tracking data shape: {physio_df.shape}")
print(physio_df.head())
Eye-tracking data shape: (6496, 5)
                     0        1        2       3       4
0  1758493906829570307  493.391  519.903  5.2033  4.4699
1  1758493906839570307  483.811  509.177  5.2051  4.4298
2  1758493906844570307  480.526  510.026  5.2209  4.4105
3  1758493906849577307  480.566  511.044  5.2965  4.4155
4  1758493906854570307  482.622  513.314  5.2680  4.4286

2. Physiological Data Metadata (_physio.json)#

This file provides comprehensive metadata about the eye-tracking data, including column definitions, sampling frequency, and device information:

[9]:
physio_json = motion_dir / f"{prefix}_physio.json"
with open(physio_json, "r") as f:
    physio_metadata = json.load(f)
print(json.dumps(physio_metadata, indent=4))
{
    "SamplingFrequency": 199.66054326173642,
    "StartTime": 0,
    "Columns": [
        "timestamp",
        "x_coordinate",
        "y_coordinate",
        "left_pupil_diameter",
        "right_pupil_diameter"
    ],
    "DeviceSerialNumber": "114837",
    "Manufacturer": "Pupil Labs",
    "ManufacturersModelName": "Neon",
    "SoftwareVersions": "App version: 2.9.26-prod; Pipeline version: 2.8.0",
    "PhysioType": "eyetrack",
    "EnvironmentCoorinates": "top-left",
    "RecordedEye": "cyclopean",
    "SampleCoordinateSystem": "gaze-in-world",
    "EyeTrackingMethod": "real-time neural network",
    "timestamp": {
        "Description": "UTC timestamp in nanoseconds of the sample",
        "Units": "ns"
    },
    "x_coordinate": {
        "Description": "X-coordinate of the mapped gaze point in world camera pixel coordinates.",
        "Units": "pixel"
    },
    "y_coordinate": {
        "Description": "Y-coordinate of the mapped gaze point in world camera pixel coordinates.",
        "Units": "pixel"
    },
    "left_pupil_diameter": {
        "Description": "Physical diameter of the pupil of the left eye",
        "Units": "mm"
    },
    "right_pupil_diameter": {
        "Description": "Physical diameter of the pupil of the right eye",
        "Units": "mm"
    },
    "TaskName": "LabMuse"
}

3. Eye-Tracking Events (_physioevents.tsv.gz)#

This file contains all detected eye-tracking events and custom messages. Each row represents one event with columns:

  • onset: Event start time in nanoseconds

  • duration: Event duration in seconds (for fixations, saccades, blinks)

  • trial_type: Type of event (fixation, saccade, blink)

  • message: Custom event messages/markers (when applicable)

PyNeon automatically exports all available events from the recording, including algorithmically detected events (fixations, saccades, blinks) and user-defined messages.

[10]:
physioevents_tsv_path = motion_dir / f"{prefix}_physioevents.tsv.gz"
physioevents_df = pd.read_csv(
    physioevents_tsv_path, sep="\t", header=None, compression="gzip"
)
print(f"Total events: {physioevents_df.shape[0]}")
print(physioevents_df.head(10))
Total events: 143
                     0      1         2                3
0  1758493904395000000    NaN       NaN  recording.begin
1  1758493906489203307  0.270     blink              NaN
2  1758493906839570307  0.180  fixation              NaN
3  1758493907019693307  0.010   saccade              NaN
4  1758493907029691307  0.576  fixation              NaN
5  1758493907605304307  0.090   saccade              NaN
6  1758493907695302307  0.150  fixation              NaN
7  1758493907845424307  0.065   saccade              NaN
8  1758493907910548307  0.210  fixation              NaN
9  1758493908120793307  0.040   saccade              NaN

4. Events Metadata (_physioevents.json)#

This file describes the structure and meaning of the events data:

[11]:
physioevents_json = motion_dir / f"{prefix}_physioevents.json"
with open(physioevents_json, "r") as f:
    physioevents_metadata = json.load(f)
print(json.dumps(physioevents_metadata, indent=4))
{
    "Columns": [
        "onset",
        "duration",
        "trial_type",
        "message"
    ],
    "Description": "Eye events and messages logged by Neon",
    "OnsetSource": "timestamp",
    "onset": {
        "Description": "UTC timestamp in nanoseconds of the start of the event",
        "Units": "ns"
    },
    "duration": {
        "Description": "Event duration",
        "Units": "s"
    },
    "trial_type": {
        "Description": "Type of trial event",
        "Levels": {
            "fixation": {
                "Description": "Fixation event"
            },
            "saccade": {
                "Description": "Saccade event"
            },
            "blink": {
                "Description": "Blink event"
            }
        }
    },
    "TaskName": "LabMuse"
}