ZUNA: EEG Foundation Model
ZUNA is a 380M-parameter masked diffusion autoencoder trained to reconstruct, denoise, and upsample scalp-EEG signals. Given a subset of EEG channels, ZUNA can:
- Denoise existing EEG channels
- Reconstruct missing EEG channels
- Predict novel channel signals given physical coordinates on the scalp
ZUNA was trained on approximately 2 million channel-hours of EEG data from a wide range of publicly available sources. At 380M parameters, it is lightweight enough to run on a consumer GPU and can be used on CPU for many workloads.
Performance
ZUNA significantly outperforms existing standard methods for channel denoising, reconstruction, and upsampling. We compared ZUNA to MNE's default spherical spline interpolation method. ZUNA outperforms MNE in reconstruction accuracy across a range of unseen datasets, even those with a different preprocessing pipeline. ZUNA's advantage is particularly striking for higher upsampling ratios, demonstrating that it is effectively using general priors learned through large-scale pretraining.
Installation
# (1). Download tutorial and sample data from GitHub git clone --depth 1 --filter=blob:none --sparse https://github.com/Zyphra/zuna.git && cd zuna && git sparse-checkout set tutorials # (2). Pip Install zuna pip install zuna
Or install in development mode:
# (1). Download Zuna codebase from GitHub git clone https://github.com/Zyphra/zuna.git && cd zuna # (2). Pip Install zuna in developer mode pip install -e .
Quick Start
See tutorials/run_zuna_pipeline.py for a complete working example.
Note that you can also find a version of this script here on Google Colaboratory for free GPU access.
Edit the paths and options, then run:
python tutorials/run_zuna_pipeline.py
Input .fif files must have a channel montage set with 3D positions (see Setting Montages below). The pipeline runs 4 steps:
| Step | Function | Description |
|---|---|---|
| 1 | zuna.preprocessing() | .fif → .pt (resample, filter, epoch, normalize) |
| 2 | zuna.inference() | .pt → .pt (model reconstruction) |
| 3 | zuna.pt_to_fif() | .pt → .fif (denormalize, concatenate) |
| 4 | zuna.compare_plot_pipeline() | Generate comparison plots |
Model weights are automatically downloaded from HuggingFace on first run.
The pipeline creates this directory structure:
working_dir/
1_fif_filter/ - Preprocessed .fif files (for comparison)
2_pt_input/ - Preprocessed .pt files (model input)
3_pt_output/ - Model output .pt files
4_fif_output/ - Final reconstructed .fif files
FIGURES/ - Comparison plots
API Reference
For detailed documentation on any function, use help():
import zuna help(zuna.preprocessing) help(zuna.inference) help(zuna.pt_to_fif) help(zuna.compare_plot_pipeline)
Preprocessing
Preprocess .fif files to .pt format (resample to 256 Hz, filter, epoch into 5s segments, normalize).
from zuna import preprocessing preprocessing( input_dir="path/to/fif/files", output_dir="path/to/working/2_pt_input", apply_notch_filter=False, # Automatic line noise removal apply_highpass_filter=True, # 0.5 Hz highpass apply_average_reference=True, # Average reference target_channel_count=['AF3', 'AF4', 'F1', 'F2'], # Add channels from 10-05 montage bad_channels=['Cz', 'Fz'], # Zero out known bad channels preprocessed_fif_dir="path/to/working/1_fif_filter", # Save filtered .fif for comparison )
Note: Sampling rate (256 Hz), epoch duration (5s), and batch size (64 epochs per file) are fixed to match the pretrained model and should not be changed.
Inference
Run the ZUNA model on preprocessed .pt files. Model weights are downloaded from HuggingFace automatically.
from zuna import inference inference( input_dir="path/to/working/2_pt_input", output_dir="path/to/working/3_pt_output", gpu_device=0, # GPU ID (default: 0), or "" for CPU tokens_per_batch=100000, # Increase for higher GPU utilization data_norm=10.0, # Normalization denominator (ZUNA expects std=0.1) diffusion_cfg=1.0, # Classifier-free guidance (1.0 = no cfg) diffusion_sample_steps=50, # Diffusion steps plot_eeg_signal_samples=False, # Plot per-sample reconstructions (slow, for debugging) inference_figures_dir="./FIGURES", # Where to save per-sample plots )
Reconstruction
Convert model output .pt files back to .fif format, reversing normalization and stitching epochs back together.
from zuna import pt_to_fif pt_to_fif( input_dir="path/to/working/3_pt_output", output_dir="path/to/working/4_fif_output", )
Visualization
Generate comparison plots between pipeline input and output.
from zuna import compare_plot_pipeline compare_plot_pipeline( input_dir="path/to/original/fif/files", fif_input_dir="path/to/working/1_fif_filter", fif_output_dir="path/to/working/4_fif_output", pt_input_dir="path/to/working/2_pt_input", pt_output_dir="path/to/working/3_pt_output", output_dir="path/to/working/FIGURES", plot_pt=True, # Compare .pt files (epoch-level) plot_fif=True, # Compare .fif files (full recording) num_samples=2, # Number of files to compare )
Setting Montages
Input .fif files must have a channel montage with 3D positions. If your files don't have one:
import mne raw = mne.io.read_raw_fif('data.fif', preload=True) montage = mne.channels.make_standard_montage('standard_1005') raw.set_montage(montage) raw.save('data_with_montage.fif', overwrite=True)
Citation
For more information please see our technical whitepaper and blog. If you find ZUNA useful in your work, please cite accordingly.
Organizations or researchers interested in collaborating with Zyphra to improve future versions for specific needs or use cases should contact bci@zyphra.com.
Disclaimer
This software and related services ("Services") are provided for research use only and are not intended for use in the diagnosis, cure, mitigation, treatment, or prevention of any disease or health condition. The Services have not been validated for any medical or clinical use. The information provided through the Services is for general informational purposes only and is not a substitute for any professional medical or healthcare advice. We do not warrant that any information provided through the Services is accurate, complete, or useful to you. Any reliance you place on such information is strictly at your own risk.