TRIBE v2
A multimodal model for brain response prediction
...The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the cortical surface to predict fMRI responses across thousands of brain regions. TRIBE v2 allows researchers to simulate and analyze brain activity without requiring direct human experiments. Overall, it provides a powerful tool for studying perception, cognition, and multimodal processing in the brain.