Brain embeddings with shared geometry to artificial contextual embeddings, as a code for representing language in the human brain
Citations Over Time
Abstract
Abstract Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. Do language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language? To test this hypothesis, we densely recorded the neural activity in the Inferior Frontal Gyrus (IFG, also known as Broca’s area) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derived for each patient a continuous vectorial representation for each word (i.e., a brain embedding). Using stringent, zero-shot mapping, we demonstrated that brain embeddings in the IFG and the DLM contextual embedding space have strikingly similar geometry. This shared geometry allows us to precisely triangulate the position of unseen words in both the brain embedding space (zero-shot encoding) and the DLM contextual embedding space (zero-shot decoding). The continuous brain embedding space provides an alternative computational framework for how natural language is represented in cortical language areas.
Related Papers
- → VISUALIZING RELATIONS USING THE "OBSERVABLE REPRESENTATION"(2011)6 cited
- → An improved method to embed larger image in QR code(2013)4 cited
- → Automatic Video Captioning via Multi-channel Sequential Encoding(2016)3 cited
- → LabelEnc: A New Intermediate Supervision Method for Object Detection(2020)1 cited
- Encoding Program as Image: Evaluating Visual Representation of Source Code.(2021)