Overview: Area A

Thematic Area A: Dynamics of Crossmodal Adaptation

Projects in Area A focus on the dynamics of the crossmodal learning process, to discover what occurs and what should occur during learning when information is arriving through multiple disparate modalities.

The projects in thematic area A are designed with refined research questions and challenging real-world experiments, guided by the insights and results learned from the first funding period. The projects in Area A will investigate how crossmodal signals are integrated in humans (A1, A3), animals (A2), and machines (A4-A6). Project A5 also explicitly integrates human experiments with neurorobotic experiments based on parallel behavioural and robotic development. Projects in this area A all contribute mainly to objectives O1 (learning architecture) and O2 (learning strategies). The objective O3 (robustness) is addressed by the projects A4-A6, but also by a study on AV integration in healthy subjects (A1) and by a comparative study on healthy and diseased subjects (A3). The projects A4-A6 will develop new complementary neural architectures; through common links in their methodology and tight collaborations, they will generate the necessary critical mass for studying complementary computational principles for crossmodal learning. All six projects will contribute to the II-T theory and II-M models integrators, and partly (A3-A6) to the II-R robot and systems integrator. A brief overview of the projects in this area is provided below.

Project A1 (Bruns, Röder, X. Fu) will investigate basic principles and neural mechanisms of the dynamic process of crossmodal binding and their development in humans. According to current hierarchical causal inference models, the degree to which humans integrate stimuli from different sensory modalities (e.g., a visual and an auditory stimulus) depends on the human’s assessment of how likely the stimuli might have originated from the same event. Results from the first funding phase demonstrated that this assessment is flexible and depends on both the sensory evidence as well as the human’s prior knowledge about stimulus correspondences (Habets et al., 2017; Zierul et al., 2019). In the second funding phase, EEG techniques, as well as developmental studies in children will be used to determine how learned priors and sensory evidence interact in the human brain and how these processes emerge during ontogenetic development.

Project A2 (Guan, Hilgetag) will study cellular population activity and connectivity in the cerebral cortex to answer the question of how local and long-range circuits are used and adapted in the representation of crossmodal memories (Guan et al., 2016, Kitamura et al., 2017). Combining cellular functional imaging and connectomics with computational analysis and modelling, the project will dissect the laminar circuits and plasticity mechanisms underlying the neural dynamics of crossmodal memory in the rodent neocortex (Zhu et al., 2016). The project builds on the key findings from the first funding period, where it identified and characterized cortical memory engrams, which are persistently firing neurons that are sparsely distributed across cortical layers and areas, and developed computational models to understand the distribution of engram activities and their formation. In the second funding period, neural signals of calcium events and immediate early gene expression will be recorded across mouse sensory cortex at very high resolution. This will be done using in vivo two-photon imaging in the context of behavioural paradigms presenting different types of sensory inputs during short-term and long-term memory tasks. The local and long-distance connectivity network will be dissected through virus tracking and optogenetic manipulation in local and long-range projection sources, and characteristic functional contributions and interactions of different layers will be assessed by data analysis and modelling.

Project A3 (Gerloff, Xue) aims to provide a deeper neuro-representational and neuro-computational analysis of crossmodal integration in healthy adults, to extend this knowledge to understand the impairment of crossmodal integration in neurological model diseases, and to transfer findings on beneficial effects of crossmodal integration to clinical applications like virtual-reality (VR) enhanced training and brain stimulation. During the second funding period, the project will evaluate the developmental changes of neural representations and computations during crossmodal associative learning in healthy young and old adults by introducing a novel behavioural task and new analysis methods (Brodt et al., 2018). The unique feature of this project is that it will transfer findings on mechanisms of crossmodal integration to clinical applications. In particular, A3 will establish rehabilitative tasks based on crossmodal processing and transfer these to a VR scenario, with the ultimate goal to move towards robust and adaptive human-cyber-physical interactions of high intensity.

Project A4 C. Zhang, Wang, J. Zhang) approaches the “dynamics of crossmodal adaption” topic from a computer science and robotics perspective, with the goal to learn crossmodal representations for robust robot behaviour. Both meta-learning (Vilalta & Drissi, 2002) and multi-agent collaborative techniques, which respectively focus on shared and dynamic crossmodal representation, will be studied in a challenging clothes-hanging scenario. Regarding transfer, the learned classifiers and contact models will be tested on humanoid platforms. The developed datasets and simulation environment will result in standardized robot benchmarking scenarios.

Project A5 (Wermter, X. Liu) focuses on developing crossmodal integration mechanisms for human-robot environments which have the potential to provide robots with more robust (Parisi et al., 2019), less ambiguous, and more accurate behaviour in different decision-making scenarios (Sheridan, 2016). In the first phase of the TRR, A5 focused on the integration of different modalities on a sensory level (Barros & Wermter, 2016). Initial neurocomputational midbrain models were developed based on the results of the investigation of different psychological mechanisms for auditory and visual integration. For the second funding phase, A5 will move beyond the sensory level and study the integration of different multimodal social cues to improve the robustness of Human-Robot Interaction (HRI) (Belpaeme et al., 2018). This will be done through the development of joint attention mechanisms based on the integration of affective behaviour, social engagement, and spatial references in communication (deictic communication).

Project A6 (Hu, Frintrop, Gerkmann) aims to develop novel algorithms to jointly process visual and acoustic cues to improve signal processing in both domains, concentrating on robust methods which are able to cope with cluttered real-world data, various noise types, and deliberately crafted adversarial data. In the first phase of the TRR, A6 focused on developing several deep learning models for visual and auditory information processing inspired by neuroscience. While current deep learning models have achieved great success in processing audio and visual signals in specific settings (Yu et al., 2018, Isola et al., 2017), they are not necessarily robust to arbitrary environmental noise or deliberately crafted perturbations (e.g., the so-called adversarial noise). In the second phase of the TRR, A6 aims to develop robust crossmodal learning methods which are able to cope with the above-mentioned challenges. This will be done by using biologically inspired architectures and the principle of information gain via multiple modalities and sensors. To be able to handle the data robustly and efficiently, attentional mechanisms (e.g., saliency) will be used to determine which parts of the data are most promising for processing.