The three thematic areas of the planned TRR
The projects of the planned TRR fall into three thematic areas: A) dynamics of crossmodal adaptation, B) efficient crossmodal generalization and prediction, and C) crossmodal learning in human-machine interaction. Each of these represents a key aspect of crossmodal learning, serving as a common category for the closely related projects within each area—as described next.
Thematic Area A: Dynamics of crossmodal adaptation
Projects in Area A will investigate the dynamics of the crossmodal learning process itself, attempting to discover what occurs and what should occur during learning when information is arriving through multiple disparate modalities. While thematic areas B and C are concerned with the learning process as well, the projects in Area B are focused more closely on the outcome of learning: the improved predictions and the improved integration that results from learning; similarly, the projects in Area C are focused on crossmodal learning as the basis for man-machine cooperation: the more intuitive interaction that will result when a machine’s conceptualization of the world is also grounded in crossmodal learning. The projects in Area A will investigate research issues such as how crossmodal signals are integrated in humans and machines and how this integration can improve over time (A1–6), how network structure and prior knowledge affect the crossmodal-learning process (A1–2, 5–6), how crossmodal representations promote abstraction which can promote more efficient learning (A2–3, 6), and how crossmodal learning can change over time (A1, 3–5).
Thematic Area B: Efficient crossmodal generalization and prediction
Whereas the projects in Area A will focus on crossmodal learning and integration as dynamic processes, the projects in Area B will investigate the way in which crossmodal learning and integration impacts generalization and prediction. Multimodal stimuli generally provide more information than the component unimodal stimuli, yet that extra information is only useful insofar as the stimuli can be integrated. The integrated percept allows better prediction and enhanced generalization. Projects in Area B investi- gate the processes by which crossmodal information can enhance generalization and prediction beyond that possible with unimodal information alone. The projects in this Area address research questions such as how biological brains create and exploit crossmodal memories that store multimodal information (B1, 3–4); how to resolve conflicts between unimodal components of crossmodal predictions (B2, 4–5); how humans incorporate signals from one modality to improve generalization and prediction in another (B1–5); and how such insights can be transferred both to statistical models to improve their predictions (B2) and to robots to improve their responses in complex, multisensory environments (B5). A better grasp of these issues will be instrumental in understanding the dynamics of crossmodal learning (Area A) as well as the application of crossmodal learning to human-machine interaction (Area C).
Thematic Area C: Crossmodal learning in human-machine interaction
Whereas Area A focuses on the dynamics of crossmodal learning, and Area B focuses on crossmodal prediction and generalization, the projects in Area C will investigate crossmodal learning from the perspective of human-machine interaction, addressing issues that specifically relate to the shared multimodal signals that are perceived by both human and machine, particularly during their interaction. The projects in this thematic area will explore issues such as how crossmodal signals are integrated and learned for speech perception and language understanding (C1, 4–5); how multiple sensory modalities are combined for interpreting signals of social communication, such as hand gestures, facial expressions, and vocal utterances (C3–4); and how motor control (such as eye movements and speech articulation) can provide the information needed to disambiguate rich multisensory information (such as vision and audition) to support a clearer understanding of both spoken and written language (C1–2, 5). What unites these research issues is their potential for improving human-machine interaction: by transferring to machines the knowledge we gain from these projects, we will provide the artificial systems greater common ground upon which interaction with humans can become more natural.
Three Integration Initiatives: II-T, II-M, II-R
Integration of research approaches: Integration initiatives and demonstrator Due to the highly interdisciplinary nature of our research, our planned centre will emphasize collabora-
tion and interdisciplinary cooperation to a high degree. Every project is a proposal for collaborative research including at least two PIs with different, complementary backgrounds. This emphasis on integration is woven deeply into both our overall research strategy and indeed into the focus of the research itself: each project combines multiple modalities, multiple computational or neurocognitive disciplines, and multiple approaches across a range of perspectives in different international research groups from both Hamburg and Beijing. Integration is therefore an intrinsic part of the centre, supported by the six common research objectives and the three integration initiatives described below. Because the focus of our investigation—crossmodal learning—is not yet a well-established discipline, we will introduce novel, interdisciplinary metrics of success. Thus, to focus, evaluate, and demonstrate the progress of the centre’s research, we will have three Integration Initiatives (IIs) in the first four-year phase of this TRR. While each project has its own specific goals, the Integration Initiatives will organize the collaborative efforts of the participants, advance the state of the art, make progress towards our six objectives (see above), and produce the robotics demonstrator. In all, there will be three initiatives:
- II-T: Theory,
- II-M: Models,
- I-R: Robotics,