Using Richard Mayer and Roxana Moreno's (2003) research as a framework, we designed our courses with multimodal instruction around the idea of dual-coding theory, which posits that learners process information differently when they see verbal representations as opposed to when they see nonverbal representations. Essentially, as Mayer and Moreno suggested, "the human information-processing system consists of two separate channels—an auditory/verbal channel for processing auditory input and verbal representations and a visual/pictorial channel for processing visual input and pictorial representations" (p. 44). Because learners use different processing systems to understand what they see and hear, the simultaneous use of words and pictures forces these two channels to work together. Students learn to make referential connections between the information presented verbally and visually, which leads to a greater understanding of the material presented (Mayer & Sims, 1994).
Indeed, prior research indicates that multimodal instruction can improve learning more than instruction with only one medium (Mayer & Sims, 1994; Mayer, Moreno, Bori, & Vagge, 1999). For example, in a study of cognitive learning processes of undergraduate students, Mayer and colleagues (1999) found that learners "are better able to construct mental models when corresponding visual and verbal representations are in working memory at the same time," adding that learners process information easier when narration and animation are combined together than when animation is presented and subsequently followed by narration (p. 642). Furthermore, Charles Fadel (2008) claimed "students engaged in learning that incorporates multimodal designs, on average, outperform students who learn using traditional approaches with single modes" (p. 13).