Virtual agenda for the future

The rapid development of technology is creating new ways of storytelling, by transforming traditional narratives into something more interactive and multisensorial.

This can be observed for instance in Sensory Stories, a series of pieces currently being exhibited at the Museum of the Moving Image in New York until July 26, 2015. All pieces engage visitors in sight, hearing, touch, and smell through virtual reality (VR) experiences, interactive films, participatory installations, and speculative interfaces.

Birdly lets one fly around Manhattan’s buildings, giving the viewer control to manoeuver through the borough; Evolution of Verse is a film that allows viewers to float over miles of lakes and mountains; Herders and Clouds over Sidra are short documentaries whose characters seem like actual people as opposed to performers; Hidden Stories includes a series of objects on the museum wall with sensors that reveal audio on the objects—listeners can even record their own “snippets”. A list of all the pieces can be found on the Museum's website.

Birdly (Image: Thanassi Karageoriou, Museum of the Moving Image)

Birdly (Image: Thanassi Karageoriou, Museum of the Moving Image)

Hidden Stories (Image: Thanassi Karageoriou, Museum of the Moving Image)

Hidden Stories (Image: Thanassi Karageoriou, Museum of the Moving Image)

Charlie Melcher, founder and president of Melcher Media and the Future of Storytelling, examines this technological shift from passively reading stories from text to something more active and virtual. In a Wired article, Melcher explains that “we are leaving this age defined by the alphabet. … We are in a process literally of transforming from an alphabet mind into one that is networked, that is more based on connections between things rather than hierarchies.”

From Text to Scene

According to Rouhizadeh et al., today’s professionals and researchers are bridging the gap between language, graphics, and knowledge by converting text into “a new type of semantic representation”—that is, a virtual three-dimensional scene.

One of such efforts is manifested by the MUSE Project (Machine Understanding for Interactive StorytElling), which is developing a translation system to transform texts into three-dimensional virtual worlds. Specifically, this system-in-the-making would work by processing the language of a given text and turning it into actions, characters, situations, plots, settings, and objects configured in virtual three-dimensional worlds, “in which the user can explore the text through interaction, re-enactment, and guided game play”.

Thus far, Prof. Dr. Marie-Francine Moens – coordinator of this project – and her team have successfully created a system capable of processing texts in terms of semantic roles in sentences (who, what, where, when, and how), spatial relations between objects, and chronology of events.

Additionally, this European Union funded Project has also been experimenting with children’s stories and patient education materials, “translating natural language utterances into instructions in a graphical world”. A video demonstration of the project can be found on their website.

In a CORDIS (Community Research and Development Information Service) announcement, the team reveals their plans to bring this text-to-scene technology into the market and make it commercially available to public.

The Text-to-Scene Trend

Other up-and-coming systems are following suit, converting texts into graphical worlds in the hopes of reaching the market.

For instance, a web application called WordsEye also allows users to create three-dimensional scenes from basic textual descriptions, an action they refer as ‘type a picture’. These descriptions consist not only of spatial relations, but also actions performed. Programs such as WordsEye make creating three-dimensional graphics effortless, immediate, and less time-consuming, requiring no special skills or training. Bob Coyne from Columbia University and Richard Sproat from Oregon Health and Science University report that “there is a certain kind of magic in seeing one’s words turned into pictures” using such a software.

Similarly, Learn Immersive helps teach languages using VR by “[generating] scene descriptions and text translations” of real-world environments. According to the co-founder, Tony Diepenbrock who spoke to Gizmag, in order to become fluent in a foreign language in a sensible timeframe, one has to be fully immersed in it. Diepenbrock expressed the struggle of the American schooling system for learning languages: “I studied French for 12 years, but when I tried to speak it in the country, often times foreigners would respond to me in English. … You need to immerse yourself in situations where you need to figure out what to say”. Learn Immersive solves this problem by transporting users to environments where languages are native and predominate.

Learn Immersive (Image: Panoptic Group)

Learn Immersive (Image: Panoptic Group)


How Does This Trend Affect Us?

Text-to-scene technology can have tremendous impact on multiple industries, in particular the gaming industry. CORDIS points out that the success of some history-based video game franchises has demonstrated that interactivity can make telling and listening to stories more appealing and memorable.

Thus, programs such as WordsEye, Learn Immersive and the MUSE Project may be used to ignite one’s passion to learn, particularly those of children’s. Used in schools, text-to-scene technology can make teaching programmes more effective and impactful.

Researchers from Columbia University and Oregon Health and Science University suggest potentially using this technology in education including ESL, EFL, special needs learning, vocabulary building, and creative storytelling. In 2007, K-12 public school teachers from the Albemarle county school system in Virginia tested WordsEye in their classes. A 5th and 6th grade teacher, who used the program to improve students’ descriptive writing, reported that students are “very eager to use the program and came up with some great pictures”.

Coincides with these findings is the statement of Prof. Moens, who believes that the MUSE Project will eventually lead students to “become part of the story … instead of [just] reading or studying a rather boring historical text”. Moens also adds that “such a [virtual] environment would stimulate the understanding of the text and the memorization of its content”, as the visuals can be adapted to correspond to the reading level of the individual.

What’s Next?

The potential benefits of this new piece of technology are undoubtedly exciting, but we need to keep in mind that the text-to-scene conversion systems are still a work in progress. Fundamental follow-up research still needs to be conducted, in particular on multimodal representation learning.

Until then, we just have to wait until we can see our stories.

Forecasted start year: 
2020 to 2025


Load comments