JMM 8, Winter 2009, section 2
Mariana Julieta Lopez & Sandra Pauletto
THE DESIGN OF AN AUDIO FILM
Portraying Story, Action and Interaction through Sound
Film, television, theatre performances and museum tours use audio description to enable visually impaired people to access these forms of art. The inclusion of these descriptions has as a consequence, however, that visually impaired audiences cannot access the work directly; they have to rely on a describer.
The aim of this project was to design an alternative to audio description for films.
In order to explore the potential of this format an example based on Roald Dahl’s Lamb to the Slaughter (1954) was designed for a 6.1 surround sound setup.
Lamb to the Slaughter is the story of Mary Maloney, who, after finding out that her husband plans to leave her, murders him with a frozen leg of lamb and feeds the murder weapon to the police.
In this project the term audio film was chosen for two main reasons: firstly, because the final work is to be experienced in a cinema environment, and, secondly, because certain elements of the filmmaking process might be adapted for the conveyance of a story through sound, creating an experience equivalent to the cinematic experience.
The environment in which we look at/listen to a piece and the social circumstances in which we experience it contribute enormously to how we define that piece. In particular, the experience of going to the cinema is not only determined by the nature of the work we are consuming (storytelling through an audio-visual medium), but by how we perceive it. A cinema in particular has very distinct sound characteristics: it is acoustically isolated from the sonic world that surrounds it, the sound reproduction is conveyed in surround through high quality speakers with a wide dynamic range and frequency spectrum. The degree to which the environment provides acoustic immersion plays a fundamental role in how well the film captivates the spectator. Going to the cinema is also a social experience that we share with friends and strangers, in a non-domestic place, where we cannot do anything else than watch/hear the film. These characteristics make it a very different experience than, say, hearing a radio play or watching a film at home on television where we can be distracted and where the acoustic and technological characteristics cannot create the immersion required by the cinematic experience.
Moreover, in a cinematic environment the sound designer can manipulate the audience’s perception of the surround sound world in great detail, allowing him/her to tell the story by driving the imagination, expectation, understanding and emotions of the audience.
In this context, Chion’s division of listening into three modes is useful. These three modes are: causal listening, semantic listening and reduced listening (Chion 1994, 25). Causal listening “consists of listening to a sound in order to gather information about its cause (or source)” (Chion 1994, 25), semantic listening refers to listening to a specific code in order to understand a message, and reduced listening refers to the process of focusing on “the traits of the sound itself, independent of its cause and of its meaning” (Chion 1994, 29).
These three listening modes are used in the audio film. Causal listening is constantly employed since the listener needs to identify the objects referred to in order to follow the plot. Semantic listening is used when a musical theme indicates an idea or situation. Finally, reduced listening might also be brought to bear when listening to the music just as music and not considering its meaning in context.
A further aspect explored in this project is the adaptation of different cinema languages, in particular the use of the master scene and the interpersonal cinema language.
Ron Richards defines the master scene as the use of one shot “that establishes the environment and the people, and records the event or action in its entirety.” (Richards 1992, 89) This can be used to introduce the different spaces by presenting all the sounds heard in that particular environment. In this way the space is aurally established as it would be visually established by means of a long shot.
Also, the interpersonal cinema language, which focuses on involving the audience with the emotional states of the characters, can be achieved through the content of the lines delivered, the expressiveness in the voices and the use of music to emphasise the characters’ feelings.
The notion of cuts through sound can be explored to indicate parallel actions taking place in different spaces. Also, an effect comparable to that of a tracking shot can be achieved by editing the sounds in a way in which a character’s movements are followed through different spaces.
2.2. Related Fields And Applications
2.2.1. Auditory Displays and Sonic Interaction Design
The concept and design of an audio film is of great interest for two related research areas: auditory displays and sonic interaction design (SID) (Rocchesso et al 2008).
Auditory displays focus on the study of how information (data, messages, feedback, etc.) can be portrayed through - usually - non-speech sound. Sonic interaction design concentrates on how sound can be designed to portray interactions between humans and/or objects in an informative way. In both research areas the designer needs to address two very important aspects: the informative aspect and the aesthetic/appropriateness aspect.
These are the same challenges faced by the designer of an audio film. In an audio film the sounds (ambiences, sound effects, sounds of actions, vocal sounds, music, etc.) need to be informative so that the storytelling is effectively portrayed. Strategies are found to either use or overcome the ambiguities in meaning of non-speech sound. Moreover, to maintain audience engagement, the aesthetic aspect of sound is considered in depth.
For these reasons understandings and techniques developed by means of the practice of designing an audio film are useful for the more general fields of sonic interaction design and auditory displays. A workshop on the relationship between sonic interaction design and film and theatre sound design was organised by the second author in the Department of Theatre, Film and TV (The University of York) in April 2009 and more information on this can be found in Pauletto et al (2009).
2.2.2. Audio Description
One major application of an audio film is to provide an alternative to audio description. The concept of audio description was first introduced and studied by Gregory Frazier in 1975 and consists in an audio track in which “a describer inserts spoken words to provide representations of information contained in the visual field of the production” (Piety 2003, 1).
An important analysis of audio description is the one provided by Philip J. Piety. Piety points out four components of audio description: source text, modified text, describer and consumer (Piety 2003, 28). The source text is the original text, generally aimed at a sighted audience. The modified text is the conjunction of the source text and the descriptions that have been inserted. The describer refers to the person in charge of the creation and the placement of the descriptions and the consumer to the recipient of the modified text (Piety 2003, 29).
Although audio description has enabled visually impaired people to access visual forms of art, it presents several limitations. Firstly, it evaluates and summarizes the scene described, leaving the consumer “less opportunity to independently assign meaning” (Piety 2003, 35).
Secondly, audio description must not overlap the dialogues. This means that visual aspects are not described during dialogues, leaving the listener without part of the information. Additionally, descriptions are constrained to the gaps between lines, which may not be long enough to provide the necessary descriptions.
A further limitation is related to the situations in which descriptions are synchronized to the sound effects. Descriptions mask what might be important audio clues, which might be interpreted without the need of descriptions.
Another limitation is that consumers cannot experience the source text on its own because it does not contain enough information for them to follow the plot.
The audio film proposed attempts to solve most of these limitations by providing just a source text that enables the visually impaired listener to experience the audio film in the same way as it would be experienced by a sighted listener. Also, by eliminating descriptions, the listener will be able to use his/her imagination more freely. Furthermore, the fact that this format does not need a describer would eliminate another disadvantage of audio description: the way in which the describer decides to present the information modifies the way in which the film is experienced (Piety 2003, 73).
2.2.3. Audio-Only Games
The design of an audio film can learn from techniques used in audio-only games.
An interesting approach is the one taken by the TiM (Tactile Interactive Multimedia) project carried out by SITREC (Stockholm International Toy Research Centre).
SITREC has worked on the design of Tim’s Journey, an adventure game based on surround sound, which takes place on an island divided into different spaces. In order to enable the player to recognize them, each of these spaces is characterized by a musical theme. Also, in order to make navigation easier, the footsteps of the avatar provide information regarding the surface on which it is walking (Friberg & Gärdenfors 2004).
Both design strategies are used for the design of an audio film in order to clarify the plot.
2.2.4. Radio Drama
The concept of audio film bears some resemblance to radio drama. The main differences are that an audio film does not use narration and employs surround sound to convey information.
William Ash defines a radio play as “a story told in dramatic form by means of sound alone.” (Ash 1985, 1) This definition points out the main aspect of a radio play: its sole reliance on sound for storytelling.
Rattigan asserts that radio drama is a format that in itself does not need visual elements, it is “not handicapped by the absence of any visual output, on the contrary, its ‘sightlessness’ is the basis of its unique appeal, which promotes an imaginative visualization on the part of its listeners” (Rattigan 2002, 1).
An audio film is also a format that in itself does not need images. On the contrary, it is designed with a purely aural approach, which stimulates the listener’s imagination.
An audio film also includes the same elements present in the radio drama: silence, pauses, voice, sound effects, utterances and music (Rattigan 2002).
As regards sound effects, Rattigan argues that they need a verbal contextualization to acquire meaning for the listener regarding their intention and their consequence. At the same time, he warns that the intentions stated should not be naïve or overstated (Rattigan 2002, 154).
Crisell also suggests this need for contextualization by stating that “it seems doubtful whether any radio sound is ultimately meaningful without the help of speech” (Crisell 1986, 141).
Although it could be agreed that sound effects that are not contextualized might disorient the audience, speech contextualization might not be the only alternative. Music could be used to denote certain meanings in those cases in which verbal statements would destroy the mood of a scene of a radio drama or an audio film.
Furthermore, it seems that by considering that sound effects must be contextualized through speech, a great stress is being placed on the voice while denying the possibility of communicating full meaning through non-speech sounds.
These opinions seem to underestimate the ability of the listeners to put together different sound effects in their minds to reconstruct meanings or to use other clues other than speech to interpret those sounds. It also seems to be suggested that sounds need to be clarified immediately when it might occur that meanings become clearer as the story evolves in time and in the listeners’ minds.
In the design of the audio film the different uses of music in films are also of relevance, including its use to convey information about a character’s emotional state, create a general atmosphere or a specific mood, and its employment as a means of narration (Levinson 1996).
2.3. The Production Stage
2.3.1. The Adaptation Of Roald Dahl’s Story
A script was written providing details on the sound elements that conform to the soundscape as well as to the actions included. Details on the feelings of the main characters were included both to provide guidance for the performance and to be used as hints for the selection of music. Dialogues and internal monologues were based on Dahl’s story as well as on Hitchcock’s adaptation for the series Alfred Hitchcock Presents (1958).
2.3.2. Recordings and Editing
All voice recordings were done using an AKG C 414 B-XLS microphone. In addition to the dialogues, the actors were asked to perform breathing and chewing sounds and utterances. In the case of Mary Maloney, different recordings of sobbing and laughs were also done.
The sound effects employed in Lamb to the Slaughter include original recordings (done in studio and on location) as well as sounds taken from sound libraries (Sound Ideas Series 6000 General Sound Effects Library, www.soundsnap.com and Hollywood Edge Sound Effects Library).
The editing process was done using Logic Pro 8, with exception of the blow with the frozen leg of lamb, which was designed using Pro Tools LE 7.4.
It was decided that some of the dialogues ought to be superposed to achieve a more dynamic and realistic work. This choice was based on Robert Altman’s ideas regarding film sound. In order to achieve life-like movies, Altman concentrated in his work “not in rapid alternating of short lines of dialogue but on dialogue simultaneously spoken by two or more characters: a wall of sound, a Tower of Babel” (Schreger 1985, 349).
Sonnenschein defines a soundmark as a type of sound that “establishes a particular place, as does a landmark, possessing some unique quality for only that location” (Sonnenschein 2001, 183).
Soundmarks have been used for indoor and outdoor spaces to aid the listener in their recognition. Indoor soundmarks include a cuckoo clock and a freezer while outdoor soundmarks include two types. The first conveys the idea of a residential area during daytime, while the second is used to communicate the idea of nighttime in the same area.
Mary finds Patrick dead: cuckoo clock soundmark (music from Gitimalya by Toru
Footsteps were considered of the utmost importance, since they are the means by which movement through different spaces is represented. When editing the footsteps different issues were considered. Firstly, they were deemed essential as a means of communicating information regarding the movements of the characters as well as their emotional states.
Secondly, an attempt was made to create a sense of distance through the footsteps. The number of footsteps that it takes a character to get from one space to another should be as consistent as possible.
Thirdly, footsteps are used to remind the listeners of the presence of those characters who in certain moments might not be taking part in dialogues, but are meant to reappear later on in other conversations.
2.3.6. Sonic Interaction Design: Character-Object Interaction
This category refers to those sounds that represent objects with which the characters interact. When necessary, careful attention was paid to presenting complete sequences of actions. In other circumstances it was, however, considered that reducing the number of sounds used to represent an event was more effective.
On the other hand, some sound effects were included because they clarified the meaning of certain events. This is the case of the door bell included when Mary enters the grocery store.
Sound layering was also used in an attempt to communicate meaning more effectively. This technique was used in the design of the murder scene in particular for the sound of Mary hitting Patrick with the frozen leg of lamb.
Crime scene, murder with leg of lamb (music from Gitimalya by Toru Takemitsu)
It is essential to mention that the character-object interaction sounds are the ones which might cause listeners more difficulty when assigning them to a particular landscape; this is, “the source from which we imagine the sounds to come” (Wishart 1996, 136).
In the editing process, however, it was considered that contextualization might aid the listener in the recognition of sound sources. This contextualization is determined by other sound effects, as well as by dialogues, music, and spatialisation.
2.3.7. Internal Sounds – Utterances
In the audio film Mary was assigned internal sounds (Chion 1994) in the form of breathing in order to try to generate sympathy for her character and aid the listener in understanding of her state of mind. When no dialogues were present, verbal exclamations were also used to make actions clearer to the listeners and avoid confusion.
Utterances were also important to indicate that, although some of the characters might not be delivering lines, they were present in the different scenes.
Two music pieces were included in Lamb to the Slaughter: Steve Reich’s New York Counterpoint (Second Movement) and Toru Takemitsu’s Gitimalya.
New York Counterpoint was used at the beginning to convey the idea of tranquillity and expectation described in the script. It was chosen as a means of indicating Mary’s initial sweetness towards her husband. It was decided that the theme would be used to accompany Mary’s attempts to communicate with Patrick, and the music would fade out as those attempts fail. Instead of using music to fill in the silences, it was used to accompany the dialogue and emphasize Patrick’s rudeness and his silence.
Beginning of Audio Film (music from New York Counterpoint by Steve Reich)
Gitimalya is used throughout the audio film with various functions associated with film music. Firstly, music is employed to clarify Mary’s feelings towards the various situations she goes through. This is particularly important in three different moments: when Patrick tells Mary he wants a divorce, when she is getting ready to go to buy the leg of lamb and finally, when she finds Patrick dead.
In these moments, besides clarifying Mary’s emotional state, music is employed to attempt to generate a feeling of empathy for the main character (Sonnenschein 2001, 182). This music editing decision was made with the hope that it might cause the audience to feel that Mary’s actions are at least partially justified.
The use of music when Mary sees her husband’s dead body is of the outmost importance, since it was considered necessary to make as clear as possible the fact that she was genuinely shocked.
Secondly, music is also applied as a tool to create tension. This is particularly important in the murder scene in which music was employed to enhance the tension and impact the listeners. This can be related to the concept of modifying music. According to this concept, music, due to its expressive qualities, is capable of modifying a scene (Carroll 1991, 219).
Noël Carroll points out that in a film scene, elements such as dialogue, visuals and sound effects can serve as indicators that tell the viewers what the scene is about, however,
The music then modifies or characterizes what the scene is about in terms of some expressive quality. In a manner of speaking, the music tells us something, of an emotive significance, about what the scene is about; then supplies us with, so to say, a description (or, better, a presentation) of the emotive properties the film attaches to the referents of the scene. (Carroll 1991, 221).
This seems of significance for the audio film format, since the only elements that the audience counts on for interpreting a scene are aural; these are dialogues, sound effects and music. This is also of relevance in this particular scene since, besides the music, the only elements included are Mary’s footsteps and the sound of the blow with the leg of lamb. If the music were to be deleted these, two sounds would probably not convey enough information for the audience to understand the scene and might cause confusion in the listener.
Thirdly, the marimba theme present in Takemitsu’s work is used as a leitmotif, that is, a musical theme assigned to a “main character or key thematic idea of the narrative”. (Chion 1994, 51) This theme is used as a leitmotif for the murder weapon. It is included every time the subject of the weapon is mentioned and it was included to aid the listener in understanding of the story. If the listener does not grasp the fact that Mary killed Patrick with the frozen leg of lamb, this might help him/her understand this key idea.
This motif is also employed with a slightly different function in the scene in which Mary offers the men the leg of lamb for dinner. In this particular case the motif starts before Mary mentions the leg of lamb. Its function is that which Richard Davis describes as “Revealing unseen implications […] The music can tip us off to what is going to happen, both in a suspenseful way, and in a way that resolves a situation” (Davis 1999, 143-144). By including the murder weapon theme before Mary offers the meat, the listener might anticipate that she is going to get rid of the murder weapon.
This theme also accompanies Mary’s final laugh as the men finish up the leg of lamb, emphasizing the fact that she got rid of it.
Additionally, the use of Gitimalya in the audio film has the function of connecting different scenes, creating continuity throughout the work and helping “to create a consistency of tone or feeling” (Piety 2003, 511).
Furthermore, it was considered that the inclusion of music was essential to try to keep the listener’s attention and as a consequence immerse him/her in the sound work. As Alec Nisbett points out, the pure sound medium can create solid magic sceneries, but these can rapidly dissolve if the listener’s attention is lost (Nisbett 1995, 328).
2.4. Sound Processing
An important aspect of sound processing was the use of reverberation to provide the listener with information about the different aural spaces and provide every sound with an environmental context (Murphy 2000) that might allow the listener to identify where an event is taking place. This is of particular importance in the case of visually impaired people since they have a strong motivation to develop spatial awareness due to its importance as an orientation tool (Blesser & Salter 2007). For this reason, artificial reverberation was considered essential to enable listeners to recognize the different spaces presented.
Two music pieces were included in Lamb to the Slaughter: Steve Reich’s New York Counterpoint (Second Movement) and Toru Takemitsu’s Gitimalya.
The voices were processed using the Space Designer plug-in of Logic Pro 8, and a different setting was assigned to each space.
Reverberation was also applied to indicate that some of the lines delivered by Mary were part of her thoughts.
Footsteps were considered an important element since they are a way in which listeners may be aware of certain characters and might even identify them. Consequently, the footsteps belonging to different characters needed to be differentiated through sound processing to try to aid the listener in their distinction. In order to endeavour to achieve these differences the Channel EQ plug-in of Logic Pro 8 was employed.
Footsteps were also processed with the Space Designer plug-in to convey the idea of the different surfaces characters walk on. Furthermore, the same reverb settings used for the voices in the different spaces were mostly maintained.
2.4.3. Sounds Heard Through Windows
Another important part of the sound processing stage was the processing of those sounds that were supposed to be heard through windows. In an attempt to create this effect reverb was added by means of the Space Designer plug-in and equalization was applied to reduce the low frequencies and boost the high frequencies.
2.4.4. Sounds Heard From Other Rooms
Another effect added through processing was that of sounds being heard from other rooms. This effect was achieved by employing the Space Designer plug-in. This effect was of particular importance at the end of the audio film. It was a creative decision to end the audio film by simulating a cut from the kitchen where the policemen are having the leg of lamb and commenting on the murder weapon to Mary listening to their conversation in the living room. This is of significance since until that moment all the audio film had been centered in the spaces in which Mary was, and this is the only moment in which the listener is taken to a room Mary is not in. To emphasize this, the sound of the clock ticking, which characterizes the living room, was processed with this reverb setting. As the audio film cuts to Mary sitting in the living room, this shift of spaces is indicated by the removal of the reverb effect from the clock, the addition of the effect to the sounds coming from the kitchen and, more explicitly, by the processing applied to Jack’s voice in the kitchen.
2.4.5. The Back Voice
The term back voice refers to those cases in which sound is processed to indicate that someone is speaking with his/her back to the listener (Chion 1994). The back voice was of significance in the design of the murder scene, since, as Mary approaches Patrick with the leg of lamb, he talks to her with his back against her so he does not see her approaching him with the murder weapon. Since there are no visual elements, it was necessary to indicate this by applying reverberation to Patrick’s voice and reducing the direct sound output.
2.4.6. Spatialisation – Visualisation and Panning
The approach to spatialisation in an audio film can break with the common cinematic conventions that see dialogue, the majority of sound effects and music coming from the front speakers and only some effects and ambiences coming from the surround speakers. In an audio film there is no screen and therefore there is no strong bias towards the front: this format allows full experimentation with 3D audio. The results of this experimentation could stimulate new uses of surround sound in conventional cinema, making it a more interesting device for the process of storytelling (Manolas and Pauletto 2009).
In order to plan the process of spatialisation in this project the layouts of the living room, the kitchen and the bedroom were designed employing the 3D interior design software, Interiors Professional. See example of visualisation in Figure 1.
Figure 1 Layout of living room designed with Interiors Professional.
Following the visualisation of rooms the surround panning was planned. See Figure 2 for example.
Figure 2 Layout of living room according to the 6.1 surround sound setup.
In the scenes in the street the outdoor sounds were spread throughout all the speakers to indicate that Mary can hear these sounds around her.
The most challenging aspect of the process of spatialisation was to indicate the movement of the characters through different spaces. Since the same channels are being used for every space and for the connections between them, it was taken into consideration that listeners would have difficulties recognizing the transitions from one space to another.
In order to attempt to solve this problem, each room entrance was assigned to a specific channel. These entrances were kept consistent throughout the audio film, and the spaces between them were considered as corridors connecting the rooms.
Figure 3 Connections between spaces according to the 6.1 surround sound setup.
Another problem encountered was how to indicate that the characters were moving away from a specific space. In order to achieve this effect volume automation was employed, lowering the volume of specific soundmarks as the characters walk away from a room. The panning remains the same, however, failing to indicate the positions of the rooms in the house. For example, when Mary exits the living room, the cuckoo clock is still panned to the C channel, even when she is in a corridor and the sound would have changed its position in relation to Mary.
In most cases the footsteps in the rooms were panned in a circular manner to make them more evident. This was done in this manner even though the visualisation process might have suggested more direct paths for the characters’ movements within the rooms.
Regarding Mary’s footsteps going up and downstairs, the best option found was to use circular motion as if it were a spiral staircase.
When Mary is walking in the street, since no fixed direction was determined, circular motion was also employed.
In some cases footsteps were also sent to all the channels to indicate that the characters are approaching the centre of the room. Music was reproduced from all the channels to create a sense of envelopment. The same treatment was given to Mary’s interior monologues to emphasize the fact that they are thoughts, so they do not need to be placed in an exact position.
2.5. Audience Feedback
In order to test the effectiveness of the audio film, trial sessions were arranged in which 13 attendants were asked to listen to the work and complete a questionnaire (see Appendix) that included questions regarding the perceived clarity of the audio film, plot understanding, and the recognition of different characters, spaces and sound sources.
These sessions had two purposes: firstly, to verify whether this format might be of interest for the general public, and secondly, to assess whether it might be used as an alternative to audio description. Contacts were made with various associations of visually impaired people, but unfortunately, in the limited time available for the project, it was not possible to arrange trial sessions. We hope to organise trial sessions with visually impaired people in the near future. For this reason, the results reported here correspond to the feedback received from sighted volunteers.
The sessions took place in the Audio Production Suite at the Department of Theatre, Film and Television at the University of York. The work was played using Genelec loudspeakers, which were arranged using a 6.1 surround sound setup (3 front channels, 3 surround channels and 1 subwoofer). The subjects listened to the piece individually so that they could sit in the ‘sweet spot’ (the best possible listening position) at the centre of the speaker setup.
The subjects were selected from groups of students from different programmes such as theatre, writing and performance and visual effects in postproduction at the University of York. It was considered of importance to select subjects from various areas of study in order to evaluate if this format could attract people with different interests and not just an audience interested specifically in sound and sonic art.
We first asked if the attendants had prior knowledge of the plot (see Figure 4) and the majority were not aware of the plot. Then, after listening to the audio film, we asked the attendants to summarise it.
Figure 4 Measure of prior knowledge of the plot.
Figure 5 Accuracy of the summaries provided by the listeners who declared no prior knowledge of the plot.
Figure 5 shows the accuracy of the summaries provided by the listeners who declared no prior knowledge of the plot. This result is encouraging since the majority were able to understand the storyline. It is important to note that that the volunteers who did not understand the plot were non-native speakers who mentioned having problems following the dialogues.
We measured the clarity of the audio film by asking the audience to name the characters they recognised, to name the rooms and spaces they identified and to mention which sound elements helped in the identification.
Figure 6 Measure of perceived clarity of audio film.
The main characters (Mary and Patrick) were recognized by all the listeners (see Figure 6). The majority of the listeners recognised two other important characters Jack (the main policeman) and Sam (man at the grocery). The recognition of the remaining characters, however, seems to have caused difficulties and none of the listeners were able to list all of them. This is probably due to the fact that as the plot moves forward the number of characters increases and it becomes more challenging to remember each of them distinctly.
Figure 7 Measure of recognition of spaces.
The spaces that were more widely recognized are those that have been assigned either soundmarks or sound effects that are typically related to those spaces.
Figure 8 Measure of elements to enabled the recognition of different spaces.
The two most prominent clues for space recognition mentioned in the questionnaires were the presence of characteristic sound effects such as cutlery in the kitchen, and the presence of soundmarks, such as the clock or street sounds.
The design of Lamb to the Slaughter has demonstrated that it is possible to present a clear storyline solely through sound by employing sound effects, sound processing and surround sound to convey information eliminating the need of a narrator.
Sound effects were used both to represent actions and as soundmarks to aid listeners in identifying the different spaces included in the audio film. Also, artificial reverberation was employed to provide each space with a characteristic sound providing further differentiation.
As regards the use of surround sound, it was employed to suggest the layout of the spaces as well as indicate the movement of the characters.
Regarding the use of music, it can be used in the same way as films to indicate the characters’ feelings, enhance tension and as a leitmotif to clarify the plot.
This project is provides some initial insight regarding the potential of this format to convey a storyline without the need of visual elements and without the need of a narrator. It is important to point out that the listener does not depend on a describer or a narrator to interpret the piece. On the contrary, it is a format in which the listener can reconstruct the meaning of the piece in a personal way and this reconstruction might vary from one listener to another.
Further work needs to be done to develop this format and to analyse its viability in depth. It is necessary to explore possible ways of finding sound equivalents of elements from the film medium such as different types of shots. It is also of importance to work on the portrayal of the connections between the different spaces, since although an attempt has been made to indicate these connections through imaginary corridors indicated through changes in the reverb settings, as well as by assigning the entrances to the spaces to different channels, this does not seem sufficient.
Moreover, this format could benefit from the use of more sophisticated spatialisation setups, such as Ambisonics, a system of recording and encoding sound in 3D that can then be decoded for a variety of speaker setups, and 10.2, a reproduction system developed by Tom Holman at TMH which includes speakers to reproduce height, which should improve the perception of location and height in 3D.
Additionally, it is necessary to explore the different challenges that this format would present when applied to different genres and aimed at different age groups.
Finally, it is of the utmost importance to test this format with visually impaired people to assess whether it could be used as a replacement for audio description and to explore the ways in which this format could be improved.
Part of this project was supported by the COST-IC0601 Action and carried out in the Portuguese Catholic University in Porto under the supervision of Dr. Álvaro Barbosa. This project would not have been possible without the help of the actors that very kindly agreed to take part in it. Very special thanks to Naomi Sheldon, Pete Dean, Matt Springett, Chris Hogg, Robert Kodama and Kieran Grant Myers.
Special thanks to John Mateer for his suggestions and feedback during the development of the project.
Finally, thanks to all those people that agreed to take part in the trial sessions.
2.8. Appendix: The Questionnaire
1. Were you familiar with the story? Yes / No
If so, could you rate the level of familiarity with the story on a scale of 1 (Very familiar) to 5 (Vaguely familiar):
2. Based on what you have just listened to, could you provide a summary of the plot?
3. Please rate the clarity of the audio film on a scale of 1 (Very confusing) to 5 (Very clear):
4. Please list the characters you could recognize.
C. SPACE RECOGNITION:
5. Please list the different spaces you recognized in the audio film.
6. For each space you recognized, could you name the factors that aided you in their recognition.
7. Please describe the layout of the spaces you recognized. You may include diagrams.
D. SOUND SOURCE RECOGNITION:
8. Were there any sounds whose source you found difficult to recognize? Yes / No
If so, could you list them? What do you think is the cause of their non-recognition? Do you think this undermined your understanding of the plot?
9. Please add any general comments.