In the 1970s, Wilson Markle pioneered a new film enhancement process. To the delight of many film buffs, Markle showed that it was possible to add colour to black and white films. While his initial efforts resulted in weak colours, the concept of film colourization took hold. Results improved in the 1980s and have since led to colourization of many classic films such as It’s a Wonderful Life, King Kong and others.
Is it possible to do the same for audio?
Of course, your first thought may turn to audio restoration. Apps such as iZotope’s RX can clean up archival audio, vinyl records, and other damaged audio recordings.
That’s not a perfect comparison, though. Why? Well, the source audio is present. It’s already there. It just needs to be restored. Here’s a more challenging question: is it possible to inject audio that wasn’t there in the first place? Is it feasible to extract audio from a silent scene?
The Visual Microphone
A friend of mine pointed me towards a TED Talk about visual microphones that explores this idea.
The TED conference gathers the brightest minds on the planet to talk about new ideas. In this TED Talk, Michael Rubinstein presented a new type of image processing that amplifies the smallest movements in digital video.
For example, the open-source technology is able to exaggerate flesh tones to depict blood as it pulses through a face. It can amplify a chest rising and falling to reassure parents of a sleeping infant’s subtle breathing.
Using Video to Create Audio
Rubinstein and his colleagues took the idea a step further. As we know, sound is changes in air pressure that travel through the air. The scientists reasoned that this air pressure would cause vibrations in any object they encounter. Of course, the vibrations would be minute, not even perceptible to the naked eye.
However, using a video camera and the new technology, the team were able to magnify footage of this slight air pressure as it hit objects such as a plant. They exaggerated these minute movements to see the effect of sound vibrating the leaves.
What he did next was impressive. First, Rubinstein analyzed these amplified movements. Then he used them to recreate the source audio using the technology-enhanced exaggerated movements extracted from the video alone. With a camera running behind soundproof glass, the technology was able to extract a melody Rubinstein sang solely from the minute shuddering of a bag of chips.
Here’s the TED Talk video. The audio portion starts at 7:24, but I’d recommend watching the entire presentation.
The possibilities of Rubinstein’s discovery are intriguing. The technology isn’t limited to freshly shot video. It can amplify these movements from older scenes, too. That means that sound can be extracted from footage and events where audio has been lost, long ago.
Imagine you are looking at a scene of James Cameron’s classic science fiction film, Aliens. Of course, the post sound crew laid a soundtrack over top of the finished picture. But what was the actual audio that occurred while the scene was shot?
This technology could find out. Using it would reveal the sound of the director, the crew, camera noise, and more at the time the scene was shot.
That wouldn’t really be useful for sound crew, of course. However, there are more tantalizing ideas for dialogue people. The technology could extract room tone from a silent scene long after it’s shot. Was a shot captured MOS? Did a fatal production sound error lose a take of dialogue? Is there an off-camera outtake hidden in the picture? This technology could recreate it from the video alone.
A New Way of Considering Audio
Now, as you’ll hear in the video, the audio is a bit rough. It’s not ready for production. However, it’s likely that a combination of evolving technology and audio restoration tools will provide improved results in the future.
Just the same, what’s interesting about this is not that visual microphones are a viable way of helping a post audio soundtrack. That is likely years away. Instead, it’s the effect that this kind of technology has on shifting the perception that capturing audio is solely a succeed-or-fail prospect.
That was true in the past. Ten years ago distorted or noisy audio was immediately tossed out without a second thought. Trying to fix clipped or crackly audio took more time for a mastering tech to fix than it would to reshoot the performance. The sound fx would need to be recorded again.
With the introduction of tools such as iZotope’s RX software and Cedar boxes, that assertion became unclear. It became possible to salvage audio for less effort than it would to record the clip again. Of course, it’s always better to record a sound effect correctly than it is to restore a damaged one. However, audio restoration became a viable way to save rare or valuable field recordings that would be impossible to record again. This option is increasingly attractive the more sophisticated these tools become. In particular, the latest version of RX makes it especially easy to repair audio.
That’s why I like the idea of visual microphones, even if they’re not currently practical. After all, neither was audio restoration ten years ago. The most valuable idea is not whether such a speculative technology can be used in our projects today. Instead, it reminds sound pros that the line between whether or not a sound effect is fit for the trash is no longer absolute like it was in the past.
There’s a more important benefit, too. Like all restoration tools, it’s vital to avoid using them as a crutch to save inferior sound effects. After all, restoring a damaged, weak sound clip will only result in a cleaner-sounding weak clip. Instead, the tools suggest a new way to perceive the value of a sound effect. They encourage us to evolve our recordings beyond tech specs to include the most valuable parts of field recording: style, expressiveness, and creativity, which only you can provide.
Check out the TED Talks video “See invisible motion, hear silent sounds.”
Hitchcock image courtesy of Sanna Dullaway.