Interview: Michael Carnes/Surround Mixing

Michael Carnes has been developing surround reverbs for more than 20 years. He was the lead developer behind what may have been the first true surround reverb – the Lexicon 960L, shipped in 2000.

A fully-loaded 960L could provide two 5.1 reverbs at 96kHz and cos a little under $20,000 – 30 times the cost of an Exponential Audio plug-in that will provide up to 22 channels of reverb at up to 384kHz with money left over for popcorn.

In the first part of this interview, Carnes examined early reflection in reverberation – what they are, how we perceive them and their part in artificial reverb both current and historical. He now turns his attention to surround sound mixing, and how it differs from mixing in stereo.

It appears that there’s growing interest in mixing music in surround. Are there different rules from mixing in stereo?

‘Rules might be too strong a word, but there are some commonsense approaches that can really help. It might help to look at just a few of the things we’ve learned to do in stereo.

‘If you listen to some early stereo recordings, you might hear the technology being shown off. Guitars jumped from channel to channel and parts of the mix may have been panned hard left or hard right – mixes like that sound really dated and obvious now. The lesson here is that we quickly get used to technology and we become attuned to excessive use of the new powers of any tech. All of that panning around became distracting and we toned down what we were doing. Our mixes may have lost some adventure, but they gained polish.

‘We also learned to work around issues that arose in LP mastering. We drove lower frequency instruments toward the centre of the mix so that the stylus would stay in the groove. That’s a technical restriction that we still respect, even though the reasons for it may have largely disappeared.

‘Mixing in surround brings practical limitations we need to be aware of and it also tempts us in ways we might regret a few years down the road. So let’s talk about some of the things to keep in mind.’

Mixing in surround sound

There have been surround mixes in movies for many years, but there’s very little in music. Why is that?

‘Well it’s not for lack of trying! Twenty years ago there was serious discussion in the music mixing world along with a few actual releases from more adventurous members of that community. It was exciting, but it foundered. I think that a big reason was the inventory problem…

‘We had several competing formats: DVD-A, SACD, HD-DVD, BluRay and more. You had to pity the poor buyer at the record store (remember those) who might be faced with stocking albums in all of those formats. The buyer chose not to stock any of them and it’s easy to understand why. But we now have discrete surround broadcast/cable/streaming and BluRay is the only physical format left standing (in fairness there are few SACD releases coming out). There’s also interest in VR, possibly for gaming or for listening over headphones.

‘The inventory problem is gone. We’re now starting to see serious activity in surround music, both in pop forms and in classical music. A large number of homes now have surround systems in their TV rooms. Many of the old roadblocks are gone, so we’re again looking at what we might do. There are already several titles available and many more are coming.’

What are the greatest differences between mixing music and mixing for picture?

‘Picture is the important word here. Mixing for film or TV is about combining sound and image to tell a story. Sometimes it’s a straightforward narrative and sometimes it’s a fantasy world built on sound design. The mix has to combine all sorts of elements, moving seamlessly from dialog to music to effects to environments. The mix is always about helping to tell the story, usually without calling attention to itself.

‘Music, whether in stereo or surround, is the story. The listener is focused entirely on the world created within the speakers. In one sense it’s freeing, since there’s no picture to follow. But the same dangers from the early world of stereo are still there--how to use the power of the medium without creating distractions that will be embarrassing ten years down the road.’

What do mixers really do in post?

‘In music there are usually two stages in making a recording (speaking in a very general way): tracking and mixing. The first stage consists of gathering up all the pieces of the musical performance The second consists of blending these pieces into a finished product. It’s similar in post, except that there are a lot more pieces.

‘The first part is, of course, dialogue. This is captured on set as the actors speak their lines. Everything possible is done to use this dialogue, but oftentimes reality (in the form of camera noises, generators, traffic) gets in the way. Some of those lines must get recorded again, months later, in a studio far from the original location. The dialogue editor has to manage EQ and ambience processing in a way that sounds seamless, while eliminating all those noises that don’t belong in the production. The end result is a session (not printed tracks) that will be a key part of the production.

‘Then there’s Foley, the combining of footsteps, chair scrapes, door slams, fist impacts and hundreds of other sounds that give reality to a scene. Those tracks – often the most subtle part of a film – become part of the final mix.

Kraftwerk in surround

‘After that sound-design makes convincing environmental sounds, whether distant cannons, swooping space ships or an uncomfortably close dentist’s drill. Sound designers use all sorts of sound. Sometimes it’s gathered in the field. Sometimes it’s made by recording and mangling real-world objects. Sometimes it’s synthesised. A good sound designer may work with the film composer to make sure that the effect and the music don’t interfere with one another.

‘Score mixing might be the most familiar idea – after all, it’s music – but there are challenges that don’t have an equivalent in the music world. The composer may have to deliver the same cue in varying lengths to make room for edits. There may be alternative instrumentations of the same cue, just in case one instrument is too much for a given moment. The score mixer may provide dozens--in some cases hundreds – of multichannel stems (submixes) to provide maximum flexibility in the final mix. If you have a music/dialogue conflict, it might be better to duck one stem (let’s say strings) rather than the whole orchestra. There’s a level of flexibility in film scoring that just doesn’t exist in the music world. Having to book a scoring stage and an orchestra for an avoidable fix-up doesn’t go over well with the producer. The score mixer organises and premixes all of these stems and that session is then made available for the final dub.

‘It’s important to point out that all of these processes work in parallel. Dialog is being edited at the same time special effects are built and Foley is being recorded. Scoring may come just a little later, but there’s no guarantee of that. Each group may have – at best – partial mixes from the other groups. A little production dialogue and a temp score may be more likely.

‘All of these stems then head over to the re-recording mixer, who will have the responsibility of putting together hundreds (or thousands) of tracks for the theatrical release. This mix is done to picture and may run several weeks for a big show. For a good amount of time the director will be present, along with trusted ears and other stakeholders in the production. It can be a stressful time to say the least.

How does this compare to mixing music?

‘If you’re mixing music for surround, your world is a lot simpler. While it’s true that your tracks may someday be remixed in a future format, you know right now what surround format you’re working in. The way you hear things in your mix room is the way you’d like it to be heard in the listener’s home. Your biggest decision will be how enveloping the mix will be – where things go. If it’s a classical mix or a concert video, you may want a proscenium effect, with the instruments across the front of the mix and only ambience above and around. But if it’s choral or organ music you might get more adventurous, with more content in the surrounds. On the other hand, a mix of popular music can be whatever you want. You just don’t want the mix to sound like a demonstration record when you listen to it several years from now. Too showy a mix can take the listener out of the music.’

So in the end, it’s just like stereo? The mixer determines which speakers an instrument lives on?

‘Wouldn’t it be nice if things were that simple? But music and post mixers (especially those mixers doing television) have an additional concern. The listener’s speaker budget may have been blown on the front pair. There may or may not be a centre speaker. The surround speakers may be satellite speakers that don’t give you much below 100Hz. Or they may not be there at all. Maybe there’s a subwoofer and maybe there isn’t.

‘This means that you still have to be careful about where you tell the story. Too much low frequency in the surrounds may be too much for small speakers. The bass management system in the home receiver may route the lower frequencies to the sub or even to the front speakers. So that dramatic effect of the five-string bass in the back has a good chance of being underwhelming on a lot of systems. It’s always a good idea to check your mix on small speakers (just like you would a stereo mix) to make sure you can live with it.

‘One thing most veteran music mixers keep an ear on is mono compatibility. You may not be looking for a perfect mono mix, but it’s got to be acceptable. The stakes are higher in surround. Your 7.1 mix might be heard in anything from a 7.1 room on down to mono. You have to avoid using any mixing tricks that fail when channels are combined (be careful about inverted phase situations). A good monitor control system that lets you easily audition in different formats is a necessity.’

What is the LFE channel? What should a music mixer know about it?

‘I think that in general, until you’ve gained a lot of experience, you should mix as if LFE isn’t there at all. LFE (Low Frequency Energy) is a special channel that’s used (or overused) almost entirely for dramatic moments in a film. If a playback system doesn’t have a subwoofer for LFE, then that audio is completely discarded. This means if you treat LFE as a subwoofer channel, then your mixes will be incredibly bass-shy on many systems. Any receiver or preamp used for surround playback will have a secondary system called “bass management”. It’s a built-in crossover that will take low frequency information from all channels and automatically sends it to your sub – combined with an LFE channel if one exists. If there’s no sub, your main speakers will have to do the job, although there will be no LFE channel. This means that for nearly all music mixing, you should treat every channel as full-range (see the caution in the next paragraph). If the playback system has full range speakers, then the low frequency components will go to those speakers just as they’re panned. If the speakers are smaller, then bass management will send it off to the sub (if there is one) or possibly the fronts.

‘But I’ll refer you back to my remark in the previous paragraph regarding the size of speakers to expect. There’s a very strong economic component to your choice here. If you’re mixing for a young market, they’re not likely to have the discretionary income to be surrounded by speakers that go down to 40Hz. But if it’s an older market – classical music or remixes of classic rock for example – they’ve got a lot more money to throw at it. Sock it to ’em. So you really do have to know a little something about your target market. That may guide some of the choices that go into your mix.’

Are there techniques that postproduction mixers use that music mixers should know about?

Exponential Audio R2 reverb plug-in

‘A lot just comes from experience. But one thing that I think would help is standardised levels. A dub stage (film-speak for the big mixing room) operates at a reference level of 85dBA. That doesn’t mean that peaks can’t be louder. But most material will be at this level or lower. For television mixing, reference levels may run in the range of 79-82dBA. There will also be “true peak” levels – the maximum level allowed on a mix that may run from -2dBFS down to -6dBFS.

‘While these levels originated from scientific studies to determine harmful long-term listening levels, they provide for a consistency in mixing and keep the ears from tiring quite so fast. There’s another practical reason for music mixers to consider adopting such a standard: chances are pretty good that a listener will use the same system for watching movies and for listening to music. There’s no point in sending the listener scurrying for the remote to make a major adjustment in level when the material changes. If the listener wants to crank it, that’s her choice. Don’t do it for her.’

That’s a lot to think about. But let’s say I’ve made the commitment to do some mixing in surround. What are some of the things you do in Exponential Audio reverbs to make those mixes work?

‘All of our multichannel reverbs have to ability to add delay to individual (or pairs of) output channels. You can also adjust their gain. This can be misused – lots of novice surround mixers like to make the surrounds obvious by using a longer reverb time in the rear or by creating a false back wall. This probably isn’t the sort of room you’d want to hear a concert in, so why would you want to mix that way? A surround reverb usually works best if you don’t notice it until it’s gone. But of course it’s your mix, so do what you like.

‘Allow me to follow up on our previous conversation about early reflections…

‘Exponential Audio surround and 3D verbs have early reflections that are adjustable in the same way as our stereo reverbs, but they also have additional patterns and the ability to focus the reflections in a zone relative to the source audio. For example, if you have a signal in the left front speaker, you can have reflections only in that same speaker or in that speaker and those immediately adjacent. This isn’t naturally what early reflections do, but it’s surprisingly useful. It was originally added at the request of dialogue mixers who needed to concentrate that energy on the screen. But I’ve found that it’s also very useful in blending spot mics into a mix. Often combining spot mics and room mics can give something of an artificial feel. The spots might be a little harsh or a little too detailed. Panning spots into the stereo or surround sound field to blend with room mics can also be tricky. If the listener moves around, the positions of an instrument in spot and main mics won’t always sound the same. But adding early reflections that are focused into that zone can soften the bite of the mic and anchor the position. This gives the gain advantage that you want from a spot, but helps to maintain the imaging given to you by a main pair. Sounds crazy, but you just have to try it.

‘I’d also encourage music mixers to listen to film mixes with a fresh set of ears. There’s a huge amount of art and skill that go into giving you a sound experience that’s unforced and natural. The mixer may have worked really hard to pull off an impossible moment that you might never notice. But once you begin mixing in surround, you’ll know that those mixers are way better than you ever imagined. Like me, you’ll be that one person watching the credits after the lights have gone up and the theatre has emptied.’

Michael would like to thank his friend Tom Marks. Tom took time away from his mix of the Sense8 finalé (Netflix) to provide invaluable insight and corrections to this interview.