The loudspeaker voicing thread

I thought I would start what hopefully will be a long-running thread that can gather community input about voicing your speaker project. Here are some starter questions for the topic:

  • What does “voicing” a speaker mean and why do you do that?
  • How do you voice your speakers?
  • What is the basis you use for making voicing decisions?

Let er rip.

1 Like

I’ll have to give this one some thought. Great questions!

OK, I will start by offering up this plot and the source where I found it. It is a plot of data collected in a study done by Sean Olive, reported in a comprehensive paper by Toole, only part of which is related to domestic listening spaces (see section 4: SMALL VENUE SOUND SYSTEMS starting on page 526).

I think about this plot a lot, and also use it when voicing my own loudspeaker projects. As I understand it, the data was collected using a long integration time in a room with various loudspeakers playing. These speakers were judged by various types of listeners who are categorized by whether they are “trained listeners” or just casual or naive listeners. The results are averaged into one of the curves mentioned.

You would get data like this if you measured the in-room response of your voiced loudspeaker with a continuous signal like white noise and averaged the LF response across the room. You can do this with ARTA, for example. So it includes BOTH the response of the speaker and the room, but that is exactly what your ears are hearing and so it should be rather useful. If you want the full details on how the data was collected you will need to go back to the source and find it but I think the original Olive paper is not online.

There are several features to note in the plot. The bass region below 100Hz is characterized by a rapid rise in level by 3dB or more (it varies by the type of listener). From 100Hz to around 1kHz the level remains approximately constant. Between 1k and 2k Hz there is a down-step of about 1.5dB. Above 2kHz the response is either relatively flat, or trends down with increasing frequency and the amount is again listener type dependent.

How does the room influence these curves? At low frequency with a boxed loudspeaker there can be some room pressurization happening below about 80Hz. So likely the elevated levels down there are reflecting how the room and speaker with flat response interact at low frequency. At high frequencies, I understand that the typical domestic space starts to absorb the off-axis radiation from the loudspeaker (via furniture, carpet, drapes, etc.) so this might influence the down-tilt at high frequencies in the total “in room” sound when the speaker itself is voiced flat.

Then there is the step-down around 1k-2k Hz. I have talked to some of you in the past and heard that you sometimes do this exact sort of step down when voicing your projects, so that the speaker does not sound too bright, etc.

My own projects are dipole in nature. Along with no spoon, there is no box. I carefully choose drivers and crossover points so that the speaker radiates the same or as similar as possible in both directions. Also, I try to have a listening space with plain, reflective walls. Definitely the front wall, but also the floor and side walls. Other surfaces and the rear wall are absorptive to mitigate the room decay time. My speaker will have nearly constant power response but cannot pressurize the room at LF. When voiced flat or nearly so it sounds too bright to me. So I use the data in the plot, and some other information I have found, to create a voicing EQ curve that makes the speaker sound more engaging, less fatiguing, and balanced on a wide variety of music sources.

One can always voice by ear using some carefully picked recordings, but it is useful to have a guide as to what to shoot for, and why. These are my current thoughts about voicing, the how and why, etc. But I would like to hear from others about how they get their own projects to sound right to them, and if there is any method that they follow.

Toole - The Measurement and Calibration of Sound Reproducing Systems.pdf (1.7 MB)

1 Like
  1. Optimizing my designs for perceived accuracy, personal preference.
  2. I combine multiple measurements to get me close and then have a list of about 20 songs I am very familiar with to finish the process. My measuring process has gotten me closer and closer with every iteration so I spend less time “voicing” with music.
  3. See answer to question 1, I think.
1 Like

I’d have to agree with JR.

  1. Making the speaker sound good to me, in my room, with my musical preferences.
  2. I have always EQ’d some BBC dip in as it has made all the different types of music I listen to, sound good to me.
    From current readings, this may be inherent to my speakers having a “hot” spot in the off axis midrange. I now plan to pay more attention to the directivity and what’s happening there. I do know flat speakers don’t sound good to me, just to sterile. I’m not afraid to admit, a “loudness” curve can be way more engaging to me overall, therefore, a bit of bass bump and relaxed mids is my norm.
  3. My ear

I feel I fit in the “All Listeners” category, except more flat out to 20K

FWIW, I’m more of a ‘predicted steady-state’ out through about 2K, thinking it carries downward more in line with the ‘trained listeners’ from there through 12-14k-ish, then flat to up a bit to 20k. I’m not a fan of a bump in the low bass (except perhaps for watching movies), and a slight rise from 14k upwards ‘sounds flatter’ to my old ears.

I don’t think my skills quite have the granularity for true voicing yet. I just try to get the drivers to play nice together and let the result have it’s character.. with only mild tweaking if something just doesn’t sound right. Though I do find I like speakers with a bit of a midrange-forward character. Whenever I ended up with an on-axis droop somewhere in the ~1-3khz region, they seem to sound too veiled for the majority of my critical listening music. They can still do well for rock though.

“voicing” can start at driver selection and cabinet design stage. Choosing drivers and cabinet layout that achieves a directivity goal can avoid a lot of “voicing” frustration at the end of a project.

For “lively” rooms with minimal absorption, targeting for a more directive speaker with high DI slope 1.2-1.4dB/oct may be preferred. For “dead” rooms with heavy absorption, less directive speaker with DI slope ~1dB/oct may be preferred.

Targeting balanced DI and PIR slope makes for only minimal room EQ needed to balance room modes, and a perfectly well enjoyable speaker for those living in 1970 without modern EQ technology.

1 Like

I still have not gotten “the hang” of Vituixcad (jeez how dense can a guy be?), but I”ve come to accept that “correct” off axis response is critical to what we enjoy listening to. Now I understand what Jeff Bagby meant when he said that fixing on axis FR does not fix off axis problems.

  1. “What does “voicing” a speaker mean and why do you do that?”

My understanding is that “voicing” usually refers to the final stage of the design process, using familiar recordings to judge the quality of what is heard and then adjusting the crossover to match personal preferences. But that only refers to “final” voicing. In general, we are continually voicing every speaker that we listening to, evaluating why it sounds the way it does. I continue to “voice” speakers that I built 10 years ago.

Why is this done? Probably because the balloon response of a finished multi-way loudspeaker is extremely complex. It is very difficult to predict what a given set of drivers, baffle shapes, and driver locations will do to the final balloon response. Most of the time, before making sawdust, all we have to work with are a few speaker building “rules of thumb” and a couple driver spec sheets. We don’t really know if the project will pan out until we build the cabs, mount the drivers, and then take some measurements.

  1. “How do you voice your speakers?”

I have several tracks that I like. I “think” I know how these tracks should sound on a good set of speakers. I also set up other reference speakers and flip back and forth. Recently, I started using Equalizer APO to convolute my VitiuxCAD crossovers. This allows me to “voice” my speakers before ordering crossover parts. With Equalizer APO, I can quickly swap parts and then listen to how the tonal balance changes in relation to power response and on-axis curve changes.

  1. “What is the basis you use for making voicing decisions?”

a) The sound compared to Wolf’s Nephlia speakers, which I am currently using as one of my references.
b) The sound compared to my Sennheiser HD600 headphones.
c) The tonal balance from mid’s (1-3k) through mid treble (3-7k) and upper treble (7-20k). Does the speaker sound bright or harsh while, at the same time, sounding somewhat dead and lifeless or lacking in upper frequency “air”. Do cymbal crashes sound right. Do vocals sound right or do they sound sibilant?
d) Does the speaker sound “phasy” as you move your ears a few inches or feet to the right or left of the “sweet spot.”
e) Does the tonal balance change from a seated to standing position?

I would also include some measurements in my basis:

f) The smoothness of the power response and PIR curves, relative to the on-axis curve.
g) The overall slope of the power and PIR curves (to match my room treatment).

1 Like

I think most of us have a few key things we focus on during the voicing process. Whether it’s bass impact, detail, spaciousness, or a more relaxed sound, everyone has their preferences. It’s like choosing your favorite color or flavor. Because every room is different, we are all biased with a slightly different house curve even before we factor our hearing and preferences into the equation. My room, like most, is flawed in it’s own unique ways. Add in the fact that I know my brain can adapt to a certain sound profile over time makes this process somewhat of a moving target.

I’m finally getting to the point where I can usually make the voicing changes I want. That is, unless my measurements or sim were flawed to begin with. Like others have said, I have a playlist with tracks I know well to point out specific things to listen for. I also keep the laptop beside me to see what happens to the response when I change a particular part. It’s a lot of back & forth from sim to changing parts. Like Bill, I’ve started using my Sennheiser headphones as a reference. We’ll see how well that plays out in the future.

1 Like