Test Setup: Volume Matching And Testing The Listener
The Listening Environment
All of our tests were conducted in a room with a background noise level of 36.5 dB(A) ±0.2. Of course, we had a PC in the room, and the noise we measured was primarily a result of the system's cooling fans. When my machine dropped to standby, the background noise fell to 32.2 dB(A) ±0.2. In other words, we listened in a very quiet room.
With all of the talk about signal-to-noise (SNR) ratios, total harmonic distortion + noise (THD+N) and dynamic range (DR), it's easy to forget that regular listening environments are inevitably subject to quite a bit of background noise. Beyond a certain threshold, increasingly high SNRs and the "N" component in THD+N become audibly irrelevant when the noise floor of your environment is meaningfully higher than the hardware being tested. That's particularly true for open-back headphones, which, unlike closed-back designs, provide practically no attenuation of ambient noise. Check out some of the (non-scientific) tests in the conclusions page to do a bit of related tests directly on your own.
Imagine trying to listen to your favorite CD on the deck of an aircraft carrier. You can't; the background noise level is so high that you actually need hearing protection. That's an extreme of course, but background noise in any environment still affects what we can hear and what we cannot.
Volume Matching and its Importance
Volume-matching sources when blind listening is important for two reasons. First, if sources are at different levels, they're easy to tell apart. From there, the test is no longer blind. Second, us humans tend to prefer (all other factors being equal) louder sources. Again, that's something we want to control.
Using Sennheiser's HD 800, we accurately volume-matched the individual devices using the 100% digital volume and minimum gain setting of the Asus Xonar Essence STX (which, as an add-in sound card, lacks an analog volume control) and a 1 kHz test tone.
Three test tones at 100 Hz, 1 kHz, and 10 kHz were used from mediacollege.com. The 1 kHz reference level is most important; that's the frequency at which human hearing is most sensitive. The devices we're using are rated to be fully linear in the specified range, so calibration values should match across all three tones.
At 1 kHz, all sound sensor weightings, such as dB(A), dB(C), and dB(Z), are exactly the same with a 0 dB gain. Meanwhile, at 100 Hz and 10 kHz, the weightings yield different values. We're using the common A-weighting, which approximates human hearing best in terms of relative loudness of sounds at different frequencies. This goes a long way in explaining why 100 Hz and, to a lesser extent, 10 kHz, measure consistently lower than 1 kHz. The remaining "drop" comes from the HD 800's own frequency response, which is far from linear above 1 kHz.
Calibration Tone Frequency | Benchmark DAC2 HGC | JDS Labs O2+ODAC | Asus Xonar Essence STX | Realtek ALC889 |
---|---|---|---|---|
100 Hz | 57.0 dB(A) ±0.1 | 57.4 dB(A) ±0.1 | 56.9 dB(A) ±0.1 | 58.3 dB(A) ±0.1 |
1 kHz | 93.9 dB(A) ±0.1 | 94.0 dB(A) ±0.1 | 94.0 dB(A) ±0.1 | 93.6 dB(A) ±0.1 |
10 kHz | 80.5 dB(A) ±0.1 | 81.0 dB(A) ±0.1 | 80.3 dB(A) ±0.1 | 80.2 dB(A) ±0.1 |
As you can see, the calibration is very good, though not absolutely perfect. The Benchmark DAC2 is not perfectly aligned because it uses a digital gain control to affect the volume of its digital input. This control has roughly 0.5 dB(A) "steps" at the level we tested, compared to the analog potentiometer in JDS Labs' O2+ODAC. Given the DAC2 HGC's higher price tag, I'm giving it a minor handicap and setting it at the rounded-down closest setting to the other devices. Realtek's codec is slightly softer at 1 kHz and significantly louder (1.4 dB[A]) at 100 Hz. In this sense, it's simply the least-linear or least-transparent of the devices we're testing.
Audiophiles might argue that a listening difference of 0.2 dB is notable, and might impact our test results. This might hold true for a small minority of humans. For us, it does not matter. This isn't just claimed; we'll prove it shortly. Furthermore, 0.2 dB approaches our equipment's margin of error. Realtek's 1.4 dB(A) difference at 100 Hz is the one measurement that might be noticeable.
Of course, listening at >90 dB(A) for extended periods of time can cause hearing loss. You'll be fine a few minutes at a time. But maintaining high volume should be avoided.
The Most Important Instrument to Calibrate: You
Because everyone's ear is morphologically different, we each hear sound uniquely. There are some general truths, though. For example, we become progressively incapable of hearing higher frequencies as we age. The typical human hearing range is conventionally referred to as 20 Hz to 20 kHz (sometimes 22 kHz).
Our tests involve two listeners: a moderate enthusiast, Listener A, accustomed to ~$3000 in audio gear, and a more serious enthusiast, Listener B, used to ~$70,000 in audio gear.
Measurement | Listener A | Listener B |
---|---|---|
Highest Frequency Heard | 17 kHz | 20 kHz |
Lowest Frequency Heard | 12 Hz | 14 Hz |
Volume Sensitivity (95% Confidence) | ±1 dB | ±1 dB |
At the high end, Listener A can hear a 17 kHz tone using the DAC2. Tones at 18 kHz and above are absolutely silent. Listener B, despite being a few years older, can hear up to 20 kHz.
On the other end of the spectrum, Listener A can faintly hear 12 Hz. Anything lower is total silence. Listener B's hearing starts roughly at 14 Hz. This is uncommon; typically, the threshold is around 20 Hz. Some say such low frequencies are felt, rather than heard. Another possible explanation is harmonic distortion in the headphones or audio equipment. If that was the case, the tone heard at 12 Hz should sound the same as 24 Hz, but softer. But it doesn't. It sounds far lower than the 24 Hz tone.
Using these calibration settings, a blind A/B test of a difference in ±0.5 dB volume levels at 440 Hz results in a score of 5/10 for both listeners, essentially equivalent to a random guess. That means neither participant can tell 0.5 dB levels apart. To reach a 95% confidence level that listeners can tell volume levels apart, we have to move to ±1 dB, where they score 9/10 or 10/10 consistently.
Thus, the "calibration range" of your listeners today is 12 Hz to 17 kHz and 14 Hz to 20 kHz, with a 1 dB volume sensitivity.
Given that the devices we're testing are calibrated well below the level where either listener can hear the volume difference, we consider them accurately volume-matched (except for Realtek's codec at 100 Hz).
For reference, here is the hardware both listeners use:
Component | Listener A | Listener B |
---|---|---|
Primary source / DAC | Asus Xonar Essence STX$190 | Burmester 061 CD Player~€9000 |
Power conditioner | None | Burmester 038 (no longer in production)~€4000 |
Integrated amplifier | Built into powered speakers | Burmester 032~€12,000 |
Secondary power amplifier (For horizontal bi-amping) | None | Burmester 036~€7000 |
Speakers | Yamaha HS80M + HS10W$900 | Ascendo Z-F3~€21,000 |
Headphone amplifier | Built into Asus Xonar Essence STX | Lehmann Audio Linear SE~€1400 |
Headphones | Sennheiser HD 800$1500 | Sennheiser HD 800~€1500 |
Cables | Budget RCA cables$5 | Burmester/Ascendo cables~€4000 |
As you can see, Listener A is accustomed to an audio setup worth around $3000. Listener B is in another category altogether, with a configuration well into five figures. Listener A's setup is also a 2.1-channel near-field active-monitor setup, while Listener B's setup relies on high-end full-range speakers. Both listeners are well-acquainted with Sennheiser's HD 800 headphones though, which are what we'll primarily be using for our tests.