DSD Under Fire | The Ear

For the purpose of gaining some insight into the Ayre QB-9DSD DAC I asked company president Charles Hansen a few apparently innocuous questions. I was therefore surprised to receive the following response, which makes fascinating reading. Especially for fans of DSD.

JK: Is there any benefit in 2x, 4x or greater upsampling of DSD and if so is this something you will offer as an upgrade?

CH: This whole "upsampling" thing has always been a huge mystery, from when it was first introduced over 15 years ago. The bottom line is that one is simply inserting another digital filter in front of the existing digital filter. This can only result in an "improvement" if the existing digital filter is poorly implemented, as the first filter in the chain dominates the overall response.

We don't use digital filters that are poorly implemented.

I personally spent four solid months of listening to digital filters, everything from "non-oversampling" to apodising to brickwall, and everything in between. We came up with the best digital filter I have heard, but we weren't smart enough to give it a clever name that has a lot of market cachet…

If you put the QB-9 into "Measure" mode with the DIP switches on the rear panel, it is actually a minimum-phase apodising filter, exactly as recommended by Peter Craven in his AES paper. It is definitely an improvement over a standard half-band, linear-phase, brickwall filter, which is what is used 99.9% of the time, both for recording and playback. If you think that sounds better than our filter, by all means go ahead and use it. I have never heard of anybody that does, but there must be somebody, somewhere…

The apodising filters address problems with aliasing and pre-ringing, but they don't address post-ringing, which is heard by the ear/brain as time smear. With single sample-rate material, one just has to compromise. And we are very happy with the filter we've developed for that. At double-, and especially quad-rates, there isn't much need for compromise.

So called "DSD" is a separate issue entirely:

DSD is a marketing term, not a technical term. It doesn't mean anything except what Sony and Philips decided it would mean. When it was released they said that it was 1-bit sampling at 2.8224 MHz with 7th-order noise shaping.

Then there were reams and reams of propaganda claiming that it was "magical" because with only 1 bit that there could be no digital nonlinearity errors. Everybody fell for this, both in the press and in the general public. It is true that there are no non-linearity errors with 1-bit digital, but I think this had very little to do with the sound quality of the DSD format.

Later Sony admitted that they didn't really mean DSD was 1-bit because there is no way to even mix or change levels with a 1-bit data stream, let alone add EQ, compression, or other effects. They developed an 8-bit PCM console running at 2.8224 MHz called the Sonoma workstation and said that this was DSD.

So which is it? Is it 1-bit or is it 8-bit PCM?

The last thing I heard on the subject was this, from Bruno Putzeys (Chief Engineer at Philips Digital Systems Labs for many years with a true insider's perspective of the development of DSD) in the fall of 2013. Here is the entire e-mail he sent to a group of AES members and recording engineers.

~~~~~~~~~~

"Actually the Vanderkooy paper (AES convention paper 5395, Why 1-Bit Sigma-Delta Conversion is Unsuitable for High-Quality Applications) is beside the point from a practical perspective. It centers on the fact that 1-bit conversion is not "perfectible" in the deepest sense by showing that certain distortion components are inherent in the process. But the nonlinearities described are minute compared to the electronic ones you find in actual converters. The most linear ADC that I know of is a discrete 1-bit design that has no measurable distortion until you approach 0dBdsd when a 2nd and a 3rd at around -135dB come peeping out of the noise floor. So it's a perfectly valid choice as a conversion format if that choice is based on practical considerations.

It really isn't at the conversion stage that 1-bit is a problem. It's when people start calling it DSD and try to use it as a production and delivery format that it becomes thoroughly nonsensical.

I consider DSD a lossy format. Anyone would if they were familiar with the steps audio is taken through in order to maintain the illusion that it stays 1-bit throughout. Delivering a DSD file to the consumer is only excusable if this happens to be the format that the ADC physically operates in and if the data it spits out is delivered to the consumer as is, with nothing more than the occasional "Philips" style splice. But: no level change, no EQ, nothing. Under those conditions DSD can sound fabulous. Otherwise it is a phenomenally clumsy format with no sonic benefits. This rather limits the usefulness of DSD to direct transfers of analogue tapes and some fanatically meticulous classical recordings. About half of Channel Classics albums are bit-true copies of the ADC output apart from the splices, and yes, they sound incredible. But it's my honest opinion that this record company is just about the only argument in favour of DSD.

The whole business of double and quad speed is just a band-aid. With an awful lot of elbow grease it could be taken to a level where it's almost practical but if this is how audio professionals do business – engineering around limitations imposed by commercially attractive consumer superstitions – I wonder what's the point of being a professional anything. We'll see medical doctors prescribing homeopathy next. Oh wait some are already doing that. Never mind.

I had rather been hoping that with the decline of optical release media DSD would die a painless death so I'm deeply dismayed by its sudden resurgence, in dual and quad speed incarnations at that. So allow me to throw my full weight (such as it is) behind the anti camp. Since quite a few people believe I'm a DSD proponent reactions could be interesting…"

Bruno Putzeys

~~~~~~~~~~

We now have one of the senior design engineer at Philips during DSD's development calling it a "lossy format"… This is not new. He has been saying much the same thing since at at least 2005.

When SACD was first released I was eager to hear it. That was a strange experience. The high frequencies of every recording sounded the same. I corresponded with one of the DSD engineers and he said that the specific 7th-order noise shaping curve was arrived at after extensive listening tests. To me that sounds like they chose a specific coloration, which would explain my reaction to the sound.

~~~~~~~~~~

On the other hand, PCM has been very ignorantly implemented in the past. The first company to put forth the idea that brickwall filters were bad was Wadia, way back in the late '80s. With filters, both analog and digital, the steeper they are and the sharper the "knee" at the cutoff frequency, the more they will ring and create time-smear.

That is why I spent so much time working on the digital filters. When we made the Ayre QA-9 A/D converter, we ended up using a moving-average filter, at least for the double- and quad-rates. (Single-rate requires many more compromises.)

The moving-average filter is transient perfect. No overshoot, no pre-ringing, and no post-ringing. It sounds far, far better than any A/D converter I've heard and many pro mastering engineers are using it as well. The biggest obstacle to a broader pro adoption is that it is only a two-channel device with no way to slave it for multi-channel use.

Of course there are many, many other things about the QA-9 that are totally unique for an A/D converter — fully-discrete, zero-feedback analog circuitry, for one thing. True DC coupling so that there are no colorations from capacitors. An innovative digital filter that removes any DC component from the output of the digital signal, and on and on.

Most people have a hard time thinking outside of the box, it seems.

Why limit the QB-9 to just a USB input?

The only other option is a variant of S/PDIF (TosLink, AES/EBU, ST glass fiber optical, et cetera). S/PDIF is inherently flawed, as it always adds jitter to the input signal. The only thing you can do is to throw money at it – the more expensive you make it, the less jitter it will add.

If we had added another input and tried to get it to sound even close to the quality of the USB input, it would have doubled the cost of the unit. And what for? What signal source do you have that can't be routed through your computer? My computer can play any 5" optical disc.

I suppose that some people have video disc players that are not Ayre products and have low-quality analog outputs. While sending that player's digital audio signal to an external DAC might help, it would still just be a compromise. If you want the best sound from a video disc player, you need a one-box solution, which is what Ayre has always made.

Now with streaming video-on-demand through Amazon and iTunes, just watch it on your computer and listen to the audio with the USB DAC. (Of course the picture quality then becomes problematic, but that is a different story. Streaming video can be "1080p" in exactly the same way that MP3 audio can be "44.1 kHz". There is far more to audio or video quality than just the sampling rate.)

The absence of a power switch seems odd, is this sonically beneficial?

Absolutely. Digital systems are even more sensitive to warm-up than analog systems. If you look at the phase noise of a crystal oscillator, it can take several days to fully stabilize. Turning a digital product on and off is not a good idea if you care about sound quality.

I got greater bass extension from the balanced output than the SE, is this to be expected?

I don't think so. I have found all kinds of people tell me about differences between the sound of balanced and unbalanced. When I dig deep there are always differences in the setup — different cables is most common, but there can be many differences in the internal circuitry of the downstream equipment.

On the other hand, if all else is equal balanced is always better. Period. Full stop. End of story.

The reason is simple. A true fully-balanced differential circuit will add another level of power supply rejection. Any imperfections in the power supply are canceled with a true balanced circuit. Every circuit in the world (analog or digital) is simply a modulated power supply. Garbage in, garbage out. Until someone develops an absolutely perfect power supply, balanced will always sound better than unbalanced.

A perfect example of this is the so called "tube watt". Decades ago many people would make a blanket statement that, "A tubed amplifier will sound just as powerful as a solid-state amp rated at twice the power."

This was really big when solid-state amps went to insane levels of 500 wpc and above. Making a tube amp with that much power is literally like putting a portable space heater in your listening room — completely impractical.

There is some truth to the claim, however. It is because nearly all tube amps have push-pull output stages. And guess what? The output transformer will cancel any sags or imperfection in the power supply! When that big bass drum hit comes and the amp is working hard to deliver a lot of current to the speakers, the power supply voltage will drop.

With an unbalanced circuit, that sag is directly passed on to the output. But with a push-pull circuit it is ignored. The equivalent in solid-state is a bridged output stage — a true differential circuit. Every product that Ayre has ever made (22 years now) has always been fully-balanced true differential from input to output.

We wouldn't do it if it didn't sound better. For some reason beyond my understanding, most of the UK-based manufacturers have clung to unbalanced circuitry. Perhaps it is simply a budgetary constraint, as a true-differential amp requires virtually twice the circuitry as of an unbalanced one. I really don't know, but most of the rest of the world has moved to balanced.

Ayre was essentially the first company outside of those that made equipment for recording studios to use balanced circuitry. When we started up in 1993, the only other company using balanced was Jeff Rowland. That was largely at my urging, as Rowland owned half of my former company, Avalon Acoustics, for about a year. Now balanced operation is very common, especially in the higher price tiers.

Do you have any plans to launch a streaming product in the foreseeable future?

Nothing imminent. I'm personally not a fan of streaming at all. From a listening standpoint, I suppose that one could compare it to the old days of radio — a way to expose yourself to new music. That made sense when there were actual DJs that selected the songs they played.

But then everything became corporate and the DJs had to include specific tracks that the radio stations had been paid for by the labels. And now it is even worse! It is just a bunch of computer algorithms… Apple tried to make a big splash with their new streaming service by saying they would have three actual human DJs.

I'm like, "Let me get this straight. These three people are going to take 8 hour shifts, 7 days a week, 365 days a year? They will never get sick, never take a vacation? Who are they kidding? And which of those poor people are going to work the midnight to 8 AM shift seven days a week?" It's just preposterous…

Streaming is fine for background music, but I never listen to music in the background. I only listen to music when I want to listen to it. So there is zero use to me for streaming. I tried a lossless service and their content was too limited for my musical tastes. Additionally, beyond the "gee whiz" factor that the technology actually works, I found no advantage to browsing through music virtually rather than in the real world. I still like the liner notes that are standard with a CD, let alone an LP.

Then there is the sound quality. Ugh. The lossless service was plagued with dropouts, pauses, and glitches, and I have 40 Mbs download at my house. CD quality is only 1.4 Mbs, so I really don't understand what the problem is. I'm sure that these problems will be resolved.

Finally you have these companies trying to "squeeze" high-resolution files down to single-rate speeds, and the "expand" them during playback. All I can say is that anybody who buys that story should take a look at this bridge I have for sale in Brooklyn, NY. There is no free lunch. Period. Full stop. End of story.

One cannot take a quad-rate file, squeeze it to single-rate, and then "upsample" (are we going around in circles here?) to quad-rate without losing something. Some people claim that what you lose is inaudible. Have you ever heard that story before? As in 20 years ago when they foisted MP3 on the unsuspecting populace? None of these compression schemes are lossless. You either have the original file that the artist heard in the studio or you have something else. Period. Full stop. End of story.

And when you look at the actual numbers, they don't add up. If CD quality is 1.4 Mbs and 192/24 is 9.2 Mbs, why compress it? I already have 40 Mbs, and two cities within 20 miles already have fiber optic to the house with 1 Gbs. That will arrive in the next year or two to my house. There simply is no incentive to compress. It's not like we are still using dial-up modems. It seems there is more to the actual story than meets the eye.

I was thinking of products that stream from a local source like the Naim NDS or Linn DSM products. Network streamers with access to genuine hi res data rather than from online sources.

Next year we will have a multi-input DAC. Many companies are making "streamer" boxes with a digital output that will interface with the new DAC more easily. Many of them have USB outputs for use with the QB-9, but unless they can access your storage on a NAS, it is clumsy to swap cables. The Melco is one interesting product: It attaches to your network, accesses all attached devices and connects to most streaming services. It has a USB output and connect seamlessly to a QB-9. It is controlled by an app on your smart phone or tablet and works really well. I'm not sure that we could make something much better than that. If we decide that we can, we may do so.

Yet one can easily do the same with either a "headless" Mac Mini or a similar device Intel-based box from J.River that will sell for $350. Just install J.River and a remote app, disconnect the keyboard, mouse, and display, and you have something far more powerful than a "streamer" for much less money.

It will connect to your network and access both the Internet and your NAS drives. Then you can connect it to anything. It will have multiple USB outputs for connecting to USB DACs. Far less expensive and far more versatile than a dedicated audio solution. Those dedicated audio boxes have computer boards in them running Linux and their own custom apps.

I certainly have no interest into entering the market with what is simply a computer disguised to look like an audio product. The marketing and distribution inefficiencies translate directly into far higher costs with little to no performance advantage. Then the customer expects it to last 20 or 30 years. But in that length of time, every single component inside that "audio" component will be obsolete and irreplaceable. (Had any luck buying 5-1/4" floppy drives lately?) To my thinking, it is simply not a good idea…

The other thing is that everything is in a state of flux. The UK and some of northern Europe have seemed to settle on Ethernet for computer audio. This can go long distances and access many devices. The problem is that the playback software is limited and that a network specialist is required to get everything running properly.

In contrast a USB device is dead simple to set up, has a choice of dozens of great player applications but is limited to a 3 or perhaps 5 meter distance from computer to DAC. Great for a desktop/office system, or even a one-room main system, but not so good for the whole house.

My prediction is that someone will combine the advantages of both in the next few years. Already J.River will run as many separate zones as there are soundcards attached to a computer — the internal one as well as as many external USB DACs as there are ports. Each device can play a different song simultaneously.

And it turns out that this is nothing new! My son is doing work in theater lighting for a playhouse. He recently told me about a free application he is using to synchronize the lighting cues with the sound cues. All of the lights are controlled by a computerized system, and the audio software can play multiple sounds (both music and effects) simultaneously, sending them to separate USB ports, just as J.River does. This allows a gunshot sound effect to trigger a strobe light to create a flash in the theater while music is playing.

I think over the next few years that computer audio will change radically. "Nature abhors a vacuum", and it seems to me that the existing systems are too limited. USB is great for one room. Ethernet is great for whole-house, but at a significant cost, not only monetarily but also in terms of the required infrastructure and limited user interface choices.