All information in these pages is copyright (c) 1989-2003 by Roger Nichols.
All rights reserved. Permission for personal reference only, and may not
be reproduced by any method without written permission.

The Definition Of High
by Roger Nichols
My New Years resolution this year was 24 bits. I doubt if I will be able
to attain it. With the release of DVD (Digital Video Disc) on the horizon,
the storage will allow anyone to store anything. Roughly ten times the storage
capacity, room to do whatever you want to do. You can store compressed full
length feature films, a whole CD box set on one CD, all of your audio samples
for your project studio, and enough X-rated images to last you until you
go blind. What we are short on is 24 bit material to store on the new format.
No Free Lunch
At this point I want to clarify the 24 bit 96kHz sample rate playing field
a little. All CDs are 16 bit. The data stored on them is 16 bit. The data
that comes out of them is 16 bit. The data that goes onto the pre-master
CD or the 1630 at the mastering facility is 16 bit. There are, however,
ways to increase the resolution of the audio stored in this 16 bit format.
One method of increasing the resolution is by dithering or noise shaping
the audio information stored in the 16 bits available. A simplified explanation
of dithering is that it adds specialized noise to the lowest bit (way down
below where your meters read) in a way that allows you to hear information
that is below the threshold of this one bit. Usually you can get about a
bit and a half of extra resolution with this method.
Another method of increasing the resolution is by employing a very sophisticated
method of noise shaping to the digital data. This noise shaping, besides
increasing the resolution of the smallest bits, mathematically moves the
noise to a range of the audio spectrum that is less apparent to the human
ear. Noise shaping processes like Sony Super Bit Map and Apogee UV22 can
make you think that you are listening to 18 bit or 18 1/2 bit recordings
with only 16 bits of data coming off of the CD.
Both of these processes need to be performed on data that is more than 16
bits to start with. That means that if you are recording live performances
to DAT and you are not going to do anything else to the recording, then
you should use a converter that has more than 16 bits of resolution and
perform the dithering or noise shaping before you store the 16 bit data.
Once the data has been stored as 16 bits, you can not get the extra information
out of the sound to provide what is necessary for correct dithering or noise
shaping.
Most after market A/D converters over 16 bits provide some type of dithering
or noise shaping algorithms for 16 bit data output. If you see a DAT machine
that brags about 20 bit converters with built-in Super Bit Mapping it will
probably sound better than a DAT machine with straight 16 bit converters,
but the final storage is the same, 16 bits.
There is one semi-exception to the "you can't get more than 16 bits
out of 16 bits" rule that I stated earlier. If you perform some digital
mixing, EQ, limiting, compression, add reverb, or in any way change the
16 bit data, you have generated a signal that contains more than 16 bits
of information. Most digital domain processors provide 24 bit or 32 bit
internal processing to provide for extra precision during math calculations.
The resolution of your data stays at this higher resolution until it comes
out of the process. At this point you usually have a choice of how you want
to get back to the 16 bits required for CD or DAT storage. Using dithering
or noise shaping you can end up with better resolution than just chopping
the data off at 16 bits.
But How About a Complimentary Offset Binary Desert?
The only way to get more than 16 bits worth of information into 16 bits
of data is to use some sort of data compression or data encoding where the
extra information is hidden in the 16 bit data stream. The more resolution
you want to store, the more data compression is necessary to fit the information
into the available space. MiniDisc and DCC use data compression to store
audio information. This is a lossey data compression process with a ratio
of 4 or 5:1. All of the data does not come back during the decoding process.
That is why these formats do not sound as good as the same material played
back from a CD. The perfect data compression should be lossless, where everything
you put in comes back out. Lossless compression of audio is possible for
small compression ratios, but the process must be in real time to become
acceptable to consumers, which can be a very expensive process.
Have Your Cake And Eat It Too.
Actually, that phrase should be "Eat Your Cake And Have It Too"
to be logically correct, but who's nit picking.
HDCD. There, I said it. There is a big roar about HDCD right now. HDCD is
an encoding process. 20 bits of information is encoded into the 16 bits
stored on the CD. You can play back the 16 bit CD on a normal CD player,
but to get the benefit of the added resolution, you need a CD player that
has an HDCD decoder and 20 bit D/A converters. There are some on the market
now and more coming. When you play back the CD on a 16 bit player, you are
listening to the raw compressed data. With a good data compression scheme,
you should be able to listen to the 16 bit player without any objectionable
artifacts tearing your head off. Before listening to HDCD I expected the
encoded CD to not be as good as the straight 16 bit version because of the
extra encoded data. They actually did a very good job of hiding the extra
bits in a way that was not detectable.
If you play the CD back on an HDCD decoder, you get the full 20 bit signal
that was encoded during mastering.
I saw the prototype in a little office in Berkeley about three years ago.
I reserved judgment at the time because the only reference was an analog
tape that was being played through the HDCD encoder/decoder. My choices
were to here the analog tape raw, or encoded then decoded. I couldn't hear
any difference. They said "See, it's great isn't it?" Last year
at Emerald Studios in Nashville there was a new prototype of the system.
It was hooked up for a Tony Brown session to mix through. It sounded much
better with higher resolution material being fed through it. The only problem
then was that you had to use the HDCD digital converters. There was no digital
input for bringing in external 20 bit digital signals.
Pacific Microsonics is now shipping the commercial version of the HDCD encoder.
Besides having its own A/D converters, it has a digital input for encoding
20 bit information that had previously been recorded. An interesting feature
of the HDCD box is that it also has a 88.2kHz output. You can use the Prism
or Rane box to store the 88.2kHz sampled signal to ADAT or DA-88 and run
it back into the HDCD box later when you decide what format you want to
release your record in.
It is going to be an interesting year for digital audio. Besides all of
the DVD and HDCD stuff, there is going to be an ERASABLE CD-R. I see DAT
machines becoming collector's items soon afterwards.
Time Base Accuracy.
Almost everybody these days is connecting studios together using satellites,
ISDN lines or EDNet. Phil Ramone used it to hear a session he was producing
on the other side of the United States. Mark Knopfler used it for guitar
overdubs between Nashville and London.
I wanted to see if two studios could exchange audio that could be locked
together with sample accuracy without using the expensive digital links.
Usually you need to have some digital connection and time code reference
to be able to send audio from one studio to another, record something and
send it back. Not this time, ISDN-breath.
The corner stone for this little test is the fact that both studios had
a very accurate Atomic Clock providing digital word sync for the recorders.
I called up the destination studio on a regular phone. I played back the
drum tracks from my tape, mixed them to mono and patched them into the telephone.
At the other end, the engineer recorded the telephone signal onto his Pro
Tools. I played the tape again and sent him some guitars and pianos. I played
the tape a third time to send the reference vocals. Each track had a click
two bars before the solo so that the destination studio could line the tracks
up for multitrack playback at his end.
After the Pro Tools tracks were aligned, I listened to an analog phone connection
so I could hear the various solo attempts. Solo number five was the keeper.
I didn't care that much about the quality of the audio I sent him, as long
as it was good enough to overdub to. I did, however care about the quality
of the solo. The engineer at the other end copied the click over to the
solo track and then sent it from his computer to my computer via modem.
It took 20 minutes to transfer the solo. I imported the solo, line up the
click started mixing. The entire operation took about two hours. No satellite,
no ISDN, no nothing.
The reason we didn't have to SMPTE lock both studios was the Atomic Clock.
In a way, both of us were locked to the Universal word clock, the decay
rate of Rubidium. With his machine connected to his clock, and my machine
connected to my clock, the recordings we made would only drift off by one
sample every three months, without being connected.
This method worked so well, that I am going to try to overdub a chain saw
solo played by an Alaskan lumberjack. So, until next month, you'll have
to excuse me. I have to phone Nome.