Adaptive Delta Modulation for Short-range Digital Voice
Adaptive Delta Modulation for Short-range Digital Voice
Short-range digital voice transmission is used extensively in modern consumer electronics.
Products such as cordless telephones, wireless headsets (for mobile and landline telephones), baby monitors are to name just a few of the items that use digital techniques to wirelessly communicate voice information. Wireless environments are inherently noisy, so the voice coding scheme chosen for such an application must be robust in the presence of bit errors.
Pulse Coded Modulation (PCM) and its derivatives are commonly used in wireless consumer products for their compromise between voice quality and implementation cost, but these schemes are not robust in the presence of bit errors.
Adaptive Delta Modulation (ADM) is another voice coding scheme; a mature technique that should be considered for these applications because of its bit error robustness and its low implementation cost.
ADM quantizes the difference between the current sample and the predicted value of the next sample.
ADM uses a variable 'step height' to adjust the predicted value of the next sample so that both slowly and rapidly changing input signals can be faithfully reproduced.
One bit (i.e. "1" or "0") is used to represent each sample in ADM.
The one-bit-per-sample ADM data stream requires no data framing, thereby minimizing the workload on the host microcontroller.
Bit errors are present in any digital wireless application. Most voice coding techniques provide good audio quality in an ideal operating environment, but the challenge is to generate good audio quality in an everyday environment, where there is the presence of bit errors.
Traditional performance metrics (e.g. SNR) do not accurately measure perceived audio quality for various voice coding methods and input signals. "Mean Opinion Score" (MOS) testing overcomes the limitations of other metrics by successfully quantifying perceived audio quality.
The MOS testing uses a scale of 1 to 5 to represent audio quality, with 1 representing very bad speech quality and 5 representing excellent speech quality. A MOS score of 4 or higher represents 'toll quality' speech, which is equivalent to that audio quality obtained during a traditional telephone call .
The following graph illustrates the relationship between MOS scores and bit errors for three of the most common voice coding schemes, CVSD, µ-law PCM, and ADPCM.
Continuously Variable Slope Delta (CVSD) coding is a member of the ADM family of voice coding schemes.) While the perceived audio quality (i.e. MOS score) of all three schemes degrades as the number of bit errors increases, the graph indicates that ADM (CVSD) sounds better than the other schemes as bit errors increase.
Figure 1: MOS Comparison for Various Voice Coding Methods
Since ADM provides robust performance in the presence of bit errors, error detection and correction typically are not used in an ADM design, and this contributes further to a reduction in host processor workload (allowing a low-cost processor to be used).
The superior noise immunity, coupled with a significantly reduced workload for the host processor, strongly supports consideration of ADM as a voice coding method for wireless applications.
The benefits of ADM for wireless applications are demonstrated in the following example reference design. This small form-factor, low-power design includes all of the building blocks necessary for a complete wireless voice product, including:
ADM voice codec
Power supply including rechargeable battery
Microphone, speaker, amplifiers, etc.
Schematics, board layout files, and microcontroller code written in "C"
This design was implemented on a 2.5cm x 3.3cm four-layer PCB, as shown in Figure 2. The use of smaller components and tighter spacing can enable an even smaller design to be conceived if desired.
Figure 2: ADM Reference Design Board
The block diagram for this ADM reference design is illustrated in Figure 3.
Figure 3: ADM Reference Design Block Diagram
The design concept utilizes both a 'master' unit and 'slave' unit that communicate wirelessly in a license-free RF band. Time-division duplexing (TDD) is used to simulate a full-duplex communication link.
Operation is similar on both the master and slave units. The CML Microcircuits CMX649 ADM Voice Codec encodes the microphone input with 27.8kbps ADM voice coding.
The encoded voice data is passed to the Texas Instruments MSP430F1232 microcontroller for 3B4B coding and frame formatting.
The coded data frames are then passed to the Micrel MICRF505 RF transceiver for filtering, up-conversion and transmission in either the USA (902-928MHz) or European (863-865MHz) license-free band.
The transmitted data rate is 100kbps, and the exact channel frequency is selected by software configuration.
After reception, the RF transceiver passes the recovered data frames to the microcontroller for de-formatting.
The microcontroller then sends the raw data stream to the voice codec for signal reconstruction, and the recovered audio can be heard on a speaker.
(Please note that data framing is used in this project for time-division duplexing (TDD), which is required to achieve the perceived full-duplex effect.)
Data is transferred over the RF link in a half-duplex manner, but the perceived effect is full-duplex because the MICRF505's data rate is more than twice that of the CMX649 ADM voice codec.
The TDD scheme is implemented with a firmware-based buffering scheme that manages the difference in data rates between the RF transceiver and the voice codec. The microcontroller's on-chip memory is used to buffer then transmit and receive voice data until ready for further processing.
Figure 4: Data Processing Flow Path
In addition to the TDD scheme, the microcontroller firmware also accomplishes a 'pairing' procedure that causes the master and slave units to communicate together while ignoring signals from other sources.
The CMX649's voice activity detector is used to determine when the circuit is placed into 'sleep mode'. If voice is absent for more than twenty-three seconds, the boards will enter powersave mode (MSP430F1232 "Low-power mode 0"). The master and slave boards, if paired, will wake from powersave mode when voice is presented at the audio input to the master board.
The CMX649 was selected for this design because of its robust ADM voice coding, ultra-low power consumption, and highly integrated feature set. The CMX649 offers extensive flexibility in its ADM voice coding settings, and the settings chosen for this project were empirically derived to offer optimal voice quality in this application.
In addition to performing ADM voice coding, other CMX649 features used in this design include:
Programmable anti-alias and anti-image filtering
Two interfaces are used between the CMX649 and the MSP430F1232. Control signals are sent to the CMX649 via its 'CBUS' serial/control data interface, while voice data is transferred between the microcontroller and codec over its 'burst mode interface'.
The Texas Instruments MSP430F1232 microcontroller was selected for this project because of its low power consumption and rich feature set. The MSP430F1232 provides 8kB of FLASH and 256B of RAM.
The entire RAM is used for variables and stack pointer space, and less than 4k of FLASH is used in this project.
The MSP430F1232 is placed in its active mode during voice communications and performs all required system management functions. The microcontroller operates from a 3.3V power supply. The 8MHz crystal sources the microcontroller's 'basic clock module' during normal operation, and the digitally controlled oscillator (DCO) provides the timing signal during sleep mode operation.
Two universal synchronous/asynchronous receiver/transmitters (USARTs, also called SPI ports) were required for this design:
One USART for the codec-to-microcontroller communications
One USART for transceiver-to-microcontroller communications
The MSP430F1232 with its one SPI port was selected over other family members because of its cost savings. The microcontroller's single SPI port services the communications between the transceiver and the microcontroller.
A second SPI port, needed for the communications between the voice codec and the microcontroller, is emulated with a 'bit-banging' routine.
The watchdog timer is disabled during normal operations. When the design is placed in sleep mode, the watchdog timer occasionally causes the circuitry to power up and checks for audio signals.
The Micrel MICRF505 'zero-IF' single-chip transceiver was selected for this design because of its low external component cost, device flexibility, and its ability to block co-channel interference. The MICRF505 operates from a dedicated 2.5V voltage regulator to optimize noise performance.
The transmit data is applied directly to the MICRF505's VCO in this design. The inherent high- pass filter characteristic of PLL-based synthesizers means that direct VCO modulation of a data signal can result in undesired attenuation in the signal passband .
To prevent this situation, the transmit data is 3B4B encoded in the microcontroller to ensure that the bandwidth of the data signal is high enough to be passed by the synthesizer. Encoding increases the transmit data rate by one-third, resulting in the 100kbps rate of data transfer both in and out of the MICRF505.
The transceiver uses an internally generated data clock signal to control the exchange of data with the microcontroller.
A 200mAH lithium polymer rechargeable battery from Ultralife Batteries was selected for this design because of its small form-factor and large energy density. The design can also easily accommodate other types of batteries.
The flexibility of both the CMX649 and the MSP430 allow operation with very low power consumption. While the exact current consumption is a function of many variables (e.g. user's voice volume, how often the user speaks into the microphone, etc.), the observed current consumption during normal operation is approximately 19.25mA, which translates to an estimated 'talk time' of 10.4 hours with the 200mAH battery.
The observed current consumption for sleep mode is 90mA, and this translates to a "standby time" of approximately 92 days. (Note: these values are based on 100% battery capacity being available for use, but the low discharge rate of this design makes this assumption reasonably accurate.)
Low-dropout regulators provide 2.5V to the RF transceiver and 3.3V to both the microcontroller and voice codec. The speaker driver is connected directly to the battery to minimize current- surge noise from propagating to the rest of the circuitry.
This document has described ADM, explained the benefits and introduced an ADM reference design that can be used as a 'seed' for wireless voice projects. It is hoped that designers will find this information useful and will consider ADM for their next wireless voice project.
For more information, please visit the following websites:
 R. Steele, Delta Modulation Systems, Pentech Press, London, England, 1975
 N.S. Jayant and P. Noll, Digital Coding of Waveforms; Principles and Applications to Speech and Video, Prentice-Hall, Englewood Cliffs, N.J., 1984
 CML Microcircuits Application Note:
"Using Two-Point Modulation To Reduce Synthesizer Problems When Designing DC-Coupled GMSK