The Evolution of Voice Over the Internet: Why Performance is Still an Issue

PublishedNov 10, 2021BySubspace Team
Digital voice and video communications are as old as the internet, but they have yet to deliver their full potential. The secret to solving jumpy video, voice distortion and dropped calls during a video conference or customer service call lies in the origins of the internet itself and the unbroken chain of improvements designed to mitigate the underlying problem.

The Birth of the Internet

In October 1969, the first file was transmitted through ARPAnet from a computer at UCLA to a computer at Stanford University. This Department of Defense (DoD) research project was the beginning of what we now know as the internet.
The DoD needed a reliable way to share information between computers engaged in defense research scattered across the country. The network had to keep working even if a power failure, natural disaster, or even nuclear attack rendered one or more nodes inoperable. The researchers who developed the internet decided that the most reliable connection would be a mesh of independently operated networks with each node connected to multiple other nodes. There would never be a single point of failure. The challenge was managing communications across a mesh rather than a single line from point to point.
The answer came in 1972 when Vint Cerf and Robert Kahn created the Transmission Control Protocol/Internet Protocol (TCP/IP). This standard allows packets to be sent across independently operated networks on a path defined as it goes. TCP/IP is what makes a network of networks possible. During transmission, a file is divided into many small electronic packets. Each packet crosses the network separately to be reassembled back into a file at the receiving end. As each packet crosses the networks, it hops from node to node. Each node is aware of whether the connected nodes are working and forwards the packet to the next available node while blocking transmission to non-responsive nodes.

Voice Over Internet

DoD planners thought that making voice calls across a super reliable network would be helpful in wartime, so they expanded the requirements for ARPAnet to include voice along with data.
Speech From a Machine
In 1973, Bob McAuley, Ed Hofstetter and Charlie Radar sent the first voice packet over ARPAnet. Linear Predictive Coding (LPC) was the protocol they used in the experiment. A network protocol is a set of rules that devices connected to the network use to understand each other. As long as the computers on the network use the same set of rules, they can find each other, connect, and communicate.
Speech Across Disparate Networks
From 1974 through 1982, Glen Culler, working with the Lincoln Lab at MIT, expanded the use of LPC for digital voice transmission to show the viability of digital voice communication on various networks. They eventually demonstrated voice communications, including conference calls across a mobile packet radio net, a cable network, and an interface with the Public Switched Telephone Network (PTSN).
A Gamer Solved the Problem Where the DoD was Struggling
In 1989, Brian C. Wiles, who would ultimately develop Skype in 2000, developed RASCAL to allow video game players to communicate with each other over dial-up modems while playing. RASCAL was the first VoIP application. Communication was limited to local Ethernet networks because voice was too much data to send across the public internet.

Mitigating the Problems

As the research and application of digital voice communications progressed, two persistent problems hindered development:
  • Voice data is much larger than the text that the internet was designed to carry.
  • There was no reliable way to negotiate a connection between two machines from different manufactures running different software.
The development of compression codecs tackled the data size issue while signaling protocols helped solve the connection problems.
Protocols and Codecs Make Digital Voice Possible—But Not Great
In 1988, International Telecommunication Union International Telecommunications Standardization Sector (ITU-T) approved the G.722 wideband audio codec, which provided markedly improved speech quality compared to LPC. G.722 was rated as “toll quality,” meaning its audio was comparable to the PSTN.
In 1995, VocalTec Communications offered the first commercial VoIP application. The VocalTec Internet Phone operated using the H.323 signaling protocol.
In 1996, SIP was introduced as a signaling protocol. SIP scaled better than H.323. SIP became the protocol of choice for mobile VoIP. The improved G.729 codec was approved the same year.
The advances in signaling and compression led to the explosion of VoIP technology available to individuals and businesses, including:
  • Asterisk, the first IP-PBX, in 1999
  • Vonage, the first IP phone system, in 2001
  • Skype, the breakthrough conferencing system, in 2002
  • Apple FaceTime in 2010
  • Today’s ubiquitous presence of IP phones in homes and offices and our current obsession with video conferencing solutions such as Zoom and WebEx
  • WebRTC, which allows us to host real-time applications in a browser

The Root of the Problem

Even with the latest compression codecs and signaling protocols, we have all experienced the challenges that have plagued voice and video since the early experiments in the 1970s—voice distortion, dropped calls, long wait times for a connection, and choppy voice and video. These problems still occur because ARPAnet, and the internet that followed, were designed to deliver bulk data reliably. It didn’t matter if it took 10 seconds or 10 minutes to deliver a file as long as it eventually arrived.
Voice and video, on the other hand, are sensitive to millisecond delays. All the advances in signaling and compression achieved over the last 40 years can’t compensate for the fact that the internet just wasn’t designed for this type of traffic. What will ultimately solve this problem is a new internet built with different priorities.

The Solution is Here

The next generation of real-time applications requires a network that prioritizes delivering data quickly. Subspace is the internet custom-built for today's real-time applications. Subspace virtually eliminates network lag, jitter and packet loss, allowing your voice and video customers the clearest, most reliable calls available.
Find out how you can empower your digital communications products on Subspace. Sign up here.

Share this post

Subscribe to our newsletter

The world’s fastest internet for real-time applications—period. Every millisecond counts. Learn more in our newsletter.

Related Articles