Simulating a poor VOIP call (dropped frames)

Greetings, Audaciteers!

I am working on a network QoS lesson for my students, and I thought it would be a good demonstration to show the effect of congestion and dropped frames on a VOIP call. I immediately thought of using Audacity on a sound sample to simulate various stages of signal issues – but I don’t really know how to go about this.

Does this group have any suggestions of filters I can use for this purpose? Again, this is meant to be demonstrative, not 100% accurate…


I don’t know about dropped frames. What happens with natural speech is the system senses a restriction in the data rate and the data processing gets stiffer to try and make up the difference. That’s when it sounds like talking into a wine glass or milk jug. Then it sometimes snaps back to normal and leaves out a portion of a word or includes a garbled word. Anybody with a stand-alone microphone and a wine glass and milk jug can simulate this. Record the dialog multiple times and switch between them.

Don’t try that on one track. Use multiple tracks (top to bottom) and switch between them with clever editing or Time Shift Tool (two sideways black arrows) and Envelope Tool.


Here’s a little script that can be run in the Nyquist Prompt effect.
It should only be used with fairly short mono tracks.

What it does is to split the selected audio into “frames” that have as size of “Frame size (samples)”
Then it drops a random number between 0 and “Max dropped samples” from the end of the frame.
There is also an option to either “Delete” the dropped samples, which makes the overall audio shorter, or replace the dropped samples with silence.

;version 4
;type process

;control max-frame-size "Frame size (samples)" int "" 400 110 4410
;control max-drop-samples "Max dropped samples" int "" 20 0 100
;control mode "Dropped samples are" choice "Deleted,Silenced" 0

(defun drop-samples (sig)
  (let ((out (s-rest 0)))
    ;First frame must be max size
    (do* ((T0 0 (+ T0 (if (= mode 0) frame-size max-frame-size)))
          (frame-size max-frame-size
                      (- max-frame-size (random max-drop-samples)))
          (buf (snd-fetch-array sig frame-size max-frame-size)
               (snd-fetch-array sig frame-size max-frame-size)))
        ((not buf) out)
      (setf out
        (sim out
             (snd-from-array (/ T0 *sound-srate*) *sound-srate* buf))))))

(if (soundp *track*)
    (drop-samples *track*)
    "Error.\nMono track only")

And here’s a short demo - first with no effect, then two examples of the effect.

If you install FFmpeg libraries into Audacity, you can use AMR (narrowband) codecs, as heard on mobile phones …
AMR (mobile-phone) codecs available in Audacity if FFmpeg is installed.png

These look like interesting options – thanks for the input!

Now, to work…

Wow - this nailed it completely. By using a small frame size and a large max dropped samples value, then silencing the dropped samples, it sounds much like a poor quality VOIP call with dropped frames.

Thanks, Steve. This was very helpful :slight_smile: