I am going to ™ the phrase “Can Of Worms”.
Still mulling over the ideas presented throughout this forum, it seems to me that besides the parameters available in Step2 of the Noise Reduction pane, the length of the selection for Step1 might have some bearing on the effect/quality of Step2.
As a boundary condition I selected the entire track for Step1 before running Step2, and to my great surprise, the results sounded OK.
Above is one of my benchmark test files: Me reading the menu items from the Audacity application (File, Edit, Select, View and so on).
I was thinking of starting with a 1/2 second, then 1-second, 2-second, 4-second chunk from the leading, or the trailing, five-second silence left vacant on the track as a noise-level benchmark
for this recording, and expected that selecting the entire track as a noise profile would empty the track. I had a vague idea that there might be, for an individual track, a sweet spot for length-of-selection; too little and the sample is too small, too big and the sample can never locate noise.
I am not at all sure that I want to know how this all works, now 
Why benchmark tracks, since “Noise” is in the beholder’s ears? Because some sort of randomized blind test ought to be made available for anyone who decides that “your macro doesn’t work”, which statement impugns the character of Audacity, IMHO.
Cheers
Chris