From Matthew Brook O’Donnell, $amz(1905048114 Corpus Linguistics and the Greek of the New Testament), p. 388:
It seems unlikely that by simply counting words it is possible to differentiate between authors. While a particular author may have a core or base vocabulary, as well as an affinity for certain words (or combination/collocation of words), there are many factors, for instance, age, further education, social setting, rhetorical purpose and so on, that restrict or expand this core set of lexical items. In spite of this, New Testament attribution studies and many commentaries (sadly, some rather recent ones at that) have placed considerable weight on counting the number of words found in one letter but not found in a group of letters assumed to be authentic. (O’Donnell, 388)
I can’t tell you the times that I’ve read authorship discussions on the Pastorals in commentaries where the argument boils down to "read P.N. Harrison’s Problem of the Pastoral Epistles, he got it right". This pawning the argument off on what is essentially a misdirected attempt at stylometry through hapax-legomena counting. Statistics are not easy to understand, and when someone makes a statistical case that sounds good it is easy to accept, point to, and never think about again. "So-and-so has all sorts of numbers, statistics, math and tables that I don’t fully understand, so it must be right."
I’m not saying that all commentaries, monographs and such that dispute Pauline authorship do this. Some do not, and they are well worth reading because they’re really wrestling with the stylistic issues. But if your reason for discounting Pauline authorship rests solely on comparative proportions of hapax legomena between two different slices of a corpus … well, you’re not standing on firm ground.