Wobble. You won’t find it in any Arbitron glossary or index, but the word is just as much part of Arbitron’s lexicon as Cume or AQH. While there is no official definition, most broadcasters would agree that a wobble is an inexplicable change in a station’s ratings that cannot be explained by market dynamics or station actions. (Most PDs would add that wobbles are always a downward change. Good books are never by chance!)
Before PPM, Arbitron refused to officially acknowledge the existence of wobbles. When confronted with wild swings in the numbers from one book to another, Arbitron representatives always attributed them to things like lack of advertising, playing the wrong songs, or lunar eclipses.
That all changed with the roll-out of PPM. Suddenly Arbitron started finding all sorts of flaws in the diary methodology. Suddenly, wobbles were a problem that PPM would fix.
While the quarterly report is a measurement over 12 weeks, participants only fill out a one week diary. Each of the 12 weeks is made up of distinct individuals. Because Arbitron doesn’t want to spend any more money than it needs to, it places the fewest diaries in a market it believes necessary to meet its promised sample size.
The company makes an effort to match diary demographics to the population, but until all the diaries come back, there’s no way to know how closely participant demographics match the market’s make-up. (Actually there is, but it involves increased costs that Arbitron is unwilling to shoulder.)
Arbitron tallies the returns by demographics and then comes up with weights to match the diaries to the market. A diary’s PPDV, Person Per Diary Value, tells us how many people each diary represents. If returns fall short in a cell, the PPDV will be large. If returns are more than needed, the PPDV will be smaller.
Wobbles are generally caused by inadequate diary returns. Too few returned diaries in a cell results in a large PPDV, and a large PPDV leads to large swings in the numbers.
PPM was to change all this. Instead of using participants for only one week, participants would become part of a permanent panel. The panel would be carefully recruited to closely match the market’s demographics. Weighting would still be applied, but since the panel was a carefully crafted microcosm of the market, all cells would be equally weighted, and weighted the same each month. This would result in more stable survey-to-survey trends, according to PPM literature.
As with many things related to PPM, the delivery has fallen considerably short of the promise.
Wobbles have not disappeared. In fact, in some respects the wobbles have gotten worse. We are seeing the same kinds of swings in the numbers that we see with the diaries. The problem appears to be the panel.
As we noted earlier, Arbitron is having problems with compliance. Participants are not consistently carrying their meters. As a result, Arbitron is having to employ a dynamic weighting system to compensate for participants who come and go within the active panel.
The weights are even more extreme than with diaries, and because PPM requires even more complicated weighting than the diary system, swings in the numbers are inevitable.
This graph illustrates the impact. We compared the weights employed in New York’s last diary book to New York’s latest PPM month. We calculated the difference between the metro population and the unweighted in-tab by demo and calculated an average mis-match. We then did the same for PPM.
With the diaries, age-sex cells were off by an average of 21% with a dispersion of plus or minus 11%. In the latest PPM month, age-sex cells were off by an average of 23% with a dispersion of 18%. Dispersion tells us how far off from the average most of the cells are.
The PPM panel weighting error is slightly higher than the diary weighting, suggesting that the active panel is no more representative of New York than the diary method was. More importantly, the range of error is almost twice as high as with diaries. That means some cells are being weighted much more severely now with the panel than they were when Arbitron used one-week diaries.
Put another way, Arbitron did a better job of drawing a representative set of diary keepers than it is currently doing with its New York panel.
Note that this analysis looked at only discrete age and sex cells. It did not look at ethnic weighting. The ethnic issues are well documented. Had we included ethnic weighting, the differences would have been considerably greater.
So here are our questions to Arbitron:
- What has gone wrong with the panels?
- Why are they no more representative than the diary keepers?
- How can the problem be fixed?
Comments