YouGov and prediction

YouGov are sending out emails crowing about their successful prediction of the General Election. Let’s not forget that they didn’t exactly ‘big up’ that, new, model before the election. Although neither did I with my model, so I won’t criticise them on that score here.

What this post is about concerns the nature of their model and how it is likely to work – or not – in future. Let’s start with a problem that is particularly acute in transport economics and voting:

A respondent who over-promises (ROP).

In transport, the problem is that there are always a number of people who say, in a simplistic stated preference survey (conventional poll/survey), that they would use public transport, but who, in reality, jump in their car.

In voting, the ROPs say they’ll turn out on election day to vote for party x but who don’t (historically more acute for the Labour Party). This, by definition, affects the turnout and is usually the excuse given by pollsters as the main inaccuracy in their predictions and which they’ll “get right next time”. (It isn’t and they don’t).

YouGov, to give them credit, recognise that turnout is influenced by attitudes, amongst other things – the weather is always a bit of an unknown but we know it has some influence and none of us has enormous power to model that. Their ‘big data’ model attempts to use respondents’ attitudes to more correctly identify who is a ROP: strongly held attitudes mean you’re more likely to vote (and thus not be a ROP).

Here is the reason their approach will fail at some point. Attitudes cannot and should not be elicited from people using:

  • Category rating scales (on a scale of 1 to 5, how likely are you to vote? / How strongly do you agree that the NHS deserves more funding?/ etc);
  • Dichotomous (yes/no) type questions that ask “do you agree or disagree with the following statement….?”)

The first method was discredited long ago (the key academic ‘straw that broke the camel’s back’ papers proved this around the turn of the millennium). People use different parts of the scale just for starters. The second method looks, on the face of it, to be a solution to this.

However, both methods fail ultimately because they give no insights into:

  • How often a person would agree with/identify with one attitude over another when they can’t have both satisfied at once;
  • How strongly they hold each attitude dear to them.

If you understand these, particularly the second, you have a good idea as to how likely they are to REALLY turn out to vote. It’s what choice modelling can do – if done correctly, as I have shown and which has made me one of the top three experts in the world. Since the proof of the pudding is in the eating, it’s really rather gratifying that a choice model that wasn’t even fully set up to predict a totally out of the blue general election delivered the goods astonishingly well.

You Gov’s ideas are in the right ballpark, but their predictions will fail. They look at attitudes in the wrong way and rely on past behaviour. Choice modelling looks forward. That, ultimately, is why choice modelling worked and will at some point (particularly in this period of electoral re-alignment) out-perform YouGov. Now I have proof of concept, I’ll be collecting a much larger winning from the bookie when that happens.