I’ve finally got around to doing some analysis on the parliamentary vote data using a model that can handle both binary data (the votes) and missing data (MPs not voting).

Intuitively, it’s quite easy to imagine how it works:

Assume we have a load of MPs and one vote, that they all attended. They can only vote yes or no (1 or 0). It is clearly possible to get all of the MPs to stand in a line such that all of them to the left of some point voted yes (1) and all to the right voted no (0). We could then characterise each MP by their position in the line and from that could work out how they voted. We’d end up with something that looks like this (each digit is an MP):

000000000111111111

Seem pointless? Yep, it would be. But what if there were two votes? If the MPs voted the same in both votes, it would be easy – we’d still only need one line, and one reference point. It would look like this (each MP is now a column, each row a vote):

V1: 000000011111111

V2: 000000011111111

If they voted completely oppositely, it would still be possible:

V1: 000000011111111

V2: 111111100000000

We can still use a single line if they vote a bit differently in two votes but two reference points will be required:

V1: 00000000111111

V2: 00001111111111

For vote 2, the reference point is slightly to the left of that for vote 1.

It’s just as easy to dream up voting patterns for which we can’t do this. For example, we can’t reorder the following MPs such that for each vote there is a point to the left of which they all vote 0 and to the right 1:

V1: 00000001111111

V2: 00011110000011

What we can do, is position the MPs in 2-dimensions – imagine placing them in a room rather than on a line. For each vote, we’ll split the room in two with a straight line such that all MPs on one side of the line vote one way, and all on the other side vote the other way.

For two votes, we could always position the MPs in two dimensions in this way (much like we could with one vote and a line). We might be able to do 3 votes in 2-dimensions, but we we can’t be sure – it depends on how they voted.

Given data for a full parliament (about 1600 votes), it’s interesting to see how well we can do this with a particular number of dimensions. For example, if we restrict ourselves to two dimensions, is it possible to lay the MPs out and draw a line for each vote such that each MP is on the correct side of each line? If it is not possible, what’s the highest number of MP-vote pairs that we can get right?

This might tell us something about the voting patterns of the parliament. For example, in the UK we have a 3 party system (or at least we have until recently). Lets assume that for each vote in say the 1997 parliament, Labour MPs voted one way, Conservative MPs the other and Lid Dems sided with the other two. If this were the case, we’d only need one dimension:

V1: 0000 1111 1111

V2: 0000 0000 1111

V3: 1111 1111 0000

V4: 1111 0000 0000

Each column is an MP, each block (set of four columns) a party (Labour Lib Dem Conservative, in that order). If we find for real data that we need only one dimension, it suggests this (or something similar) is happening.

Alternatively, if we assume that sometimes Labour and Conservatives vote together and the Lib Dems differently, we would need a second dimension.

The following plot shows the percentage of MP-vote pairs we can get right for different parliaments as we increase the number of dimensions:

(The line for 2010 should be treated with some caution – only 100 votes or so so far.)

To put the y-axis into perspective, in a parliament of 1600 votes and 600 MPs, an increase of 1% corresponds to getting approx 9000 additional votes correct – about 10 MPs worth if the MPs vote about 50% of the time.

The results suggest two things to me. Firstly, the voting patterns in each parliament are pretty simple – we can get a lot of the votes right with two dimensions. This is not surprising – most MPs will vote along party lines and we have three (main) parties. Secondly, it looks like the three successive Labour parliaments have been getting slightly more complex over time – 1997 seems to have the simplest structure, but not by much. Perhaps over time MPs became a bit more rebellious?

I suggest using Vapnik-Chervonenkis dimension.

Hmm, not sure how?

The VC-dimension of each individual ‘classifier’ is 3 – they’re all linear.

Do you mean computing the VC-dimension required for the data?

Probably best not to put a spanner in the works and point out that in Parliament, MPs can vote twice. See page 4 of http://www.parliament.uk/documents/commons-information-office/p09.pdf

(I don’t think it happens too often these days – though possibly more after a long lunch…)