Archive for January, 2012

Non random samples

It is a well worn teaching practice to hold up examples of results from large, non-random samples and contrast them with the greater accuracy obtained from much smaller random ones.

The standard but dated e.g. is the 1936 US Presidential Election Literary Digest poll, wrongly forecasting a landslide for Landon based on a sample of 2.4 million voters, and  Gallup Polls, based on a much smaller random sample of 50,000 that not only predicted the right election result (although it underestimated Roosevelt’s share of the vote at 56% instad of 62%) but also the result that the Digest’s methods would produce, before they had produced them! The event hastened the demise of the Literary Digest, and made George Gallup’s reputation.

Info on the Literary Digest poll is at



The best account is in Freedman, Pisani & Purves Statistics (various editions).

80 year old US elections don’t grab the imagination, however Agresti & Finlay Statistical Methods for the Social Sciences  (Pearson, 3rd ed 1997) p. 7 draws attention to Shere Hite’s claim in Women in Love that 70% of women married at least 5 years had had an extra marital affair, based on a sample of 4,500 women.

However 100,000 questionnaires were issued, so that it is highly unlikely that the 4.5% returned were a random sample.

Agresti & Finlay take the matter no further, but it would be good to find some data based on a random sample. Enter David Atkins and colleagues who use US General Social Survey data (collected through face to face interview) reported in an article in Journal of Family Psychology (2001, v.15, no. 4:735-749) ‘Understanding Infidelity: Correlates in a National Random Sample’ estimates the rate at nearer 5%!

The latter article is also a very readable report of a logistic regression, useful for teaching that procedure.


Read Full Post »

The chart for the US begs the obvious question:has the same thing happened in the UK. The chart below shows that it doesn´t look like it has. The chart is not strictly accurate as the median houshold income data comes from the IFS and is for Great Britain only (England Wales and Scotland) while the GDP data is for the UK (including Northern Ireland) but since NI is such a small part of the UK economy this shouldn´t matter too much.

The trends diverge in the early 1990s, but the gap is nothing like that for the US: by 2010 median family income was about 60% up on 1973, compared to around 20% in the US.

Read Full Post »

This chart comes from a post by Lane Kenworthy  (http://lanekenworthy.net/2008/09/03/slow-income-growth-for-middle-america/)

It shows the growth of GDP per capita and median family income in the US from 1947 to 2007.

By the latter year median family income was approximately 61,000$.

Had it kept pace with per capita GDP, as it did till the mid 1970s, it would have been approximately 91,000$

A great example of how a simple graph can tell a compelling story

Read Full Post »

QM Teaching Blog

This Blog site is under construction. It will shortly have a list of links to resources for social science QM teaching (social statistics), mostly at university undergraduate and Masters University level, but also of relevance to schools or those looking to revise or upskill their quants knowledge after graduation.


Enquiries to john.macinnes@ed.ac.uk


Read Full Post »