There are a number of challenges. As recent debates show – see Tom McKenzie’s briefing paper
- differences in definitions, including the precise wording of questions, and in the methods used to capture information, can lead to wide variations in estimates of, for example, the proportion of the populace engaged in charitable giving and the amount of money they give.
Similar difficulties beset attempts to measure the economic weight of the sector: how many organisations are there, what is their overall expenditure, what sources of income do they draw on, and so on? Systems of national accounts classify information by sector of activity: e.g. primary production (agriculture, mining); secondary activity (e.g. manufacturing) and tertiary activity (delivery of services). Distinctions are not made according to the legal form (government; for profit; not-for-profit) of organisations carrying out the activity. And even if we can adequately identify nonprofit entities, a challenge we then have is categorising their sources of income. This was the focus of a project between NCVO and TSRC to capture data on a sample of registered charities in England and Wales.
We wanted to establish a panel of organisations which were broadly representative of the sector, by size of organisation, region, and subsector. We drew a sample of around 10,000 organisations, including several thousand with incomes below £500,000 (that is, organisations which are not required to report to the Charity Commission in as much detail as those above that threshold). We set up a data entry operation to capture information from the notes to the accounts contained in their annual reports. This gives us more granular data than are available from the categories of income that charities are required to report (depending on size) to the Charity Commission. For example, notes to the accounts will often indicate that income received in pursuit of the charitable activities of an organisation has come from a particular funding source (e.g. central government, local government, NHS). We entered around 500,000 items of data relating to these 10,000 charities. The most complex part of the task was then classifying this information. A number of techniques were used – some involved manual classification, but most of the data was classified through automated keyword searches, including probabilistic methods which generate a series of probabilities for each word found in a line of text, which are then applied to previously unclassified items. As we build on this data set, and add extra years of information, we believe that the classification process will be improved.
As well as offering improved classification of income sources, a crucial advantage of this source is that it will allow tracking of organisations over time – we will pick up the effects on particular funding streams of recessionary conditions and also changes in the amount and distribution of public funding after 2010. Some discussions of research on funding the voluntary sector have argued strongly for real-time information on both donors and organisations. That has prompted numerous open access surveys of organisations which cannot be considered representative: for example, surveys of financial cuts to the sector might exaggerate the true position (if they were only completed by those with axes to grind) or they might underestimate it (if the organisations most severely affected by funding reductions were no longer active). We cannot tell the difference from such surveys and it is difficult to work out what action might be taken as a result of them.
In the case of our work, it could be argued that a focus on organisational reports imposes a time delay. On the other hand, we are analysing relatively robust data which has been through the process of the production of accounts according to agreed standards. In that regard, and seen in a longer term perspective, this data will provide an authoritative basis for future work on the changing funding bases of charitable organisations. For examples of what might be done with that kind of data see our existing work (TSRC working papers 38 and 39
) on the differential growth of charities, and forthcoming papers on differential survival. We will seek to extend that level of analysis to particular income streams.
A final point concerns access to the data, so that others can use it for research purposes. Debates on the evidence base for the third sector often call for new evidence when, in practice, there’s a lot of data out there which, if shared among those with the expertise to analyse it, could be of considerable value – as Richard Harrison of CAF points out in a recent interview
. We share his view, so we will make the data we are gathering publicly available to other researchers.