Just before Christmas the Council of Mortgage Lenders published some interesting new data on outstanding residential mortgage lending by postcode sector. Thank you and well done to CML and the participating lenders.
As an enthusiast for data, and maps, I wanted to explore whether and how this data could be usefully linked to related third-party sources – and therefore tell us more about the scale and nature of mortgage investment where you and I live.
For example, the good folk at OpenStreetMap have developed a useful set of boundaries for postcode areas, derived using Ordnance Survey’s Code-Point Open data set and some very clever maths. My first question was: can I link outstanding mortgage data to this dataset, and therefore display it on a map?
This then got me thinking: can I contextualise the outstanding mortgage data by blending it with third-party sources available at postcode level? For instance, there is a wealth of data available via the excellent NOMIS service, in particular information from Census 2011 on households and housing.
The map on the left shows the total outstanding mortgage value for individual postcode districts. Clicking on each area will display a pop-up box displaying outstanding mortgage values for each of the seven main lenders releasing data via CML. Note that I have made various assumptions when calculating totals at postcode district level – see explanation below. This is not based on robust research, but my own musings late in the evenings over the Christmas period.
The map on the right is my estimate of outstanding loans per-household for each lender in individual postcode areas. Again, I’ve made some fairly unscientific assumptions here (explained below) which boil down to:
1. Taking the total outstanding mortgage value, and
2. Dividing it by the total number of households (from Census 2011) which reported that their accommodation was “owned with a mortgage or loan” – data for postcodes and other geographies are available in this table.
Preparing the maps – what I did, and how I did it
Step 1 – build the postcode boundary file
I started with outputs from the good folk at OpenStreetMap – published here, with re-usable (KML format) boundaries available via Wikipedia. Note that these boundaries have been modelled and interpreted from Ordnance Survey’s Code-Point Open product for purely illustrative and analytical purposes: they do not represent the official, definitive postcode boundaries and should not be treated as such.
My basic approach was to “scrape” individual KML files from Wikipedia, then combine the results into a single file – i.e. one KML file of all postcode districts. I then used Quantum GIS to create a shape file from the KML, ready for attaching additional data: i.e. the outstanding mortgage lending data, and information from Census 2011, created at step 2.
Step 2 – calculate outstanding mortgage data, and associated contextual information, for postcode districts
In order to calculate outstanding mortgages at postcode district level I started with data from individual participating lenders, published on the Council for Mortgage Lender’s website.
CML’s press release describes some important caveats and limitations on interpreting the results. For me, the most important ones were:
- The data “initially only covers lenders accounting for around three quarters of the overall mortgage market”. So, the picture is incomplete, but the data is and excellent start – a round of applause please for CML and the participating lenders.
- Data are not published at the postcode district level I need to map results, but are instead available for the finer-grained postcode sectors. For example, I wanted to map results for postcode district “ME10”, but the source data reports totals for postcode sectors “ME10 1”, “ME10 2” and so on. I therefore need a reliable way of adding-up sector-level data to give me the results for postcode districts.
- Related to (2), CML and the lenders have (quite rightly) suppressed results for some postcode sectors. As indicated in their press release, this is to “provide for the maximum transparency possible without compromising data privacy“. The result is data comprising a mixture of totals for individual postcode sectors (e.g. “ME10 1”), and totals for the entire postcode area (e.g. “ME”). Here’s an example using data from Barclay’s, for the ME area, plus ME1 and ME10 districts – note the £7.75m reported against “ME other”.
The obvious question for me was: how to apportion these whole area totals across the constituent postcode districts? In other words, how should I “allocate” the £7.75m reported for “ME” to “ME1”, “ME10” and all other districts in this area?
I chose to do this based on the proportions (percentages) of households in each postcode district which reported in Census 2011 that their property was “owned with a loan or mortgage“. For example.
- number of households in postcode area “ME” = 94,510
- number of households in postcode district “ME10” = 7,789
- Percentage of total households in ME10 = 8.2% (7,789/94,510)
- Therefore, postcode district ME10 is allocated 8.2% of the total £7.75m outstanding mortgage value for the ME postcode area
I cannot and will not claim this as a fully robust and scientific methodology. It begs obvious questions: for instance, is it right to apportion values based solely on households with a loan or mortgage, when it is likely that some rented properties may be mortgaged too? (via the landlord). Nonetheless, I hope that my work helps to illustrate potential for this important new dataset, and get your brains whirring on opportunities for further data linking and more contextual analysis.
On that, here are some things that I’m looking at trying next….
- Update the postcode maps – utilising more recent versions of Ordnance Survey’s Code-Point Open product, and/or ONS’s National Statistics Postcode Lookup file, available for download here. I’ve made a start by installing the open source PostGIS database on my trusty Mac.
- Extend the range of contextual data. I’m expecially interested in blending outstanding mortgage information with DCLG’s Indices of Deprivation, available via our OpenDataCommunities site (with a supporting mapping tool here).
Thanks very much – especially for reading to the end of this post – and please get in touch with any comments, ideas etc.