Skip to content

BBC-Data-Unit/rail-fare-increases-2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rail fare increases: Charts explain passengers' frustration

In January 2019 as rail fares increased we published an analysis of official data which showed that rail users were paying more for worsening delays, shortages of staff and, in some areas, an ageing fleet of carriages.

The analysis included scripts in both R (analysis, visualisation) and Python (scraping).

This is the fifth story the data unit has done on rail fare rises. In August 2018 we reported Commuters 'pay fifth of salary' on season ticket, and 12 months before that we reported Commuters to pay £100 more in 2018. In January 2017 we published Rail fares: Who are the season ticket winners and losers? and in September 2016 we published Rail season tickets cost 10% of net pay.

Get the data

The tweets data is not included here because it is too large for GitHub. However, the filtered file of tweets from November 20 onwards, is.

Quotes and interviews

  • Stewart Frank, commuter
  • James Vasey, Bradford Rail Users Group
  • Spokesperson, The Office of Rail and Road (ORR)
  • Darren Shirley, Campaign for Better Transport (CBT)
  • Paul Plummer, chief executive, the Rail Delivery Group

Visualisation

  • Tree map: Rail delays by cause and responsibility
  • Grouped bar chart: Train delays due to staff shortages, 2017 vs 2018
  • Bar chart: Percentage of tweets saying 'sorry', 'apologies' or 'apologise' between November 20 and December 19 by train operator
  • Column chart: Compensation claims made by Northern Rail passengers during 2018, by period
  • Line chart: Age of rolling stock by operator, 2008-2018
  • Table: Rise in monthly rail season ticket fares, by route

Scripts and code

  • The notebook traindelays details the process of analysing ORR data on train delays.
  • The R notebook 7periodcomparison takes the periodic data produced by the ORR and produces totals for 7 periods, allowing for a comparison between the delays to date, and those for the same 7-period dates in previous years.
  • Python script to scrape Twitter accounts
  • The R markdown file traintweetsrmdonly details the process of analysing tweets by train company accounts. This is not saved as a notebook because the resulting HTML file is over 40MB!

Related repos