Data Quality Score
Passengers rely on high-quality information being provided to the Bus Open Data Service. The data quality score have been defined by the Department for Transport as an indicator for timetables data quality.
Data quality observations are divided into critical and advisory observations. Operators should aim to have zero critical observations in their data. Advisory observations should be investigated and addressed. Alternatively, if the observation is a result of intended behaviour an operator can suppress the observation.
Methods for measurement
Only critical observations are included in the data quality measurement. For each type of critical observation, the percentage of tests passed is calculated. These are then used to find the weighted average percentage, using the following weightings:
|12%||Backward date range|
|12%||Incorrect NOC code|
|12%||Missing block number|
|12%||Stop(s) are not found in NaPTAN|
|10%||Fast timing between timing points|
|10%||First stop is found to be set down only|
|10%||Incorrect stop type|
|10%||Last stop is found to be pick up only|
If you have two services in a data set, and one of them has a backward date range, then only 50% of the backward date range tests have been passed. As backward date range can contribute a maximum of 12% of the score, in this case you would only get 50% of that 12%, which is a 6% contribution in this case. These contribution are summed to give the final data quality score.
The service expects operators to be meeting the Green data quality standard for all local bus service data. The data quality report provided to operators will support operators to identify issues in their data. Operators should aim to have zero critical data quality observations, and this will result in a score of 100%.
|RED ≤ 90%|
|AMBER > 90%|
|GREEN = 100%|