Data Quality Score

Data Quality

Passengers rely on high-quality information being provided to the Bus Open Data Service. The data quality score have been defined by the Department for Transport as an indicator for timetables data quality.

Critical observations

Data quality observations are divided into critical and advisory observations. Operators should aim to have zero critical observations in their data. Advisory observations should be investigated and addressed. Alternatively, if the observation is a result of intended behaviour an operator can suppress the observation.

Methods for measurement

Only critical observations are included in the data quality measurement. For each type of critical observation, the percentage of tests passed is calculated. These are then used to find the weighted average percentage, using the following weightings:

% Observation
12% Backward date range
12% Backwards timing
12% Incorrect NOC code
12% Missing block number
12% Stop(s) are not found in NaPTAN
10% Fast timing between timing points
10% First stop is found to be set down only
10% Incorrect stop type
10% Last stop is found to be pick up only

Example

If you have two services in a data set, and one of them has a backward date range, then only 50% of the backward date range tests have been passed. As backward date range can contribute a maximum of 12% of the score, in this case you would only get 50% of that 12%, which is a 6% contribution in this case. These contribution are summed to give the final data quality score.

Thresholds

The service expects operators to be meeting the Green data quality standard for all local bus service data. The data quality report provided to operators will support operators to identify issues in their data. Operators should aim to have zero critical data quality observations, and this will result in a score of 100%.

RED ≤ 90%
AMBER > 90%
GREEN = 100%