This is a new service – your feedback will help us to improve it.

Bus operator requirements

What you need to know to get started. Find guidance and support material tailored to your needs.


Data quality

Data required Data format required Method
Timetable TransXChange Version 2.4 profile v1.1

Validation against PTI 1.1 profile

Data Quality report

Bus location DfT BODS SIRI-VM profile Validation against DfT BODS SIRI-VM profile
Basic fares UK NeTEx 1.10 Validation against schema
Complex fares UK NeTEx 1.10 Validation against schema

Data quality checks are provided on the data supplied to the service to provide feedback on the data to help operators identify and understand issues within their data. The issues identified may prevent a data consumer using and sharing their data with passengers. High data quality is expected for all data published on the service. It reduces the barriers to entry for innovators and consumers when using bus open data. High quality data enables trust to be created between passengers and the public transport network.

Timetables data

TransXChange data undergoes two sets of checks. In the first validation stage, it is checked that it adheres to the TxC 2.4 schema and the PTI profile v1.1. The TxC 2.4 schema is the basic data standard mandated by DfT, and the PTI profile v1.1 is an additional mandate for the TransxChange data that will be expected from operators. The PTI profile 1.1 clarifies the standards even further, making the industry unified with a common, unambiguous data standard. More information on the differences between the TxC 2.4 schema and the PTI profile v1.1 can be found here.

From Friday 1st October 2021, files non-compliant to the PTI profile 1.1 will be rejected upon submission.

The feedback as a result of the validation check in the first step of upload is provided to the user which is to be shared with their respective software suppliers to enable them to provide robust data that fits the profile. In the second review step, a further data quality check is conducted which produces a report for operators. The report provides observations about operator's data, highlighting common errors. Some observations are critical, meaning there is definitely an error within the data and the operator is expected to rectify the issue. Other observations are advisory as they may be false positives, as a result of the data structure. Operators should use these reports as suggested improvements in their timetables data.

Bus location data

SIRI-VM data is taken into a central AVL system, where it is harmonised to produce a consistent SIRI-VM 2.0 output of bus location data for open data consumers.

We have introduced a SIRI-VM validator to BODS to ensure the highest data standards are provided to consumers. The validator has two parts: one that checks first for the schema and the second part checks for mandatory fields specified within the DfT BODS profile. For the schema check, if the feed fails it, the feed will be put in an ‘inactive’ status. The validator will run 1 randomised check per day (excluding buses running from 12am-5am) and will check 1000 packets or 10 minutes from a feed each day (this number is configurable until deemed sensible).

Given the level of industry readiness in terms of providing consistent SIRI-VM data, there will be no blocking of feeds as long as they are valid SIRI (and don't fail the schema). However BODS compliance tags will be attached to showcase if they are: 'compliant', 'non-compliant' or 'partially compliant' using a 7-day rolling average. The validator will look at the last 7 days' worth of SIRI-VM aggregate data and assign a compliance status accordingly.

A SIRI-VM feed will be deemed 'compliant' if all fields here are present more than 70% of the time for the last 7 days.

  • Bearing
  • LineRef
  • OperatorRef
  • RecordedAtTime
  • ResponseTimestamp
  • VehicleJourneyRef
  • VehicleLocation (Lat, Long)
  • ProducerRef
  • DirectionRef
  • BlockRef
  • PublishedLineName
  • ValidUntilTime
  • DestinationRef
  • OriginName
  • OriginRef
  • VehicleRef

A SIRI-VM feed will be deemed 'partially compliant' if it has all other mandatory fields present but only have the following fields below missing 70% of the time in the last 7 days.

  • BlockRef
  • PublishedLineName
  • DestinationRef
  • OriginName
  • OriginRef

A SIRI-VM feed will be deemed 'non-compliant' if all fields below are not present more than 70% of the time for the last 7 days. It can also be assigned a direct non-compliant status if any one of the fields below fall under 45% population at the time of the daily validation check. This is because this would count as a gross error in the data and would be highlighted to the publisher right away.

  • Bearing
  • LineRef
  • OperatorRef
  • RecordedAtTime
  • ResponseTimestamp
  • VehicleJourneyRef
  • VehicleLocation (Lat, Long)
  • ProducerRef
  • DirectionRef
  • VehicleRef
  • ValidUntilTime

Other compliance statuses:

  • Undergoing validation: This status will be used for all newly added feeds in the first 24 hours until initial checks are completed. It will also be used for all compliant feeds for the first 7 days until the 'automated flow' rolling validation logic becomes active.
  • Awaiting publisher review: This status will be used for all feeds in the first 7 days after publishing if a critical or noncritical fields(s) has not been provided by >70% of vehicles in a daily check.
  • Unavailable due to dormant feed: This status will be used for all feeds which don’t have any vehicles running for 7 consecutive days and henceforth have repeatedly evaded validation.

New feed validation process:

When a new feed is added to BODS it will be validated in the following way:

  1. 24 hours after a new SIRI feed is added the validator will check against the mandatory fields and if necessary, an error report will be sent to operators.
  2. Over the subsequent 6 days when data is flowing through it will continue to run randomised daily checks.
  3. After Day 7: each day a fresh automated validation check will run and a compliance status will be assigned on a 7-day rolling average.

Automated feed validation process:

  1. The validator will run 1 randomised check per day (excluding buses running from 12am-5am).
  2. The validator will check 1000 packets or 10 minutes from a feed each day (this number is configurable until deemed sensible).
  3. 70% of vehicles on the feed need to be populating the mandatory fields to avoid moving in to non/partial compliance error status (e.g that means 70% of 'Bearing' should be present in the last 7 days' worth of data, if not, it will move to a non-compliant status).
  4. If the daily check has any non-compliant fields which are less than 45% populated (for each non-compliant feed), it will automatically move the compliance status to 'non-compliant' as it is a gross error.
  5. If the daily check has more than 45% of non-compliant fields populated (for each non-compliant feed), then the rolling average check will kick in and assign a compliance status based on the last 7 days.

Fares data

NeTEx data is validated against their respective schemas, to check if it is in the expected format. As this format is new to the UK, more data quality checks may be enabled over time.

Previous Fares data
Next How to get help