Bus operator requirements
What you need to know to get started. Find guidance and support material tailored to your needs.
|Data required||Data format required||Method|
|Timetable||TransXChange Version 2.4 profile v1.1a||
Validation against TxC-PTI 1.1a profile
Data Quality report
|Bus location||DfT BODS SIRI-VM profile||Validation against DfT BODS SIRI-VM profile|
|Basic fares||UK NeTEx 1.10||Validation against schema|
|Complex fares||UK NeTEx 1.10||Validation against schema|
|Matching bus Location to timetables data||DfT BODS SIRI-VM profile and its corresponding TransXChange Version 2.4 TxC-PTI 1.1.a data||Validation against SIRI-VM PTI and data matching v1.1|
Data quality checks are provided on the data supplied to the service to provide feedback on the data to help operators identify and understand issues within their data. The issues identified may prevent a data consumer using and sharing their data with passengers. High data quality is expected for all data published on the service. It reduces the barriers to entry for innovators and consumers when using bus open data. High quality data enables trust to be created between passengers and the public transport network.
TransXChange data undergoes two sets of checks. In the first validation stage, it is checked that it adheres to the TxC 2.4 schema and the PTI profile v1.1. The TxC 2.4 schema is the basic data standard mandated by DfT, and the PTI profile v1.1 is an additional mandate for the TransxChange data that will be expected from operators. The PTI profile 1.1 clarifies the standards even further, making the industry unified with a common, unambiguous data standard. More information on the differences between the TxC 2.4 schema and the PTI profile v1.1 can be found here.
From Friday 1st October 2021, files non-compliant to the PTI profile 1.1 will be rejected upon submission.
The feedback as a result of the validation check in the first step of upload is provided to the user which is to be shared with their respective software suppliers to enable them to provide robust data that fits the profile. In the second review step, a further data quality check is conducted which produces a report for operators. The report provides observations about operator's data, highlighting common errors. Some observations are critical, meaning there is definitely an error within the data and the operator is expected to rectify the issue. Other observations are advisory as they may be false positives, as a result of the data structure. Operators should use these reports as suggested improvements in their timetables data.
Bus location data
SIRI-VM data is taken into a central AVL system, where it is harmonised to produce a consistent SIRI-VM 2.0 output of bus location data for open data consumers.
We have introduced a SIRI-VM validator to BODS to ensure the highest data standards are provided to consumers. The validator has two parts: one that checks first for the schema and the second part checks for mandatory fields specified within the DfT BODS profile. For the schema check, if the feed fails it, the feed will be put in an ‘inactive’ status. The validator will run 1 randomised check per day (excluding buses running from 12am-5am) and will check 1000 packets or 10 minutes from a feed each day (this number is configurable until deemed sensible).
Given the level of industry readiness in terms of providing consistent SIRI-VM data, there will be no blocking of feeds as long as they are valid SIRI (and don't fail the schema). However BODS compliance tags will be attached to showcase if they are: 'compliant', 'non-compliant' or 'partially compliant' using a 7-day rolling average. The validator will look at the last 7 days' worth of SIRI-VM aggregate data and assign a compliance status accordingly.
A SIRI-VM feed will be deemed 'compliant' if all fields here are present more than 70% of the time for the last 7 days.
- VehicleLocation (Lat, Long)
A SIRI-VM feed will be deemed 'partially compliant' if it has all other mandatory fields present but only have the following fields below missing 70% of the time in the last 7 days.
A SIRI-VM feed will be deemed 'non-compliant' if all fields below are not present more than 70% of the time for the last 7 days. It can also be assigned a direct non-compliant status if any one of the fields below fall under 45% population at the time of the daily validation check. This is because this would count as a gross error in the data and would be highlighted to the publisher right away.
- VehicleLocation (Lat, Long)
Other compliance statuses:
- Undergoing validation: This status will be used for all newly added feeds in the first 24 hours until initial checks are completed. It will also be used for all compliant feeds for the first 7 days until the 'automated flow' rolling validation logic becomes active.
- Awaiting publisher review: This status will be used for all feeds in the first 7 days after publishing if a critical or noncritical fields(s) has not been provided by >70% of vehicles in a daily check.
- Unavailable due to dormant feed: This status will be used for all feeds which don’t have any vehicles running for 7 consecutive days and henceforth have repeatedly evaded validation.
New feed validation process:
When a new feed is added to BODS it will be validated in the following way:
Automated feed validation process:
NeTEx data is validated against their respective schemas, to check if it is in the expected format. As this format is new to the UK, more data quality checks may be enabled over time.