Vaccine Safety Analysis

In support of the national vaccination service, Aire Logic has embedded a team of Data Analysts to investigating data quality issues and providing insights into the vaccination data.


Aire Logic has been supporting NHS England (NHSE) with its implementation and deployment of a national vaccination service, from the booking to the capturing and onward dissemination of vaccination events. 

Vaccination records are recorded in a Point of Care (PoC) system, this data is then captured by NHSE. The vaccination events are then disseminated to UKHSA, the citizens GPIT supplier and onto Foundry, the national reporting tool for vaccination uptake etc. 

In support of the vaccination service, Aire Logic has embedded a team of Data Analysts with the main aim of investigating Data Quality issues and providing insights into the vaccination data.

A recent key piece of analysis was the investigation into a potential vaccine safety issue and the extent to which the error had persisted.


Problem Statement

During the Autumn/Winter 2022 Covid vaccination campaign, it became apparent that a number of vaccination sites were entering incorrect information onto a vaccination record. 

The expectation of the programme was that the “Spikevax Zero Omicron bivalent Vaccination” will be administered as a booster dose from September 2022. 

However, it was found that the booster doses recorded onto the patients vaccination record since September 2022 was a decommissioned vaccine,  “Spikevax original”.

Initial information suggested that a large proportion of the doses administered for the “Spikevax Original” since  September 2022 had a batch number of “Spikevax Zero Omicron Bivalent” Vaccination.


~170,000 records of patients were recorded as being administered as “Spikevax original”, whereas the batch number is “Spikevax Zero Omicron Bivalent”, after the operational date of the September where it was expected that the “Spikevax Zero Omicron Bivalent” Vaccination was to be administered.


The data insights team were tasked with analyzing the situation, we were to answer the following questions:

  1. How many vaccination records have a vaccination product term that is inconsistent with the batch number?
  2. Breakdown of suppliers (PoC) with this error to find out whether the issue was occuring with one supplier or multiple.
  3. If we were to fix the records what would the impact be on the GP, (updates to Vaccination records are not auto ingested).
  4. How long would it take to update the citizens records if we bulk update the records in batches of 50 per GP practice per week.


Initial analysis identified that the batch number records were incorrectly linked to “Spikevax Original” vaccination, instead of “Spikevax Zero Omicron Bivalent”.

Data Cleansing

Aire Logic received a list of batch numbers that were known to be “Spikevax Zero Omicron Bivalent” . The batch numbers were meant to have a consistent entry format, for example, ABC123, however we saw additional varied formatting. There were additional letters/numbers, entered in lowercase etc… We developed a function in PySpark that captured these variations, from here we were able to assess which batch numbers we need to include in our final searches.

The vaccinations stored in our databases are based on logical updates with each New Record that we receive is labeled as ‘New’, an update to an existing record is labeled as ‘Update’, and a record that has been deleted is labeled as ‘Deleted’. which means the history of the vaccination record is retained. As a result of this, we needed to deduplicate the data and select the most recent version of the vaccination event, to do this we wrote a window and partition function, which then allowed us to use filtering to identify the records with the relevant dates.


Once we had cleaned the data, we were able to begin the full Analysis:: 

Our Analysis confirmed the initial  hypothesis, and it was  identified that out of the 18 million vaccinations delivered in the campaign there were ~180,000 Covid vaccination with the “Spikevax Original” vaccine product code, when the batch number is of the Spikevax Zero Omicron Bivalent Vaccine.

Of the 7 PoC suppliers, PoC B was proportionately the highest PoC that had entered the records incorrectly at ~90%.

On average it takes a GP ~5 minutes to review a vaccination record, if we were to correct and reflow the vaccination events it would take a total of ~15,000 hours to review and ingest the updates into the patients clinical records, which would be time not spent seeing patients.

We identified that ~5000 practices would be receiving updates of which:

  • ~4600 will have received updates to 50 records.some text
    • a one off update will of:some text
      • 50 would fix the records in ~ 87% of practices 
      • 100 would fix the records in ~ 92% of practices
      • 200 would fix the records in ~ 96% of practices
  • ~400 practices had more than 50 vaccination records to receive updates

From the analysis it would take around 43 weeks to correct the vaccination records if the updates were limited to 50 per week per GP practice. It was highlighted to the stakeholders that the process of updating the records would need to be a process that was triggered manually on a weekly basis.

Aire Logic Recommended that the updates should be blocked to the GPIT systems, as the time it takes to reflow all data would exceed 43 weeks and take up ~15,000 hours of GP time. We highlighted that the events will flow to UKHSA and Foundry, therefore nationally the records would be correct. The risk was minimal compared to the limited team and resources a GP has.


As a result of the analysis it was deemed by the clinical team that it would be acceptable to update the records, but block updates to the GP practices as the time it will take update all records would be too burdensome. The records will flow on to Foundry and UKHSA, therefore the vaccinations records would be correct nationally and for reporting purposes.

By not flowing data onto the GPs to review, there was an estimated saving of £540,000. This is assuming the average per hour GP pay is £36. It is also 15,000 hours where a GP can still see patients.

It was recommended that the PoC B entry tool should be examined due to the  high number of errors coming from PoC B. Our hypothesis was that the error was caused by incorrect data entry due to issues with the PoC B tool. 

On examination, it was identified that when the operator selects the vaccination, the wording between “Spikevax Original” and “Spikevax Zero Omicron Bivalent” were similar and that “Spikevax Original” was not removed as an option, meaning that the incorrect vaccine data was sent, confirming our initial hypothesis.

Close up of a clinician holding a tablet device


Estimated saving in GPs time.


Estimated GP hours saved