Abstract tunnel image of the Aire Logic cog

Kevin

Data scientist and Software Engineer

Role

Data scientist, data engineer and software engineer. Particle physics PhD. Former research scientist on the LHCb experiment at the European centre for nuclear research (CERN).

Background

Kevin’s background is in high energy physics where he measured differences between matter and antimatter, and nanosecond particle lifetimes. The large volume of data collected, and the lifetime-dependent efficiencies caused by the detector apparatus, resulted in a very challenging analysis. Each of the 200 million events had its own unique efficiency that had to be included in a maximum likelihood estimation.

Kevin went on to do natural language processing (NLP) of health triage data using recurrent neural networks. Vector embeddings were used, which are multidimensional representations of words constructed using unsupervised learning.

During the COVID-19 pandemic, Kevin was a key worker for the NHS, building the data warehouse and analysis platform used to land, load, analyse and disseminate test results and vaccination records. Kevin was also a data engineer tasked with creating a single asset of linked healthcare data, to be used by government and researchers to answer questions about COVID-19. He also worked on an NLP project to map textual medicine names to integer code (from the standard dictionary of medicines) to improve data quality and clinical safety.

Technologies

  • AWS
  • Python
  • Spark
  • Terraform
  • Docker

Methods

  • GBDTs
  • Maximum likelihood estimation
  • Spark
  • Terraform
  • Docker