Deriving an actionable patient phenome from healthcare data
Funded by MRC, Feb 18 - Feb 21, £315,181
The proposed research will devise a semantic electronic health record toolkit that is able to derive a consistent and comprehensive patient phenome from unstructured and structured electronic health records and provide semantic computation upon it to support decision making for tailored care, trial recruitment and research.
Graph-Based Data Federation for Healthcare Data Science
Funded by MRC, Mar 19 - Nov 19, £260,057
We know that answers to many in-depth healthcare questions can only be explored if we can look across data for the whole UK. However, we manage data locally and describe it in different ways to suit local communities. To get a global view from local data we need a map that tells us precisely where to look for data and how to interpret it when we find it. If we have this sort of map then we can use it to link data between localities in a way that makes access more predictable and rapid while also allowing the people managing different data sets to retain control of how the data in their charge is shared. We can also treat the map itself as data that can be shared to give insights into potential uses of data linkage and to encourage as wide a variety of innovators as possible to build tools that can be used across the data landscape; enriching the data, revealing new knowledge and extending the map.
The challenge we face is that most of the information held within medical records is in written form – sometimes referred to as unstructured text – which is difficult to use in research: for example, ‘the patient feels very tired and breathless, is losing weight, and says her heart is beating very fast’. We need to develop special computerised tools to process these words to ensure we have a full picture of all patient symptoms, experiences and diagnoses to use in research for patient benefit.
We will establish a natural language processing (NLP) processing research community that will address the complexity of clinical text through development of shared tools and standards with inbuilt patient confidence and engagement, supporting joint working across industry, academia and the NHS. The community will be open and inclusive, and develop capability for UK-wide NLP research at scale whilst providing clear ‘quick-wins’ through exemplar projects, shared material and datasets for training and implementation, with the ultimate aim of integrating with other health data analytics. The project will lay the foundations for a sustainable model for collaborative working, thus attracting funding for next 4 years and beyond.