Getting FHIR’ed up with a Graph Database(neo4j)

Ruchika Kharwar
4 min readJan 16, 2022

My world recently evolved into having deeper and deeper conversations with folks solving problems in the healthcare domain. I love thinking about problems in different ways and finding any opportunity to tinker with new technology. I often work with FHIR data and the Healthcare API in Google Cloud Platform. While I gained enough practice traversing the data in the Google Healthcare API, I wanted to explore other ways to understand patient representation in the healthcare api. This blogpost is my first rendezvous using graph technology to study patient data. In this blogpost I walk through all the steps from data creation to visualizing data using graph technology from neo4j.

My journey in itself with healthcare and specifically healthcare data has been quite organic but a few things while studying the FHIR specification here, one of the first things that stuck out were the nature of the specification. It’s easy to see the “linked” or “graph” nature of the data. Eg. Looking at the Resource definitions here, Looking at and Encounter, there is a unique identifier. Observations made during an Encounter all have a reference to the Encounter for which these are part of. The same pattern is seen with MedicationRequests, Conditions, Claims where along with unique id for the resource itself, there are references to the Encounter and Patient and associated resources.

Create your synthetic FHIR Data

However mundane, no good data journey begins without some sample data. For this purpose, I use the synthea patient data generator. This link here has the steps to set it up. If all goes well you’ll be able to generate data in the form of FHIR bundles like so (tbd).

Setup neo4j

Setup your neo4j instance from the GCP Marketplace . It’s your easiest path to getting a graph database with compute resources available on GCP to easily setup your web application on and connected to your neo4j instance.

Load data

There are a few aspects of the Patient Bundle I load into the graph database since I especially wanted to explore these aspects. These aspects directly relate to resourceType in FHIR. The script used to load the data is in the github repository linked in the references.

  • Patient
  • Provider
  • Encounter
  • Observation
  • MedicationRequest
  • Procedure

Some noteworthy points — I started with creating constraints in neo4j to ensure I had unique entities for a given resourceType. I also chose to create a linked list of Encounters in the order of occurrence associated with the Patient to have a temporal view of the data.

If all works well you should be able to browse the data on the neo4j browser

Encounters and Observations
Medical Conditions

Asking questions and exploration

This was my favorite part of this exercise. I wanted to experience patient data with a graph tool. For this section I used neo4j bloom and bloom perspectives to do the same.

The first question I asked was “Show me patient <fname>, <lname>”. Right clicking on the node made it easy to explore the patient details.

The next question was “What conditions does patient Clara183 Carbajal274 have?

I was able to quite easily write up Bloom search queries

I wanted to explore additional details and quickly dig into the next question such as “What was patient Clara183 Carbajal274 last visit about?

Now with this next graph, it’s interesting to see the the details of the last encounter and the related observations for it.

Last Encounter details
Observation details

The final question I explored was “What is the current medication list for patient Clara183 Carbajal274 ?

Next Steps

This little exercise is the tip of the iceberg. There are several other resources than need to be modeled here such as the Claims resource with the goal of exploring aspects more interesting to the payor community.

The above exploration is done using neo4j bloom with cypher queries. An interesting integration aspect I hope to work on is using Google Cloud’s Vertex AI to build a Natural Language Entity Extraction technology to build a “Questions Parser”. I envision this to augment the Bloom query function to use Google Machine Learning to feed queries to neo4j.

Disclaimer

All thoughts in this blog post are my own and exploration during the pandemic forcing long hours indoors.

References

Github repository. This contains

  • 1 sample patient file
  • Script for node and relationship creation

--

--

Ruchika Kharwar

I work for Google helping customers achieve their data modernization with Google Cloud Platform.