Evidation Health Project Update 1


Our project sponsor is Evidation Health. Evidation is a Santa Barbara based company aiming to improve health through data. Evidation connects people and companies wanting to contribute to health related data projects. Companies in industries such as “biopharma, med-tech, big tech, academic institutions, professional societies, and government” employ Evidation to generate and analyze data for specific projects.1 The subject of our project is the Stress and Recovery of frontline healthcare workers. Our project utilizes wearable, survey, and biomedical data from healthcare workers during the pandemic to analyze changes in sleep patterns, cognition, etc. The wellbeing of frontline healthcare workers is extremely important in the middle of a worldwide pandemic. Improving the observation and prediction of changes in this population’s wellbeing through non burdening wearable data aligns with Evidation’s goal to improve the overall health of our communities.

Project Goals

Since participants were continuously surveyed on their stress levels, tested on cognition and worn technology that tracked heart rate, body temperature etc. we hope to be able to detect and then model changes in participants stress and cognition levels from wearable variables such as their heart rate and body temperature. Some participants wore multiple wearable devices and participated in “stress reduction intervention (physical exercise or meditation)”.2 These additional aspects of the study will hopefully allow us to assess which wearable data streams best detected changes and identify contributing factors to the increase or decrease of these two outcomes.


Unfortunately, we currently don’t have access to the dataset from Evidation Health. Since we are dealing with real patient data, we need approval from UCSB and be certified to handle patient data through Synapse. However, we have been working with two sets of kaggle 3 data that we believe might closely reflect the actual data we will be working with. Kaggle, a subsidiary of Google, provides a platform for data scientists to publish and obtain datasets. The two datasets we obtained from Kaggle are from individuals who published their own wearable data. The first dataset pertains to the Garmin data which keeps track of sleeping and resting heart rate data. The second dataset involves Oura ring data which tracks a user’s sleep, readiness, and activity scores.

Some background on the devices that we will be working with: the Garmin is a multisport fitness watch which offers functionalities such as heart rate, running and cycling statistics, and navigation.4 The Oura ring is a wearable ring that offers 24/7 heart rate monitoring, sleep analysis, seven temperature sensors, and daily calculated scores that provides you with personalized health insights.5

From the kaggle dataset, we can see that the Garmin data has variables such as calendarDate, sleepEndTime, sleepStartTime, sleepWindowConfirmationType, and total_sleep_hours in regards to tracking sleep data. Other Garmin data include activeKilocalories, allDayStress, burnedKilocalories, etc. For now, we are working with sleep data, but the Garmin watch offers a breadth of variables to work with such as Kilocalories and heart rate data as shown in the figures below. The Garmin watch is able to track variables such as heart rate by directing light from a light-emitting diode (LED) to the skin of the user. The reflection of the light is received by a photodiode, which sends a light intensity signal to the processor.6

The Oura Ring works similarly to the Garmin watch, however, it provides a more concise report on your bodily health. The ring tracks the resting heart rate, heart rate variability, body temperature, and respiratory rate. These metrics help the Oura ring create a “readiness_score” which is shown in the figure below. The company does not specify how it specifically calculates these scores, but it does take into account the user’s overall sleep, heart rate levels, and activity. These scores are reported to the user on a daily basis and use different pieces of technology to capture each score.

The activity score uses a 3D accelerometer to track step count, training frequency, and training volume.7 For the sleep score, the Oura measures the resting heart rate, body temperature, movement, and time spent in specific sleep stages, including light, deep, and REM. Oura utilizes proprietary algorithms for these measurements to generate a summarized picture of sleep patterns.8 Lastly, for the heart rate the Oura takes measurements of daytime heart rate every 5 minutes using green LEDs embedded in the ring. The green LEDs enable your ring to take measurements using photoplethysmography (PPG) technology at a frequency of 50Hz. PPG technology works by shining a light onto the surface of your skin.9

In the Evidation Health research paper, it also discussed collecting other data related to stress outside of the Garmin and Oura Ring. In the paper, it used a dataset collected by 4YouandMe10 which contains the results of continuous stress surveys, cognition tests, etc. One particularly interesting aspect of the dataset is cortisol levels measured from participants’ hair samples to measure their Cortisol levels. Cortisol is “your body’s main stress hormone”. Your brain releases extra cortisol when a “pressure or danger” is perceived.11 As a biomedical indicator of stress levels, it will be interesting to see how cortisol levels relate to participants’ reported stress levels gauged by surveys.

Initial Findings

Oura Ring Data

The dataset comes from an anonymous user on Kaggle who was generous enough to share his Oura Ring data. The data includes: average resting heart rate, sleep score, activity score, and readiness score. By working on this dataset, our goal in mind was to familiarize ourselves with wearable data and create functions/methods that could later be used on real data.

We wanted to compare each Oura Ring variable with different stress values in order to evaluate their relevance in stress predictions; however, our Oura Ring Dataset did not come with low/high stress ground truth labels, so we created a pseudo-random column of 0’s and 1’s to denote low and high stress, respectively. To deal with missing values, we decided to exclude them from the analysis. We did not impute values in place of the missing ones due to the variable values having an erratic nature. Additionally, imputed values (if done incorrectly and inappropriately) may lead to erroneous analysis and predictive models. Our next step was to subset each group by their stress values, plot their histograms, and look at their summary statistics. Of course, since the stress values were simulated, we weren’t going to get a meaningful analysis. But, as previously mentioned, just going through the process would help immensely when we receive access to the real data. For this reason, we did not look too closely at the relationships between each variable and the stress values. Below are two histograms of sleep scores that, coincidentally, conform to what our intuition suggests– that a better sleep score leads to lower stress. Anyhow, the histograms are included because they are a representation of how we would go about handling wearable data and determine their pertinence in stress predictions.

Garmin Watch Data

The dataset comes from Kaggle user K Scott Mader who was kind enough to share his Garmin Connect Data. To review, the dataset includes extensive data on sleep such as Deep Sleep Seconds, Light Sleep Seconds, Sleep Start Time, and Sleep End Time. There also exists a dataset shared by the same user that has activity data from the Garmin Watch which includes categories such as Burned Kilocalories and Total Distance Covered in Meters. The focus of our work thus far has been on the sleep data that was collected. By working on the sleep data, we aimed to familiarize ourselves with the way that the Garmin Watch collects sleep data and to dive into the methods of exploratory data analysis on the raw data that could be later used on the real data set once it becomes available.

Our exploratory data analysis of the raw data started with understanding the features of the dataset. During our analysis we discovered that the Total Sleep Hours provided were calculated by subtracting the Sleep Start Time from the Sleep End time. We also discovered that using the Datetime functions from Pandas in python, we could gather the columns Year, DayName, and DayofWeek just by having the information in the CalendarDate Column. We observed that features such as NapList and SleepResultType are columns having almost all NaN (missing data) entries with few exceptions. With our exploratory data analysis, we proceeded to create multiple plots of the data ranging from heatmaps to scatter plots and even a LMplots (regression) from seaborn which helped to see if there existed correlation between certain variables such as Total Sleep Hours and Deep Sleep Seconds.

A topic that stood out to us was exploring how the Total Sleep Hours were distributed based on the day of the week (above). In order to explore this topic further, the dataset was sorted by the date that data was collected. Then, the data from 2014 was dropped because the Total Sleep Hours in 2014 had the same values with a few outliers; therefore, the distribution plot was a straight line, not giving much insight. Next, the data was split by year (2015-2018). Any missing values in Total Sleep Hours were imputed by our earlier finding, which was to subtract the Sleep Start Time from the Sleep End Time. Then, Seaborn was used in python to plot boxplots for every year, showing the distribution of Total Sleep Hours based on the day of the week. Since our aim was to dive into exploration methods, conclusions were not drawn, but trends and anomalies were definitely realized. One example of such anomalies is the fact that during 2017, the median Total Sleep Hours on Sunday was much lower than the other years.

Future Work

Some ideas that we may pursue in the future is possibly generating fake outcome data for the Garmin Watch while building upon the fake outcome data that was generated for the Oura Ring. We would also like to identify different methods to predict stress levels and apply those methods to our real dataset. It would be important to pursue what kind of features are necessary to include in our model or possibly write a function to plot our data to visually identify when stress is present. Another important discussion that we would like to further pursue is how to specifically define “stress” and identify its physical manifestations. Stress is a broad word that is used in our everyday lives, but because we are tasked with data analysis involving stress, we need to formulate a somewhat concrete definition of stress. There are also an array of different questions that we would like to explore and further understand: Is the Garmin Watch or Oura ring a better wearable device? How should missing values in data be imputed when dealing with data from individual humans? Does or where does wearable technology fall short in terms of gathering stress data? Why is there missing data? For example, is there missing sleep data because the wearable device was not worn or is there missing sleep data because of other activities? How do we decide if missing data should even be imputed or completely dropped? There are a lot of questions that as a team we aim to answer and we are extremely excited for what comes in the future!