How to get your Whoop data

How to get your Whoop data

May 08, 2022

There are currently only two ways to get your Whoop data:

  • Either use the inofficial Whoop REST API.
  • Request your data from the Whoop directly referring to GDRP or HIPAA.

This post is about the Whoop API to read your sleeps, workouts, heart rates and much more. Unfortunately Whoop does not yet provide an official API, so it will not be usable for professional projects. The API was reverse engineered by pelo-tech and you can find a technical description about the API here.

I will cover the newer version of the API (v2) as it provides more information provided with the new Whoop Hardware (Whoop Strap v4), e.g. skin temperature.

As mentioned the API is inofficial, still it can be used to read your biomarkers to get insights for yourself. I will provide more information on how to get insights from your Whoop data in another post.

API Overview

This post will cover how to get data about:

  • Sleep (core sleeps and naps)
  • Sleep stages
  • Recovery
  • Activities like workouts or meditations
  • Activity surveys
  • Heart Rate series

There are some interesting things that Whoop provides using this API, even your probability of having Covid is provided. 😷

Getting started

I’m using python and pandas to read the data. First step is to get a long-lived refresh token. It will expire only after some months. Using the OAuth endpoint like this:

curl -X "POST" "https://api-7.whoop.com/oauth/token" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "username": "your@email.com",
  "password": "yourPassword",
  "issueRefresh": true,
  "grant_type": "password"
}'

Now we will use the refresh token to get a short-lived access token. This token will be used for the actual endpoint requests.

response = requests.post(
    url="https://api-7.whoop.com/oauth/token",
    headers={
        "Content-Type": "application/json; charset=utf-8",
    },
    data=json.dumps({
        "grant_type": "refresh_token",
        "refresh_token": refresh_token
    })
)
access_token = response.json()['access_token']

The access token will be used in the Authorization header. I’m using a requests sessions to enable this for all following requests

s = requests.Session()
s.headers.update({
    'Content-Type': 'application/json; charset=utf-8',
    'Authorization': f'Bearer {access_token}',
})

Try out if everything works by querying your user profile data

response = s.get(f'{api_url}/users/{user_id}')
response.json()

Activity types

/activities-service/v1/sports

You can use this endpoint to get all the Whoop activity types possible.

response = s.get(f'{api_url}/activities-service/v1/sports')
sports = pd.DataFrame(response.json()).set_index('id')

There are different categories for the activities

sports['category'].value_counts()

# returns
cardiovascular        55
non-cardiovascular    25
restorative            3
muscular               2

Using this makes sense if you want to join your activity IDs to a concrete name later. Use the following table to browse through all the available workouts.

Meditations

If you have HealthKit connected, Whoop will also read in your Meditation sessions from 3rd party apps like Headspace or a breathing session with your Apple Watch.

Cycles

/activities-service/v1/cycles/aggregate/range/{userId}

A cycle represents a day which starts and ends before going to bed, e.g. 22:15 - 21:45 (next day) The cycles endpoint provides summarized data for each cycle about:

  • recovery
  • sleeps
  • activities

To get all your data set startTime to when you started wearing your Whoop and endTime to the current time. You have to paginate through the results using offset. The maximum limit is 50.

offset = 0
total_count = sys.maxsize
records = []
while offset < total_count:
    response = s.get(
        url=f'{api_url}/activities-service/v1/cycles/aggregate/range/{user_id}',
        params={
            'startTime': '2021-01-01T01:00:00.000Z',
            'endTime': '2022-04-19T01:00:00.000Z',
            'limit': 50,
            'offset': offset
        })
    if response.status_code != 200:
        break

    records.extend(response.json()['records'])
    offset = response.json()['offset']
    total_count = response.json()['total_count']
    print(f'got {offset} of {total_count} items', end='\r')

Now I’m using the records objects to read the top-level cycles attributes:

cycles = pd.json_normalize(records)
# drop sleep, activities and recovery, we'll read it later
cycles = cycles.drop(columns=['sleeps', 'workouts', 'v2_activities', 'recovery'])
# parse the first part of the date tuple
cycles.index = pd.to_datetime(cycles['cycle.days'].str.slice(2, 12))
cycles = cycles.sort_index()

Next I’m reading sleeps and activities. There may be several sleeps and activities per cycle.

sleeps = pd.json_normalize(records, record_path=['sleeps'], meta=[['cycle', 'days']])
sleeps.index = pd.to_datetime(sleeps['cycle.days'].str.slice(2, 12))
sleeps = sleeps.sort_index()

workouts = pd.json_normalize(records, record_path=['workouts'], meta=[['cycle', 'days']])
workouts.index = pd.to_datetime(workouts['cycle.days'].str.slice(2, 12))
workouts = workouts.sort_index()

Cycles attributes

The summarized cycle data provides a lot of interesting data. Unfortunately the survey data from your daily journals is NOT included. This would be extremely valuable as you could do your own analysis on your provided journal entries. Here is a summary of the interesting cycle attributes:

time

  • cycle.days: the days of the cycle (the cycle begins and ends when going to bed)
  • cycle.during: not so interesting - exact time when cycle started and ended
  • cycle.timezone_offset: your timezone offset

performance

  • cycle.day_avg_heart_rate: your average heart rate during the day

  • cycle.day_kilojoules: kilojoules burned during the day

  • cycle.day_max_heart_rate: your max HR during the day

  • cycle.day_strain: the raw strain during the day (similar to TSS)

  • cycle.scaled_strain: the strain score you see in the app from 0 - 21.

recovery

  • recovery.calibrating: set to True when the algorithms calibrate based on your initial data

  • recovery.during: exact start and end time of your sleep

  • recovery.sleep_id: can be used to fetch sleep cycle data from the sleep events endpoint

  • recovery.recovery_score: your recovery score

  • recovery.resting_heart_rate: RHR during recovery

  • recovery.hrv_rmssd: HRV in seconds

  • recovery.skin_temp_celsius: your skin temp in the night

  • recovery.spo2: your SPO2 during sleep

Covid

  • recovery.prob_covid: probability of having Covid

How is Whoop calculating the Covid probability? I think they trained a regression model. The highest correlation between prob_covid can be found with:

  1. resting_heart_rate
  2. skin_temp_celsius
  3. hrv_rmssd

Survey data is not returned by the API! 😭

Unfortunately the data for all my journal entries is not provided. I would love to use it to find out about the impact of my habits.

  • recovery.responded: if you responded to the survey, always False
  • recovery.survey_response_id: always None

Sleep

  • activity_id: sleep_id for sleep events

  • is_nap: core sleep (False) or nap (True)

  • cycles_count: how many full sleep cycles Whoop detected during sleep

  • disturbances: how often you woke up during night

  • during: start and end time of your sleep

  • timezone_offset: your local time offset from UTC

  • optimal_sleep_times: Whoop recommendations to sleep in and wake up

  • sleep_consistency: how consistent you go to bed and wake up

sleep times

  • latency: how long it took you to sleep in milliseconds
  • time_in_bed: time you spent in bed in milliseconds
  • no_data_duration: time no data was recorded (e.g. empty battery) in milliseconds
  • quality_duration: total time you have been asleep in milliseconds
  • light_sleep_duration: light sleep in milliseconds
  • rem_sleep_duration: REM sleep in milliseconds
  • slow_wave_sleep_duration: SWS in milliseconds
  • wake_duration: complete time you spend awake in bed in milliseconds
  • arousal_time: how long you spend in bed after waking up in milliseconds
  • percent_recorded: 1.0 if no_data_duration is 0
  • projected_sleep: the extrapolated sleep duration when not the complete night was recorded
  • projected_score: the extrapolated sleep score when not the complete night was recorded

biomarkers

  • respiratory_rate: breathes per minute

Sleep Efficiency

  • sleep_need: how much sleep you need (based on your personal needed sleep, debt, naps from day before and strain)
  • in_sleep_efficiency: how much of the time you spent in bed you have been sleeping.

How Whoop calculates it: in_sleep_efficiency=quality_durationtime_in_bed{in\_sleep\_efficiency} = \frac{quality\_duration}{time\_in\_bed}

Sleep Need

  • credit_from_naps: based on the naps from the day before
  • habitual_sleep_need: Whoop estimates how much sleep you need (without taking strain, sleep debt etc. into account)
  • debt_post: your new sleep debt after sleeping
  • debt_pre: your sleep debt before the sleep
  • need_from_strain: how much additional sleep you need because of your daily accumulated strain
How Whoop calculates it:

sleep_need =
+ habitual_sleep_need
+ debt_pre
+ need_from_strain
- credit_from_naps

Sleep Score

  • score: sleep score

How Whoop calculates it: score=quality_durationsleep_need{score} = \frac{quality\_duration}{sleep\_need}

Sleep stages

Sleep stages with white background

Whoop Sleep Stages

Workout

Workouts are directly included in the response from the Cycle endpoint. To only query a distinct workout, there is the Workout endpoint:

/activities-service/v1/workouts/{workoutId}

But here we don’t need it as we already have the data from the cycles.

time

  • during: contains the start and end time (UTC) of your workout. This is also important to query your heart rate during the workout from the Heart Rate endpoint.
  • timezone_offset: used to convert to local time

spatial

If you have GPS enabled with the Strain coach, you will get the following spatial information about your workouts. You will have to carry a GPS enabled phone. I don’t use this feature, so this is empty in my data.

  • gps_enabled
  • altitude_change
  • altitude_gain
  • distance

performance

  • average_heart_rate: the average heart rate during the workout
  • max_heart_rate: the maximum heart rate measured in the workout
  • zone_durations: how much time you spend in different heart rate zones in a workout
  • kilojoules: energy burned during workout

score

  • raw_intensity_score: the raw intensity (similar to TSS)
  • intensity_score: the strain score from this workout
  • cumulative_workout_intensity: the cumulated strain score for the day from several workouts

survey

  • responded: if the user filled the survey
  • survey_response_id: ID to use for fetching the survey (unlike with the sleep, for workouts it is available)
  • rpe: Rating of Perceived Exertion

Workout Survey Response

Fortunately the surveys for workouts can be fetched.

GET /activities-service/v0/workouts/{workoutId}/survey/response

The response looks like that:

[
  {
    "questionId": 4,
    "label": "Perceived Exertion",
    "answer": 10
  },
  {
    "questionId": 5,
    "label": "Performance Level",
    "answer": 3
  },
  {
    "questionId": 9,
    "label": "Incomplete Workout",
    "answer": null
  },
  {
    "questionId": 10,
    "label": "Tired",
    "answer": null
  },
  {
    "questionId": 11,
    "label": "Injured",
    "answer": null
  }
]

Heart Rate

A very nice thing about Whoop is that you can get pretty accurate heart rate measurements 24/7. Another very nice thing is that you can query it using the Heart Rate endpoint:

GET /users/{user_id}/metrics/heart_rate
Parameters
  • start: 2021-02-12T11:00:16.528Z
  • end: 2021-02-12T12:18:54.599Z
  • step: 6

start and end define the time period. This could be for example a workout. The step parameter defines the sampling rate. Possible values are: 6, 60, 600 (a HR value every 6, 60 or 600 seconds)

This code snippet queries the data into a Pandas Timeseries DataFrame.

response = s.get(
    url=f'{api_url}/users/{user_id}/metrics/heart_rate',
    params={
        "start": "2021-02-12T11:00:16.528Z",
        "end": "2021-02-12T12:18:54.599Z",
        "step": "6", # every 6 seconds, 6 or 60 or 600
    }
)
hr = pd.DataFrame.from_dict(response.json()['values'])
hr['time'] = pd.to_datetime(hr['time'], unit='ms', utc=True)
hr = hr.rename(columns={'data': 'bpm'})
hr = hr.set_index('time')

Conclusion

Most of your data can be fetched using the Whoop API. I showed you what you get and what you don’t get. Unfortunately some important things are missing: for example surveys and intraday HRV. Whoop does not offer an official API or an export feature.

Another way to get all your personal data would be to do a manual request to the Whoop support team. This is what I will do next. I’ll keep you posted about the process and what to expect from the data in another post.


© 2022, Marcus Lehmann