Taking Charge with Fitbit

1 Fitbit Time-Series Data
- 1.1 Minute-Level Data
- 1.2 Daily-Level Data
2 How does getting fitter reflect in the data?
3 Useful Resources

*insert introductory section here –> What is this? What motivated me?*
This document is the product of a combination of events in my life and some newly-fueled interest that I’ve had for a longer time.

Something about my interest in data:
- Gathering data on sports, physical state, games
- Bachelor Thesis about quantified self

I’m fascinated by the all the cool things you can do with data, but I never really took the time to do something like it myself.
*make sure to come up with better titles for everything*

1 Fitbit Time-Series Data

*describe how the data were collected, stored, and what they look like*
All the data that I’m using can be retrieved by calling the fitbit API through the getActivitiesResourceByDatePeriod, getHeartByDateIntraday, and getWeightByDate methods. The functions I wrote to perform these GET requests can be found in this script.

I decided to structure the data in a tidy format. The code to collect new data from the fitbit servers and tidy it can be found in this script. So first I split up the data in minute-level and daily-level time series. Then I created separate tibbles per type of value:

Minute-level
- Numerical: calories, distance, elevation, floors, heartrate, steps
- Ordinal: activity intensity
Daily-level
- Numerical: weight, BMI, activity calories, base metabolic heartrate calories, rest heartrate{, minutes, calories out (per heartrate zone –> separate table) [backlog]}

str(fitbit_data)

## List of 3
##  $ dl_num:Classes 'tbl_df', 'tbl' and 'data.frame':  611 obs. of  5 variables:
##   ..$ date : Date[1:611], format: "2019-01-01" ...
##   ..$ type : Factor w/ 5 levels "bmi","weight",..: 3 4 5 3 4 5 3 4 5 3 ...
##   ..$ value: num [1:611] 1885 708 59 1885 1389 ...
##   ..$ day  : Ord.factor w/ 7 levels "Monday"<"Tuesday"<..: 2 2 2 3 3 3 4 4 4 5 ...
##   ..$ week : Ord.factor w/ 21 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ml_ord:Classes 'tbl_df', 'tbl' and 'data.frame':  202980 obs. of  5 variables:
##   ..$ datetime: POSIXct[1:202980], format: "2019-01-01 00:00:00" ...
##   ..$ type    : Factor w/ 1 level "intensity": 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ value   : Ord.factor w/ 4 levels "sedentary"<"light"<..: 1 2 1 1 1 1 1 2 2 2 ...
##   ..$ day     : Ord.factor w/ 7 levels "Monday"<"Tuesday"<..: 2 2 2 2 2 2 2 2 2 2 ...
##   ..$ week    : Ord.factor w/ 21 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ml_num:Classes 'tbl_df', 'tbl' and 'data.frame':  1193695 obs. of  5 variables:
##   ..$ datetime: POSIXct[1:1193695], format: "2019-01-01 00:00:00" ...
##   ..$ type    : Factor w/ 6 levels "calories","distance",..: 1 2 3 4 5 6 1 2 3 4 ...
##   ..$ value   : num [1:1193695] 1.7 0 0 0 80 ...
##   ..$ day     : Ord.factor w/ 7 levels "Monday"<"Tuesday"<..: 2 2 2 2 2 2 2 2 2 2 ...
##   ..$ week    : Ord.factor w/ 21 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 1 1 1 1 1 1 ...

1.1 Minute-Level Data

1.1.1 Data Validation

*Show summary statistics for all numerical variables and table for ordinal. Show histogram and density plot for heartrate. Other plots are so heavily skewed that they are not very informative other than that they show that the values are 0 most of the time.*

fitbit_data$ml_num %>%
  group_by(type) %>%
  summarise(
  minimum = min(value),
  pctl_25 = quantile(value, probs = c(0.25), names = FALSE),
  median = median(value),
  pctl_75 = quantile(value, probs = c(0.75), names = FALSE),
  maximum = max(value),
  mean = mean(value),
  st_dev = sd(value)
  ) %>%
  knitr::kable(digits = 2, caption = "", col.names = c("", 
                                                       "*P~0~*",
                                                       "*P~25~*",
                                                       "*P~50~*",
                                                       "*P~75~*",
                                                       "*P~100~*",
                                                       "*$\\mu$*",
                                                       "*$\\sigma$~X~*"))

	P₀	P₂₅	P₅₀	P₇₅	P₁₀₀	\(\mu\)	\(\sigma\)_X
calories	1.22	1.26	1.36	1.66	17.94	2.35	2.35
distance	0.00	0.00	0.00	0.00	249.47	9.22	26.72
elevation	0.00	0.00	0.00	0.00	21.34	0.07	0.61
floors	0.00	0.00	0.00	0.00	7.00	0.02	0.20
heartrate	35.00	50.00	58.00	69.00	182.00	62.69	19.94
steps	0.00	0.00	0.00	0.00	185.00	11.31	30.20

*Some text here.*

min_hr <- fitbit_data$ml_num %>%
  filter(type == "heartrate") %>%
  select(value) %>%
  min()

max_hr <- fitbit_data$ml_num %>%
  filter(type == "heartrate") %>%
  select(value) %>%
  max()

fitbit_data$ml_num %>%
  filter(type == "heartrate") %>%
  ggplot(aes(x = value, fill = value, stat(density))) +
  geom_histogram(binwidth = 1, color = "black", fill = "#2A211C") +
  geom_density(alpha = 0.8, color = "black", fill = "#9F2042") +
  theme_classic(base_size = 12, base_line_size = 1) +
  theme(axis.title.x = element_text(color = "#000000", face = "italic", margin = margin(7.5, 0, 0, 0)),
        axis.title.y = element_text(color = "#000000", face = "italic", margin = margin(0, 7.5, 0, 0)),
        axis.text.x = element_text(color = "#000000", face = "bold", size = 9),
        axis.text.y = element_text(color = "#000000", face = "bold", size = 9),
        panel.background = element_rect(fill = "#FCFCFC"),
        plot.background = element_rect(color = "#000000", fill = "#FCFCFC")) +
  labs(x = "Heart rate", y = "Density") +
  scale_x_continuous(breaks = c(min_hr, 50, 75, 100, 125, 150, 175, max_hr))

ds_histogram_and_density_heartrate

*Some text here.*

1.2 Daily-Level Data

# act_tidy %>% 
#   filter(!is.na(act_tidy$datetime)) %>%
#   group_by(as.Date(datetime, tz = "Europe/Amsterdam"), type) %>%
#   summarise(
#     daily_total = sum(value)
#   ) %>%
#   group_by(type) %>%
#   summarise(
#     total = sum(daily_total),
#     minimum = min(daily_total),
#     mean = mean(daily_total),
#     median = median(daily_total),
#     maximum = max(daily_total)
#   ) %>% 
#   knitr::kable(digits = 2, caption = "Summary of activity data")

# act_tidy %>% 
#   filter(!is.na(act_tidy$datetime)) %>%
#   group_by(week, type) %>%
#   summarise(
#     average_day = sum(value)/uniqueN(day)
#   ) %>%
#   ggplot(aes(x = week, y = average_day)) +
#   geom_point() +
#   geom_line() +
#   facet_wrap(~ type, nrow = 2, ncol = 3, scales = "free_y")

2 How does getting fitter reflect in the data?

*obviously do a solid attempt to answer this question here*

2.1 Rest Heart Rate Over Time

# hr_summary %>%
#   ggplot(aes(x = date, y = rest_hr)) +
#   geom_point() +
#   geom_smooth(se = FALSE, span = 0.275) +
#   labs(title = "Rest Heart Rate over Time",
#        subtitle = "*this is how you make a subtitle, Luc*")

# hr_intraday %>%  
#   group_by(week) %>%
#   summarise(
#     minimum = min(hr),
#     mean = mean(hr),
#     median = median(hr),
#     maximum = max(hr)
#   ) %>% 
#   gather(key = "type", value = "value", -week) %>%
#   ggplot(aes(x = week, y = value)) +
#   geom_point() +
#   geom_line() +
#   facet_wrap(~ type, nrow = 2, ncol = 2, scales = "free_y")

2.2 Weekly heart-rate distribution

*show ggridges weekly heart rate distributions here*

# hr_intraday %>%
#   mutate(week = fct_rev(as.factor(week))) %>%
#   ggplot(aes(x= hr, y = week)) +
#   geom_density_ridges(scale = 2.5, alpha = 0.7) +
#   xlim(30, 194) +
#   theme_ridges()

2.3 Link steps/minute to bpm

*idea here is that when fitter heart rate is lower for a fixed amount of steps/min than when less fit*

Taking Charge with Fitbit

Taking Charge with Fitbit

1 Fitbit Time-Series Data

1.1 Minute-Level Data

1.1.1 Data Validation

1.2 Daily-Level Data

2 How does getting fitter reflect in the data?

2.1 Rest Heart Rate Over Time

2.2 Weekly heart-rate distribution

2.3 Link steps/minute to bpm

3 Useful Resources