This is a full reproduction for the results of the derivation cohort (MIMICdb) using MIMIC-IV v2.2 and MIMIC Code v2.4.0

Citation

Hasidim, A.A., Klein, M.A., Ben Shitrit, I. et al. Toward the standardization of big datasets of urine output for AKI analysis: a multicenter validation study. Sci Rep 15, 20009 (2025). https://doi.org/10.1038/s41598-025-95535-4

Population and Study’s Sample

Total ICU stays in mimic:

dbGetQuery(con, "SELECT count(*) FROM `physionet-data.mimiciv_icu.icustays`")

## # A tibble: 1 × 1
##     f0_
##   <int>
## 1 73181

ICU stays with UO records (eligible):

n_distinct(raw_uo$STAY_ID)

## [1] 70364

Hospital admissions with UO records:

n_distinct(raw_uo$HADM_ID)

## [1] 64110

Patients with UO records:

n_distinct(raw_uo$SUBJECT_ID)

## [1] 49950

Count all UO records (before exclusions):

all_rows_count <- nrow(raw_uo)
all_rows_count

## [1] 3335985

Figure for Raw UO Records Data

Frequency of urine output charting by source

S3a <- raw_uo %>% 
  mutate(
    LABEL = case_when(
      LABEL == "GU Irrigant/Urine Volume Out" ~ "GU Irrig. Out",
      LABEL == "GU Irrigant Volume In" ~ "GU Irrig. In",
      .default = LABEL
    )
  ) %>% count(LABEL, sort = TRUE) %>%
   ggplot(aes(x=reorder(LABEL, -n), y=n)) +
   geom_bar(stat="identity") +
   xlab("") +
   ylab("") +
   geom_bar(stat="identity", fill="steelblue") +
   geom_text(aes(label=n), vjust=-0.6, color="black", size=3) +
   theme_classic() +
   theme(axis.text.y=element_blank()) +
   theme(axis.text.x = element_text(angle = 45, hjust = 1))

S3a

Temporal Trends Analysis

The MIMIC-IV dataset is a collection of data collected between 2005 and 2022. Each patient in the dataset has an “anchor year” that is organized into Three-year periods.

Frequency of urine output charting by year:

For each hospital admission, we adjust the beginning of the anchor year based on the years elapsed since the patient’s first hospital admission.

S4_b <- raw_uo %>%
  left_join(anchor_year, by = "HADM_ID") %>%
  count(ANCHOR_START, sort = TRUE) %>%
   ggplot(aes(x=as_factor(ANCHOR_START), y=n)) +
   geom_bar(stat="identity") +
   xlab("Anchor year starts") +
   ylab("") +
   geom_bar(stat="identity", fill="steelblue") +
   geom_text(aes(label=n), vjust=-0.6, color="black", size=3) +
   theme_classic() +
   theme(axis.text.y=element_blank()) +
   theme(axis.text.x = element_text(angle = 45, hjust = 1))

S4_b

Regrouping:

anchor_year_regrouped <- anchor_year %>%
  mutate(ANCHOR_GROUP = case_when(ANCHOR_START < 2011 ~ "2008-2010",
                                       ANCHOR_START >= 2011 & ANCHOR_START < 2014 ~ "2011-2013",
                                       ANCHOR_START >= 2014 & ANCHOR_START < 2017 ~ "2014-2016",
                                       .default = "2017+"))

Frequency of urine output charting by year:

positions <- c("2008-2010", "2011-2013", "2014-2016", "2017+")

S4_c <- raw_uo %>%
  left_join(anchor_year_regrouped, by = "HADM_ID") %>%
  count(ANCHOR_GROUP, sort = TRUE) %>%
   ggplot(aes(x=as_factor(ANCHOR_GROUP), y=n)) +
   scale_x_discrete(limits = positions) +
   geom_bar(stat="identity") +
   xlab("Grouped anchor years") +
   ylab("") +
   geom_bar(stat="identity", fill="steelblue") +
   geom_text(aes(label=n), vjust=-0.6, color="black", size=3) +
   theme_classic() +
   theme(axis.text.y=element_blank()) +
   theme(axis.text.x = element_text(angle = 45, hjust = 1))

S4_c

Frequency of urine output charting by source:

positions <- c("2008-2010", "2011-2013", "2014-2016", "2017+")

S4_d <- raw_uo %>%
  left_join(anchor_year_regrouped, by = "HADM_ID") %>%
  mutate(
    LABEL = case_when(
      LABEL == "GU Irrigant/Urine Volume Out" ~ "GU Irrig. Out",
      LABEL == "GU Irrigant Volume In" ~ "GU Irrig. In",
      .default = LABEL
    )
  ) %>%
  group_by(ANCHOR_GROUP) %>%
  count(LABEL, sort = TRUE) %>%
  ggplot(aes(
    fill = LABEL,
    y = n,
    x = as_factor(ANCHOR_GROUP)
  )) +
  scale_x_discrete(limits = positions) +
  geom_bar(position = "fill", stat = "identity") +
  xlab("Grouped anchor years") +
  ylab("Proportions") +
  scale_fill_discrete(name = "Source") +
  theme_classic() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

S4_d

Mean urine output volume by source:

S4_e <- uo_rate_including_null_collection_period %>%
  left_join(anchor_year_regrouped, by = "HADM_ID") %>%
  group_by(ANCHOR_GROUP, SOURCE) %>%
  summarise(Mean = mean(VALUE, na.rm = TRUE)) %>%
  ggplot(aes(
    fill = as_factor(ANCHOR_GROUP),
    y = Mean,
    x = SOURCE
  )) +
  geom_bar(position = "dodge", stat = "identity") +
  xlab("Source") +
  ylab("mL") +
  scale_fill_discrete(name = "Grouped anchor years") +
  theme_classic() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

S4_e

Mean collection time by source:

S4_f <- uo_rate %>%
  left_join(anchor_year_regrouped, by = "HADM_ID") %>%
  group_by(ANCHOR_GROUP, SOURCE) %>%
  summarise(Mean = mean(TIME_INTERVAL, na.rm=TRUE)) %>%
  ggplot(aes(fill=as_factor(ANCHOR_GROUP), y=Mean / 60, x=SOURCE)) + 
    geom_bar(position="dodge", stat="identity") +
  xlab("Source") +
  ylab("Hr") +
  scale_fill_discrete(name = "Anchor year starts") +
  theme_classic() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

S4_f

Check for Duplications

Distinctiveness

First, we are basing distinctive rows in the raw UO data.

Count distinct raw rows:

distinct_time_item_patient_rows_count <- raw_uo %>% 
  select(-VALUE, 
         -SERVICE, 
         -LABEL) %>% 
  n_distinct()

distinct_time_item_patient_rows_count

## [1] 3335985

Conclusion: the original raw query does not have duplicates (all rows are distinct by all columns)

Simultaneous Charting

raw_uo_excluions_duplicates$same_value <- as.factor(raw_uo_excluions_duplicates$same_value)
raw_uo_excluions_duplicates$label <- as.factor(raw_uo_excluions_duplicates$label)
raw_uo_excluions_duplicates$label <- factor(raw_uo_excluions_duplicates$label, levels = as.factor(names(sort(table(raw_uo_excluions_duplicates$label),
                                  decreasing = TRUE))))

S4_a <- raw_uo_excluions_duplicates %>%
  select(same_value, label) %>%
  tbl_summary(by = same_value)

S4_a

Characteristic	Different volume N = 12,051¹	Equal volume N = 518¹
label
GU Irrigant Volume In,GU Irrigant/Urine Volume Out	4,189 (35%)	8 (1.5%)
Foley,L Nephrostomy	1,127 (9.4%)	69 (13%)
Foley,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	1,158 (9.6%)	3 (0.6%)
Foley,R Nephrostomy	1,077 (8.9%)	60 (12%)
R Nephrostomy,L Nephrostomy	988 (8.2%)	125 (24%)
Foley,Void	703 (5.8%)	61 (12%)
Foley,Suprapubic	557 (4.6%)	55 (11%)
R Nephrostomy,Ileoconduit	346 (2.9%)	24 (4.6%)
Foley,R Nephrostomy,L Nephrostomy	322 (2.7%)	4 (0.8%)
Foley,GU Irrigant Volume In	186 (1.5%)	1 (0.2%)
Void,Straight Cath	161 (1.3%)	4 (0.8%)
Suprapubic,R Nephrostomy	151 (1.3%)	11 (2.1%)
Foley,Condom Cath	113 (0.9%)	18 (3.5%)
R Ureteral Stent,Foley	122 (1.0%)	8 (1.5%)
Suprapubic,L Nephrostomy	120 (1.0%)	9 (1.7%)
Void,Condom Cath	78 (0.6%)	14 (2.7%)
L Nephrostomy,Ileoconduit	90 (0.7%)	1 (0.2%)
R Ureteral Stent,L Ureteral Stent	51 (0.4%)	7 (1.4%)
Void,L Nephrostomy	48 (0.4%)	7 (1.4%)
Condom Cath,Straight Cath	47 (0.4%)	2 (0.4%)
Foley,Ileoconduit	46 (0.4%)	2 (0.4%)
R Ureteral Stent,L Nephrostomy	45 (0.4%)	2 (0.4%)
Condom Cath,R Nephrostomy	29 (0.2%)	3 (0.6%)
R Nephrostomy,L Nephrostomy,Ileoconduit	29 (0.2%)	0 (0%)
Void,R Nephrostomy	23 (0.2%)	3 (0.6%)
Condom Cath,Suprapubic	21 (0.2%)	2 (0.4%)
Foley,Straight Cath	17 (0.1%)	4 (0.8%)
Foley,GU Irrigant/Urine Volume Out	19 (0.2%)	0 (0%)
R Ureteral Stent,L Ureteral Stent,L Nephrostomy	19 (0.2%)	0 (0%)
R Ureteral Stent,R Nephrostomy,Ileoconduit	16 (0.1%)	0 (0%)
L Ureteral Stent,Foley	14 (0.1%)	1 (0.2%)
R Ureteral Stent,L Ureteral Stent,Foley	14 (0.1%)	0 (0%)
R Ureteral Stent,R Nephrostomy	9 (<0.1%)	4 (0.8%)
Suprapubic,R Nephrostomy,L Nephrostomy	12 (<0.1%)	1 (0.2%)
Condom Cath,R Nephrostomy,L Nephrostomy	11 (<0.1%)	0 (0%)
Condom Cath,L Nephrostomy	8 (<0.1%)	2 (0.4%)
Foley,Suprapubic,R Nephrostomy	10 (<0.1%)	0 (0%)
Foley,L Nephrostomy,GU Irrigant Volume In	7 (<0.1%)	0 (0%)
Foley,L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	6 (<0.1%)	0 (0%)
R Nephrostomy,L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	6 (<0.1%)	0 (0%)
L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	5 (<0.1%)	0 (0%)
L Ureteral Stent,R Nephrostomy,L Nephrostomy	5 (<0.1%)	0 (0%)
Void,R Nephrostomy,L Nephrostomy	5 (<0.1%)	0 (0%)
R Ureteral Stent,Ileoconduit	4 (<0.1%)	0 (0%)
Void,Suprapubic	4 (<0.1%)	0 (0%)
Foley,L Nephrostomy,Ileoconduit	3 (<0.1%)	0 (0%)
Foley,Void,Condom Cath	1 (<0.1%)	1 (0.2%)
Ileoconduit,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	2 (<0.1%)	0 (0%)
R Nephrostomy,L Nephrostomy,GU Irrigant Volume In	2 (<0.1%)	0 (0%)
R Ureteral Stent,Foley,R Nephrostomy	2 (<0.1%)	0 (0%)
R Ureteral Stent,Void	2 (<0.1%)	0 (0%)
Suprapubic,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	2 (<0.1%)	0 (0%)
Suprapubic,Ileoconduit	1 (<0.1%)	1 (0.2%)
Void,Ileoconduit	2 (<0.1%)	0 (0%)
Foley,R Nephrostomy,GU Irrigant Volume In	1 (<0.1%)	0 (0%)
Foley,R Nephrostomy,Ileoconduit	1 (<0.1%)	0 (0%)
L Nephrostomy,Straight Cath	1 (<0.1%)	0 (0%)
L Ureteral Stent,Foley,L Nephrostomy	1 (<0.1%)	0 (0%)
L Ureteral Stent,L Nephrostomy	1 (<0.1%)	0 (0%)
L Ureteral Stent,Suprapubic	1 (<0.1%)	0 (0%)
R Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	1 (<0.1%)	0 (0%)
R Nephrostomy,GU Irrigant/Urine Volume Out	1 (<0.1%)	0 (0%)
R Nephrostomy,L Nephrostomy,GU Irrigant/Urine Volume Out	1 (<0.1%)	0 (0%)
R Nephrostomy,L Nephrostomy,Straight Cath	1 (<0.1%)	0 (0%)
R Ureteral Stent,L Ureteral Stent,Foley,Suprapubic	1 (<0.1%)	0 (0%)
R Ureteral Stent,Straight Cath	1 (<0.1%)	0 (0%)
Suprapubic,GU Irrigant Volume In	1 (<0.1%)	0 (0%)
Suprapubic,GU Irrigant/Urine Volume Out	0 (0%)	1 (0.2%)
Void,Condom Cath,Straight Cath	1 (<0.1%)	0 (0%)
Void,GU Irrigant Volume In	1 (<0.1%)	0 (0%)
Void,GU Irrigant Volume In,GU Irrigant/Urine Volume Out	1 (<0.1%)	0 (0%)
¹ n (%)

Show full SQL query —–>

WITH aa AS (
        SELECT
            STAY_ID,
            CHARTTIME
        FROM
            `mimic_uo_and_aki.a_urine_output_raw` uo
        GROUP BY
            STAY_ID,
            CHARTTIME
    ),
    ab AS (
        SELECT
            a.*,
            b.ITEMID,
            b.VALUE,
            c.label
        FROM
            aa a
            LEFT JOIN `mimic_uo_and_aki.a_urine_output_raw` b ON b.STAY_ID = a.STAY_ID
            AND b.CHARTTIME = a.CHARTTIME
            LEFT JOIN `physionet-data.mimiciv_icu.d_items` c ON c.itemid = b.itemid
        ORDER BY
            STAY_ID,
            CHARTTIME,
            ITEMID
    ),
    ac AS (
        SELECT
            STAY_ID,
            CHARTTIME,
            STRING_AGG(label) label,
            COUNT(STAY_ID) COUNT,
            IF(
                MIN(VALUE) = MAX(VALUE),
                "Equal volume",
                "Different volume"
            ) AS same_value
        FROM
            ab
        GROUP BY
            STAY_ID,
            CHARTTIME
    ),
    ad AS (
        SELECT
            COUNT(CHARTTIME) COUNT,
            label,
            same_value
        FROM
            ac
        WHERE
            COUNT > 1
        GROUP BY
            label,
            same_value
    )
SELECT
    *
FROM
    ac
WHERE
    COUNT > 1

In conclusion, most of the records have different values, and thus human error in duplicate record-keeping is not likely.

Exclusion

ICU type exclusion:

dbGetQuery(con, statement = read_file('sql/service_type_exclusion.sql'))

## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1       238       6223

Show full SQL query —–>

WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(DISTINCT STAY_ID) icu_stays,
  COUNT(STAY_ID) UO_records,
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )

Uretral stent exclusion:

dbGetQuery(con, statement = read_file('sql/ure_stent_exclusion.sql'))

## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1        45       3201

Show full SQL query —–>

WITH stays_services AS (
        -- Adding ICU type by looking into services
        SELECT
            a.STAY_ID,
            ARRAY_AGG(
                c.curr_service
                ORDER BY
                    c.transfertime DESC
                LIMIT
                    1
            ) [OFFSET(0)] AS SERVICE
        FROM
            `mimic_uo_and_aki.a_urine_output_raw` a
            LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
            AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
        GROUP BY
            a.STAY_ID
    )
SELECT
    COUNT(DISTINCT STAY_ID) icu_stays,
    COUNT(STAY_ID) UO_records
FROM
    `mimic_uo_and_aki.a_urine_output_raw`
WHERE
    STAY_ID IN (
        SELECT
            STAY_ID
        FROM
            `physionet-data.mimiciv_icu.outputevents`
        WHERE
            ITEMID IN (226558, 226557) -- Urethral stent
        GROUP BY
            STAY_ID
    )
    AND STAY_ID IN (
        SELECT
            STAY_ID
        FROM
            stays_services
        WHERE
            SERVICE IN (
                'MED',
                'TSURG',
                'CSURG',
                'CMED',
                'NMED',
                'OMED',
                'TRAUM',
                'SURG',
                'NSURG',
                'ORTHO',
                'VSURG',
                'ENT',
                'PSURG',
                'GU'
            )
    )

GU irrigation exclusion:

dbGetQuery(con, statement = read_file('sql/gu_irig_exclusion.sql'))

## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1       639      85286

Show full SQL query —–>

WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(DISTINCT STAY_ID) icu_stays,
  COUNT(STAY_ID) UO_records
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (227488, 227489) --GU irrigation
    GROUP BY
      STAY_ID
  )
  AND STAY_ID NOT IN (
        SELECT
            STAY_ID
        FROM
            `physionet-data.mimiciv_icu.outputevents`
        WHERE
            ITEMID IN (226558, 226557) -- Urethral stent
        GROUP BY
            STAY_ID
    )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )

Not passing UO sanity check:

dbGetQuery(con, statement = read_file('sql/sanity.sql'))

## # A tibble: 1 × 1
##   UO_records
##        <int>
## 1          9

Show full SQL query —–>

WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(STAY_ID) UO_records
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (226558, 226557, 227488, 227489)
    GROUP BY
      STAY_ID
  )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )
  AND (
    VALUE > 5000
    OR VALUE < 0
  )

Total raw urine output after exclusion (“included records, before dropping records without collection times”):

nrow(raw_uo_eligible)

## [1] 3241266

Show full SQL query —–>

WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  *
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (226558, 226557, 227488, 227489)
    GROUP BY
      STAY_ID
  )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )
  AND NOT (
    VALUE > 5000
    OR VALUE < 0
  )

Total icu stays after exclusion (“included records, before dropping records without collection times”):

n_distinct(raw_uo_eligible$STAY_ID)

## [1] 69442

Exclusion of first volume in each compartment per ICU stay:

uo_rate_including_null_collection_period %>%
  filter(STAY_ID %in% raw_uo_eligible$STAY_ID) %>%
  filter(is.na(TIME_INTERVAL)) %>%
  nrow()

## [1] 70051

UO records with time intervals (“Included records”):

uo_rate_including_null_collection_period %>%
  filter(STAY_ID %in% raw_uo_eligible$STAY_ID) %>%
  drop_na(TIME_INTERVAL) %>%
  nrow()

## [1] 3171215

Count UO records by anatomical compartment:

uo_rate %>% 
  mutate(agg_group = case_when(SOURCE == "Foley" |
                                 SOURCE == "Condom Cath" |
                                 SOURCE == "Straight Cath" |
                                 SOURCE == "Suprapubic" |
                                 SOURCE == "Void" ~ "Urinary bladder",
                               TRUE ~ SOURCE)
  ) %>%
           group_by(agg_group) %>%
   dplyr::summarise(N = n()
  )

## # A tibble: 4 × 2
##   agg_group             N
##   <chr>             <int>
## 1 Ileoconduit        6022
## 2 L Nephrostomy      3206
## 3 R Nephrostomy      3465
## 4 Urinary bladder 3158522

ICU stays with UO records with time intervals:

print("ICU stays after exclusion criteria:")

## [1] "ICU stays after exclusion criteria:"

uo_rate_including_null_collection_period %>%
  {n_distinct(.$STAY_ID)}

## [1] 69442

print("Included ICU stays (has time intervals):")

## [1] "Included ICU stays (has time intervals):"

uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL) %>%
  {n_distinct(.$STAY_ID)}

## [1] 67642

print("ICU stays with UO records that does not  have time interval (no previous UO record in the same compartment:")

## [1] "ICU stays with UO records that does not  have time interval (no previous UO record in the same compartment:"

uo_rate_including_null_collection_period %>%
                     filter(is.na(TIME_INTERVAL)) %>%
  {n_distinct(.$STAY_ID)}

## [1] 69438

print("ICU stays dropped due to no UO records with time intervalst (no previous UO record in the same compartment:")

## [1] "ICU stays dropped due to no UO records with time intervalst (no previous UO record in the same compartment:"

(uo_rate_including_null_collection_period %>%
                     filter(is.na(TIME_INTERVAL))) %>%
  filter(!(STAY_ID %in% (uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL))$STAY_ID)) %>%
  {n_distinct(.$STAY_ID)}

## [1] 1800

Count total ICU days of UO monitoring for icu stays with time intervals:

hourly_uo %>%
  filter(STAY_ID %in%
           (uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL))$STAY_ID) %>%
  nrow() / 24

## [1] 218388.2

Hours of UO monitoring:

print("Hours of UO monitoring in all included ICU stays:")

## [1] "Hours of UO monitoring in all included ICU stays:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  nrow()

## [1] 5241317

print("Valid hourly-adjusted UO monitoring hours:")

## [1] "Valid hourly-adjusted UO monitoring hours:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  nrow()

## [1] 5211377

print("ICU stays with valid hourly-adjusted UO monitoring hours:")

## [1] "ICU stays with valid hourly-adjusted UO monitoring hours:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  {n_distinct(.$STAY_ID)}

## [1] 67602

Proportion of valid hours covered out of included hours of uo monitoring:

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  nrow() /
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  nrow()

## [1] 0.9942877

Hourly-adjusted UO with admission weights:

print("ICU stays with weight at admission and valid hourly-adjusted UO:")

## [1] "ICU stays with weight at admission and valid hourly-adjusted UO:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  drop_na(WEIGHT_ADMIT) %>%
  {n_distinct(.$STAY_ID)}

## [1] 65717

print("Valid hourly-adjusted UO monitoring hours with weight at admission:")

## [1] "Valid hourly-adjusted UO monitoring hours with weight at admission:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  drop_na(WEIGHT_ADMIT) %>%
  nrow()

## [1] 5127472

print("ICU stays with valid weight (25 <= kg <=300) and valid hourly-adjusted UO:")

## [1] "ICU stays with valid weight (25 <= kg <=300) and valid hourly-adjusted UO:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  {n_distinct(.$STAY_ID)}

## [1] 65595

print("Valid hourly-adjusted UO monitoring hours with valid weights:")

## [1] "Valid hourly-adjusted UO monitoring hours with valid weights:"

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  nrow()

## [1] 5119874

ICU stays with calculated KDIGO-UO staging (at least six consecutive hours with valid charting of hourly-adjusted UO)

print("number of ICU stays with valid KDIGO-UO staging")

## [1] "number of ICU stays with valid KDIGO-UO staging"

kdigo_uo_stage %>% 
  filter(stay_id %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}

## [1] 64044

print("number of included hours with valid KDIGO-UO staging")

## [1] "number of included hours with valid KDIGO-UO staging"

kdigo_uo_stage %>% 
  filter(stay_id %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()

## [1] 4794111

eligible first ICU admission to each patient for AKI analysis

first_icu_stay <- dbGetQuery(con, "SELECT subject_id,
            ARRAY_AGG(
                STAY_ID
                ORDER BY
                    intime ASC
                LIMIT
                    1
            ) [OFFSET(0)] FIRST_STAY_ID_IN_PATIENT
        FROM
            `physionet-data.mimiciv_icu.icustays`
        GROUP BY
            subject_id")

print("number of first ICU stays with valid KDIGO-UO staging")

## [1] "number of first ICU stays with valid KDIGO-UO staging"

kdigo_uo_stage %>% 
  filter(stay_id %in% first_icu_stay$FIRST_STAY_ID_IN_PATIENT
         & stay_id %in% raw_uo_eligible$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}

## [1] 46348

print("number of valid hourly KDIGO-UO staging in first ICU stays")

## [1] "number of valid hourly KDIGO-UO staging in first ICU stays"

kdigo_uo_stage %>% 
  filter(stay_id %in% first_icu_stay$FIRST_STAY_ID_IN_PATIENT
         & stay_id %in% raw_uo_eligible$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()

## [1] 3304774

included ICU admissions for AKI analysis

print("number of first ICU stays (after exclusion criteria) with valid KDIGO-UO staging for the first 24-hours ins ICU stay")

## [1] "number of first ICU stays (after exclusion criteria) with valid KDIGO-UO staging for the first 24-hours ins ICU stay"

aki_epi <- akis_all_long %>%
  filter(group == "newcons") %>%
  drop_na(prevalnce_admit) %>%
  transmute(STAY_ID,
           first_kdigo_uo = first_stage,
         max_uo_stage = max_stage)
  

kdigo_uo_stage %>% 
  filter(stay_id %in% aki_epi$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}

## [1] 46344

print("number of valid hourly KDIGO-UO for those stays")

## [1] "number of valid hourly KDIGO-UO for those stays"

kdigo_uo_stage %>% 
  filter(stay_id %in% aki_epi$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()

## [1] 3304351

Inclusion/Exclusion Flowchart

knitr::include_graphics("flow chart.png")

Table 1 - Patient’s characteristics

table_1$SERVICE <- as.factor(table_1$SERVICE)
table_1$admission_age <- as.numeric(table_1$admission_age)
table_1$weight_admit <- as.numeric(table_1$weight_admit)
table_1$height_first <- as.numeric(table_1$height_first)
table_1$creat_first <- as.numeric(table_1$creat_first)
table_1$creat_peak_72 <- as.numeric(table_1$creat_peak_72)
table_1$creat_last <- as.numeric(table_1$creat_last)

table_1 <- table_1 %>%
  mutate(
    race =
      case_when(
        grepl("asian", race, ignore.case = TRUE) ~ "Asian",
        grepl("black", race, ignore.case = TRUE) ~ "African American",
        grepl("white", race, ignore.case = TRUE) ~ "Caucasian",
        grepl("hispanic", race, ignore.case = TRUE) ~ "Hispanic",
        grepl("other", race, ignore.case = TRUE) ~ "Other",
        grepl("native", race, ignore.case = TRUE) ~ "Other",
        grepl("MULTIPLE", race, ignore.case = TRUE) ~ "Other",
        grepl("PORTUGUESE", race, ignore.case = TRUE) ~ "Other",
        grepl("SOUTH AMERICAN", race, ignore.case = TRUE) ~ "Other",
        TRUE ~ as.character(NA)
      )
  )

uo_for_table1 <- uo_rate_including_null_collection_period %>%
  drop_na(TIME_INTERVAL) %>%
  group_by(STAY_ID) %>%
  summarise(
    count = n(),
    volumes = mean(VALUE, na.rm = TRUE),
    collection_times = mean(TIME_INTERVAL, na.rm = TRUE),
    rates = mean(HOURLY_RATE, na.rm = TRUE),
    ml_kg_hr = mean(HOURLY_RATE / WEIGHT_ADMIT, na.rm = TRUE)
  )

t1a <- table_1 %>%
  select(
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    type = list(
      c(hospital_expire_flag, ckd, dm, rrt_binary) ~ "dichotomous",
      c(
        admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last
      ) ~ "continuous"
    ),
    statistic = c(
      admission_age,
      weight_admit,
      creat_first,
      creat_peak_72,
      creat_last
    ) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_n() 

display_prec <- function(x)
  mean(x) * 100

t1b <- uo_for_table1 %>%
  select(count,
         volumes,
         collection_times,
         rates,
         ml_kg_hr) %>%
  tbl_summary(
    type = list(
      c(volumes,
        collection_times,
        rates,
        ml_kg_hr) ~ "continuous"
    ),
    statistic = list(
      all_continuous() ~ "{mean} ({sd})"
    ),
    missing = "no",
    label = list(
      count ~ "Number of Measurements",
      volumes ~ "Average Volumes, mL",
      collection_times ~ "Average Collection Times, minutes",
      rates ~ "Average Rates, mL/hr",
      ml_kg_hr ~ "Average Rate to Weight, mL/hr/kg"
    )
  ) %>%
  add_n() 

tbl_stack(
  list(t1a, t1b),
  group_header = c("ICU Stay", "UO Charting Across ICU Stay")
) %>%
  as_gt() %>%
  tab_source_note(source_note = "The variables age at hospital admission, gender, CCI, CKD, ethnicity, time in hospital, and mortality are measured for each hospital admission and might be counted for more than one ICU stay. All other variables are measured individually for each ICU stay. The variables average collection times, average rates, and average rate to weight are only presented for ICU stays with at least two UO measurements for the same compartment, and the latter also required weight at admission. AKI variables were presented for ICU stays with at least one hour with a non-null KDIGO-UO stage. The variables average AKI duration and total time in AKI were specifically reported for ICU stays with at least one AKI event.") %>%
  tab_source_note(source_note = "AKI: Acute Kidney Injury; CCI: Charlson Comorbidity Index; CKD Stage 1-4: Chronic Kidney Disease excluding end-stage-renal-disease; ICU: Intensive Care Unit; KDIGO: Kidney Disease: Improving Global Outcomes; SOFA: Sequential Organ Failure Assessment; UO: Urine Output.")

Characteristic	N	N = 67,642¹
ICU Stay
Age at Hospital Admission, years	67,642	65 (17)
Weight at ICU Admission, kg	65,751	81 (34)
Gender	67,642
F		29,686 (44%)
M		37,956 (56%)
Ethnicity	60,412
African American		6,836 (11%)
Asian		1,956 (3.2%)
Caucasian		46,323 (77%)
Hispanic		2,484 (4.1%)
Other		2,813 (4.7%)
CCI Score	67,642	5 (3, 7)
CKD, Stage 1-4	67,624	13,162 (19%)
Diabetes Mellitus	67,624	15,896 (24%)
SOFA Score at ICU Admission	67,642	4 (2, 6)
SAPS-II at ICU Admission	67,161	33 (25, 42)
APS-III Score at ICU Admission	67,642	39 (29, 52)
First Creatinine in ICU, mg/dL	67,287	1.35 (1.37)
Peak Creatinine at first days, mg/dL	67,263	1.52 (1.55)
ICU Discharge Creatinine, mg/dL	67,287	1.25 (1.21)
Peak KDIGO-Cr at first days	66,255
0		48,817 (74%)
1		12,222 (18%)
2		2,764 (4.2%)
3		2,452 (3.7%)
Time in hospital, days	67,642	7 (4, 13)
Time in ICU, days	67,642	2.0 (1.1, 3.8)
Renal replacement therapy	67,642	4,009 (5.9%)
Hospital Mortality	67,642	7,198 (11%)
UO Charting Across ICU Stay
Number of Measurements	67,642	47 (75)
Average Volumes, mL	67,642	182 (149)
Average Collection Times, minutes	67,642	157 (230)
Average Rates, mL/hr	67,642	115 (259)
Average Rate to Weight, mL/hr/kg	65,751	1.62 (6.16)
The variables age at hospital admission, gender, CCI, CKD, ethnicity, time in hospital, and mortality are measured for each hospital admission and might be counted for more than one ICU stay. All other variables are measured individually for each ICU stay. The variables average collection times, average rates, and average rate to weight are only presented for ICU stays with at least two UO measurements for the same compartment, and the latter also required weight at admission. AKI variables were presented for ICU stays with at least one hour with a non-null KDIGO-UO stage. The variables average AKI duration and total time in AKI were specifically reported for ICU stays with at least one AKI event.
AKI: Acute Kidney Injury; CCI: Charlson Comorbidity Index; CKD Stage 1-4: Chronic Kidney Disease excluding end-stage-renal-disease; ICU: Intensive Care Unit; KDIGO: Kidney Disease: Improving Global Outcomes; SOFA: Sequential Organ Failure Assessment; UO: Urine Output.
¹ Mean (SD); n (%); Median (Q1, Q3)

Age

mimic_Sage_a <- table_1 %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(admission_age),2),
                   SD = round(sd(admission_age),2),
                   '5th' = round(quantile(admission_age, 0.05),2),
                   '10th' = round(quantile(admission_age, 0.1),2),
                   '25th' = round(quantile(admission_age, 0.25),2),
                   '50th' = round(quantile(admission_age, 0.50),2),
                   '75th' = round(quantile(admission_age, 0.75),2),
                   '95th' = round(quantile(admission_age, 0.95),2),
                   Min = round(min(admission_age),2),
                   Max = round(max(admission_age),2)
  ) %>% gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)
mimic_Sage_a

N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
67,642.00	64.79	16.79	32.00	41.00	55.00	66.00	77.00	89.00	18.00	102.00

mimic_Sage_b <- ggplot() + 
  geom_histogram(aes(x = admission_age
                     ), data=table_1, binwidth = 1) + 
  labs(
        x = "Age (years)",
        y = "Frequency"
      )

mimic_Sage_b

Weight

mimic_Sweight_a <- table_1 %>%
  drop_na(weight_admit) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(weight_admit),2),
                   SD = round(sd(weight_admit),2),
                   '5th' = round(quantile(weight_admit, 0.05),2),
                   '10th' = round(quantile(weight_admit, 0.1),2),
                   '25th' = round(quantile(weight_admit, 0.25),2),
                   '50th' = round(quantile(weight_admit, 0.50),2),
                   '75th' = round(quantile(weight_admit, 0.75),2),
                   '95th' = round(quantile(weight_admit, 0.95),2),
                   Min = round(min(weight_admit),2),
                   Max = round(max(weight_admit),2)
  ) %>% gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)
mimic_Sweight_a

N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
65,751.00	81.44	34.35	50.00	55.70	65.60	78.10	93.00	122.00	1.00	5,864.00

mimic_Sweight_b <- ggplot() + 
  geom_histogram(aes(x = weight_admit
                     ), data=table_1, binwidth = 5) + 
  labs(
        # title = "Hourly-Adjusted UO per Kilogram",
        x = "Weight (kg)",
        y = "Frequency"
      ) +
  xlim(0, 300)

mimic_Sweight_b

Table 2 - UO records characteristics

uo_rate$SOURCE <- as.factor(uo_rate$SOURCE)
uo_rate$SERVICE <- as.factor(uo_rate$SERVICE)

uo_rate %>%
  select(VALUE, TIME_INTERVAL, SOURCE, SERVICE) %>%
  tbl_summary(by=SERVICE) %>%
  add_overall()

Characteristic	Overall N = 3,171,215¹	CMED N = 285,374¹	PSURG N = 5,896¹	SURG N = 376,125¹	TRAUM N = 139,716¹	TSURG N = 88,269¹	VSURG N = 83,399¹	CSURG N = 510,735¹	ENT N = 5,914¹	GU N = 6,562¹	MED N = 1,123,098¹	NMED N = 180,286¹	NSURG N = 307,162¹	OMED N = 20,881¹	ORTHO N = 37,798¹
VALUE	100 (45, 175)	100 (50, 200)	100 (50, 200)	75 (40, 140)	80 (45, 150)	80 (45, 150)	70 (35, 125)	75 (40, 145)	125 (65, 240)	100 (50, 160)	100 (45, 190)	100 (50, 200)	125 (70, 225)	125 (60, 250)	80 (45, 150)
TIME_INTERVAL	60 (60, 120)	60 (60, 120)	60 (60, 120)	60 (60, 81)	60 (60, 60)	60 (60, 71)	60 (60, 60)	60 (60, 60)	60 (60, 120)	60 (60, 120)	60 (60, 120)	60 (60, 120)	60 (60, 120)	90 (60, 120)	60 (60, 82)
SOURCE
Condom Cath	37,890 (1.2%)	3,977 (1.4%)	22 (0.4%)	2,954 (0.8%)	1,535 (1.1%)	932 (1.1%)	370 (0.4%)	2,292 (0.4%)	31 (0.5%)	0 (0%)	10,733 (1.0%)	7,422 (4.1%)	6,957 (2.3%)	598 (2.9%)	67 (0.2%)
Foley	2,851,891 (90%)	234,442 (82%)	5,430 (92%)	355,674 (95%)	132,961 (95%)	82,596 (94%)	79,095 (95%)	491,557 (96%)	4,787 (81%)	4,941 (75%)	993,097 (88%)	151,583 (84%)	263,074 (86%)	16,042 (77%)	36,612 (97%)
Ileoconduit	6,022 (0.2%)	93 (<0.1%)	0 (0%)	1,037 (0.3%)	53 (<0.1%)	1 (<0.1%)	165 (0.2%)	149 (<0.1%)	0 (0%)	953 (15%)	3,184 (0.3%)	242 (0.1%)	21 (<0.1%)	23 (0.1%)	101 (0.3%)
L Nephrostomy	3,206 (0.1%)	41 (<0.1%)	0 (0%)	500 (0.1%)	33 (<0.1%)	4 (<0.1%)	0 (0%)	0 (0%)	0 (0%)	46 (0.7%)	2,377 (0.2%)	48 (<0.1%)	30 (<0.1%)	105 (0.5%)	22 (<0.1%)
R Nephrostomy	3,465 (0.1%)	193 (<0.1%)	0 (0%)	328 (<0.1%)	7 (<0.1%)	4 (<0.1%)	12 (<0.1%)	74 (<0.1%)	0 (0%)	65 (1.0%)	2,642 (0.2%)	60 (<0.1%)	8 (<0.1%)	72 (0.3%)	0 (0%)
Straight Cath	10,207 (0.3%)	280 (<0.1%)	10 (0.2%)	858 (0.2%)	488 (0.3%)	301 (0.3%)	261 (0.3%)	741 (0.1%)	41 (0.7%)	8 (0.1%)	2,458 (0.2%)	2,179 (1.2%)	2,336 (0.8%)	100 (0.5%)	146 (0.4%)
Suprapubic	9,672 (0.3%)	383 (0.1%)	19 (0.3%)	1,182 (0.3%)	168 (0.1%)	10 (<0.1%)	25 (<0.1%)	16 (<0.1%)	0 (0%)	368 (5.6%)	6,965 (0.6%)	203 (0.1%)	329 (0.1%)	4 (<0.1%)	0 (0%)
Void	248,862 (7.8%)	45,965 (16%)	415 (7.0%)	13,592 (3.6%)	4,471 (3.2%)	4,421 (5.0%)	3,471 (4.2%)	15,906 (3.1%)	1,055 (18%)	181 (2.8%)	101,642 (9.1%)	18,549 (10%)	34,407 (11%)	3,937 (19%)	850 (2.2%)
¹ Median (Q1, Q3); n (%)

Data for single patient example

The data that was used for single patient example.

Raw UO records:

raw_uo_as_character <- raw_uo %>%
  filter(STAY_ID == 36871275)
raw_uo_as_character[] <- lapply(raw_uo_as_character, as.character)

S2_a <- raw_uo_as_character %>%
  select(-SUBJECT_ID, -HADM_ID, -STAY_ID, -SERVICE) %>%
  arrange(., CHARTTIME) %>%
  slice_head(n=15) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S2_a

CHARTTIME	VALUE	ITEMID	LABEL
2144-05-19 17:00:00	50	226559	Foley
2144-05-19 18:00:00	20	226559	Foley
2144-05-19 19:00:00	20	226559	Foley
2144-05-19 19:46:00	150	226564	R Nephrostomy
2144-05-19 20:00:00	20	226559	Foley
2144-05-19 21:00:00	35	226559	Foley
2144-05-19 22:00:00	45	226564	R Nephrostomy
2144-05-19 22:00:00	35	226559	Foley
2144-05-20	23	226559	Foley
2144-05-20 01:00:00	40	226559	Foley
2144-05-20 01:00:00	35	226564	R Nephrostomy
2144-05-20 02:00:00	17	226559	Foley
2144-05-20 03:00:00	22	226559	Foley
2144-05-20 04:00:00	20	226559	Foley
2144-05-20 05:00:00	50	226559	Foley

UO Rates

uo_rate %>%
  filter(STAY_ID == 36871275) %>%
  select(-HADM_ID, -STAY_ID, -WEIGHT_ADMIT, -SERVICE) %>%
  arrange(., CHARTTIME) %>%
  slice_head(n=20) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

SOURCE	VALUE	CHARTTIME	LAST_CHARTTIME	TIME_INTERVAL	HOURLY_RATE
Foley	20	2144-05-19 18:00:00	2144-05-19 17:00:00	60	20
Foley	20	2144-05-19 19:00:00	2144-05-19 18:00:00	60	20
Foley	20	2144-05-19 20:00:00	2144-05-19 19:00:00	60	20
Foley	35	2144-05-19 21:00:00	2144-05-19 20:00:00	60	35
Foley	35	2144-05-19 22:00:00	2144-05-19 21:00:00	60	35
R Nephrostomy	45	2144-05-19 22:00:00	2144-05-19 19:46:00	134	20
Foley	23	2144-05-20	2144-05-19 22:00:00	120	12
Foley	40	2144-05-20 01:00:00	2144-05-20	60	40
R Nephrostomy	35	2144-05-20 01:00:00	2144-05-19 22:00:00	180	12
Foley	17	2144-05-20 02:00:00	2144-05-20 01:00:00	60	17
Foley	22	2144-05-20 03:00:00	2144-05-20 02:00:00	60	22
Foley	20	2144-05-20 04:00:00	2144-05-20 03:00:00	60	20
Foley	50	2144-05-20 05:00:00	2144-05-20 04:00:00	60	50
Foley	50	2144-05-20 06:00:00	2144-05-20 05:00:00	60	50
Foley	135	2144-05-20 08:00:00	2144-05-20 06:00:00	120	68
Foley	65	2144-05-20 10:00:00	2144-05-20 08:00:00	120	32
Foley	60	2144-05-20 12:00:00	2144-05-20 10:00:00	120	30
R Nephrostomy	100	2144-05-20 12:00:00	2144-05-20 01:00:00	660	9
Foley	15	2144-05-20 13:00:00	2144-05-20 12:00:00	60	15
Foley	40	2144-05-20 14:00:00	2144-05-20 13:00:00	60	40

Hourly-Adjusted UO

hourly_uo %>%
  filter(STAY_ID == 36871275) %>%
  select(-STAY_ID, -WEIGHT_ADMIT) %>%
  arrange(., T_PLUS) %>%
  slice_head(n=20) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

T_PLUS	TIME_INTERVAL_STARTS	TIME_INTERVAL_FINISH	HOURLY_WEIGHTED_MEAN_RATE	SIMPLE_SUM
1	2144-05-19 17:00:00	2144-05-19 18:00:00	20	20
2	2144-05-19 18:00:00	2144-05-19 19:00:00	20	20
3	2144-05-19 19:00:00	2144-05-19 20:00:00	25	170
4	2144-05-19 20:00:00	2144-05-19 21:00:00	55	35
5	2144-05-19 21:00:00	2144-05-19 22:00:00	55	80
6	2144-05-19 22:00:00	2144-05-19 23:00:00	23	0
7	2144-05-19 23:00:00	2144-05-20	23	23
8	2144-05-20	2144-05-20 01:00:00	52	75
9	2144-05-20 01:00:00	2144-05-20 02:00:00	26	17
10	2144-05-20 02:00:00	2144-05-20 03:00:00	31	22
11	2144-05-20 03:00:00	2144-05-20 04:00:00	29	20
12	2144-05-20 04:00:00	2144-05-20 05:00:00	59	50
13	2144-05-20 05:00:00	2144-05-20 06:00:00	59	50
14	2144-05-20 06:00:00	2144-05-20 07:00:00	77	0
15	2144-05-20 07:00:00	2144-05-20 08:00:00	77	135
16	2144-05-20 08:00:00	2144-05-20 09:00:00	42	0
17	2144-05-20 09:00:00	2144-05-20 10:00:00	42	65
18	2144-05-20 10:00:00	2144-05-20 11:00:00	39	0
19	2144-05-20 11:00:00	2144-05-20 12:00:00	39	160
20	2144-05-20 12:00:00	2144-05-20 13:00:00	33	15

Raw data analysis

Collection Periods

S6_a <- uo_rate %>% group_by(SOURCE) %>%
   dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S6_a

SOURCE	N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
Foley	2,851,891	83	97	60	60	60	60	105	180	1	43,235
Void	248,862	232	375	60	60	119	180	287	585	1	51,846
Condom Cath	37,890	222	304	60	60	90	131	240	600	1	8,904
Straight Cath	10,207	732	1,486	60	122	300	409	617	2,547	1	35,375
Suprapubic	9,672	115	131	60	60	60	60	120	240	1	6,616
Ileoconduit	6,022	129	213	60	60	60	68	120	300	1	8,040
R Nephrostomy	3,465	223	243	60	60	120	180	240	635	2	7,740
L Nephrostomy	3,206	251	298	60	60	120	180	300	660	2	7,560

S6_b <- ggplot(data = uo_rate, aes(x = TIME_INTERVAL / 60)) +
  geom_histogram(binwidth = 1) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +
  xlim(-1, 20) +
  labs(
          x = "Time interval (hr)",
          y = "Frequency"
        ) 

S6_b

Volumes and Collection Periods

S7_a <- uo_rate %>% group_by(SOURCE) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(VALUE),0),
                   SD = round(sd(VALUE),0),
                   '5th' = round(quantile(VALUE, 0.05),0),
                   '10th' = round(quantile(VALUE, 0.1),0),
                   '25th' = round(quantile(VALUE, 0.25),0),
                   '50th' = round(quantile(VALUE, 0.50),0),
                   '75th' = round(quantile(VALUE, 0.75),0),
                   '95th' = round(quantile(VALUE, 0.95),0),
                   Min = round(min(VALUE),0),
                   Max = round(max(VALUE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S7_a

SOURCE	N	Mean	SD	5th	10th	25th	50th	75th	95th	Max
Foley	2,851,891	118	124	15	25	40	80	150	350	4,385
Void	248,862	299	195	50	100	160	250	400	700	4,500
Condom Cath	37,890	246	205	30	50	100	200	320	625	2,900
Straight Cath	10,207	497	271	50	150	320	500	650	1,000	2,550
Suprapubic	9,672	126	140	12	25	45	90	150	350	2,050
Ileoconduit	6,022	148	149	15	30	50	100	200	400	2,500
R Nephrostomy	3,465	157	144	10	20	50	100	220	450	1,200
L Nephrostomy	3,206	167	145	10	25	50	125	250	450	1,150

S7_b <- ggplot(data = uo_rate, aes(x = VALUE)) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +
  geom_histogram(binwidth = 50) +
  xlim(-25, 1100) +
  labs(
        # title = "Volumes",
        x = "Volume (ml)",
        y = "Frequency"
      ) +
      theme(
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold"),
        plot.caption = element_text(face = "italic")
      )

S7_b

Records of zero volume

The proportion of zero value UO measurements:

uo_rate_count <- uo_rate %>% 
  count(SOURCE, sort = TRUE)

uo_rate_0_count <- uo_rate %>% 
  filter(VALUE == 0) %>% 
  count(SOURCE, sort = TRUE)

count_uo_zero_vs_all <- left_join(uo_rate_count, 
                                  uo_rate_0_count, by = "SOURCE") %>% 
  mutate(PROPORTION = n.y / n.x) %>%
  pivot_longer(cols = n.y:n.x, names_to = "type")
  
S7_c <- count_uo_zero_vs_all %>%
  ggplot(aes(x=reorder(SOURCE, -value), y=value, fill=type)) +
    geom_bar(position="fill", stat="identity") +
    xlab("") +
    ylab("") +
    scale_fill_brewer(palette="Paired") +  
    geom_text(aes(label=ifelse(type == "n.y", paste0((round(PROPORTION, 3) * 100), "%"), "")), 
          color="black", 
          size=3, 
          vjust=-1,
          position="fill") +
    theme_minimal() +
    theme(axis.text.y=element_blank()) +
    theme(legend.position="none") +
  theme(axis.text.y=element_blank()) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
     labs(
         # title = "Proportion of zero value raw output count",
       )
S7_c

uo_rate_0 <- uo_rate %>% filter(VALUE == 0) 
S7_d <- uo_rate_0 %>% group_by(SOURCE) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S7_d

SOURCE	N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
Foley	31,574	103	232	45	60	60	60	104	240	1	15,007
Void	2,609	412	1,078	39	60	60	180	300	1,440	1	18,747
Condom Cath	651	194	388	51	60	60	120	240	516	1	5,952
Suprapubic	213	222	230	33	60	60	210	240	528	1	2,189
Straight Cath	101	1,013	1,859	60	120	180	303	858	3,508	13	14,427
Ileoconduit	89	216	292	60	60	60	120	240	699	15	1,680
R Nephrostomy	85	275	873	60	60	60	120	180	516	13	7,740
L Nephrostomy	70	258	892	60	60	60	120	220	360	60	7,560

Adjusting for hourly UO

UO Rate

S8_a <- uo_rate %>% group_by(SOURCE) %>%
    dplyr::summarise(N = n(),
                   Mean = round(mean(HOURLY_RATE),0),
                   SD = round(sd(HOURLY_RATE),0),
                   '5th' = round(quantile(HOURLY_RATE, 0.05),0),
                   '10th' = round(quantile(HOURLY_RATE, 0.1),0),
                   '25th' = round(quantile(HOURLY_RATE, 0.25),0),
                   '50th' = round(quantile(HOURLY_RATE, 0.50),0),
                   '75th' = round(quantile(HOURLY_RATE, 0.75),0),
                   '95th' = round(quantile(HOURLY_RATE, 0.95),0),
                   Min = round(min(HOURLY_RATE),0),
                   Max = round(max(HOURLY_RATE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S8_a

SOURCE	N	Mean	SD	5th	10th	25th	50th	75th	95th	Max
Foley	2,851,891	105	276	11	20	35	62	120	300	108,000
Void	248,862	159	639	17	25	50	90	160	400	57,000
Condom Cath	37,890	117	309	10	20	40	75	129	325	24,000
Straight Cath	10,207	195	1,786	2	11	40	69	114	378	114,000
Suprapubic	9,672	89	151	5	15	32	60	100	250	6,000
Ileoconduit	6,022	99	179	7	17	38	67	120	262	9,000
R Nephrostomy	3,465	65	138	3	7	17	40	75	200	6,000
L Nephrostomy	3,206	66	300	3	7	18	38	75	200	16,500

S8_b <- ggplot(data = uo_rate, aes(x = HOURLY_RATE)) +
  geom_histogram(binwidth = 20) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +  xlim(-10, 500) +
  labs(
        # title = "UO Rates",
        # subtitle = "by source",
        x = "Rate (ml/hr)",
        y = "Frequency"
      ) +
      theme(
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold"),
        plot.caption = element_text(face = "italic")
      )

S8_b

Low UO Rate Analysis

The association between UO rates and collection periods, smoothed conditional means for records of Foley catheter.

S8_d <- uo_rate %>%
  filter(SOURCE == "Foley",
         HOURLY_RATE < 500) %>%
  # slice_head(n = 50000) %>%
  ggplot(aes(x = HOURLY_RATE, y = TIME_INTERVAL)) +
  geom_smooth(se = TRUE, alpha = 0.1, linewidth = 1) +
  # geom_smooth(
  #   se = FALSE,
  #   method = "lm",
  #   linetype = "dashed",
  #   color = "red", 
  #   linewidth = 0.3
  # ) +
  geom_hline(yintercept = 60,
             size = 0.3,
             color = "#cccccc") +
  geom_vline(
    xintercept = 20,
    size = 0.3,
    color = "black",
    linetype = "dotdash"
  ) +
  scale_x_continuous(breaks = c(0, 20, 50, 100, 200, 300, 400, 500)) +
  scale_y_continuous(breaks = c(0, 60, 100, 200)) +
  coord_cartesian(xlim = c(0, 500), ylim = c(0, 200)) +
  labs(x = "Urine output rate (ml/hr)", y = "Collection periods (min)") +
  theme_classic()

S8_d

Quantile analysis for collection periods as a function of rates.

uo_rate_qreg <- uo_rate %>%
  left_join(anchor_year, by = "HADM_ID") %>%
  filter(SOURCE == "Foley",
         HOURLY_RATE < 500,
         ANCHOR_START > 2016) %>%
  slice_head(n = 500000)

#### Quantile
quantile_reg <- rq(TIME_INTERVAL ~
                     HOURLY_RATE,
                   seq(0.10, 0.90, by = 0.10),
                   # c(.05, .1, .25, .5, .75, .90, .95),
                   data = uo_rate_qreg)

# summary(quantile_reg, se = "iid") %>% 
#   plot()

### OLS
lm <- lm(data=uo_rate_qreg,
         formula =  TIME_INTERVAL ~
           HOURLY_RATE)

ols <- as.data.frame(coef(lm))
ols.ci <- as.data.frame(confint(lm, level = 0.95))
ols2 <- cbind(ols, ols.ci)
ols2 <- tibble::rownames_to_column(ols2, var="term")



#### Quantile
S8_e <- quantile_reg %>%
  tidy(se.type = "iid", conf.int = TRUE, conf.level = 0.95) %>%
  filter(!grepl("factor", term)) %>%
  ggplot(aes(x=tau,y=estimate)) +
  theme_classic() +
  theme(
    strip.background = element_blank(),
    #strip.text.x = element_blank()
  ) +
  scale_y_continuous(limits = symmetric_limits) +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 12)) +
  ##### quantilie results
  geom_point(color="#27408b", size = 0.3)+ 
  geom_line(color="black", linetype = "dotdash", size = 0.3)+ 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#555555")+
  facet_wrap(~term, scales="free", ncol=1)+
  ##### OLS results
  geom_hline(data = ols2, aes(yintercept= `coef(lm)`), lty=1, color="red", size=0.3)+
  geom_hline(data = ols2, aes(yintercept= `2.5 %`), lty=2, color="red", size=0.3)+
  geom_hline(data = ols2, aes(yintercept= `97.5 %`), lty=2, color="red", size=0.3)+
  #### Lines
   geom_hline(yintercept = 0, size=0.3) 

S8_e

# Visualization for Quantile Regression with some tau values: 
intercept_slope <- quantile_reg %>% 
  coef() %>% 
  t() %>% 
  data.frame() %>% 
  rename(intercept = X.Intercept., slope = HOURLY_RATE) %>% 
  mutate(quantile = row.names(.))


S8_f <-
  ggplot() +
  geom_jitter(data = uo_rate_qreg, aes(HOURLY_RATE, TIME_INTERVAL),
    alpha = 0.2,
    size = 0.5,
    stroke = 0.5,
    width = 2,
    height = 2
  ) +
  geom_abline(data = intercept_slope, aes(
    intercept = intercept,
    slope = slope,
    color = quantile
  ),
  linewidth=1) +
  theme_minimal() +
  labs(x = "Urine output rate (ml/hr)", y = "Collection periods (min)") +
  coord_cartesian(xlim = c(0, 500), ylim = c(0, 500))

S8_f

uo_rate_qreg <- uo_rate %>%
  filter(SOURCE == "Foley") %>%
  arrange(STAY_ID) %>%
  slice_head(n = 500000)

percentile <- ecdf(uo_rate_qreg$TIME_INTERVAL)

#### Quantile
quantile_reg2 <- rq(TIME_INTERVAL ~ HOURLY_RATE, 
                    # seq(0.20, 0.80, by = 0.10), 
                    c(percentile(30),
                      percentile(60) - ((1-percentile(60))/10),
                      percentile(90),
                      percentile(120) - ((1-percentile(120))/10),
                      percentile(150),
                      percentile(180) - ((1-percentile(180))/10),
                      percentile(210),
                      percentile(240) - ((1-percentile(240))/10)),
                    data=uo_rate_qreg)

# Visualization for Quantile Regression with some tau values: 
intercept_slope <- quantile_reg2 %>% 
  coef() %>% 
  t() %>% 
  data.frame() %>% 
  rename(intercept = X.Intercept., slope = HOURLY_RATE) %>% 
  mutate(quantile = row.names(.))


ggplot() + 
  geom_point(data = uo_rate_qreg, aes(HOURLY_RATE, TIME_INTERVAL), 
             alpha = 0.5) + 
  geom_abline(data = intercept_slope, aes(intercept = intercept, slope = slope, color = quantile)) + 
  theme_minimal() + 
  labs(x = "HOURLY_RATE", y = "TIME_INTERVAL", 
       title = "Quantile Regression with tau = 0.25, 0.50 and 0.75", 
       caption = "Data Source: Koenker and Bassett (1982)") +
  coord_cartesian(xlim = c(0, 1000), ylim = c(0, 300))

Collection periods for UO rate 20ml/hr or below

uo_rate %>% 
  filter(HOURLY_RATE <= 20) %>%
  group_by(SOURCE) %>%
   dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

SOURCE	N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
Foley	316,355	120	255	60	60	60	60	120	300	1	43,235
Void	16,971	635	1,247	60	120	226	381	700	1,748	1	51,846
Condom Cath	4,191	492	736	60	60	125	240	536	1,620	1	8,904
Straight Cath	1,380	2,761	3,343	240	390	900	1,750	3,191	7,964	13	35,375
Suprapubic	1,355	216	291	60	60	60	120	240	695	1	6,616
R Nephrostomy	1,055	321	377	60	60	120	240	361	850	13	7,740
L Nephrostomy	939	370	489	60	120	120	240	420	923	34	7,560
Ileoconduit	759	280	537	60	60	60	120	240	948	15	8,040

uo_rate %>% 
  filter(HOURLY_RATE <= 20) %>%
ggplot(aes(x = TIME_INTERVAL / 60)) +
  geom_histogram(binwidth = 1) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +  xlim(-1, 20) +
  labs(
          title = "Collection periods for UO rate 20ml/hr or below",
          subtitle = "by source",
          x = "Time interval (hr)",
          y = "Frequency"
        ) +
        theme(
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          plot.caption = element_text(face = "italic")
        )

Mean Rate

Mean UO rate weighted by tyme and grouped by source:

S8_c <- uo_rate %>% 
  group_by(SOURCE) %>%
  summarise(weighted_mean_rate = weighted.mean(HOURLY_RATE, TIME_INTERVAL)) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 1) %>%
  cols_label(
    SOURCE = "Source",
    weighted_mean_rate = "Weighted mean rate (ml/hr)"
  )

S8_c

Source	Weighted mean rate (ml/hr)
Condom Cath	66.3
Foley	85.1
Ileoconduit	68.8
L Nephrostomy	39.9
R Nephrostomy	42.1
Straight Cath	40.8
Suprapubic	66.0
Void	77.3

Hourly-adjusted UO

S9_a <- hourly_uo %>% drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
    dplyr::summarise(N = n(),
                   Mean = round(mean(HOURLY_WEIGHTED_MEAN_RATE),0),
                   SD = round(sd(HOURLY_WEIGHTED_MEAN_RATE),0),
                   '5th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.05),0),
                   '10th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.1),0),
                   '25th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.25),0),
                   '50th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.50),0),
                   '75th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.75),0),
                   '95th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.95),0),
                   Min = round(min(HOURLY_WEIGHTED_MEAN_RATE),0),
                   Max = round(max(HOURLY_WEIGHTED_MEAN_RATE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S9_a

N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
5,211,377	82	92	3	10	30	55	100	250	0	4,098

S9_b <- hourly_uo %>% drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
ggplot(aes(x = HOURLY_WEIGHTED_MEAN_RATE)) +
  geom_histogram(binwidth = 25) +
  xlim(-10, 500) + 
  labs(
        # title = "Hourly-Adjusted UO",
        x = "Hourly UO (ml)",
        y = "Frequency"
      )

S9_b

Simple Sum Comparison

Showing proportion of hours with less than 100ml difference):

adj_uo_diff <- hourly_uo %>%
  select(HOURLY_WEIGHTED_MEAN_RATE, SIMPLE_SUM) %>%
  filter(!is.na(HOURLY_WEIGHTED_MEAN_RATE)) %>%
  mutate(hourly_diff = abs(HOURLY_WEIGHTED_MEAN_RATE - SIMPLE_SUM)) %>%
  mutate(
    cutoff_10 = if_else(hourly_diff < 10, 1, 0),
    cutoff_50 = if_else(hourly_diff < 50, 1, 0),
    cutoff_100 = if_else(hourly_diff < 100, 1, 0),
    cutoff_150 = if_else(hourly_diff < 150, 1, 0),
    cutoff_200 = if_else(hourly_diff < 200, 1, 0)
  )

my_order <- c("<10", "<50", "<100", "<150", "<200")

S11 <- adj_uo_diff %>%
  select(cutoff_10,
         cutoff_50,
         cutoff_100,
         cutoff_150,
         cutoff_200) %>%
  pivot_longer(cols = contains("cutoff")) %>%
  transmute(name = case_when(
    name == "cutoff_10" ~ "<10",
    name == "cutoff_50" ~ "<50",
    name == "cutoff_100" ~ "<100",
    name == "cutoff_150" ~ "<150",
    name == "cutoff_200" ~ "<200"
  ),
  value) %>%
  group_by(name) %>%
  summarise(agreement = paste0(round(mean(value) * 100, 1), "%"),
            non_agreement = paste0(round((1 - mean(
              value
            )) * 100, 1), "%")) %>%
  arrange(match(name, my_order)) %>%
  gt() %>%
  # tab_header(
  #   title = md("**Comparison of Hourly-Adjusted UO and Simple Summation**"),
  # ) %>%
  cols_label(
    name = "Cut-off (ml)",
    agreement = "Proportion  of Agreement",
    non_agreement = "Proportion  of Disagreement"
  ) %>%
  cols_align(
    align = "center"
  ) %>%
  tab_source_note(source_note = "The table demonstrates the significance of hourly adjustment for accuracy by presenting the variance between the adjusted values and the simple hourly summation. Cut-off values are based on the absolute difference between the hourly-adjusted UO and a simple hourly summation of UO. Measurements charted on the hour were included with the previous time interval. ")

adj_uo_diff <- hourly_uo %>%
  select(HOURLY_WEIGHTED_MEAN_RATE, SIMPLE_SUM) %>%
  filter(!is.na(HOURLY_WEIGHTED_MEAN_RATE)) %>%
  mutate(no_diff = 
           ifelse((is.na(HOURLY_WEIGHTED_MEAN_RATE) &
                  is.na(SIMPLE_SUM)) |
             (!is.na(HOURLY_WEIGHTED_MEAN_RATE) &
                  !is.na(SIMPLE_SUM) &
                    abs(HOURLY_WEIGHTED_MEAN_RATE-SIMPLE_SUM) < 100), 
                  1, 
                  0),
         .keep = "none")

S11

Cut-off (ml)	Proportion of Agreement	Proportion of Disagreement
<10	45.4%	54.6%
<50	66.6%	33.4%
<100	84.2%	15.8%
<150	91.7%	8.3%
<200	95.2%	4.8%
The table demonstrates the significance of hourly adjustment for accuracy by presenting the variance between the adjusted values and the simple hourly summation. Cut-off values are based on the absolute difference between the hourly-adjusted UO and a simple hourly summation of UO. Measurements charted on the hour were included with the previous time interval.

mean(adj_uo_diff$no_diff)

## [1] 0.842083

Hourly UO Per Kilogram

S9_c <- uo_ml_kg_hr %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(ML_KG_HR),2),
                   SD = round(sd(ML_KG_HR),2),
                   '5th' = round(quantile(ML_KG_HR, 0.05),2),
                   '10th' = round(quantile(ML_KG_HR, 0.1),2),
                   '25th' = round(quantile(ML_KG_HR, 0.25),2),
                   '50th' = round(quantile(ML_KG_HR, 0.50),2),
                   '75th' = round(quantile(ML_KG_HR, 0.75),2),
                   '95th' = round(quantile(ML_KG_HR, 0.95),2),
                   Min = round(min(ML_KG_HR),2),
                   Max = round(max(ML_KG_HR),2)
  ) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)

S9_c

N	Mean	SD	5th	10th	25th	50th	75th	95th	Min	Max
5,119,874.00	1.05	1.21	0.04	0.13	0.37	0.70	1.30	3.20	0.00	94.86

mean_log <- log(mean(uo_ml_kg_hr$ML_KG_HR))
sd_log <- log(sd(uo_ml_kg_hr$ML_KG_HR))
S9_d <- ggplot() + 
  xlim(0, 2) + 
  geom_histogram(aes(x = ML_KG_HR
                     # , y =..density..
                     ), data=(uo_ml_kg_hr %>%
                                filter(WEIGHT_ADMIT <= 300,
                                       WEIGHT_ADMIT >= 25)), 
                 binwidth = 0.02) + 
  # stat_function(fun = dlnorm, args = list(meanlog = mean_log, sdlog = sd_log, log = FALSE), size=1, color='gray') +
  labs(
        # title = "Hourly-Adjusted UO per Kilogram",
        x = "Hourly volume to kg (ml/hr/kg)",
        y = "Frequency"
      )

S9_d

# save(all_rows_count, 
#      distinct_time_item_patient_rows_count, 
#      S2_a,
#      S3a,
#      S4_a, S4_b, S4_c, S4_d, S4_e, S4_f,
#      S6_a, S6_b,
#      S7_a, S7_b, S7_c, S7_d, 
#      S8_a, S8_b, S8_c, S8_d, S8_e, S8_f,
#      S9_a, S9_b, S9_c, S9_d, 
#      file = "s_data.Rda")

KDIGO Criteria Avarage-UO, Consecutive-UO and Old (MIMIC repo. official deriviation) Comparison

mimic_kdigo_inter_aki_table <- akis_all_long %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    ),
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  tbl_summary(
    by = "group",
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  )  %>%
  modify_column_indent(columns = label, rows = c(FALSE, TRUE)) %>%
  modify_column_indent(
    columns = label,
    rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
    double_indent = TRUE
  ) %>%
  add_p()

mimic_kdigo_inter_aki_table

Characteristic	Block summation N = 46,115¹	UO-Average N = 46,344¹	UO-Consecutive N = 46,344¹	p-value²
Oliguric-AKI on the first days	27,188 (59.0%)	29,385 (63.4%)	22,372 (48.3%)	<0.001
Maximum KDIGO staging				<0.001
1	8,248 (30.3%)	9,204 (31.3%)	11,262 (50.3%)
2	14,830 (54.5%)	15,255 (51.9%)	8,991 (40.2%)
3	4,110 (15.1%)	4,926 (16.8%)	2,119 (9.47%)
Prevalence at admission	7,321 (15.9%)	10,511 (22.7%)	6,388 (13.8%)	<0.001
¹ n (%)
² Pearson’s Chi-squared test

mimic_kdigo_inter_cons_mean <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "old")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_mean_old <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "newcons")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary,) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_cons_old <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "newmean")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_cons_mean

Characteristic	N	UO-Average N = 20,181¹	N	UO-Consecutive N = 11,110¹	p-value²
Age at Hospital Admission, years	20,181	68 (16)	11,110	67 (16)	0.4
Weight at ICU Admission, kg	20,181	87 (25)	11,110	88 (27)	<0.001
Gender	20,181		11,110		0.028
F		8,731 (43%)		4,950 (45%)
M		11,450 (57%)		6,160 (55%)
Ethnicity	17,294		9,526		0.009
African American		1,734 (10%)		1,084 (11%)
Asian		373 (2.2%)		205 (2.2%)
Caucasian		13,878 (80%)		7,496 (79%)
Hispanic		524 (3.0%)		310 (3.3%)
Other		785 (4.5%)		431 (4.5%)
CCI Score	20,181	5 (3, 7)	11,110	5 (3, 7)	<0.001
CKD, Stage 1-4	20,175	4,100 (20%)	11,106	2,678 (24%)	<0.001
Diabetes Mellitus	20,175	4,950 (25%)	11,106	2,709 (24%)	0.8
SOFA Score at ICU Admission	20,181	4 (2, 7)	11,110	5 (2, 8)	<0.001
SAPS-II at ICU Admission	20,075	37 (28, 46)	11,040	39 (29, 50)	<0.001
APS-III Score at ICU Admission	20,181	42 (32, 58)	11,110	46 (34, 63)	<0.001
First Creatinine in ICU, mg/dL	20,128	1.46 (1.57)	11,073	1.70 (1.92)	<0.001
Peak Creatinine at first days, mg/dL	20,114	1.75 (1.80)	11,064	2.10 (2.19)	<0.001
ICU Discharge Creatinine, mg/dL	20,128	1.42 (1.45)	11,073	1.69 (1.74)	<0.001
Peak KDIGO-Cr at first days	20,027		11,014		<0.001
0		12,966 (65%)		6,357 (58%)
1		4,794 (24%)		2,886 (26%)
2		1,126 (5.6%)		773 (7.0%)
3		1,141 (5.7%)		998 (9.1%)
Time in hospital, days	20,181	8 (5, 13)	11,110	8 (5, 14)	0.057
Time in ICU, days	20,181	2.9 (1.8, 5.1)	11,110	3.1 (1.9, 5.7)	<0.001
Renal replacement therapy	20,181	1,771 (8.8%)	11,110	1,574 (14%)	<0.001
Hospital Mortality	20,181	2,912 (14%)	11,110	2,111 (19%)	<0.001
¹ Mean (SD); n (%); Median (Q1, Q3)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

mimic_kdigo_inter_cons_old

Characteristic	N	Block summation N = 18,940¹	N	UO-Consecutive N = 11,110¹	p-value²
Age at Hospital Admission, years	18,940	68 (16)	11,110	67 (16)	0.002
Weight at ICU Admission, kg	18,940	88 (25)	11,110	88 (27)	0.2
Gender	18,940		11,110		0.005
F		8,125 (43%)		4,950 (45%)
M		10,815 (57%)		6,160 (55%)
Ethnicity	16,197		9,526		<0.001
African American		1,580 (9.8%)		1,084 (11%)
Asian		333 (2.1%)		205 (2.2%)
Caucasian		13,081 (81%)		7,496 (79%)
Hispanic		485 (3.0%)		310 (3.3%)
Other		718 (4.4%)		431 (4.5%)
CCI Score	18,940	5 (3, 7)	11,110	5 (3, 7)	<0.001
CKD, Stage 1-4	18,934	3,910 (21%)	11,106	2,678 (24%)	<0.001
Diabetes Mellitus	18,934	4,736 (25%)	11,106	2,709 (24%)	0.2
SOFA Score at ICU Admission	18,940	5 (2, 7)	11,110	5 (2, 8)	0.021
SAPS-II at ICU Admission	18,865	37 (29, 47)	11,040	39 (29, 50)	<0.001
APS-III Score at ICU Admission	18,940	43 (32, 59)	11,110	46 (34, 63)	<0.001
First Creatinine in ICU, mg/dL	18,897	1.46 (1.51)	11,073	1.70 (1.92)	<0.001
Peak Creatinine at first days, mg/dL	18,889	1.76 (1.75)	11,064	2.10 (2.19)	<0.001
ICU Discharge Creatinine, mg/dL	18,897	1.44 (1.44)	11,073	1.69 (1.74)	<0.001
Peak KDIGO-Cr at first days	18,811		11,014		<0.001
0		11,860 (63%)		6,357 (58%)
1		4,713 (25%)		2,886 (26%)
2		1,132 (6.0%)		773 (7.0%)
3		1,106 (5.9%)		998 (9.1%)
Time in hospital, days	18,940	8 (5, 14)	11,110	8 (5, 14)	0.3
Time in ICU, days	18,940	2.9 (1.8, 5.2)	11,110	3.1 (1.9, 5.7)	<0.001
Renal replacement therapy	18,940	1,655 (8.7%)	11,110	1,574 (14%)	<0.001
Hospital Mortality	18,940	2,859 (15%)	11,110	2,111 (19%)	<0.001
¹ Mean (SD); n (%); Median (Q1, Q3)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

mimic_kdigo_inter_mean_old

Characteristic	N	Block summation N = 18,940¹	N	UO-Average N = 20,181¹	p-value²
Age at Hospital Admission, years	18,940	68 (16)	20,181	68 (16)	0.006
Weight at ICU Admission, kg	18,940	88 (25)	20,181	87 (25)	0.011
Gender	18,940		20,181		0.5
F		8,125 (43%)		8,731 (43%)
M		10,815 (57%)		11,450 (57%)
Ethnicity	16,197		17,294		0.8
African American		1,580 (9.8%)		1,734 (10%)
Asian		333 (2.1%)		373 (2.2%)
Caucasian		13,081 (81%)		13,878 (80%)
Hispanic		485 (3.0%)		524 (3.0%)
Other		718 (4.4%)		785 (4.5%)
CCI Score	18,940	5 (3, 7)	20,181	5 (3, 7)	0.11
CKD, Stage 1-4	18,934	3,910 (21%)	20,175	4,100 (20%)	0.4
Diabetes Mellitus	18,934	4,736 (25%)	20,175	4,950 (25%)	0.3
SOFA Score at ICU Admission	18,940	5 (2, 7)	20,181	4 (2, 7)	<0.001
SAPS-II at ICU Admission	18,865	37 (29, 47)	20,075	37 (28, 46)	<0.001
APS-III Score at ICU Admission	18,940	43 (32, 59)	20,181	42 (32, 58)	<0.001
First Creatinine in ICU, mg/dL	18,897	1.46 (1.51)	20,128	1.46 (1.57)	0.034
Peak Creatinine at first days, mg/dL	18,889	1.76 (1.75)	20,114	1.75 (1.80)	<0.001
ICU Discharge Creatinine, mg/dL	18,897	1.44 (1.44)	20,128	1.42 (1.45)	0.009
Peak KDIGO-Cr at first days	18,811		20,027		0.006
0		11,860 (63%)		12,966 (65%)
1		4,713 (25%)		4,794 (24%)
2		1,132 (6.0%)		1,126 (5.6%)
3		1,106 (5.9%)		1,141 (5.7%)
Time in hospital, days	18,940	8 (5, 14)	20,181	8 (5, 13)	<0.001
Time in ICU, days	18,940	2.9 (1.8, 5.2)	20,181	2.9 (1.8, 5.1)	0.13
Renal replacement therapy	18,940	1,655 (8.7%)	20,181	1,771 (8.8%)	0.9
Hospital Mortality	18,940	2,859 (15%)	20,181	2,912 (14%)	0.064
¹ Mean (SD); n (%); Median (Q1, Q3)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

akis_all_long %>%
  group_by(group, max_stage) %>%
  summarise(Propo = sum(mortality_7, na.rm = T) / n()) %>%
  ggplot(aes(max_stage, Propo, fill = group)) + geom_col(position = 'dodge')

mimic_cdplod_old <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "old")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for Block Summation"
)

mimic_cdplod_old

## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x1558cdac0>

mimic_cdplod_mean <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "newmean")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for UOmean"
)

mimic_cdplod_mean

## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x1698d4b30>

mimic_cdplod_cons <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "newcons")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for UOcons"
)

mimic_cdplod_cons

## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x333346fa8>

model_block_summation <- glm(
  mortality_7 ~ MAX_STAGE_OLD + FIRST_STAGE_OLD,
  data = akis_all_wide,
  family = binomial
)

model_mean <- glm(
  mortality_7 ~ MAX_STAGE_NEW_MEAN + FIRST_STAGE_NEW_MEAN,
  data = akis_all_wide,
  family = binomial
)

model_cons <- glm(
  mortality_7 ~ MAX_STAGE_NEW_CONS + FIRST_STAGE_NEW_CONS,
  data = akis_all_wide,
  family = binomial
)

# anova(model_old, model_new, test = 'Chisq')
mimic_kdigo_inter_bic <- BIC(model_block_summation, model_mean, model_cons)

mimic_kdigo_inter_bic <- cbind(Model = rownames(mimic_kdigo_inter_bic), mimic_kdigo_inter_bic) %>%
  gt()

mimic_kdigo_inter_bic

Model	df	BIC
model_block_summation	3	20948.37
model_mean	3	20837.52
model_cons	3	20802.08

check for mortality overdispration:

model_binom <- glm(
  mortality_30 ~ stage_newcons,
  family = binomial,
  data = (
    akis_all_long %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

model_overdispersed <- glm(
  mortality_30 ~ stage_newcons,
  family = quasibinomial,
  data = (
    akis_all_long %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

pchisq(summary(model_overdispersed)$dispersion * model_binom$df.residual, 
       model_binom$df.residual, lower = F)

## [1] 0.4937964

akis_all_long_complete <- akis_all_long %>% drop_na(max_stage)

# descriptive
descriptive_tbl <- akis_all_long_complete %>%
  group_by(group, max_stage) %>%
  summarise(
    n = n(),
    dead = sum(mortality_30),
    mortality_prop = sum(mortality_30) / n()
  ) %>%
  # drop_na() %>%
  group_by(group) %>%
  transmute(
    max_stage,
    "Patients, No. (%)" = paste0(n, " (", round((n / sum(
      n
    )), 2), ")"),
    "Mortality, No. (%)" = paste0(dead, " (", round(mortality_prop, 2), ")")
  ) %>%
  gt()

# glm
m1 <- glm(
  mortality_30 ~ stage_newcons,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

m2 <- glm(
  mortality_30 ~ stage_newmean,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newmean") %>%
      mutate(stage_newmean = as.factor(max_stage))
  )
)

m3 <- glm(
  mortality_30 ~ stage_old,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "old") %>%
      mutate(stage_old = as.factor(max_stage))
  )
)

tbl_regression

## function (x, ...) 
## {
##     check_pkg_installed(c("broom", "broom.helpers"), reference_pkg = "gtsummary")
##     check_not_missing(x)
##     UseMethod("tbl_regression")
## }
## <bytecode: 0x129a45250>
## <environment: namespace:gtsummary>

glm_tbl <- tbl_stack(list(
  tbl_regression(m1, exponentiate = TRUE),
  tbl_regression(m2, exponentiate = TRUE),
  tbl_regression(m3, exponentiate = TRUE)
)) %>%
  as_gt()

# join tables
glm_tbl_data <- glm_tbl$`_data` %>%
  filter(!is.na(term)) %>%
  transmute(
    group = case_when(
      variable == "stage_newmean" ~ "newmean",
      variable == "stage_newcons" ~ "newcons",
      variable == "stage_old" ~ "old"
    ),
    max_stage = label,
    "OR (95% CI)" = if_else(label == "0", "1 [Reference]", paste0(round(estimate, 2), " (", ci, ")"), ),
    "P value" = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

descriptive_tbl_data <- descriptive_tbl$`_data` %>%
  mutate(max_stage = as.character(max_stage))

rr_table_data <- left_join(descriptive_tbl_data, glm_tbl_data, by = c("group", "max_stage"))

mimic_kdigo_inter_survival_table <- rr_table_data %>%
  mutate(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    )
  ) %>%
  gt(
    rowname_col = "max_stage",
    groupname_col = "group",
    row_group_as_column = TRUE
  ) %>%
  tab_stubhead(label = "Criteria / Stage") %>%
  tab_spanner(label = "Unadjusted OR", columns = c("OR (95% CI)", "P value"))

mimic_kdigo_inter_survival_table

Criteria / Stage		Patients, No. (%)	Mortality, No. (%)	Unadjusted OR
Criteria / Stage		Patients, No. (%)	Mortality, No. (%)	OR (95% CI)	P value
UO-Consecutive	0	23972 (0.52)	1715 (0.07)	1 [Reference]	NA
	1	11262 (0.24)	1336 (0.12)	1.75 (1.62, 1.88)	<.001
	2	8991 (0.19)	1826 (0.2)	3.31 (3.08, 3.55)	<.001
	3	2119 (0.05)	775 (0.37)	7.48 (6.76, 8.28)	<.001
UO-Average	0	16959 (0.37)	1044 (0.06)	1 [Reference]	NA
	1	9204 (0.2)	924 (0.1)	1.7 (1.55, 1.87)	<.001
	2	15255 (0.33)	2111 (0.14)	2.45 (2.27, 2.65)	<.001
	3	4926 (0.11)	1573 (0.32)	7.15 (6.56, 7.80)	<.001
Block summation	0	18927 (0.41)	1231 (0.07)	1 [Reference]	NA
	1	8248 (0.18)	825 (0.1)	1.6 (1.46, 1.75)	<.001
	2	14830 (0.32)	2143 (0.14)	2.43 (2.26, 2.61)	<.001
	3	4110 (0.09)	1435 (0.35)	7.71 (7.07, 8.41)	<.001

adjusted models:

m1_adj1 <- glm(
  mortality_30 ~ stage_newcons * (
    admission_age + gender + weight_admit + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

mrg_efct_m1_adj1 <- avg_comparisons(
  m1_adj1,
  variables = "stage_newcons",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m1_adj1 <- mrg_efct_m1_adj1 %>%
    transmute(
      group = "newcons",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "newcons", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

m2_adj1 <- glm(
  mortality_30 ~ stage_newmean * (
    admission_age + weight_admit + gender + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newmean") %>%
      mutate(stage_newmean = as.factor(max_stage))
  )
)

mrg_efct_m2_adj1 <- avg_comparisons(
  m2_adj1,
  variables = "stage_newmean",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m2_adj1 <- mrg_efct_m2_adj1 %>%
    transmute(
      group = "newmean",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "newmean", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

m3_adj1 <- glm(
  mortality_30 ~ stage_old * (
    admission_age + weight_admit + gender + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "old") %>%
      mutate(stage_old = as.factor(max_stage))
  )
)

mrg_efct_m3_adj1 <- avg_comparisons(
  m3_adj1,
  variables = "stage_old",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m3_adj1 <- mrg_efct_m3_adj1 %>%
    transmute(
      group = "old",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "old", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

mimic_kdigo_inter_survival_table_adj <- rr_table_data %>% 
  left_join(bind_rows(rr_table_data_m1_adj1, rr_table_data_m2_adj1, rr_table_data_m3_adj1)) %>%
  mutate(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    )
  ) %>%
  gt(
    rowname_col = "max_stage",
    groupname_col = "group",
    row_group_as_column = TRUE
  ) %>%
  cols_label(
    or.adj1 = "OR (95% CI)",
    p.adj1 = "P value",
  ) %>%
  tab_stubhead(label = "Criteria / Stage") %>%
  tab_spanner(label = "Unadjusted OR", columns = c("OR (95% CI)", "P value")) %>%
  tab_spanner(label = "Adjusted Model", columns = c(or.adj1, p.adj1), id = "adj1") %>%
  tab_footnote(
    footnote = "Model include age, weight, gender and whether diagnosed on admission",
    locations = cells_column_spanners(spanners = "adj1")
  ) %>% tab_source_note(source_note = md(
    "All covariates in the adjusted model were significant except for diagnosis at admission for block summation model."
  ))

mimic_kdigo_inter_survival_table_adj

Criteria / Stage		Patients, No. (%)	Mortality, No. (%)	Unadjusted OR		Adjusted Model¹
Criteria / Stage		Patients, No. (%)	Mortality, No. (%)	OR (95% CI)	P value	OR (95% CI)	P value
UO-Consecutive	0	23972 (0.52)	1715 (0.07)	1 [Reference]	NA	1 [Reference]	NA
	1	11262 (0.24)	1336 (0.12)	1.75 (1.62, 1.88)	<.001	1.58 (1.46-1.72)	<.001
	2	8991 (0.19)	1826 (0.2)	3.31 (3.08, 3.55)	<.001	2.94 (2.7-3.19)	<.001
	3	2119 (0.05)	775 (0.37)	7.48 (6.76, 8.28)	<.001	5.24 (4.42-6.2)	<.001
UO-Average	0	16959 (0.37)	1044 (0.06)	1 [Reference]	NA	1 [Reference]	NA
	1	9204 (0.2)	924 (0.1)	1.7 (1.55, 1.87)	<.001	1.48 (1.34-1.63)	<.001
	2	15255 (0.33)	2111 (0.14)	2.45 (2.27, 2.65)	<.001	2.11 (1.93-2.31)	<.001
	3	4926 (0.11)	1573 (0.32)	7.15 (6.56, 7.80)	<.001	5.59 (4.92-6.36)	<.001
Block summation	0	18927 (0.41)	1231 (0.07)	1 [Reference]	NA	1 [Reference]	NA
	1	8248 (0.18)	825 (0.1)	1.6 (1.46, 1.75)	<.001	1.54 (1.4-1.69)	<.001
	2	14830 (0.32)	2143 (0.14)	2.43 (2.26, 2.61)	<.001	2.38 (2.2-2.57)	<.001
	3	4110 (0.09)	1435 (0.35)	7.71 (7.07, 8.41)	<.001	8.11 (7.29-9.03)	<.001
All covariates in the adjusted model were significant except for diagnosis at admission for block summation model.
¹ Model include age, weight, gender and whether diagnosed on admission

summaries for all tables:

summary(m1_adj1)

## 
## Call:
## glm(formula = mortality_30 ~ stage_newcons * (admission_age + 
##     gender + weight_admit + first_stage), family = binomial, 
##     data = (akis_all_long_complete %>% filter(group == "newcons") %>% 
##         mutate(stage_newcons = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -3.202164   0.179984 -17.791  < 2e-16 ***
## stage_newcons1                0.967897   0.290305   3.334 0.000856 ***
## stage_newcons2                1.493116   0.269739   5.535 3.11e-08 ***
## stage_newcons3                0.774332   0.354520   2.184 0.028950 *  
## admission_age                 0.032089   0.001698  18.896  < 2e-16 ***
## genderM                       0.242934   0.055858   4.349 1.37e-05 ***
## weight_admit                 -0.022223   0.001703 -13.052  < 2e-16 ***
## first_stage                   0.514027   0.101331   5.073 3.92e-07 ***
## stage_newcons1:admission_age -0.008540   0.002735  -3.122 0.001796 ** 
## stage_newcons2:admission_age -0.009419   0.002571  -3.664 0.000248 ***
## stage_newcons3:admission_age -0.007443   0.003570  -2.085 0.037097 *  
## stage_newcons1:genderM       -0.175030   0.084691  -2.067 0.038763 *  
## stage_newcons2:genderM       -0.162809   0.079392  -2.051 0.040297 *  
## stage_newcons3:genderM       -0.226078   0.110594  -2.044 0.040932 *  
## stage_newcons1:weight_admit   0.002444   0.002439   1.002 0.316316    
## stage_newcons2:weight_admit   0.005524   0.002191   2.521 0.011687 *  
## stage_newcons3:weight_admit   0.020724   0.002500   8.291  < 2e-16 ***
## stage_newcons1:first_stage    0.172714   0.123130   1.403 0.160706    
## stage_newcons2:first_stage   -0.158296   0.115697  -1.368 0.171254    
## stage_newcons3:first_stage          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34369  on 46343  degrees of freedom
## Residual deviance: 30601  on 46325  degrees of freedom
## AIC: 30639
## 
## Number of Fisher Scoring iterations: 6

summary(m2_adj1)

## 
## Call:
## glm(formula = mortality_30 ~ stage_newmean * (admission_age + 
##     weight_admit + gender + first_stage), family = binomial, 
##     data = (akis_all_long_complete %>% filter(group == "newmean") %>% 
##         mutate(stage_newmean = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -3.520e+00  2.260e-01 -15.573  < 2e-16 ***
## stage_newmean1                9.096e-01  3.511e-01   2.591  0.00958 ** 
## stage_newmean2                1.716e+00  2.918e-01   5.880 4.10e-09 ***
## stage_newmean3                2.080e+00  3.151e-01   6.603 4.04e-11 ***
## admission_age                 3.372e-02  2.110e-03  15.980  < 2e-16 ***
## weight_admit                 -2.127e-02  2.231e-03  -9.533  < 2e-16 ***
## genderM                       2.270e-01  7.112e-02   3.192  0.00141 ** 
## first_stage                   3.498e-01  6.812e-02   5.136 2.81e-07 ***
## stage_newmean1:admission_age -8.734e-03  3.300e-03  -2.647  0.00813 ** 
## stage_newmean2:admission_age -1.273e-02  2.733e-03  -4.658 3.19e-06 ***
## stage_newmean3:admission_age -1.497e-02  3.036e-03  -4.931 8.18e-07 ***
## stage_newmean1:weight_admit  -1.866e-05  3.140e-03  -0.006  0.99526    
## stage_newmean2:weight_admit  -1.034e-03  2.627e-03  -0.394  0.69383    
## stage_newmean3:weight_admit   1.143e-02  2.607e-03   4.384 1.17e-05 ***
## stage_newmean1:genderM       -9.116e-02  1.046e-01  -0.871  0.38353    
## stage_newmean2:genderM       -4.251e-02  8.781e-02  -0.484  0.62831    
## stage_newmean3:genderM       -2.000e-01  9.605e-02  -2.083  0.03728 *  
## stage_newmean1:first_stage    5.619e-01  1.008e-01   5.572 2.51e-08 ***
## stage_newmean2:first_stage    2.542e-01  8.411e-02   3.022  0.00251 ** 
## stage_newmean3:first_stage           NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34369  on 46343  degrees of freedom
## Residual deviance: 30390  on 46325  degrees of freedom
## AIC: 30428
## 
## Number of Fisher Scoring iterations: 6

summary(m3_adj1)

## 
## Call:
## glm(formula = mortality_30 ~ stage_old * (admission_age + weight_admit + 
##     gender + first_stage), family = binomial, data = (akis_all_long_complete %>% 
##     filter(group == "old") %>% mutate(stage_old = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -3.441839   0.210510 -16.350  < 2e-16 ***
## stage_old1                0.661224   0.354127   1.867 0.061874 .  
## stage_old2                1.806640   0.281955   6.408 1.48e-10 ***
## stage_old3                2.881287   0.315823   9.123  < 2e-16 ***
## admission_age             0.034303   0.001960  17.498  < 2e-16 ***
## weight_admit             -0.022277   0.002060 -10.816  < 2e-16 ***
## genderM                   0.241431   0.065597   3.681 0.000233 ***
## first_stage              -0.014451   0.046307  -0.312 0.754995    
## stage_old1:admission_age -0.008355   0.003318  -2.518 0.011809 *  
## stage_old2:admission_age -0.013912   0.002636  -5.278 1.31e-07 ***
## stage_old3:admission_age -0.019748   0.003061  -6.450 1.12e-10 ***
## stage_old1:weight_admit   0.003504   0.003099   1.131 0.258169    
## stage_old2:weight_admit   0.001462   0.002474   0.591 0.554449    
## stage_old3:weight_admit   0.009982   0.002507   3.981 6.86e-05 ***
## stage_old1:genderM       -0.101982   0.103987  -0.981 0.326729    
## stage_old2:genderM       -0.137826   0.083048  -1.660 0.096996 .  
## stage_old3:genderM       -0.161356   0.095257  -1.694 0.090284 .  
## stage_old1:first_stage    0.797323   0.094909   8.401  < 2e-16 ***
## stage_old2:first_stage    0.269415   0.064578   4.172 3.02e-05 ***
## stage_old3:first_stage          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34235  on 46113  degrees of freedom
## Residual deviance: 30322  on 46095  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 30360
## 
## Number of Fisher Scoring iterations: 6

km_fit_newcons <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_CONS,
          akis_all_wide)

km_fit_newmean <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_MEAN,
          akis_all_wide)

km_fit_old <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_OLD,
          akis_all_wide)

mimic_survival_figure_newcons <- km_fit_newcons %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival", title = "UOcons",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_survival_figure_newmean <- km_fit_newmean %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival", title = "UOmean",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_survival_figure_old <- km_fit_old %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival",title = "Block Summation",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_kdigo_inter_survival_figure <- ggarrange(
  mimic_survival_figure_old,
  mimic_survival_figure_newmean,
  mimic_survival_figure_newcons,
  ncol = 1,
  nrow = 3,
  legend = "bottom",
  common.legend = TRUE
) %>% annotate_figure(top = text_grob("MIMICdb", face = "bold", size = 14))

mimic_kdigo_inter_survival_figure

Sensitivity Analysis: Validity Threshold for Durations of Collection

validity_threshold_long <- validity_threshold_wide %>%
  mutate(
         first_stage.cons = FIRST_STAGE_NEW_CONS,
         first_stage.9520 = FIRST_STAGE_NEW_CONS_95_20,
         first_stage.9920 = FIRST_STAGE_NEW_CONS_99_20,
         max_stage.cons = MAX_STAGE_NEW_CONS,
         max_stage.9520 = MAX_STAGE_NEW_CONS_95_20,
         max_stage.9920 = MAX_STAGE_NEW_CONS_99_20,
         .keep = "unused") %>%
  pivot_longer(
    !c(STAY_ID, FOLLOWUP_DAYS, DEATH_FLAG),
    names_sep = "\\.",
    names_to = c(".value", "group")
  ) %>%
  mutate(
    mortality_90 = if_else(FOLLOWUP_DAYS < 91 &
                                 DEATH_FLAG == 1, 1, 0),
    prevalnce_admit = if_else(first_stage > 0, 1, 0),
    Incidence_first_72hr = case_when(first_stage > 0 ~ NA, max_stage == 0 ~ 0, max_stage > 0 ~ 1),
    Incidence_first_72hr_with_stage = ifelse(first_stage == 0 &
                                              max_stage > 0, max_stage, NA),
    .keep = "all"
  ) 

validity_threshold_long %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    group,
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  mutate(group =
           case_when(group == "cons" ~ "No exclusion",
                     group == "9520" ~ "95th precentile for rate bellow 20th precentile",
                     group == "9920" ~ "99th precentile for rate bellow 20th precentile")) %>%
  tbl_summary(
    by = "group",
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  )  %>%
  modify_column_indent(columns = label, rows = c(FALSE, TRUE)) %>%
  modify_column_indent(
    columns = label,
    rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
    double_indent = TRUE
  ) %>%
  add_p()

Characteristic	95th precentile for rate bellow 20th precentile N = 45,804¹	99th precentile for rate bellow 20th precentile N = 46,278¹	No exclusion N = 46,347¹	p-value²
Oliguric-AKI on the first days	20,837 (45.5%)	22,105 (47.8%)	22,373 (48.3%)	<0.001
Maximum KDIGO staging				<0.001
1	11,034 (53.0%)	11,447 (51.8%)	11,262 (50.3%)
2	8,528 (40.9%)	8,792 (39.8%)	8,992 (40.2%)
3	1,275 (6.12%)	1,866 (8.44%)	2,119 (9.47%)
Prevalence at admission	5,941 (13.0%)	6,250 (13.5%)	6,388 (13.8%)	0.001
¹ n (%)
² Pearson’s Chi-squared test

mimic_exclusion_threshold <- validity_threshold_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group == "cons" | group == "9520" | group == "9920")) %>%
  left_join(table_1, by = "STAY_ID") %>%
    mutate(group =
           case_when(group == "cons" ~ "No exclusion",
                     group == "9520" ~ "95th precentile for rate bellow 20th precentile",
                     group == "9920" ~ "99th precentile for rate bellow 20th precentile")) %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2) %>%
      dplyr::relocate(add_n_stat_3, .before = stat_3)
  )

mimic_exclusion_threshold

Characteristic	N	95th precentile for rate bellow 20th precentile N = 9,803¹	N	99th precentile for rate bellow 20th precentile N = 10,658¹	N	No exclusion N = 11,111¹	p-value²
Age at Hospital Admission, years	9,803	67 (16)	10,658	67 (16)	11,111	67 (16)	0.9
Weight at ICU Admission, kg	9,803	89 (27)	10,658	89 (27)	11,111	88 (27)	0.036
Gender	9,803		10,658		11,111		>0.9
F		4,362 (44%)		4,745 (45%)		4,950 (45%)
M		5,441 (56%)		5,913 (55%)		6,161 (55%)
Ethnicity	8,427		9,142		9,527		>0.9
African American		967 (11%)		1,048 (11%)		1,084 (11%)
Asian		169 (2.0%)		193 (2.1%)		205 (2.2%)
Caucasian		6,632 (79%)		7,186 (79%)		7,497 (79%)
Hispanic		279 (3.3%)		299 (3.3%)		310 (3.3%)
Other		380 (4.5%)		416 (4.6%)		431 (4.5%)
CCI Score	9,803	5 (3, 7)	10,658	5 (3, 7)	11,111	5 (3, 7)	0.8
CKD, Stage 1-4	9,799	2,332 (24%)	10,654	2,582 (24%)	11,107	2,678 (24%)	0.8
Diabetes Mellitus	9,799	2,438 (25%)	10,654	2,622 (25%)	11,107	2,710 (24%)	0.7
SOFA Score at ICU Admission	9,803	5 (2, 8)	10,658	5 (2, 8)	11,111	5 (2, 8)	0.6
SAPS-II at ICU Admission	9,743	39 (30, 50)	10,590	39 (30, 50)	11,041	39 (29, 50)	0.6
APS-III Score at ICU Admission	9,803	46 (34, 63)	10,658	46 (34, 64)	11,111	46 (34, 63)	0.6
First Creatinine in ICU, mg/dL	9,775	1.66 (1.83)	10,621	1.70 (1.91)	11,074	1.70 (1.91)	0.8
Peak Creatinine at first days, mg/dL	9,771	2.07 (2.12)	10,617	2.11 (2.19)	11,065	2.10 (2.19)	0.6
ICU Discharge Creatinine, mg/dL	9,775	1.68 (1.71)	10,621	1.70 (1.75)	11,074	1.69 (1.74)	0.7
Peak KDIGO-Cr at first days	9,737		10,577		11,015		>0.9
0		5,578 (57%)		6,039 (57%)		6,357 (58%)
1		2,557 (26%)		2,805 (27%)		2,887 (26%)
2		715 (7.3%)		759 (7.2%)		773 (7.0%)
3		887 (9.1%)		974 (9.2%)		998 (9.1%)
Time in hospital, days	9,803	8 (5, 14)	10,658	8 (5, 14)	11,111	8 (5, 14)	0.5
Time in ICU, days	9,803	3.1 (1.9, 5.7)	10,658	3.1 (1.9, 5.7)	11,111	3.1 (1.9, 5.7)	0.9
Renal replacement therapy	9,803	1,345 (14%)	10,658	1,519 (14%)	11,111	1,574 (14%)	0.5
Hospital Mortality	9,803	1,805 (18%)	10,658	2,031 (19%)	11,111	2,111 (19%)	0.4
¹ Mean (SD); n (%); Median (Q1, Q3)
² Kruskal-Wallis rank sum test; Pearson’s Chi-squared test

Clinical Outcomes

Describing the prevalence of oliguric-AKI upon admission and incidence at the first ICU day

akis_all_long %>%
  filter(group == "newcons") %>%
  select(prevalnce_admit,
         Incidence_first_72hr,
         max_stage) %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  tbl_summary(
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  ) %>%
  modify_column_indent(columns = label, 
                       rows = c(FALSE, TRUE)) %>%
  modify_column_indent(columns = label, 
                       rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
                       double_indent = TRUE)

Characteristic	N = 46,344¹
Oliguric-AKI on the first days	22,372 (48.3%)
Maximum KDIGO staging
1	11,262 (50.3%)
2	8,991 (40.2%)
3	2,119 (9.47%)
Prevalence at admission	6,388 (13.8%)
¹ n (%)

aki_uo_analysis <- left_join(akis_all_wide, uo_ml_kg_hr, by = "STAY_ID") %>%
  drop_na(FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
          TIME_INTERVAL_FINISH, 
          MAX_STAGE_NEW_CONS) %>%
  transmute(STAY_ID,
         MAX_STAGE_NEW_CONS = as.character(MAX_STAGE_NEW_CONS),
         FIRST_POSITIVE_STAGE_UO_CONS_TIME,
         TIME_INTERVAL_FINISH,
         UO_KG = ML_KG_HR
         ) %>%
  mutate(TIME = as.double(difftime(TIME_INTERVAL_FINISH, 
                                   FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
                                   units = c("hour")))) %>%
  filter(TIME >= -48 & TIME <= 48)

aki_creat_analysis <- left_join(akis_all_wide, creat_diff %>% select(-STAY_ID), by = "HADM_ID") %>%
  select(STAY_ID,
         MAX_STAGE_NEW_CONS,
         FIRST_STAGE_NEW_CONS,
         FIRST_POSITIVE_STAGE_UO_CONS_TIME,
         CHARTTIME,
         CREAT,
         SCR_BASELINE,
         CREAT_BASLINE_DIFF,
         CREAT_BASLINE_RATIO,
         CREAT_LOWEST7_DIFF,
         CREAT_LOWEST7_RATIO
         ) %>%
  mutate(AKI_TO_CREAT = as.double(difftime(CHARTTIME, 
                                   FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
                                   units = c("mins"))) / 60) %>%
  filter(AKI_TO_CREAT >= -72 & AKI_TO_CREAT <= 72)

First Oliguric-AKI Events

table 1 for ICU stays with identified oliguric AKI in the first 72 hours of admission, stratified by max kdigo-uo stage (ICU stays with AKI at admission were excluded):

table1_akis <- akis_all_wide  %>%
  filter(MAX_STAGE_NEW_CONS >= 0) %>%
  select(STAY_ID, MAX_STAGE_NEW_CONS) %>%
  left_join(table_1, by = "STAY_ID")

table_1_staging <- table1_akis %>%
  select(
    MAX_STAGE_NEW_CONS,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  mutate(
    staging = case_when(
      MAX_STAGE_NEW_CONS == 0 ~ "No AKI",
      MAX_STAGE_NEW_CONS == 1 ~ "Stage 1",
      MAX_STAGE_NEW_CONS == 2 ~ "Stage 2",
      MAX_STAGE_NEW_CONS == 3 ~ "Stage 3",
      .default = NA
    ),
    .keep = "unused"
  ) %>%
  tbl_summary(
    by = staging,
    type = list(
      c(hospital_expire_flag, ckd, dm, rrt_binary) ~ "dichotomous",
      c(
        admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last
      ) ~ "continuous"
    ),
    statistic = c(
      admission_age,
      weight_admit,
      creat_first,
      creat_peak_72,
      creat_last
    ) ~ "{mean} ({sd})",
    missing = "no",
    missing_text = "-",
    digits = list(hospital_days ~ c(1, 1)),
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p(c(
    admission_age,
    weight_admit,
    creat_first,
    creat_peak_72,
    creat_last
  ) ~ "aov")

table_1_staging

Characteristic	No AKI N = 23,972¹	Stage 1 N = 11,262¹	Stage 2 N = 8,991¹	Stage 3 N = 2,119¹	p-value²
Age at Hospital Admission, years	62 (18)	67 (16)	68 (16)	66 (16)	<0.001
Weight at ICU Admission, kg	77 (20)	84 (23)	89 (27)	88 (28)	<0.001
Gender					<0.001
F	10,590 (44%)	4,710 (42%)	3,987 (44%)	963 (45%)
M	13,382 (56%)	6,552 (58%)	5,004 (56%)	1,156 (55%)
Ethnicity					<0.001
African American	2,051 (9.8%)	910 (9.3%)	840 (11%)	244 (14%)
Asian	895 (4.3%)	232 (2.4%)	146 (1.9%)	59 (3.4%)
Caucasian	16,023 (76%)	7,795 (80%)	6,228 (80%)	1,268 (73%)
Hispanic	935 (4.5%)	325 (3.3%)	243 (3.1%)	67 (3.9%)
Other	1,053 (5.0%)	471 (4.8%)	336 (4.3%)	95 (5.5%)
CCI Score	4 (2, 6)	5 (3, 7)	5 (3, 7)	6 (4, 8)	<0.001
CKD, Stage 1-4	3,101 (13%)	1,802 (16%)	1,879 (21%)	799 (38%)	<0.001
Diabetes Mellitus	4,933 (21%)	2,662 (24%)	2,201 (24%)	508 (24%)	<0.001
SOFA Score at ICU Admission	3 (1, 5)	4 (2, 6)	4 (2, 7)	8 (4, 12)	<0.001
SAPS-II at ICU Admission	30 (22, 38)	33 (26, 42)	37 (29, 47)	49 (37, 60)	<0.001
APS-III Score at ICU Admission	34 (26, 44)	38 (29, 50)	43 (33, 58)	64 (45, 83)	<0.001
First Creatinine in ICU, mg/dL	1.14 (1.04)	1.19 (1.02)	1.42 (1.51)	2.87 (2.81)	<0.001
Peak Creatinine at first days, mg/dL	1.21 (1.07)	1.33 (1.10)	1.70 (1.70)	3.79 (3.07)	<0.001
ICU Discharge Creatinine, mg/dL	1.00 (0.79)	1.12 (0.95)	1.40 (1.38)	2.92 (2.46)	<0.001
Peak KDIGO-Cr at first days					<0.001
0	20,132 (86%)	8,387 (75%)	5,646 (63%)	711 (34%)
1	2,708 (12%)	2,166 (19%)	2,247 (25%)	639 (30%)
2	440 (1.9%)	369 (3.3%)	569 (6.4%)	204 (9.7%)
3	175 (0.7%)	198 (1.8%)	454 (5.1%)	544 (26%)
Time in hospital, days	6.0 (3.0, 10.0)	7.0 (4.0, 12.0)	8.0 (5.0, 13.0)	10.0 (5.0, 18.0)	<0.001
Time in ICU, days	1.4 (1.0, 2.5)	2.2 (1.3, 4.0)	2.9 (1.8, 5.1)	4.4 (2.8, 8.2)	<0.001
Renal replacement therapy	281 (1.2%)	257 (2.3%)	604 (6.7%)	970 (46%)	<0.001
Hospital Mortality	1,163 (4.9%)	1,005 (8.9%)	1,440 (16%)	671 (32%)	<0.001
¹ Mean (SD); n (%); Median (Q1, Q3)
² One-way analysis of means; Pearson’s Chi-squared test; Kruskal-Wallis rank sum test

UO onset at UOcons event:

mimic_uo_cons_figure <- aki_uo_analysis %>% 
ggplot(aes(TIME, UO_KG, color=MAX_STAGE_NEW_CONS, fill=MAX_STAGE_NEW_CONS))  + 
           # linetype=MAX_STAGE_NEW_CONS))  + 
  geom_hline(yintercept=0.3, size = 0.3, color = "#cccccc") +
  geom_hline(yintercept=0.5, size = 0.3, color = "#cccccc") +
  geom_vline(xintercept=0, size = 0.3, color = "black", linetype = "dashed") +
  stat_summary(fun = median, geom="line") +
  scale_x_continuous(breaks = seq(-24, 48, by=6)) +
  scale_y_continuous(breaks = c(0, 0.3, 0.5)) +
  coord_cartesian(xlim = c(-12, 24), ylim = c(0, 1.7)) +
  # xlim(-24, 48) +
  stat_summary(fun.min = function(z) { quantile(z,0.25) },
               fun.max = function(z) { quantile(z,0.75) },
               geom="ribbon", colour = NA, alpha=0.2) +
  labs(x="Time around AKI onset (hour)", y = "Urine output (ml/kg/hr)", 
       color="Maximum KDIGO-UO stage", fill="Maximum KDIGO-UO stage") + 
  theme_classic() + # remove panel background and gridlines
  scale_color_manual(values = pal_jama("default")(4)[2:4]) +
  scale_fill_manual(values = pal_jama("default")(4)[2:4]) +
  theme(
    legend.position = "none"
  )

mimic_uo_cons_figure

Serum Creatinine Analysis

aki_creat_analysis %>%
ggplot(aes(x=CREAT_BASLINE_DIFF)) + 
    xlim(0, 5) + 
    geom_histogram(binwidth = 0.1)

mimic_creat_a <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 1,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = "Difference from basline (mg/dL)") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

mimic_creat_b <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 2,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

mimic_creat_c <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 3,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

mimic_creat_d <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 1,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x=" ", y = "Relative sCr change") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

mimic_creat_e <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 2,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x="Time to AKI start (hours)", y = " ") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

mimic_creat_f <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 3,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})

ggarrange(
  mimic_creat_a,
  mimic_creat_b,
  mimic_creat_c,
  mimic_creat_d,
  mimic_creat_e,
  mimic_creat_f,
  labels = c("a", "b", "c", "d", "e", "f"),
  ncol = 3,
  nrow = 2,
  heights = c(1,1),
  legend = "bottom",
  common.legend = TRUE
)

Survival Analysis

km_fit <- survfit2(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide)

mimic_survival_cons_figure <- km_fit %>%
ggsurvfit(linewidth = 1) +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(-1, 31)) +
  theme_classic() +
  labs(x="Days", y = "Survival", 
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  scale_color_jama() +
  scale_fill_jama() +
  theme(legend.position = "bottom",
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold")) +
  add_risktable() +
  add_pvalue(caption = "Log-rank {p.value}")

mimic_survival_cons_figure

table of survival probabilities:

mimic_survival_table <-
  km_fit %>% tbl_survfit(times = c(7, 30, 90, 365),
                                                 label = "Maximum KDIGO-UO stage",
                                                 label_header = "**Day {time}**")
mimic_survival_table

Characteristic	Day 7	Day 30	Day 90	Day 365
Maximum KDIGO-UO stage
0	97% (96%, 97%)	93% (93%, 93%)	89% (89%, 89%)	83% (82%, 83%)
1	94% (94%, 94%)	88% (88%, 89%)	84% (83%, 84%)	77% (76%, 78%)
2	89% (88%, 89%)	80% (79%, 81%)	74% (73%, 75%)	66% (65%, 67%)
3	78% (76%, 80%)	63% (61%, 66%)	57% (55%, 60%)	49% (47%, 51%)

Log rank for each pair:

survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide)

## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide)
## 
## n=46344, 2189 observations deleted due to missingness.
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=0 23972     1715     3005    553.64   1190.38
## MAX_STAGE_NEW_CONS=1 11262     1336     1375      1.08      1.44
## MAX_STAGE_NEW_CONS=2  8991     1826     1049    575.26    711.31
## MAX_STAGE_NEW_CONS=3  2119      775      224   1360.29   1427.01
## 
##  Chisq= 2510  on 3 degrees of freedom, p= <2e-16

akis_all_wide_non_01 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 0 | MAX_STAGE_NEW_CONS == 1)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_01)

## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_01)
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=0 23972     1715     2094      68.5       219
## MAX_STAGE_NEW_CONS=1 11262     1336      957     149.7       219
## 
##  Chisq= 219  on 1 degrees of freedom, p= <2e-16

akis_all_wide_non_12 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 1 | MAX_STAGE_NEW_CONS == 2)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_12)

## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_12)
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=1 11262     1336     1793       116       271
## MAX_STAGE_NEW_CONS=2  8991     1826     1369       153       271
## 
##  Chisq= 272  on 1 degrees of freedom, p= <2e-16

akis_all_wide_non_23 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 2 | MAX_STAGE_NEW_CONS == 3)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_23)

## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_23)
## 
##                         N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=2 8991     1826     2144      47.1       272
## MAX_STAGE_NEW_CONS=3 2119      775      457     220.9       272
## 
##  Chisq= 272  on 1 degrees of freedom, p= <2e-16

mimic_t1a <- t1a
mimic_t1b <- t1b
mimic_table_1 <- table_1
mimic_table_1_akis <- table1_akis
mimic_uo_rate <- uo_rate
mimic_aki_epi <- akis_all_long %>%
  filter(group == "newcons")
mimic_akis_all_wide <- akis_all_wide
mimic_table_1_staging <- table_1_staging
mimic_rr_table_data <- rr_table_data
mimic_rr_table_data_adj <- rr_table_data %>% 
  left_join(bind_rows(rr_table_data_m1_adj1, rr_table_data_m2_adj1, rr_table_data_m3_adj1))

save(mimic_t1a,
     mimic_t1b,
     mimic_table_1,
     mimic_table_1_akis,
     mimic_uo_rate,
     mimic_uo_cons_figure,
     mimic_survival_cons_figure,
     mimic_survival_table,
     mimic_aki_epi,
     mimic_akis_all_wide,
     mimic_table_1_staging,
     mimic_rr_table_data,
     mimic_rr_table_data_adj,
     file = "paper_mimic.Rda")

mimic_uo_rate <- uo_rate
mimic_hourly_uo <- hourly_uo
mimic_raw_uo_eligible <- raw_uo_eligible

save(
  all_rows_count,
  distinct_time_item_patient_rows_count,
  S2_a,
  S3a,
  S4_a,
  S4_b,
  S4_c,
  S4_d,
  S4_e,
  S4_f,
  S6_a,
  S6_b,
  # S7_a,
  # S7_b,
  S7_c,
  S7_d,
  # S8_a,
  # S8_b,
  S8_c,
  S8_d,
  S8_e,
  # S8_f,
  S9_a,
  S9_b,
  S9_c,
  S9_d,
  S11,
  # mimic_uo_rate,
  # mimic_hourly_uo,
  # mimic_raw_uo_eligible,
  mimic_exclusion_threshold,
  mimic_kdigo_inter_aki_table,
  mimic_kdigo_inter_cons_mean,
  mimic_kdigo_inter_cons_old,
  mimic_kdigo_inter_mean_old,
  mimic_kdigo_inter_bic,
  mimic_kdigo_inter_survival_table,
  mimic_kdigo_inter_survival_table_adj,
  mimic_kdigo_inter_survival_figure,
  mimic_Sage_a,
  mimic_Sage_b,
  mimic_Sweight_a,
  mimic_Sweight_b,
  file = "s_data.Rda"
)

Technical Details

R Session Info:

sessionInfo()

## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Asia/Jerusalem
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] marginaleffects_0.21.0 rms_6.8-1              Hmisc_5.1-3           
##  [4] gt_0.11.0              ggsurvfit_1.1.0        ggsci_3.2.0           
##  [7] gtsummary_2.0.0        nortest_1.0-4          survminer_0.4.9       
## [10] ggpubr_0.6.0           survival_3.7-0         rmdformats_1.0.4      
## [13] kableExtra_1.4.0       broom_1.0.6            quantreg_5.98         
## [16] SparseM_1.84-2         rlang_1.1.4            ggforce_0.4.2         
## [19] ggpmisc_0.6.0          ggpp_0.5.8-1           scales_1.3.0          
## [22] ggbreak_0.1.2          psych_2.4.6.26         finalfit_1.0.8        
## [25] reshape2_1.4.4         lubridate_1.9.3        forcats_1.0.0         
## [28] stringr_1.5.1          dplyr_1.1.4            purrr_1.0.2           
## [31] readr_2.1.5            tidyr_1.3.1            tibble_3.2.1          
## [34] ggplot2_3.5.1          tidyverse_2.0.0        bigrquery_1.5.1       
## [37] DBI_1.2.3              pacman_0.5.1          
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.4.1        polspline_1.1.25     ggplotify_0.1.2     
##   [4] polyclip_1.10-7      rpart_4.1.23         lifecycle_1.0.4     
##   [7] rstatix_0.7.2        lattice_0.22-6       MASS_7.3-61         
##  [10] insight_0.20.2       backports_1.5.0      magrittr_2.0.3      
##  [13] sass_0.4.9           rmarkdown_2.27       jquerylib_0.1.4     
##  [16] yaml_2.3.10          askpass_1.2.0        cowplot_1.1.3       
##  [19] RColorBrewer_1.1-3   minqa_1.2.7          multcomp_1.4-26     
##  [22] abind_1.4-5          clock_0.7.1          yulab.utils_0.1.5   
##  [25] nnet_7.3-19          TH.data_1.1-2        tweenr_2.0.3        
##  [28] rappdirs_0.3.3       sandwich_3.1-0       labelled_2.13.0     
##  [31] KMsurv_0.1-5         cards_0.2.0          MatrixModels_0.5-3  
##  [34] cardx_0.2.0          svglite_2.1.3        commonmark_1.9.1    
##  [37] codetools_0.2-20     xml2_1.3.6           tidyselect_1.2.1    
##  [40] shape_1.4.6.1        aplot_0.2.3          farver_2.1.2        
##  [43] lme4_1.1-35.5        base64enc_0.1-3      broom.helpers_1.15.0
##  [46] jsonlite_1.8.8       mitml_0.4-5          Formula_1.2-5       
##  [49] iterators_1.0.14     systemfonts_1.1.0    foreach_1.5.2       
##  [52] tools_4.4.1          Rcpp_1.0.13          glue_1.7.0          
##  [55] mnormt_2.1.1         gridExtra_2.3        pan_1.9             
##  [58] mgcv_1.9-1           xfun_0.46            withr_3.0.0         
##  [61] fastmap_1.2.0        boot_1.3-30          fansi_1.0.6         
##  [64] openssl_2.2.0        digest_0.6.36        timechange_0.3.0    
##  [67] R6_2.5.1             gridGraphics_0.5-1   mice_3.16.0         
##  [70] colorspace_2.1-1     markdown_1.13        utf8_1.2.4          
##  [73] generics_0.1.3       data.table_1.15.4    httr_1.4.7          
##  [76] htmlwidgets_1.6.4    pkgconfig_2.0.3      gtable_0.3.5        
##  [79] survMisc_0.5.6       brio_1.1.5           htmltools_0.5.8.1   
##  [82] carData_3.0-5        bookdown_0.40        png_0.1-8           
##  [85] ggfun_0.1.5          knitr_1.48           km.ci_0.5-6         
##  [88] rstudioapi_0.16.0    tzdb_0.4.0           checkmate_2.3.1     
##  [91] nlme_3.1-165         curl_5.2.1           nloptr_2.1.1        
##  [94] cachem_1.1.0         zoo_1.8-12           parallel_4.4.1      
##  [97] foreign_0.8-87       pillar_1.9.0         grid_4.4.1          
## [100] vctrs_0.6.5          car_3.1-2            jomo_2.7-6          
## [103] xtable_1.8-4         cluster_2.1.6        htmlTable_2.4.3     
## [106] evaluate_0.24.0      mvtnorm_1.2-5        cli_3.6.3           
## [109] compiler_4.4.1       ggsignif_0.6.4       labeling_0.4.3      
## [112] plyr_1.8.9           fs_1.6.4             stringi_1.8.4       
## [115] viridisLite_0.4.2    munsell_0.5.1        glmnet_4.1-8        
## [118] Matrix_1.7-0         hms_1.1.3            patchwork_1.2.0     
## [121] bit64_4.0.5          haven_2.5.4          highr_0.11          
## [124] gargle_1.5.2         memoise_2.0.1        bslib_0.7.0         
## [127] bit_4.0.5            polynom_1.4-1

Toward the standardization of big datasets of urine output for AKI analysis: A multicenter validation study

Ariel Avraham Hasidim, Matthew Adam Klein, Itamar Ben Shitrit, Sharon Einav, Karny Ilan, Lior Fuchs

February 24th, 2025

Citation

Population and Study’s Sample

Figure for Raw UO Records Data

Temporal Trends Analysis

Check for Duplications

Distinctiveness

Simultaneous Charting

Exclusion

Inclusion/Exclusion Flowchart

Table 1 - Patient’s characteristics

Age

Weight

Table 2 - UO records characteristics

Data for single patient example

Raw UO records:

UO Rates

Hourly-Adjusted UO

Raw data analysis

Collection Periods

Volumes and Collection Periods

Records of zero volume

Adjusting for hourly UO

UO Rate

Low UO Rate Analysis

Mean Rate

Hourly-adjusted UO

Simple Sum Comparison

Hourly UO Per Kilogram

KDIGO Criteria Avarage-UO, Consecutive-UO and Old (MIMIC repo. official deriviation) Comparison

Sensitivity Analysis: Validity Threshold for Durations of Collection

Clinical Outcomes

First Oliguric-AKI Events

Serum Creatinine Analysis

Survival Analysis

Technical Details

R Session Info: