This is a full reproduction for the results of the derivation cohort (MIMICdb) using MIMIC-IV v2.2 and MIMIC Code v2.4.0

 

Citation

Hasidim, A.A., Klein, M.A., Ben Shitrit, I. et al. Toward the standardization of big datasets of urine output for AKI analysis: a multicenter validation study. Sci Rep 15, 20009 (2025). https://doi.org/10.1038/s41598-025-95535-4


Population and Study’s Sample

Total ICU stays in mimic:

dbGetQuery(con, "SELECT count(*) FROM `physionet-data.mimiciv_icu.icustays`")
## # A tibble: 1 × 1
##     f0_
##   <int>
## 1 73181

 

ICU stays with UO records (eligible):

n_distinct(raw_uo$STAY_ID)
## [1] 70364


Hospital admissions with UO records:

n_distinct(raw_uo$HADM_ID)
## [1] 64110

 

Patients with UO records:

n_distinct(raw_uo$SUBJECT_ID)
## [1] 49950

 

Count all UO records (before exclusions):

all_rows_count <- nrow(raw_uo)
all_rows_count
## [1] 3335985

 

Figure for Raw UO Records Data

Frequency of urine output charting by source

S3a <- raw_uo %>% 
  mutate(
    LABEL = case_when(
      LABEL == "GU Irrigant/Urine Volume Out" ~ "GU Irrig. Out",
      LABEL == "GU Irrigant Volume In" ~ "GU Irrig. In",
      .default = LABEL
    )
  ) %>% count(LABEL, sort = TRUE) %>%
   ggplot(aes(x=reorder(LABEL, -n), y=n)) +
   geom_bar(stat="identity") +
   xlab("") +
   ylab("") +
   geom_bar(stat="identity", fill="steelblue") +
   geom_text(aes(label=n), vjust=-0.6, color="black", size=3) +
   theme_classic() +
   theme(axis.text.y=element_blank()) +
   theme(axis.text.x = element_text(angle = 45, hjust = 1))

S3a

 

Check for Duplications

Distinctiveness

First, we are basing distinctive rows in the raw UO data.

Count distinct raw rows:

distinct_time_item_patient_rows_count <- raw_uo %>% 
  select(-VALUE, 
         -SERVICE, 
         -LABEL) %>% 
  n_distinct()

distinct_time_item_patient_rows_count
## [1] 3335985

Conclusion: the original raw query does not have duplicates (all rows are distinct by all columns)

 

Simultaneous Charting

raw_uo_excluions_duplicates$same_value <- as.factor(raw_uo_excluions_duplicates$same_value)
raw_uo_excluions_duplicates$label <- as.factor(raw_uo_excluions_duplicates$label)
raw_uo_excluions_duplicates$label <- factor(raw_uo_excluions_duplicates$label, levels = as.factor(names(sort(table(raw_uo_excluions_duplicates$label),
                                  decreasing = TRUE))))

S4_a <- raw_uo_excluions_duplicates %>%
  select(same_value, label) %>%
  tbl_summary(by = same_value)

S4_a
Characteristic Different volume
N = 12,051
1
Equal volume
N = 518
1
label

    GU Irrigant Volume In,GU Irrigant/Urine Volume Out 4,189 (35%) 8 (1.5%)
    Foley,L Nephrostomy 1,127 (9.4%) 69 (13%)
    Foley,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 1,158 (9.6%) 3 (0.6%)
    Foley,R Nephrostomy 1,077 (8.9%) 60 (12%)
    R Nephrostomy,L Nephrostomy 988 (8.2%) 125 (24%)
    Foley,Void 703 (5.8%) 61 (12%)
    Foley,Suprapubic 557 (4.6%) 55 (11%)
    R Nephrostomy,Ileoconduit 346 (2.9%) 24 (4.6%)
    Foley,R Nephrostomy,L Nephrostomy 322 (2.7%) 4 (0.8%)
    Foley,GU Irrigant Volume In 186 (1.5%) 1 (0.2%)
    Void,Straight Cath 161 (1.3%) 4 (0.8%)
    Suprapubic,R Nephrostomy 151 (1.3%) 11 (2.1%)
    Foley,Condom Cath 113 (0.9%) 18 (3.5%)
    R Ureteral Stent,Foley 122 (1.0%) 8 (1.5%)
    Suprapubic,L Nephrostomy 120 (1.0%) 9 (1.7%)
    Void,Condom Cath 78 (0.6%) 14 (2.7%)
    L Nephrostomy,Ileoconduit 90 (0.7%) 1 (0.2%)
    R Ureteral Stent,L Ureteral Stent 51 (0.4%) 7 (1.4%)
    Void,L Nephrostomy 48 (0.4%) 7 (1.4%)
    Condom Cath,Straight Cath 47 (0.4%) 2 (0.4%)
    Foley,Ileoconduit 46 (0.4%) 2 (0.4%)
    R Ureteral Stent,L Nephrostomy 45 (0.4%) 2 (0.4%)
    Condom Cath,R Nephrostomy 29 (0.2%) 3 (0.6%)
    R Nephrostomy,L Nephrostomy,Ileoconduit 29 (0.2%) 0 (0%)
    Void,R Nephrostomy 23 (0.2%) 3 (0.6%)
    Condom Cath,Suprapubic 21 (0.2%) 2 (0.4%)
    Foley,Straight Cath 17 (0.1%) 4 (0.8%)
    Foley,GU Irrigant/Urine Volume Out 19 (0.2%) 0 (0%)
    R Ureteral Stent,L Ureteral Stent,L Nephrostomy 19 (0.2%) 0 (0%)
    R Ureteral Stent,R Nephrostomy,Ileoconduit 16 (0.1%) 0 (0%)
    L Ureteral Stent,Foley 14 (0.1%) 1 (0.2%)
    R Ureteral Stent,L Ureteral Stent,Foley 14 (0.1%) 0 (0%)
    R Ureteral Stent,R Nephrostomy 9 (<0.1%) 4 (0.8%)
    Suprapubic,R Nephrostomy,L Nephrostomy 12 (<0.1%) 1 (0.2%)
    Condom Cath,R Nephrostomy,L Nephrostomy 11 (<0.1%) 0 (0%)
    Condom Cath,L Nephrostomy 8 (<0.1%) 2 (0.4%)
    Foley,Suprapubic,R Nephrostomy 10 (<0.1%) 0 (0%)
    Foley,L Nephrostomy,GU Irrigant Volume In 7 (<0.1%) 0 (0%)
    Foley,L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 6 (<0.1%) 0 (0%)
    R Nephrostomy,L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 6 (<0.1%) 0 (0%)
    L Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 5 (<0.1%) 0 (0%)
    L Ureteral Stent,R Nephrostomy,L Nephrostomy 5 (<0.1%) 0 (0%)
    Void,R Nephrostomy,L Nephrostomy 5 (<0.1%) 0 (0%)
    R Ureteral Stent,Ileoconduit 4 (<0.1%) 0 (0%)
    Void,Suprapubic 4 (<0.1%) 0 (0%)
    Foley,L Nephrostomy,Ileoconduit 3 (<0.1%) 0 (0%)
    Foley,Void,Condom Cath 1 (<0.1%) 1 (0.2%)
    Ileoconduit,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 2 (<0.1%) 0 (0%)
    R Nephrostomy,L Nephrostomy,GU Irrigant Volume In 2 (<0.1%) 0 (0%)
    R Ureteral Stent,Foley,R Nephrostomy 2 (<0.1%) 0 (0%)
    R Ureteral Stent,Void 2 (<0.1%) 0 (0%)
    Suprapubic,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 2 (<0.1%) 0 (0%)
    Suprapubic,Ileoconduit 1 (<0.1%) 1 (0.2%)
    Void,Ileoconduit 2 (<0.1%) 0 (0%)
    Foley,R Nephrostomy,GU Irrigant Volume In 1 (<0.1%) 0 (0%)
    Foley,R Nephrostomy,Ileoconduit 1 (<0.1%) 0 (0%)
    L Nephrostomy,Straight Cath 1 (<0.1%) 0 (0%)
    L Ureteral Stent,Foley,L Nephrostomy 1 (<0.1%) 0 (0%)
    L Ureteral Stent,L Nephrostomy 1 (<0.1%) 0 (0%)
    L Ureteral Stent,Suprapubic 1 (<0.1%) 0 (0%)
    R Nephrostomy,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 1 (<0.1%) 0 (0%)
    R Nephrostomy,GU Irrigant/Urine Volume Out 1 (<0.1%) 0 (0%)
    R Nephrostomy,L Nephrostomy,GU Irrigant/Urine Volume Out 1 (<0.1%) 0 (0%)
    R Nephrostomy,L Nephrostomy,Straight Cath 1 (<0.1%) 0 (0%)
    R Ureteral Stent,L Ureteral Stent,Foley,Suprapubic 1 (<0.1%) 0 (0%)
    R Ureteral Stent,Straight Cath 1 (<0.1%) 0 (0%)
    Suprapubic,GU Irrigant Volume In 1 (<0.1%) 0 (0%)
    Suprapubic,GU Irrigant/Urine Volume Out 0 (0%) 1 (0.2%)
    Void,Condom Cath,Straight Cath 1 (<0.1%) 0 (0%)
    Void,GU Irrigant Volume In 1 (<0.1%) 0 (0%)
    Void,GU Irrigant Volume In,GU Irrigant/Urine Volume Out 1 (<0.1%) 0 (0%)
1 n (%)
      Show full SQL query —–>
WITH aa AS (
        SELECT
            STAY_ID,
            CHARTTIME
        FROM
            `mimic_uo_and_aki.a_urine_output_raw` uo
        GROUP BY
            STAY_ID,
            CHARTTIME
    ),
    ab AS (
        SELECT
            a.*,
            b.ITEMID,
            b.VALUE,
            c.label
        FROM
            aa a
            LEFT JOIN `mimic_uo_and_aki.a_urine_output_raw` b ON b.STAY_ID = a.STAY_ID
            AND b.CHARTTIME = a.CHARTTIME
            LEFT JOIN `physionet-data.mimiciv_icu.d_items` c ON c.itemid = b.itemid
        ORDER BY
            STAY_ID,
            CHARTTIME,
            ITEMID
    ),
    ac AS (
        SELECT
            STAY_ID,
            CHARTTIME,
            STRING_AGG(label) label,
            COUNT(STAY_ID) COUNT,
            IF(
                MIN(VALUE) = MAX(VALUE),
                "Equal volume",
                "Different volume"
            ) AS same_value
        FROM
            ab
        GROUP BY
            STAY_ID,
            CHARTTIME
    ),
    ad AS (
        SELECT
            COUNT(CHARTTIME) COUNT,
            label,
            same_value
        FROM
            ac
        WHERE
            COUNT > 1
        GROUP BY
            label,
            same_value
    )
SELECT
    *
FROM
    ac
WHERE
    COUNT > 1

In conclusion, most of the records have different values, and thus human error in duplicate record-keeping is not likely.

 

Exclusion

ICU type exclusion:

dbGetQuery(con, statement = read_file('sql/service_type_exclusion.sql'))
## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1       238       6223
      Show full SQL query —–>
WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(DISTINCT STAY_ID) icu_stays,
  COUNT(STAY_ID) UO_records,
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )

 

Uretral stent exclusion:

dbGetQuery(con, statement = read_file('sql/ure_stent_exclusion.sql'))
## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1        45       3201
      Show full SQL query —–>
WITH stays_services AS (
        -- Adding ICU type by looking into services
        SELECT
            a.STAY_ID,
            ARRAY_AGG(
                c.curr_service
                ORDER BY
                    c.transfertime DESC
                LIMIT
                    1
            ) [OFFSET(0)] AS SERVICE
        FROM
            `mimic_uo_and_aki.a_urine_output_raw` a
            LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
            AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
        GROUP BY
            a.STAY_ID
    )
SELECT
    COUNT(DISTINCT STAY_ID) icu_stays,
    COUNT(STAY_ID) UO_records
FROM
    `mimic_uo_and_aki.a_urine_output_raw`
WHERE
    STAY_ID IN (
        SELECT
            STAY_ID
        FROM
            `physionet-data.mimiciv_icu.outputevents`
        WHERE
            ITEMID IN (226558, 226557) -- Urethral stent
        GROUP BY
            STAY_ID
    )
    AND STAY_ID IN (
        SELECT
            STAY_ID
        FROM
            stays_services
        WHERE
            SERVICE IN (
                'MED',
                'TSURG',
                'CSURG',
                'CMED',
                'NMED',
                'OMED',
                'TRAUM',
                'SURG',
                'NSURG',
                'ORTHO',
                'VSURG',
                'ENT',
                'PSURG',
                'GU'
            )
    )

 

GU irrigation exclusion:

dbGetQuery(con, statement = read_file('sql/gu_irig_exclusion.sql'))
## # A tibble: 1 × 2
##   icu_stays UO_records
##       <int>      <int>
## 1       639      85286
      Show full SQL query —–>
WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(DISTINCT STAY_ID) icu_stays,
  COUNT(STAY_ID) UO_records
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (227488, 227489) --GU irrigation
    GROUP BY
      STAY_ID
  )
  AND STAY_ID NOT IN (
        SELECT
            STAY_ID
        FROM
            `physionet-data.mimiciv_icu.outputevents`
        WHERE
            ITEMID IN (226558, 226557) -- Urethral stent
        GROUP BY
            STAY_ID
    )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )

 

Not passing UO sanity check:

dbGetQuery(con, statement = read_file('sql/sanity.sql'))
## # A tibble: 1 × 1
##   UO_records
##        <int>
## 1          9
      Show full SQL query —–>
WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  COUNT(STAY_ID) UO_records
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (226558, 226557, 227488, 227489)
    GROUP BY
      STAY_ID
  )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )
  AND (
    VALUE > 5000
    OR VALUE < 0
  )

 

Total raw urine output after exclusion (“included records, before dropping records without collection times”):

nrow(raw_uo_eligible)
## [1] 3241266
      Show full SQL query —–>
WITH stays_services AS (
    -- Adding ICU type by looking into services
    SELECT
      a.STAY_ID,
      ARRAY_AGG(
        c.curr_service
        ORDER BY
          c.transfertime DESC
        LIMIT
          1
      ) [OFFSET(0)] AS SERVICE
    FROM
      `mimic_uo_and_aki.a_urine_output_raw` a
      LEFT JOIN `physionet-data.mimiciv_hosp.services` c ON c.hadm_id = a.HADM_ID
      AND c.transfertime < DATETIME_ADD(a.CHARTTIME, INTERVAL 1 HOUR)
    GROUP BY
      a.STAY_ID
  )
SELECT
  *
FROM
  `mimic_uo_and_aki.a_urine_output_raw`
WHERE
  STAY_ID NOT IN (
    SELECT
      STAY_ID
    FROM
      `physionet-data.mimiciv_icu.outputevents`
    WHERE
      ITEMID IN (226558, 226557, 227488, 227489)
    GROUP BY
      STAY_ID
  )
  AND STAY_ID IN (
    SELECT
      STAY_ID
    FROM
      stays_services
    WHERE
      SERVICE IN (
        'MED',
        'TSURG',
        'CSURG',
        'CMED',
        'NMED',
        'OMED',
        'TRAUM',
        'SURG',
        'NSURG',
        'ORTHO',
        'VSURG',
        'ENT',
        'PSURG',
        'GU'
      )
  )
  AND NOT (
    VALUE > 5000
    OR VALUE < 0
  )

 

Total icu stays after exclusion (“included records, before dropping records without collection times”):

n_distinct(raw_uo_eligible$STAY_ID)
## [1] 69442

 

Exclusion of first volume in each compartment per ICU stay:

uo_rate_including_null_collection_period %>%
  filter(STAY_ID %in% raw_uo_eligible$STAY_ID) %>%
  filter(is.na(TIME_INTERVAL)) %>%
  nrow()
## [1] 70051

 

UO records with time intervals (“Included records”):

uo_rate_including_null_collection_period %>%
  filter(STAY_ID %in% raw_uo_eligible$STAY_ID) %>%
  drop_na(TIME_INTERVAL) %>%
  nrow()
## [1] 3171215

 

Count UO records by anatomical compartment:

uo_rate %>% 
  mutate(agg_group = case_when(SOURCE == "Foley" |
                                 SOURCE == "Condom Cath" |
                                 SOURCE == "Straight Cath" |
                                 SOURCE == "Suprapubic" |
                                 SOURCE == "Void" ~ "Urinary bladder",
                               TRUE ~ SOURCE)
  ) %>%
           group_by(agg_group) %>%
   dplyr::summarise(N = n()
  )
## # A tibble: 4 × 2
##   agg_group             N
##   <chr>             <int>
## 1 Ileoconduit        6022
## 2 L Nephrostomy      3206
## 3 R Nephrostomy      3465
## 4 Urinary bladder 3158522

 

ICU stays with UO records with time intervals:

print("ICU stays after exclusion criteria:")
## [1] "ICU stays after exclusion criteria:"
uo_rate_including_null_collection_period %>%
  {n_distinct(.$STAY_ID)}
## [1] 69442
print("Included ICU stays (has time intervals):")
## [1] "Included ICU stays (has time intervals):"
uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL) %>%
  {n_distinct(.$STAY_ID)}
## [1] 67642
print("ICU stays with UO records that does not  have time interval (no previous UO record in the same compartment:")
## [1] "ICU stays with UO records that does not  have time interval (no previous UO record in the same compartment:"
uo_rate_including_null_collection_period %>%
                     filter(is.na(TIME_INTERVAL)) %>%
  {n_distinct(.$STAY_ID)}
## [1] 69438
print("ICU stays dropped due to no UO records with time intervalst (no previous UO record in the same compartment:")
## [1] "ICU stays dropped due to no UO records with time intervalst (no previous UO record in the same compartment:"
(uo_rate_including_null_collection_period %>%
                     filter(is.na(TIME_INTERVAL))) %>%
  filter(!(STAY_ID %in% (uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL))$STAY_ID)) %>%
  {n_distinct(.$STAY_ID)}
## [1] 1800


Count total ICU days of UO monitoring for icu stays with time intervals:

hourly_uo %>%
  filter(STAY_ID %in%
           (uo_rate_including_null_collection_period %>%
                     drop_na(TIME_INTERVAL))$STAY_ID) %>%
  nrow() / 24
## [1] 218388.2

 

Hours of UO monitoring:

print("Hours of UO monitoring in all included ICU stays:")
## [1] "Hours of UO monitoring in all included ICU stays:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  nrow()
## [1] 5241317
print("Valid hourly-adjusted UO monitoring hours:")
## [1] "Valid hourly-adjusted UO monitoring hours:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  nrow()
## [1] 5211377
print("ICU stays with valid hourly-adjusted UO monitoring hours:")
## [1] "ICU stays with valid hourly-adjusted UO monitoring hours:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  {n_distinct(.$STAY_ID)}
## [1] 67602


Proportion of valid hours covered out of included hours of uo monitoring:

hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  nrow() /
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  nrow()
## [1] 0.9942877

 

Hourly-adjusted UO with admission weights:

print("ICU stays with weight at admission and valid hourly-adjusted UO:")
## [1] "ICU stays with weight at admission and valid hourly-adjusted UO:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  drop_na(WEIGHT_ADMIT) %>%
  {n_distinct(.$STAY_ID)}
## [1] 65717
print("Valid hourly-adjusted UO monitoring hours with weight at admission:")
## [1] "Valid hourly-adjusted UO monitoring hours with weight at admission:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  drop_na(WEIGHT_ADMIT) %>%
  nrow()
## [1] 5127472
print("ICU stays with valid weight (25 <= kg <=300) and valid hourly-adjusted UO:")
## [1] "ICU stays with valid weight (25 <= kg <=300) and valid hourly-adjusted UO:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  {n_distinct(.$STAY_ID)}
## [1] 65595
print("Valid hourly-adjusted UO monitoring hours with valid weights:")
## [1] "Valid hourly-adjusted UO monitoring hours with valid weights:"
hourly_uo %>%
  filter(STAY_ID %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  nrow()
## [1] 5119874


ICU stays with calculated KDIGO-UO staging (at least six consecutive hours with valid charting of hourly-adjusted UO)

print("number of ICU stays with valid KDIGO-UO staging")
## [1] "number of ICU stays with valid KDIGO-UO staging"
kdigo_uo_stage %>% 
  filter(stay_id %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}
## [1] 64044
print("number of included hours with valid KDIGO-UO staging")
## [1] "number of included hours with valid KDIGO-UO staging"
kdigo_uo_stage %>% 
  filter(stay_id %in%
           (
             uo_rate_including_null_collection_period %>%
               drop_na(TIME_INTERVAL)
           )$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()
## [1] 4794111


eligible first ICU admission to each patient for AKI analysis

first_icu_stay <- dbGetQuery(con, "SELECT subject_id,
            ARRAY_AGG(
                STAY_ID
                ORDER BY
                    intime ASC
                LIMIT
                    1
            ) [OFFSET(0)] FIRST_STAY_ID_IN_PATIENT
        FROM
            `physionet-data.mimiciv_icu.icustays`
        GROUP BY
            subject_id")

print("number of first ICU stays with valid KDIGO-UO staging")
## [1] "number of first ICU stays with valid KDIGO-UO staging"
kdigo_uo_stage %>% 
  filter(stay_id %in% first_icu_stay$FIRST_STAY_ID_IN_PATIENT
         & stay_id %in% raw_uo_eligible$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}
## [1] 46348
print("number of valid hourly KDIGO-UO staging in first ICU stays")
## [1] "number of valid hourly KDIGO-UO staging in first ICU stays"
kdigo_uo_stage %>% 
  filter(stay_id %in% first_icu_stay$FIRST_STAY_ID_IN_PATIENT
         & stay_id %in% raw_uo_eligible$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()
## [1] 3304774

included ICU admissions for AKI analysis

print("number of first ICU stays (after exclusion criteria) with valid KDIGO-UO staging for the first 24-hours ins ICU stay")
## [1] "number of first ICU stays (after exclusion criteria) with valid KDIGO-UO staging for the first 24-hours ins ICU stay"
aki_epi <- akis_all_long %>%
  filter(group == "newcons") %>%
  drop_na(prevalnce_admit) %>%
  transmute(STAY_ID,
           first_kdigo_uo = first_stage,
         max_uo_stage = max_stage)
  

kdigo_uo_stage %>% 
  filter(stay_id %in% aki_epi$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  {n_distinct(.$stay_id)}
## [1] 46344
print("number of valid hourly KDIGO-UO for those stays")
## [1] "number of valid hourly KDIGO-UO for those stays"
kdigo_uo_stage %>% 
  filter(stay_id %in% aki_epi$STAY_ID) %>%
  filter(weight_admit <= 300,
         weight_admit >= 25) %>%
  nrow()
## [1] 3304351

Inclusion/Exclusion Flowchart

knitr::include_graphics("flow chart.png")

Table 1 - Patient’s characteristics

table_1$SERVICE <- as.factor(table_1$SERVICE)
table_1$admission_age <- as.numeric(table_1$admission_age)
table_1$weight_admit <- as.numeric(table_1$weight_admit)
table_1$height_first <- as.numeric(table_1$height_first)
table_1$creat_first <- as.numeric(table_1$creat_first)
table_1$creat_peak_72 <- as.numeric(table_1$creat_peak_72)
table_1$creat_last <- as.numeric(table_1$creat_last)

table_1 <- table_1 %>%
  mutate(
    race =
      case_when(
        grepl("asian", race, ignore.case = TRUE) ~ "Asian",
        grepl("black", race, ignore.case = TRUE) ~ "African American",
        grepl("white", race, ignore.case = TRUE) ~ "Caucasian",
        grepl("hispanic", race, ignore.case = TRUE) ~ "Hispanic",
        grepl("other", race, ignore.case = TRUE) ~ "Other",
        grepl("native", race, ignore.case = TRUE) ~ "Other",
        grepl("MULTIPLE", race, ignore.case = TRUE) ~ "Other",
        grepl("PORTUGUESE", race, ignore.case = TRUE) ~ "Other",
        grepl("SOUTH AMERICAN", race, ignore.case = TRUE) ~ "Other",
        TRUE ~ as.character(NA)
      )
  )

uo_for_table1 <- uo_rate_including_null_collection_period %>%
  drop_na(TIME_INTERVAL) %>%
  group_by(STAY_ID) %>%
  summarise(
    count = n(),
    volumes = mean(VALUE, na.rm = TRUE),
    collection_times = mean(TIME_INTERVAL, na.rm = TRUE),
    rates = mean(HOURLY_RATE, na.rm = TRUE),
    ml_kg_hr = mean(HOURLY_RATE / WEIGHT_ADMIT, na.rm = TRUE)
  )

t1a <- table_1 %>%
  select(
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    type = list(
      c(hospital_expire_flag, ckd, dm, rrt_binary) ~ "dichotomous",
      c(
        admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last
      ) ~ "continuous"
    ),
    statistic = c(
      admission_age,
      weight_admit,
      creat_first,
      creat_peak_72,
      creat_last
    ) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_n() 

display_prec <- function(x)
  mean(x) * 100

t1b <- uo_for_table1 %>%
  select(count,
         volumes,
         collection_times,
         rates,
         ml_kg_hr) %>%
  tbl_summary(
    type = list(
      c(volumes,
        collection_times,
        rates,
        ml_kg_hr) ~ "continuous"
    ),
    statistic = list(
      all_continuous() ~ "{mean} ({sd})"
    ),
    missing = "no",
    label = list(
      count ~ "Number of Measurements",
      volumes ~ "Average Volumes, mL",
      collection_times ~ "Average Collection Times, minutes",
      rates ~ "Average Rates, mL/hr",
      ml_kg_hr ~ "Average Rate to Weight, mL/hr/kg"
    )
  ) %>%
  add_n() 

tbl_stack(
  list(t1a, t1b),
  group_header = c("ICU Stay", "UO Charting Across ICU Stay")
) %>%
  as_gt() %>%
  tab_source_note(source_note = "The variables age at hospital admission, gender, CCI, CKD, ethnicity, time in hospital, and mortality are measured for each hospital admission and might be counted for more than one ICU stay. All other variables are measured individually for each ICU stay. The variables average collection times, average rates, and average rate to weight are only presented for ICU stays with at least two UO measurements for the same compartment, and the latter also required weight at admission. AKI variables were presented for ICU stays with at least one hour with a non-null KDIGO-UO stage. The variables average AKI duration and total time in AKI were specifically reported for ICU stays with at least one AKI event.") %>%
  tab_source_note(source_note = "AKI: Acute Kidney Injury; CCI: Charlson Comorbidity Index; CKD Stage 1-4: Chronic Kidney Disease excluding end-stage-renal-disease; ICU: Intensive Care Unit; KDIGO: Kidney Disease: Improving Global Outcomes; SOFA: Sequential Organ Failure Assessment; UO: Urine Output.")
Characteristic N N = 67,6421
ICU Stay
Age at Hospital Admission, years 67,642 65 (17)
Weight at ICU Admission, kg 65,751 81 (34)
Gender 67,642
    F
29,686 (44%)
    M
37,956 (56%)
Ethnicity 60,412
    African American
6,836 (11%)
    Asian
1,956 (3.2%)
    Caucasian
46,323 (77%)
    Hispanic
2,484 (4.1%)
    Other
2,813 (4.7%)
CCI Score 67,642 5 (3, 7)
CKD, Stage 1-4 67,624 13,162 (19%)
Diabetes Mellitus 67,624 15,896 (24%)
SOFA Score at ICU Admission 67,642 4 (2, 6)
SAPS-II at ICU Admission 67,161 33 (25, 42)
APS-III Score at ICU Admission 67,642 39 (29, 52)
First Creatinine in ICU, mg/dL 67,287 1.35 (1.37)
Peak Creatinine at first days, mg/dL 67,263 1.52 (1.55)
ICU Discharge Creatinine, mg/dL 67,287 1.25 (1.21)
Peak KDIGO-Cr at first days 66,255
    0
48,817 (74%)
    1
12,222 (18%)
    2
2,764 (4.2%)
    3
2,452 (3.7%)
Time in hospital, days 67,642 7 (4, 13)
Time in ICU, days 67,642 2.0 (1.1, 3.8)
Renal replacement therapy 67,642 4,009 (5.9%)
Hospital Mortality 67,642 7,198 (11%)
UO Charting Across ICU Stay
Number of Measurements 67,642 47 (75)
Average Volumes, mL 67,642 182 (149)
Average Collection Times, minutes 67,642 157 (230)
Average Rates, mL/hr 67,642 115 (259)
Average Rate to Weight, mL/hr/kg 65,751 1.62 (6.16)
The variables age at hospital admission, gender, CCI, CKD, ethnicity, time in hospital, and mortality are measured for each hospital admission and might be counted for more than one ICU stay. All other variables are measured individually for each ICU stay. The variables average collection times, average rates, and average rate to weight are only presented for ICU stays with at least two UO measurements for the same compartment, and the latter also required weight at admission. AKI variables were presented for ICU stays with at least one hour with a non-null KDIGO-UO stage. The variables average AKI duration and total time in AKI were specifically reported for ICU stays with at least one AKI event.
AKI: Acute Kidney Injury; CCI: Charlson Comorbidity Index; CKD Stage 1-4: Chronic Kidney Disease excluding end-stage-renal-disease; ICU: Intensive Care Unit; KDIGO: Kidney Disease: Improving Global Outcomes; SOFA: Sequential Organ Failure Assessment; UO: Urine Output.
1 Mean (SD); n (%); Median (Q1, Q3)

 

Age

mimic_Sage_a <- table_1 %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(admission_age),2),
                   SD = round(sd(admission_age),2),
                   '5th' = round(quantile(admission_age, 0.05),2),
                   '10th' = round(quantile(admission_age, 0.1),2),
                   '25th' = round(quantile(admission_age, 0.25),2),
                   '50th' = round(quantile(admission_age, 0.50),2),
                   '75th' = round(quantile(admission_age, 0.75),2),
                   '95th' = round(quantile(admission_age, 0.95),2),
                   Min = round(min(admission_age),2),
                   Max = round(max(admission_age),2)
  ) %>% gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)
mimic_Sage_a
N Mean SD 5th 10th 25th 50th 75th 95th Min Max
67,642.00 64.79 16.79 32.00 41.00 55.00 66.00 77.00 89.00 18.00 102.00
mimic_Sage_b <- ggplot() + 
  geom_histogram(aes(x = admission_age
                     ), data=table_1, binwidth = 1) + 
  labs(
        x = "Age (years)",
        y = "Frequency"
      )

mimic_Sage_b

 

Weight

mimic_Sweight_a <- table_1 %>%
  drop_na(weight_admit) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(weight_admit),2),
                   SD = round(sd(weight_admit),2),
                   '5th' = round(quantile(weight_admit, 0.05),2),
                   '10th' = round(quantile(weight_admit, 0.1),2),
                   '25th' = round(quantile(weight_admit, 0.25),2),
                   '50th' = round(quantile(weight_admit, 0.50),2),
                   '75th' = round(quantile(weight_admit, 0.75),2),
                   '95th' = round(quantile(weight_admit, 0.95),2),
                   Min = round(min(weight_admit),2),
                   Max = round(max(weight_admit),2)
  ) %>% gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)
mimic_Sweight_a
N Mean SD 5th 10th 25th 50th 75th 95th Min Max
65,751.00 81.44 34.35 50.00 55.70 65.60 78.10 93.00 122.00 1.00 5,864.00
mimic_Sweight_b <- ggplot() + 
  geom_histogram(aes(x = weight_admit
                     ), data=table_1, binwidth = 5) + 
  labs(
        # title = "Hourly-Adjusted UO per Kilogram",
        x = "Weight (kg)",
        y = "Frequency"
      ) +
  xlim(0, 300)

mimic_Sweight_b

 

Table 2 - UO records characteristics

uo_rate$SOURCE <- as.factor(uo_rate$SOURCE)
uo_rate$SERVICE <- as.factor(uo_rate$SERVICE)

uo_rate %>%
  select(VALUE, TIME_INTERVAL, SOURCE, SERVICE) %>%
  tbl_summary(by=SERVICE) %>%
  add_overall()
Characteristic Overall
N = 3,171,215
1
CMED
N = 285,374
1
PSURG
N = 5,896
1
SURG
N = 376,125
1
TRAUM
N = 139,716
1
TSURG
N = 88,269
1
VSURG
N = 83,399
1
CSURG
N = 510,735
1
ENT
N = 5,914
1
GU
N = 6,562
1
MED
N = 1,123,098
1
NMED
N = 180,286
1
NSURG
N = 307,162
1
OMED
N = 20,881
1
ORTHO
N = 37,798
1
VALUE 100 (45, 175) 100 (50, 200) 100 (50, 200) 75 (40, 140) 80 (45, 150) 80 (45, 150) 70 (35, 125) 75 (40, 145) 125 (65, 240) 100 (50, 160) 100 (45, 190) 100 (50, 200) 125 (70, 225) 125 (60, 250) 80 (45, 150)
TIME_INTERVAL 60 (60, 120) 60 (60, 120) 60 (60, 120) 60 (60, 81) 60 (60, 60) 60 (60, 71) 60 (60, 60) 60 (60, 60) 60 (60, 120) 60 (60, 120) 60 (60, 120) 60 (60, 120) 60 (60, 120) 90 (60, 120) 60 (60, 82)
SOURCE














    Condom Cath 37,890 (1.2%) 3,977 (1.4%) 22 (0.4%) 2,954 (0.8%) 1,535 (1.1%) 932 (1.1%) 370 (0.4%) 2,292 (0.4%) 31 (0.5%) 0 (0%) 10,733 (1.0%) 7,422 (4.1%) 6,957 (2.3%) 598 (2.9%) 67 (0.2%)
    Foley 2,851,891 (90%) 234,442 (82%) 5,430 (92%) 355,674 (95%) 132,961 (95%) 82,596 (94%) 79,095 (95%) 491,557 (96%) 4,787 (81%) 4,941 (75%) 993,097 (88%) 151,583 (84%) 263,074 (86%) 16,042 (77%) 36,612 (97%)
    Ileoconduit 6,022 (0.2%) 93 (<0.1%) 0 (0%) 1,037 (0.3%) 53 (<0.1%) 1 (<0.1%) 165 (0.2%) 149 (<0.1%) 0 (0%) 953 (15%) 3,184 (0.3%) 242 (0.1%) 21 (<0.1%) 23 (0.1%) 101 (0.3%)
    L Nephrostomy 3,206 (0.1%) 41 (<0.1%) 0 (0%) 500 (0.1%) 33 (<0.1%) 4 (<0.1%) 0 (0%) 0 (0%) 0 (0%) 46 (0.7%) 2,377 (0.2%) 48 (<0.1%) 30 (<0.1%) 105 (0.5%) 22 (<0.1%)
    R Nephrostomy 3,465 (0.1%) 193 (<0.1%) 0 (0%) 328 (<0.1%) 7 (<0.1%) 4 (<0.1%) 12 (<0.1%) 74 (<0.1%) 0 (0%) 65 (1.0%) 2,642 (0.2%) 60 (<0.1%) 8 (<0.1%) 72 (0.3%) 0 (0%)
    Straight Cath 10,207 (0.3%) 280 (<0.1%) 10 (0.2%) 858 (0.2%) 488 (0.3%) 301 (0.3%) 261 (0.3%) 741 (0.1%) 41 (0.7%) 8 (0.1%) 2,458 (0.2%) 2,179 (1.2%) 2,336 (0.8%) 100 (0.5%) 146 (0.4%)
    Suprapubic 9,672 (0.3%) 383 (0.1%) 19 (0.3%) 1,182 (0.3%) 168 (0.1%) 10 (<0.1%) 25 (<0.1%) 16 (<0.1%) 0 (0%) 368 (5.6%) 6,965 (0.6%) 203 (0.1%) 329 (0.1%) 4 (<0.1%) 0 (0%)
    Void 248,862 (7.8%) 45,965 (16%) 415 (7.0%) 13,592 (3.6%) 4,471 (3.2%) 4,421 (5.0%) 3,471 (4.2%) 15,906 (3.1%) 1,055 (18%) 181 (2.8%) 101,642 (9.1%) 18,549 (10%) 34,407 (11%) 3,937 (19%) 850 (2.2%)
1 Median (Q1, Q3); n (%)

 

Data for single patient example

The data that was used for single patient example.

Raw UO records:

raw_uo_as_character <- raw_uo %>%
  filter(STAY_ID == 36871275)
raw_uo_as_character[] <- lapply(raw_uo_as_character, as.character)

S2_a <- raw_uo_as_character %>%
  select(-SUBJECT_ID, -HADM_ID, -STAY_ID, -SERVICE) %>%
  arrange(., CHARTTIME) %>%
  slice_head(n=15) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S2_a
CHARTTIME VALUE ITEMID LABEL
2144-05-19 17:00:00 50 226559 Foley
2144-05-19 18:00:00 20 226559 Foley
2144-05-19 19:00:00 20 226559 Foley
2144-05-19 19:46:00 150 226564 R Nephrostomy
2144-05-19 20:00:00 20 226559 Foley
2144-05-19 21:00:00 35 226559 Foley
2144-05-19 22:00:00 45 226564 R Nephrostomy
2144-05-19 22:00:00 35 226559 Foley
2144-05-20 23 226559 Foley
2144-05-20 01:00:00 40 226559 Foley
2144-05-20 01:00:00 35 226564 R Nephrostomy
2144-05-20 02:00:00 17 226559 Foley
2144-05-20 03:00:00 22 226559 Foley
2144-05-20 04:00:00 20 226559 Foley
2144-05-20 05:00:00 50 226559 Foley

 

UO Rates

uo_rate %>%
  filter(STAY_ID == 36871275) %>%
  select(-HADM_ID, -STAY_ID, -WEIGHT_ADMIT, -SERVICE) %>%
  arrange(., CHARTTIME) %>%
  slice_head(n=20) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)
SOURCE VALUE CHARTTIME LAST_CHARTTIME TIME_INTERVAL HOURLY_RATE
Foley 20 2144-05-19 18:00:00 2144-05-19 17:00:00 60 20
Foley 20 2144-05-19 19:00:00 2144-05-19 18:00:00 60 20
Foley 20 2144-05-19 20:00:00 2144-05-19 19:00:00 60 20
Foley 35 2144-05-19 21:00:00 2144-05-19 20:00:00 60 35
Foley 35 2144-05-19 22:00:00 2144-05-19 21:00:00 60 35
R Nephrostomy 45 2144-05-19 22:00:00 2144-05-19 19:46:00 134 20
Foley 23 2144-05-20 2144-05-19 22:00:00 120 12
Foley 40 2144-05-20 01:00:00 2144-05-20 60 40
R Nephrostomy 35 2144-05-20 01:00:00 2144-05-19 22:00:00 180 12
Foley 17 2144-05-20 02:00:00 2144-05-20 01:00:00 60 17
Foley 22 2144-05-20 03:00:00 2144-05-20 02:00:00 60 22
Foley 20 2144-05-20 04:00:00 2144-05-20 03:00:00 60 20
Foley 50 2144-05-20 05:00:00 2144-05-20 04:00:00 60 50
Foley 50 2144-05-20 06:00:00 2144-05-20 05:00:00 60 50
Foley 135 2144-05-20 08:00:00 2144-05-20 06:00:00 120 68
Foley 65 2144-05-20 10:00:00 2144-05-20 08:00:00 120 32
Foley 60 2144-05-20 12:00:00 2144-05-20 10:00:00 120 30
R Nephrostomy 100 2144-05-20 12:00:00 2144-05-20 01:00:00 660 9
Foley 15 2144-05-20 13:00:00 2144-05-20 12:00:00 60 15
Foley 40 2144-05-20 14:00:00 2144-05-20 13:00:00 60 40

 

Hourly-Adjusted UO

hourly_uo %>%
  filter(STAY_ID == 36871275) %>%
  select(-STAY_ID, -WEIGHT_ADMIT) %>%
  arrange(., T_PLUS) %>%
  slice_head(n=20) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)
T_PLUS TIME_INTERVAL_STARTS TIME_INTERVAL_FINISH HOURLY_WEIGHTED_MEAN_RATE SIMPLE_SUM
1 2144-05-19 17:00:00 2144-05-19 18:00:00 20 20
2 2144-05-19 18:00:00 2144-05-19 19:00:00 20 20
3 2144-05-19 19:00:00 2144-05-19 20:00:00 25 170
4 2144-05-19 20:00:00 2144-05-19 21:00:00 55 35
5 2144-05-19 21:00:00 2144-05-19 22:00:00 55 80
6 2144-05-19 22:00:00 2144-05-19 23:00:00 23 0
7 2144-05-19 23:00:00 2144-05-20 23 23
8 2144-05-20 2144-05-20 01:00:00 52 75
9 2144-05-20 01:00:00 2144-05-20 02:00:00 26 17
10 2144-05-20 02:00:00 2144-05-20 03:00:00 31 22
11 2144-05-20 03:00:00 2144-05-20 04:00:00 29 20
12 2144-05-20 04:00:00 2144-05-20 05:00:00 59 50
13 2144-05-20 05:00:00 2144-05-20 06:00:00 59 50
14 2144-05-20 06:00:00 2144-05-20 07:00:00 77 0
15 2144-05-20 07:00:00 2144-05-20 08:00:00 77 135
16 2144-05-20 08:00:00 2144-05-20 09:00:00 42 0
17 2144-05-20 09:00:00 2144-05-20 10:00:00 42 65
18 2144-05-20 10:00:00 2144-05-20 11:00:00 39 0
19 2144-05-20 11:00:00 2144-05-20 12:00:00 39 160
20 2144-05-20 12:00:00 2144-05-20 13:00:00 33 15

Raw data analysis

Collection Periods

S6_a <- uo_rate %>% group_by(SOURCE) %>%
   dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S6_a
SOURCE N Mean SD 5th 10th 25th 50th 75th 95th Min Max
Foley 2,851,891 83 97 60 60 60 60 105 180 1 43,235
Void 248,862 232 375 60 60 119 180 287 585 1 51,846
Condom Cath 37,890 222 304 60 60 90 131 240 600 1 8,904
Straight Cath 10,207 732 1,486 60 122 300 409 617 2,547 1 35,375
Suprapubic 9,672 115 131 60 60 60 60 120 240 1 6,616
Ileoconduit 6,022 129 213 60 60 60 68 120 300 1 8,040
R Nephrostomy 3,465 223 243 60 60 120 180 240 635 2 7,740
L Nephrostomy 3,206 251 298 60 60 120 180 300 660 2 7,560
S6_b <- ggplot(data = uo_rate, aes(x = TIME_INTERVAL / 60)) +
  geom_histogram(binwidth = 1) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +
  xlim(-1, 20) +
  labs(
          x = "Time interval (hr)",
          y = "Frequency"
        ) 

S6_b

 

Volumes and Collection Periods

S7_a <- uo_rate %>% group_by(SOURCE) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(VALUE),0),
                   SD = round(sd(VALUE),0),
                   '5th' = round(quantile(VALUE, 0.05),0),
                   '10th' = round(quantile(VALUE, 0.1),0),
                   '25th' = round(quantile(VALUE, 0.25),0),
                   '50th' = round(quantile(VALUE, 0.50),0),
                   '75th' = round(quantile(VALUE, 0.75),0),
                   '95th' = round(quantile(VALUE, 0.95),0),
                   Min = round(min(VALUE),0),
                   Max = round(max(VALUE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S7_a
SOURCE N Mean SD 5th 10th 25th 50th 75th 95th Min Max
Foley 2,851,891 118 124 15 25 40 80 150 350 0 4,385
Void 248,862 299 195 50 100 160 250 400 700 0 4,500
Condom Cath 37,890 246 205 30 50 100 200 320 625 0 2,900
Straight Cath 10,207 497 271 50 150 320 500 650 1,000 0 2,550
Suprapubic 9,672 126 140 12 25 45 90 150 350 0 2,050
Ileoconduit 6,022 148 149 15 30 50 100 200 400 0 2,500
R Nephrostomy 3,465 157 144 10 20 50 100 220 450 0 1,200
L Nephrostomy 3,206 167 145 10 25 50 125 250 450 0 1,150
S7_b <- ggplot(data = uo_rate, aes(x = VALUE)) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +
  geom_histogram(binwidth = 50) +
  xlim(-25, 1100) +
  labs(
        # title = "Volumes",
        x = "Volume (ml)",
        y = "Frequency"
      ) +
      theme(
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold"),
        plot.caption = element_text(face = "italic")
      )

S7_b

 

Records of zero volume

The proportion of zero value UO measurements:

uo_rate_count <- uo_rate %>% 
  count(SOURCE, sort = TRUE)

uo_rate_0_count <- uo_rate %>% 
  filter(VALUE == 0) %>% 
  count(SOURCE, sort = TRUE)

count_uo_zero_vs_all <- left_join(uo_rate_count, 
                                  uo_rate_0_count, by = "SOURCE") %>% 
  mutate(PROPORTION = n.y / n.x) %>%
  pivot_longer(cols = n.y:n.x, names_to = "type")
  
S7_c <- count_uo_zero_vs_all %>%
  ggplot(aes(x=reorder(SOURCE, -value), y=value, fill=type)) +
    geom_bar(position="fill", stat="identity") +
    xlab("") +
    ylab("") +
    scale_fill_brewer(palette="Paired") +  
    geom_text(aes(label=ifelse(type == "n.y", paste0((round(PROPORTION, 3) * 100), "%"), "")), 
          color="black", 
          size=3, 
          vjust=-1,
          position="fill") +
    theme_minimal() +
    theme(axis.text.y=element_blank()) +
    theme(legend.position="none") +
  theme(axis.text.y=element_blank()) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
     labs(
         # title = "Proportion of zero value raw output count",
       )
S7_c

uo_rate_0 <- uo_rate %>% filter(VALUE == 0) 
S7_d <- uo_rate_0 %>% group_by(SOURCE) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S7_d
SOURCE N Mean SD 5th 10th 25th 50th 75th 95th Min Max
Foley 31,574 103 232 45 60 60 60 104 240 1 15,007
Void 2,609 412 1,078 39 60 60 180 300 1,440 1 18,747
Condom Cath 651 194 388 51 60 60 120 240 516 1 5,952
Suprapubic 213 222 230 33 60 60 210 240 528 1 2,189
Straight Cath 101 1,013 1,859 60 120 180 303 858 3,508 13 14,427
Ileoconduit 89 216 292 60 60 60 120 240 699 15 1,680
R Nephrostomy 85 275 873 60 60 60 120 180 516 13 7,740
L Nephrostomy 70 258 892 60 60 60 120 220 360 60 7,560

Adjusting for hourly UO

UO Rate

S8_a <- uo_rate %>% group_by(SOURCE) %>%
    dplyr::summarise(N = n(),
                   Mean = round(mean(HOURLY_RATE),0),
                   SD = round(sd(HOURLY_RATE),0),
                   '5th' = round(quantile(HOURLY_RATE, 0.05),0),
                   '10th' = round(quantile(HOURLY_RATE, 0.1),0),
                   '25th' = round(quantile(HOURLY_RATE, 0.25),0),
                   '50th' = round(quantile(HOURLY_RATE, 0.50),0),
                   '75th' = round(quantile(HOURLY_RATE, 0.75),0),
                   '95th' = round(quantile(HOURLY_RATE, 0.95),0),
                   Min = round(min(HOURLY_RATE),0),
                   Max = round(max(HOURLY_RATE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S8_a
SOURCE N Mean SD 5th 10th 25th 50th 75th 95th Min Max
Foley 2,851,891 105 276 11 20 35 62 120 300 0 108,000
Void 248,862 159 639 17 25 50 90 160 400 0 57,000
Condom Cath 37,890 117 309 10 20 40 75 129 325 0 24,000
Straight Cath 10,207 195 1,786 2 11 40 69 114 378 0 114,000
Suprapubic 9,672 89 151 5 15 32 60 100 250 0 6,000
Ileoconduit 6,022 99 179 7 17 38 67 120 262 0 9,000
R Nephrostomy 3,465 65 138 3 7 17 40 75 200 0 6,000
L Nephrostomy 3,206 66 300 3 7 18 38 75 200 0 16,500
S8_b <- ggplot(data = uo_rate, aes(x = HOURLY_RATE)) +
  geom_histogram(binwidth = 20) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +  xlim(-10, 500) +
  labs(
        # title = "UO Rates",
        # subtitle = "by source",
        x = "Rate (ml/hr)",
        y = "Frequency"
      ) +
      theme(
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold"),
        plot.caption = element_text(face = "italic")
      )

S8_b

 

Low UO Rate Analysis

The association between UO rates and collection periods, smoothed conditional means for records of Foley catheter.

S8_d <- uo_rate %>%
  filter(SOURCE == "Foley",
         HOURLY_RATE < 500) %>%
  # slice_head(n = 50000) %>%
  ggplot(aes(x = HOURLY_RATE, y = TIME_INTERVAL)) +
  geom_smooth(se = TRUE, alpha = 0.1, linewidth = 1) +
  # geom_smooth(
  #   se = FALSE,
  #   method = "lm",
  #   linetype = "dashed",
  #   color = "red", 
  #   linewidth = 0.3
  # ) +
  geom_hline(yintercept = 60,
             size = 0.3,
             color = "#cccccc") +
  geom_vline(
    xintercept = 20,
    size = 0.3,
    color = "black",
    linetype = "dotdash"
  ) +
  scale_x_continuous(breaks = c(0, 20, 50, 100, 200, 300, 400, 500)) +
  scale_y_continuous(breaks = c(0, 60, 100, 200)) +
  coord_cartesian(xlim = c(0, 500), ylim = c(0, 200)) +
  labs(x = "Urine output rate (ml/hr)", y = "Collection periods (min)") +
  theme_classic()

S8_d

Quantile analysis for collection periods as a function of rates.

uo_rate_qreg <- uo_rate %>%
  left_join(anchor_year, by = "HADM_ID") %>%
  filter(SOURCE == "Foley",
         HOURLY_RATE < 500,
         ANCHOR_START > 2016) %>%
  slice_head(n = 500000)

#### Quantile
quantile_reg <- rq(TIME_INTERVAL ~
                     HOURLY_RATE,
                   seq(0.10, 0.90, by = 0.10),
                   # c(.05, .1, .25, .5, .75, .90, .95),
                   data = uo_rate_qreg)

# summary(quantile_reg, se = "iid") %>% 
#   plot()
### OLS
lm <- lm(data=uo_rate_qreg,
         formula =  TIME_INTERVAL ~
           HOURLY_RATE)

ols <- as.data.frame(coef(lm))
ols.ci <- as.data.frame(confint(lm, level = 0.95))
ols2 <- cbind(ols, ols.ci)
ols2 <- tibble::rownames_to_column(ols2, var="term")



#### Quantile
S8_e <- quantile_reg %>%
  tidy(se.type = "iid", conf.int = TRUE, conf.level = 0.95) %>%
  filter(!grepl("factor", term)) %>%
  ggplot(aes(x=tau,y=estimate)) +
  theme_classic() +
  theme(
    strip.background = element_blank(),
    #strip.text.x = element_blank()
  ) +
  scale_y_continuous(limits = symmetric_limits) +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 12)) +
  ##### quantilie results
  geom_point(color="#27408b", size = 0.3)+ 
  geom_line(color="black", linetype = "dotdash", size = 0.3)+ 
  geom_ribbon(aes(ymin=conf.low,ymax=conf.high),alpha=0.25, fill="#555555")+
  facet_wrap(~term, scales="free", ncol=1)+
  ##### OLS results
  geom_hline(data = ols2, aes(yintercept= `coef(lm)`), lty=1, color="red", size=0.3)+
  geom_hline(data = ols2, aes(yintercept= `2.5 %`), lty=2, color="red", size=0.3)+
  geom_hline(data = ols2, aes(yintercept= `97.5 %`), lty=2, color="red", size=0.3)+
  #### Lines
   geom_hline(yintercept = 0, size=0.3) 

S8_e

# Visualization for Quantile Regression with some tau values: 
intercept_slope <- quantile_reg %>% 
  coef() %>% 
  t() %>% 
  data.frame() %>% 
  rename(intercept = X.Intercept., slope = HOURLY_RATE) %>% 
  mutate(quantile = row.names(.))


S8_f <-
  ggplot() +
  geom_jitter(data = uo_rate_qreg, aes(HOURLY_RATE, TIME_INTERVAL),
    alpha = 0.2,
    size = 0.5,
    stroke = 0.5,
    width = 2,
    height = 2
  ) +
  geom_abline(data = intercept_slope, aes(
    intercept = intercept,
    slope = slope,
    color = quantile
  ),
  linewidth=1) +
  theme_minimal() +
  labs(x = "Urine output rate (ml/hr)", y = "Collection periods (min)") +
  coord_cartesian(xlim = c(0, 500), ylim = c(0, 500))

S8_f

uo_rate_qreg <- uo_rate %>%
  filter(SOURCE == "Foley") %>%
  arrange(STAY_ID) %>%
  slice_head(n = 500000)

percentile <- ecdf(uo_rate_qreg$TIME_INTERVAL)

#### Quantile
quantile_reg2 <- rq(TIME_INTERVAL ~ HOURLY_RATE, 
                    # seq(0.20, 0.80, by = 0.10), 
                    c(percentile(30),
                      percentile(60) - ((1-percentile(60))/10),
                      percentile(90),
                      percentile(120) - ((1-percentile(120))/10),
                      percentile(150),
                      percentile(180) - ((1-percentile(180))/10),
                      percentile(210),
                      percentile(240) - ((1-percentile(240))/10)),
                    data=uo_rate_qreg)

# Visualization for Quantile Regression with some tau values: 
intercept_slope <- quantile_reg2 %>% 
  coef() %>% 
  t() %>% 
  data.frame() %>% 
  rename(intercept = X.Intercept., slope = HOURLY_RATE) %>% 
  mutate(quantile = row.names(.))


ggplot() + 
  geom_point(data = uo_rate_qreg, aes(HOURLY_RATE, TIME_INTERVAL), 
             alpha = 0.5) + 
  geom_abline(data = intercept_slope, aes(intercept = intercept, slope = slope, color = quantile)) + 
  theme_minimal() + 
  labs(x = "HOURLY_RATE", y = "TIME_INTERVAL", 
       title = "Quantile Regression with tau = 0.25, 0.50 and 0.75", 
       caption = "Data Source: Koenker and Bassett (1982)") +
  coord_cartesian(xlim = c(0, 1000), ylim = c(0, 300))

 

Collection periods for UO rate 20ml/hr or below

uo_rate %>% 
  filter(HOURLY_RATE <= 20) %>%
  group_by(SOURCE) %>%
   dplyr::summarise(N = n(),
                   Mean = round(mean(TIME_INTERVAL),0),
                   SD = round(sd(TIME_INTERVAL),0),
                   '5th' = round(quantile(TIME_INTERVAL, 0.05),0),
                   '10th' = round(quantile(TIME_INTERVAL, 0.1),0),
                   '25th' = round(quantile(TIME_INTERVAL, 0.25),0),
                   '50th' = round(quantile(TIME_INTERVAL, 0.50),0),
                   '75th' = round(quantile(TIME_INTERVAL, 0.75),0),
                   '95th' = round(quantile(TIME_INTERVAL, 0.95),0),
                   Min = round(min(TIME_INTERVAL),0),
                   Max = round(max(TIME_INTERVAL),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)
SOURCE N Mean SD 5th 10th 25th 50th 75th 95th Min Max
Foley 316,355 120 255 60 60 60 60 120 300 1 43,235
Void 16,971 635 1,247 60 120 226 381 700 1,748 1 51,846
Condom Cath 4,191 492 736 60 60 125 240 536 1,620 1 8,904
Straight Cath 1,380 2,761 3,343 240 390 900 1,750 3,191 7,964 13 35,375
Suprapubic 1,355 216 291 60 60 60 120 240 695 1 6,616
R Nephrostomy 1,055 321 377 60 60 120 240 361 850 13 7,740
L Nephrostomy 939 370 489 60 120 120 240 420 923 34 7,560
Ileoconduit 759 280 537 60 60 60 120 240 948 15 8,040
uo_rate %>% 
  filter(HOURLY_RATE <= 20) %>%
ggplot(aes(x = TIME_INTERVAL / 60)) +
  geom_histogram(binwidth = 1) +
  facet_wrap(~factor(SOURCE, levels=c('Foley', 'Suprapubic', 'Ileoconduit',
                         'Void', 'Condom Cath', 'Straight Cath',
                         'R Nephrostomy', 'L Nephrostomy')), scales = "free") +  xlim(-1, 20) +
  labs(
          title = "Collection periods for UO rate 20ml/hr or below",
          subtitle = "by source",
          x = "Time interval (hr)",
          y = "Frequency"
        ) +
        theme(
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          plot.caption = element_text(face = "italic")
        )

 

Mean Rate

Mean UO rate weighted by tyme and grouped by source:

S8_c <- uo_rate %>% 
  group_by(SOURCE) %>%
  summarise(weighted_mean_rate = weighted.mean(HOURLY_RATE, TIME_INTERVAL)) %>%
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 1) %>%
  cols_label(
    SOURCE = "Source",
    weighted_mean_rate = "Weighted mean rate (ml/hr)"
  )

S8_c
Source Weighted mean rate (ml/hr)
Condom Cath 66.3
Foley 85.1
Ileoconduit 68.8
L Nephrostomy 39.9
R Nephrostomy 42.1
Straight Cath 40.8
Suprapubic 66.0
Void 77.3

 

Hourly-adjusted UO

S9_a <- hourly_uo %>% drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
    dplyr::summarise(N = n(),
                   Mean = round(mean(HOURLY_WEIGHTED_MEAN_RATE),0),
                   SD = round(sd(HOURLY_WEIGHTED_MEAN_RATE),0),
                   '5th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.05),0),
                   '10th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.1),0),
                   '25th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.25),0),
                   '50th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.50),0),
                   '75th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.75),0),
                   '95th' = round(quantile(HOURLY_WEIGHTED_MEAN_RATE, 0.95),0),
                   Min = round(min(HOURLY_WEIGHTED_MEAN_RATE),0),
                   Max = round(max(HOURLY_WEIGHTED_MEAN_RATE),0)
  ) %>% 
  arrange(desc(N)) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 0)

S9_a
N Mean SD 5th 10th 25th 50th 75th 95th Min Max
5,211,377 82 92 3 10 30 55 100 250 0 4,098
S9_b <- hourly_uo %>% drop_na(HOURLY_WEIGHTED_MEAN_RATE) %>%
ggplot(aes(x = HOURLY_WEIGHTED_MEAN_RATE)) +
  geom_histogram(binwidth = 25) +
  xlim(-10, 500) + 
  labs(
        # title = "Hourly-Adjusted UO",
        x = "Hourly UO (ml)",
        y = "Frequency"
      )

S9_b

 

Simple Sum Comparison

Showing proportion of hours with less than 100ml difference):

adj_uo_diff <- hourly_uo %>%
  select(HOURLY_WEIGHTED_MEAN_RATE, SIMPLE_SUM) %>%
  filter(!is.na(HOURLY_WEIGHTED_MEAN_RATE)) %>%
  mutate(hourly_diff = abs(HOURLY_WEIGHTED_MEAN_RATE - SIMPLE_SUM)) %>%
  mutate(
    cutoff_10 = if_else(hourly_diff < 10, 1, 0),
    cutoff_50 = if_else(hourly_diff < 50, 1, 0),
    cutoff_100 = if_else(hourly_diff < 100, 1, 0),
    cutoff_150 = if_else(hourly_diff < 150, 1, 0),
    cutoff_200 = if_else(hourly_diff < 200, 1, 0)
  )

my_order <- c("<10", "<50", "<100", "<150", "<200")

S11 <- adj_uo_diff %>%
  select(cutoff_10,
         cutoff_50,
         cutoff_100,
         cutoff_150,
         cutoff_200) %>%
  pivot_longer(cols = contains("cutoff")) %>%
  transmute(name = case_when(
    name == "cutoff_10" ~ "<10",
    name == "cutoff_50" ~ "<50",
    name == "cutoff_100" ~ "<100",
    name == "cutoff_150" ~ "<150",
    name == "cutoff_200" ~ "<200"
  ),
  value) %>%
  group_by(name) %>%
  summarise(agreement = paste0(round(mean(value) * 100, 1), "%"),
            non_agreement = paste0(round((1 - mean(
              value
            )) * 100, 1), "%")) %>%
  arrange(match(name, my_order)) %>%
  gt() %>%
  # tab_header(
  #   title = md("**Comparison of Hourly-Adjusted UO and Simple Summation**"),
  # ) %>%
  cols_label(
    name = "Cut-off (ml)",
    agreement = "Proportion  of Agreement",
    non_agreement = "Proportion  of Disagreement"
  ) %>%
  cols_align(
    align = "center"
  ) %>%
  tab_source_note(source_note = "The table demonstrates the significance of hourly adjustment for accuracy by presenting the variance between the adjusted values and the simple hourly summation. Cut-off values are based on the absolute difference between the hourly-adjusted UO and a simple hourly summation of UO. Measurements charted on the hour were included with the previous time interval. ")

adj_uo_diff <- hourly_uo %>%
  select(HOURLY_WEIGHTED_MEAN_RATE, SIMPLE_SUM) %>%
  filter(!is.na(HOURLY_WEIGHTED_MEAN_RATE)) %>%
  mutate(no_diff = 
           ifelse((is.na(HOURLY_WEIGHTED_MEAN_RATE) &
                  is.na(SIMPLE_SUM)) |
             (!is.na(HOURLY_WEIGHTED_MEAN_RATE) &
                  !is.na(SIMPLE_SUM) &
                    abs(HOURLY_WEIGHTED_MEAN_RATE-SIMPLE_SUM) < 100), 
                  1, 
                  0),
         .keep = "none")

S11
Cut-off (ml) Proportion of Agreement Proportion of Disagreement
<10 45.4% 54.6%
<50 66.6% 33.4%
<100 84.2% 15.8%
<150 91.7% 8.3%
<200 95.2% 4.8%
The table demonstrates the significance of hourly adjustment for accuracy by presenting the variance between the adjusted values and the simple hourly summation. Cut-off values are based on the absolute difference between the hourly-adjusted UO and a simple hourly summation of UO. Measurements charted on the hour were included with the previous time interval.
mean(adj_uo_diff$no_diff)
## [1] 0.842083

 

Hourly UO Per Kilogram

S9_c <- uo_ml_kg_hr %>%
  filter(WEIGHT_ADMIT <= 300,
         WEIGHT_ADMIT >= 25) %>%
  dplyr::summarise(N = n(),
                   Mean = round(mean(ML_KG_HR),2),
                   SD = round(sd(ML_KG_HR),2),
                   '5th' = round(quantile(ML_KG_HR, 0.05),2),
                   '10th' = round(quantile(ML_KG_HR, 0.1),2),
                   '25th' = round(quantile(ML_KG_HR, 0.25),2),
                   '50th' = round(quantile(ML_KG_HR, 0.50),2),
                   '75th' = round(quantile(ML_KG_HR, 0.75),2),
                   '95th' = round(quantile(ML_KG_HR, 0.95),2),
                   Min = round(min(ML_KG_HR),2),
                   Max = round(max(ML_KG_HR),2)
  ) %>% 
  gt() %>%
  fmt_number(use_seps = TRUE, decimals = 2)

S9_c
N Mean SD 5th 10th 25th 50th 75th 95th Min Max
5,119,874.00 1.05 1.21 0.04 0.13 0.37 0.70 1.30 3.20 0.00 94.86
mean_log <- log(mean(uo_ml_kg_hr$ML_KG_HR))
sd_log <- log(sd(uo_ml_kg_hr$ML_KG_HR))
S9_d <- ggplot() + 
  xlim(0, 2) + 
  geom_histogram(aes(x = ML_KG_HR
                     # , y =..density..
                     ), data=(uo_ml_kg_hr %>%
                                filter(WEIGHT_ADMIT <= 300,
                                       WEIGHT_ADMIT >= 25)), 
                 binwidth = 0.02) + 
  # stat_function(fun = dlnorm, args = list(meanlog = mean_log, sdlog = sd_log, log = FALSE), size=1, color='gray') +
  labs(
        # title = "Hourly-Adjusted UO per Kilogram",
        x = "Hourly volume to kg (ml/hr/kg)",
        y = "Frequency"
      )

S9_d

# save(all_rows_count, 
#      distinct_time_item_patient_rows_count, 
#      S2_a,
#      S3a,
#      S4_a, S4_b, S4_c, S4_d, S4_e, S4_f,
#      S6_a, S6_b,
#      S7_a, S7_b, S7_c, S7_d, 
#      S8_a, S8_b, S8_c, S8_d, S8_e, S8_f,
#      S9_a, S9_b, S9_c, S9_d, 
#      file = "s_data.Rda")

KDIGO Criteria Avarage-UO, Consecutive-UO and Old (MIMIC repo. official deriviation) Comparison

mimic_kdigo_inter_aki_table <- akis_all_long %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    ),
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  tbl_summary(
    by = "group",
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  )  %>%
  modify_column_indent(columns = label, rows = c(FALSE, TRUE)) %>%
  modify_column_indent(
    columns = label,
    rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
    double_indent = TRUE
  ) %>%
  add_p()

mimic_kdigo_inter_aki_table
Characteristic Block summation
N = 46,115
1
UO-Average
N = 46,344
1
UO-Consecutive
N = 46,344
1
p-value2
Oliguric-AKI on the first days 27,188 (59.0%) 29,385 (63.4%) 22,372 (48.3%) <0.001
    Maximum KDIGO staging


<0.001
        1 8,248 (30.3%) 9,204 (31.3%) 11,262 (50.3%)
        2 14,830 (54.5%) 15,255 (51.9%) 8,991 (40.2%)
        3 4,110 (15.1%) 4,926 (16.8%) 2,119 (9.47%)
Prevalence at admission 7,321 (15.9%) 10,511 (22.7%) 6,388 (13.8%) <0.001
1 n (%)
2 Pearson’s Chi-squared test
mimic_kdigo_inter_cons_mean <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "old")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_mean_old <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "newcons")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary,) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_cons_old <- akis_all_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group != "newmean")) %>%
  mutate(group = case_when(
    group == "newcons" ~ 'UO-Consecutive',
    group == "newmean" ~ 'UO-Average',
    group == "old" ~ 'Block summation'
  )) %>%
  left_join(table_1, by = "STAY_ID") %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2)
  )

mimic_kdigo_inter_cons_mean
Characteristic N UO-Average
N = 20,181
1
N UO-Consecutive
N = 11,110
1
p-value2
Age at Hospital Admission, years 20,181 68 (16) 11,110 67 (16) 0.4
Weight at ICU Admission, kg 20,181 87 (25) 11,110 88 (27) <0.001
Gender 20,181
11,110
0.028
    F
8,731 (43%)
4,950 (45%)
    M
11,450 (57%)
6,160 (55%)
Ethnicity 17,294
9,526
0.009
    African American
1,734 (10%)
1,084 (11%)
    Asian
373 (2.2%)
205 (2.2%)
    Caucasian
13,878 (80%)
7,496 (79%)
    Hispanic
524 (3.0%)
310 (3.3%)
    Other
785 (4.5%)
431 (4.5%)
CCI Score 20,181 5 (3, 7) 11,110 5 (3, 7) <0.001
CKD, Stage 1-4 20,175 4,100 (20%) 11,106 2,678 (24%) <0.001
Diabetes Mellitus 20,175 4,950 (25%) 11,106 2,709 (24%) 0.8
SOFA Score at ICU Admission 20,181 4 (2, 7) 11,110 5 (2, 8) <0.001
SAPS-II at ICU Admission 20,075 37 (28, 46) 11,040 39 (29, 50) <0.001
APS-III Score at ICU Admission 20,181 42 (32, 58) 11,110 46 (34, 63) <0.001
First Creatinine in ICU, mg/dL 20,128 1.46 (1.57) 11,073 1.70 (1.92) <0.001
Peak Creatinine at first days, mg/dL 20,114 1.75 (1.80) 11,064 2.10 (2.19) <0.001
ICU Discharge Creatinine, mg/dL 20,128 1.42 (1.45) 11,073 1.69 (1.74) <0.001
Peak KDIGO-Cr at first days 20,027
11,014
<0.001
    0
12,966 (65%)
6,357 (58%)
    1
4,794 (24%)
2,886 (26%)
    2
1,126 (5.6%)
773 (7.0%)
    3
1,141 (5.7%)
998 (9.1%)
Time in hospital, days 20,181 8 (5, 13) 11,110 8 (5, 14) 0.057
Time in ICU, days 20,181 2.9 (1.8, 5.1) 11,110 3.1 (1.9, 5.7) <0.001
Renal replacement therapy 20,181 1,771 (8.8%) 11,110 1,574 (14%) <0.001
Hospital Mortality 20,181 2,912 (14%) 11,110 2,111 (19%) <0.001
1 Mean (SD); n (%); Median (Q1, Q3)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
mimic_kdigo_inter_cons_old
Characteristic N Block summation
N = 18,940
1
N UO-Consecutive
N = 11,110
1
p-value2
Age at Hospital Admission, years 18,940 68 (16) 11,110 67 (16) 0.002
Weight at ICU Admission, kg 18,940 88 (25) 11,110 88 (27) 0.2
Gender 18,940
11,110
0.005
    F
8,125 (43%)
4,950 (45%)
    M
10,815 (57%)
6,160 (55%)
Ethnicity 16,197
9,526
<0.001
    African American
1,580 (9.8%)
1,084 (11%)
    Asian
333 (2.1%)
205 (2.2%)
    Caucasian
13,081 (81%)
7,496 (79%)
    Hispanic
485 (3.0%)
310 (3.3%)
    Other
718 (4.4%)
431 (4.5%)
CCI Score 18,940 5 (3, 7) 11,110 5 (3, 7) <0.001
CKD, Stage 1-4 18,934 3,910 (21%) 11,106 2,678 (24%) <0.001
Diabetes Mellitus 18,934 4,736 (25%) 11,106 2,709 (24%) 0.2
SOFA Score at ICU Admission 18,940 5 (2, 7) 11,110 5 (2, 8) 0.021
SAPS-II at ICU Admission 18,865 37 (29, 47) 11,040 39 (29, 50) <0.001
APS-III Score at ICU Admission 18,940 43 (32, 59) 11,110 46 (34, 63) <0.001
First Creatinine in ICU, mg/dL 18,897 1.46 (1.51) 11,073 1.70 (1.92) <0.001
Peak Creatinine at first days, mg/dL 18,889 1.76 (1.75) 11,064 2.10 (2.19) <0.001
ICU Discharge Creatinine, mg/dL 18,897 1.44 (1.44) 11,073 1.69 (1.74) <0.001
Peak KDIGO-Cr at first days 18,811
11,014
<0.001
    0
11,860 (63%)
6,357 (58%)
    1
4,713 (25%)
2,886 (26%)
    2
1,132 (6.0%)
773 (7.0%)
    3
1,106 (5.9%)
998 (9.1%)
Time in hospital, days 18,940 8 (5, 14) 11,110 8 (5, 14) 0.3
Time in ICU, days 18,940 2.9 (1.8, 5.2) 11,110 3.1 (1.9, 5.7) <0.001
Renal replacement therapy 18,940 1,655 (8.7%) 11,110 1,574 (14%) <0.001
Hospital Mortality 18,940 2,859 (15%) 11,110 2,111 (19%) <0.001
1 Mean (SD); n (%); Median (Q1, Q3)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
mimic_kdigo_inter_mean_old
Characteristic N Block summation
N = 18,940
1
N UO-Average
N = 20,181
1
p-value2
Age at Hospital Admission, years 18,940 68 (16) 20,181 68 (16) 0.006
Weight at ICU Admission, kg 18,940 88 (25) 20,181 87 (25) 0.011
Gender 18,940
20,181
0.5
    F
8,125 (43%)
8,731 (43%)
    M
10,815 (57%)
11,450 (57%)
Ethnicity 16,197
17,294
0.8
    African American
1,580 (9.8%)
1,734 (10%)
    Asian
333 (2.1%)
373 (2.2%)
    Caucasian
13,081 (81%)
13,878 (80%)
    Hispanic
485 (3.0%)
524 (3.0%)
    Other
718 (4.4%)
785 (4.5%)
CCI Score 18,940 5 (3, 7) 20,181 5 (3, 7) 0.11
CKD, Stage 1-4 18,934 3,910 (21%) 20,175 4,100 (20%) 0.4
Diabetes Mellitus 18,934 4,736 (25%) 20,175 4,950 (25%) 0.3
SOFA Score at ICU Admission 18,940 5 (2, 7) 20,181 4 (2, 7) <0.001
SAPS-II at ICU Admission 18,865 37 (29, 47) 20,075 37 (28, 46) <0.001
APS-III Score at ICU Admission 18,940 43 (32, 59) 20,181 42 (32, 58) <0.001
First Creatinine in ICU, mg/dL 18,897 1.46 (1.51) 20,128 1.46 (1.57) 0.034
Peak Creatinine at first days, mg/dL 18,889 1.76 (1.75) 20,114 1.75 (1.80) <0.001
ICU Discharge Creatinine, mg/dL 18,897 1.44 (1.44) 20,128 1.42 (1.45) 0.009
Peak KDIGO-Cr at first days 18,811
20,027
0.006
    0
11,860 (63%)
12,966 (65%)
    1
4,713 (25%)
4,794 (24%)
    2
1,132 (6.0%)
1,126 (5.6%)
    3
1,106 (5.9%)
1,141 (5.7%)
Time in hospital, days 18,940 8 (5, 14) 20,181 8 (5, 13) <0.001
Time in ICU, days 18,940 2.9 (1.8, 5.2) 20,181 2.9 (1.8, 5.1) 0.13
Renal replacement therapy 18,940 1,655 (8.7%) 20,181 1,771 (8.8%) 0.9
Hospital Mortality 18,940 2,859 (15%) 20,181 2,912 (14%) 0.064
1 Mean (SD); n (%); Median (Q1, Q3)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
akis_all_long %>%
  group_by(group, max_stage) %>%
  summarise(Propo = sum(mortality_7, na.rm = T) / n()) %>%
  ggplot(aes(max_stage, Propo, fill = group)) + geom_col(position = 'dodge')

mimic_cdplod_old <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "old")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for Block Summation"
)

mimic_cdplod_old
## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x1558cdac0>
mimic_cdplod_mean <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "newmean")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for UOmean"
)

mimic_cdplod_mean
## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x1698d4b30>
mimic_cdplod_cons <- cdplot(
  as.factor(mortality_7) ~ max_stage,
  (akis_all_long %>%
     filter(group == "newcons")),
  col = c("lightgoldenrod", "lightcyan"),
  ylab = "7 days mortality",
  xlab = "Max KDIGO-UO stage",
  main = "CD Plot for UOcons"
)

mimic_cdplod_cons
## $`1`
## function (v) 
## .approxfun(x, y, v, method, yleft, yright, f, na.rm)
## <bytecode: 0x1558ca4e0>
## <environment: 0x333346fa8>
model_block_summation <- glm(
  mortality_7 ~ MAX_STAGE_OLD + FIRST_STAGE_OLD,
  data = akis_all_wide,
  family = binomial
)

model_mean <- glm(
  mortality_7 ~ MAX_STAGE_NEW_MEAN + FIRST_STAGE_NEW_MEAN,
  data = akis_all_wide,
  family = binomial
)

model_cons <- glm(
  mortality_7 ~ MAX_STAGE_NEW_CONS + FIRST_STAGE_NEW_CONS,
  data = akis_all_wide,
  family = binomial
)

# anova(model_old, model_new, test = 'Chisq')
mimic_kdigo_inter_bic <- BIC(model_block_summation, model_mean, model_cons)

mimic_kdigo_inter_bic <- cbind(Model = rownames(mimic_kdigo_inter_bic), mimic_kdigo_inter_bic) %>%
  gt()

mimic_kdigo_inter_bic
Model df BIC
model_block_summation 3 20948.37
model_mean 3 20837.52
model_cons 3 20802.08

check for mortality overdispration:

model_binom <- glm(
  mortality_30 ~ stage_newcons,
  family = binomial,
  data = (
    akis_all_long %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

model_overdispersed <- glm(
  mortality_30 ~ stage_newcons,
  family = quasibinomial,
  data = (
    akis_all_long %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

pchisq(summary(model_overdispersed)$dispersion * model_binom$df.residual, 
       model_binom$df.residual, lower = F)
## [1] 0.4937964
akis_all_long_complete <- akis_all_long %>% drop_na(max_stage)

# descriptive
descriptive_tbl <- akis_all_long_complete %>%
  group_by(group, max_stage) %>%
  summarise(
    n = n(),
    dead = sum(mortality_30),
    mortality_prop = sum(mortality_30) / n()
  ) %>%
  # drop_na() %>%
  group_by(group) %>%
  transmute(
    max_stage,
    "Patients, No. (%)" = paste0(n, " (", round((n / sum(
      n
    )), 2), ")"),
    "Mortality, No. (%)" = paste0(dead, " (", round(mortality_prop, 2), ")")
  ) %>%
  gt()

# glm
m1 <- glm(
  mortality_30 ~ stage_newcons,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

m2 <- glm(
  mortality_30 ~ stage_newmean,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newmean") %>%
      mutate(stage_newmean = as.factor(max_stage))
  )
)

m3 <- glm(
  mortality_30 ~ stage_old,
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "old") %>%
      mutate(stage_old = as.factor(max_stage))
  )
)

tbl_regression
## function (x, ...) 
## {
##     check_pkg_installed(c("broom", "broom.helpers"), reference_pkg = "gtsummary")
##     check_not_missing(x)
##     UseMethod("tbl_regression")
## }
## <bytecode: 0x129a45250>
## <environment: namespace:gtsummary>
glm_tbl <- tbl_stack(list(
  tbl_regression(m1, exponentiate = TRUE),
  tbl_regression(m2, exponentiate = TRUE),
  tbl_regression(m3, exponentiate = TRUE)
)) %>%
  as_gt()

# join tables
glm_tbl_data <- glm_tbl$`_data` %>%
  filter(!is.na(term)) %>%
  transmute(
    group = case_when(
      variable == "stage_newmean" ~ "newmean",
      variable == "stage_newcons" ~ "newcons",
      variable == "stage_old" ~ "old"
    ),
    max_stage = label,
    "OR (95% CI)" = if_else(label == "0", "1 [Reference]", paste0(round(estimate, 2), " (", ci, ")"), ),
    "P value" = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

descriptive_tbl_data <- descriptive_tbl$`_data` %>%
  mutate(max_stage = as.character(max_stage))

rr_table_data <- left_join(descriptive_tbl_data, glm_tbl_data, by = c("group", "max_stage"))

mimic_kdigo_inter_survival_table <- rr_table_data %>%
  mutate(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    )
  ) %>%
  gt(
    rowname_col = "max_stage",
    groupname_col = "group",
    row_group_as_column = TRUE
  ) %>%
  tab_stubhead(label = "Criteria / Stage") %>%
  tab_spanner(label = "Unadjusted OR", columns = c("OR (95% CI)", "P value"))

mimic_kdigo_inter_survival_table
Criteria / Stage Patients, No. (%) Mortality, No. (%) Unadjusted OR
OR (95% CI) P value
UO-Consecutive 0 23972 (0.52) 1715 (0.07) 1 [Reference] NA
1 11262 (0.24) 1336 (0.12) 1.75 (1.62, 1.88) <.001
2 8991 (0.19) 1826 (0.2) 3.31 (3.08, 3.55) <.001
3 2119 (0.05) 775 (0.37) 7.48 (6.76, 8.28) <.001
UO-Average 0 16959 (0.37) 1044 (0.06) 1 [Reference] NA
1 9204 (0.2) 924 (0.1) 1.7 (1.55, 1.87) <.001
2 15255 (0.33) 2111 (0.14) 2.45 (2.27, 2.65) <.001
3 4926 (0.11) 1573 (0.32) 7.15 (6.56, 7.80) <.001
Block summation 0 18927 (0.41) 1231 (0.07) 1 [Reference] NA
1 8248 (0.18) 825 (0.1) 1.6 (1.46, 1.75) <.001
2 14830 (0.32) 2143 (0.14) 2.43 (2.26, 2.61) <.001
3 4110 (0.09) 1435 (0.35) 7.71 (7.07, 8.41) <.001

adjusted models:

m1_adj1 <- glm(
  mortality_30 ~ stage_newcons * (
    admission_age + gender + weight_admit + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newcons") %>%
      mutate(stage_newcons = as.factor(max_stage))
  )
)

mrg_efct_m1_adj1 <- avg_comparisons(
  m1_adj1,
  variables = "stage_newcons",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m1_adj1 <- mrg_efct_m1_adj1 %>%
    transmute(
      group = "newcons",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "newcons", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

m2_adj1 <- glm(
  mortality_30 ~ stage_newmean * (
    admission_age + weight_admit + gender + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "newmean") %>%
      mutate(stage_newmean = as.factor(max_stage))
  )
)

mrg_efct_m2_adj1 <- avg_comparisons(
  m2_adj1,
  variables = "stage_newmean",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m2_adj1 <- mrg_efct_m2_adj1 %>%
    transmute(
      group = "newmean",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "newmean", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

m3_adj1 <- glm(
  mortality_30 ~ stage_old * (
    admission_age + weight_admit + gender + first_stage
  ),
  family = binomial,
  data = (
    akis_all_long_complete %>%
      filter(group == "old") %>%
      mutate(stage_old = as.factor(max_stage))
  )
)

mrg_efct_m3_adj1 <- avg_comparisons(
  m3_adj1,
  variables = "stage_old",
  comparison = "lnoravg",
  transform = exp
)

rr_table_data_m3_adj1 <- mrg_efct_m3_adj1 %>%
    transmute(
      group = "old",
      max_stage = as.factor(dplyr::row_number()),
      estimate = paste0(round(estimate, 2), " (",round(conf.low, 2), "-", round(conf.high, 2), ")"),
      p.value
    ) %>% add_row(group = "old", max_stage = "0", .before = 0) %>%
    transmute(
    group,
    max_stage,
    or.adj1 = if_else(is.na(estimate), "1 [Reference]", estimate),
    p.adj1 = case_when(
      is.na(p.value) ~ "NA",
      p.value < 0.001 ~ "<.001",
      .default = as.character(round(p.value, 2))
    )
  )

mimic_kdigo_inter_survival_table_adj <- rr_table_data %>% 
  left_join(bind_rows(rr_table_data_m1_adj1, rr_table_data_m2_adj1, rr_table_data_m3_adj1)) %>%
  mutate(
    group = case_when(
      group == "newcons" ~ 'UO-Consecutive',
      group == "newmean" ~ 'UO-Average',
      group == "old" ~ 'Block summation'
    )
  ) %>%
  gt(
    rowname_col = "max_stage",
    groupname_col = "group",
    row_group_as_column = TRUE
  ) %>%
  cols_label(
    or.adj1 = "OR (95% CI)",
    p.adj1 = "P value",
  ) %>%
  tab_stubhead(label = "Criteria / Stage") %>%
  tab_spanner(label = "Unadjusted OR", columns = c("OR (95% CI)", "P value")) %>%
  tab_spanner(label = "Adjusted Model", columns = c(or.adj1, p.adj1), id = "adj1") %>%
  tab_footnote(
    footnote = "Model include age, weight, gender and whether diagnosed on admission",
    locations = cells_column_spanners(spanners = "adj1")
  ) %>% tab_source_note(source_note = md(
    "All covariates in the adjusted model were significant except for diagnosis at admission for block summation model."
  ))

mimic_kdigo_inter_survival_table_adj
Criteria / Stage Patients, No. (%) Mortality, No. (%) Unadjusted OR Adjusted Model1
OR (95% CI) P value OR (95% CI) P value
UO-Consecutive 0 23972 (0.52) 1715 (0.07) 1 [Reference] NA 1 [Reference] NA
1 11262 (0.24) 1336 (0.12) 1.75 (1.62, 1.88) <.001 1.58 (1.46-1.72) <.001
2 8991 (0.19) 1826 (0.2) 3.31 (3.08, 3.55) <.001 2.94 (2.7-3.19) <.001
3 2119 (0.05) 775 (0.37) 7.48 (6.76, 8.28) <.001 5.24 (4.42-6.2) <.001
UO-Average 0 16959 (0.37) 1044 (0.06) 1 [Reference] NA 1 [Reference] NA
1 9204 (0.2) 924 (0.1) 1.7 (1.55, 1.87) <.001 1.48 (1.34-1.63) <.001
2 15255 (0.33) 2111 (0.14) 2.45 (2.27, 2.65) <.001 2.11 (1.93-2.31) <.001
3 4926 (0.11) 1573 (0.32) 7.15 (6.56, 7.80) <.001 5.59 (4.92-6.36) <.001
Block summation 0 18927 (0.41) 1231 (0.07) 1 [Reference] NA 1 [Reference] NA
1 8248 (0.18) 825 (0.1) 1.6 (1.46, 1.75) <.001 1.54 (1.4-1.69) <.001
2 14830 (0.32) 2143 (0.14) 2.43 (2.26, 2.61) <.001 2.38 (2.2-2.57) <.001
3 4110 (0.09) 1435 (0.35) 7.71 (7.07, 8.41) <.001 8.11 (7.29-9.03) <.001
All covariates in the adjusted model were significant except for diagnosis at admission for block summation model.
1 Model include age, weight, gender and whether diagnosed on admission

summaries for all tables:

summary(m1_adj1)
## 
## Call:
## glm(formula = mortality_30 ~ stage_newcons * (admission_age + 
##     gender + weight_admit + first_stage), family = binomial, 
##     data = (akis_all_long_complete %>% filter(group == "newcons") %>% 
##         mutate(stage_newcons = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -3.202164   0.179984 -17.791  < 2e-16 ***
## stage_newcons1                0.967897   0.290305   3.334 0.000856 ***
## stage_newcons2                1.493116   0.269739   5.535 3.11e-08 ***
## stage_newcons3                0.774332   0.354520   2.184 0.028950 *  
## admission_age                 0.032089   0.001698  18.896  < 2e-16 ***
## genderM                       0.242934   0.055858   4.349 1.37e-05 ***
## weight_admit                 -0.022223   0.001703 -13.052  < 2e-16 ***
## first_stage                   0.514027   0.101331   5.073 3.92e-07 ***
## stage_newcons1:admission_age -0.008540   0.002735  -3.122 0.001796 ** 
## stage_newcons2:admission_age -0.009419   0.002571  -3.664 0.000248 ***
## stage_newcons3:admission_age -0.007443   0.003570  -2.085 0.037097 *  
## stage_newcons1:genderM       -0.175030   0.084691  -2.067 0.038763 *  
## stage_newcons2:genderM       -0.162809   0.079392  -2.051 0.040297 *  
## stage_newcons3:genderM       -0.226078   0.110594  -2.044 0.040932 *  
## stage_newcons1:weight_admit   0.002444   0.002439   1.002 0.316316    
## stage_newcons2:weight_admit   0.005524   0.002191   2.521 0.011687 *  
## stage_newcons3:weight_admit   0.020724   0.002500   8.291  < 2e-16 ***
## stage_newcons1:first_stage    0.172714   0.123130   1.403 0.160706    
## stage_newcons2:first_stage   -0.158296   0.115697  -1.368 0.171254    
## stage_newcons3:first_stage          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34369  on 46343  degrees of freedom
## Residual deviance: 30601  on 46325  degrees of freedom
## AIC: 30639
## 
## Number of Fisher Scoring iterations: 6
summary(m2_adj1)
## 
## Call:
## glm(formula = mortality_30 ~ stage_newmean * (admission_age + 
##     weight_admit + gender + first_stage), family = binomial, 
##     data = (akis_all_long_complete %>% filter(group == "newmean") %>% 
##         mutate(stage_newmean = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                  -3.520e+00  2.260e-01 -15.573  < 2e-16 ***
## stage_newmean1                9.096e-01  3.511e-01   2.591  0.00958 ** 
## stage_newmean2                1.716e+00  2.918e-01   5.880 4.10e-09 ***
## stage_newmean3                2.080e+00  3.151e-01   6.603 4.04e-11 ***
## admission_age                 3.372e-02  2.110e-03  15.980  < 2e-16 ***
## weight_admit                 -2.127e-02  2.231e-03  -9.533  < 2e-16 ***
## genderM                       2.270e-01  7.112e-02   3.192  0.00141 ** 
## first_stage                   3.498e-01  6.812e-02   5.136 2.81e-07 ***
## stage_newmean1:admission_age -8.734e-03  3.300e-03  -2.647  0.00813 ** 
## stage_newmean2:admission_age -1.273e-02  2.733e-03  -4.658 3.19e-06 ***
## stage_newmean3:admission_age -1.497e-02  3.036e-03  -4.931 8.18e-07 ***
## stage_newmean1:weight_admit  -1.866e-05  3.140e-03  -0.006  0.99526    
## stage_newmean2:weight_admit  -1.034e-03  2.627e-03  -0.394  0.69383    
## stage_newmean3:weight_admit   1.143e-02  2.607e-03   4.384 1.17e-05 ***
## stage_newmean1:genderM       -9.116e-02  1.046e-01  -0.871  0.38353    
## stage_newmean2:genderM       -4.251e-02  8.781e-02  -0.484  0.62831    
## stage_newmean3:genderM       -2.000e-01  9.605e-02  -2.083  0.03728 *  
## stage_newmean1:first_stage    5.619e-01  1.008e-01   5.572 2.51e-08 ***
## stage_newmean2:first_stage    2.542e-01  8.411e-02   3.022  0.00251 ** 
## stage_newmean3:first_stage           NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34369  on 46343  degrees of freedom
## Residual deviance: 30390  on 46325  degrees of freedom
## AIC: 30428
## 
## Number of Fisher Scoring iterations: 6
summary(m3_adj1)
## 
## Call:
## glm(formula = mortality_30 ~ stage_old * (admission_age + weight_admit + 
##     gender + first_stage), family = binomial, data = (akis_all_long_complete %>% 
##     filter(group == "old") %>% mutate(stage_old = as.factor(max_stage))))
## 
## Coefficients: (1 not defined because of singularities)
##                           Estimate Std. Error z value Pr(>|z|)    
## (Intercept)              -3.441839   0.210510 -16.350  < 2e-16 ***
## stage_old1                0.661224   0.354127   1.867 0.061874 .  
## stage_old2                1.806640   0.281955   6.408 1.48e-10 ***
## stage_old3                2.881287   0.315823   9.123  < 2e-16 ***
## admission_age             0.034303   0.001960  17.498  < 2e-16 ***
## weight_admit             -0.022277   0.002060 -10.816  < 2e-16 ***
## genderM                   0.241431   0.065597   3.681 0.000233 ***
## first_stage              -0.014451   0.046307  -0.312 0.754995    
## stage_old1:admission_age -0.008355   0.003318  -2.518 0.011809 *  
## stage_old2:admission_age -0.013912   0.002636  -5.278 1.31e-07 ***
## stage_old3:admission_age -0.019748   0.003061  -6.450 1.12e-10 ***
## stage_old1:weight_admit   0.003504   0.003099   1.131 0.258169    
## stage_old2:weight_admit   0.001462   0.002474   0.591 0.554449    
## stage_old3:weight_admit   0.009982   0.002507   3.981 6.86e-05 ***
## stage_old1:genderM       -0.101982   0.103987  -0.981 0.326729    
## stage_old2:genderM       -0.137826   0.083048  -1.660 0.096996 .  
## stage_old3:genderM       -0.161356   0.095257  -1.694 0.090284 .  
## stage_old1:first_stage    0.797323   0.094909   8.401  < 2e-16 ***
## stage_old2:first_stage    0.269415   0.064578   4.172 3.02e-05 ***
## stage_old3:first_stage          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34235  on 46113  degrees of freedom
## Residual deviance: 30322  on 46095  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 30360
## 
## Number of Fisher Scoring iterations: 6
km_fit_newcons <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_CONS,
          akis_all_wide)

km_fit_newmean <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_MEAN,
          akis_all_wide)

km_fit_old <-
  survfit(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_OLD,
          akis_all_wide)

mimic_survival_figure_newcons <- km_fit_newcons %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival", title = "UOcons",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_survival_figure_newmean <- km_fit_newmean %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival", title = "UOmean",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_survival_figure_old <- km_fit_old %>%
ggsurvfit(linewidth = 1) +
  add_confidence_interval() +
  add_quantile() +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(0, 30)) +
  theme_classic() +
  scale_fill_discrete(labels=c('0', '1', '2', '3')) +
  scale_color_discrete(labels=c('0', '1', '2', '3')) +
  labs(x="Days", y = "Survival",title = "Block Summation",
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  theme(legend.position = "bottom")

mimic_kdigo_inter_survival_figure <- ggarrange(
  mimic_survival_figure_old,
  mimic_survival_figure_newmean,
  mimic_survival_figure_newcons,
  ncol = 1,
  nrow = 3,
  legend = "bottom",
  common.legend = TRUE
) %>% annotate_figure(top = text_grob("MIMICdb", face = "bold", size = 14))

mimic_kdigo_inter_survival_figure

Sensitivity Analysis: Validity Threshold for Durations of Collection

validity_threshold_long <- validity_threshold_wide %>%
  mutate(
         first_stage.cons = FIRST_STAGE_NEW_CONS,
         first_stage.9520 = FIRST_STAGE_NEW_CONS_95_20,
         first_stage.9920 = FIRST_STAGE_NEW_CONS_99_20,
         max_stage.cons = MAX_STAGE_NEW_CONS,
         max_stage.9520 = MAX_STAGE_NEW_CONS_95_20,
         max_stage.9920 = MAX_STAGE_NEW_CONS_99_20,
         .keep = "unused") %>%
  pivot_longer(
    !c(STAY_ID, FOLLOWUP_DAYS, DEATH_FLAG),
    names_sep = "\\.",
    names_to = c(".value", "group")
  ) %>%
  mutate(
    mortality_90 = if_else(FOLLOWUP_DAYS < 91 &
                                 DEATH_FLAG == 1, 1, 0),
    prevalnce_admit = if_else(first_stage > 0, 1, 0),
    Incidence_first_72hr = case_when(first_stage > 0 ~ NA, max_stage == 0 ~ 0, max_stage > 0 ~ 1),
    Incidence_first_72hr_with_stage = ifelse(first_stage == 0 &
                                              max_stage > 0, max_stage, NA),
    .keep = "all"
  ) 

validity_threshold_long %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    group,
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  mutate(group =
           case_when(group == "cons" ~ "No exclusion",
                     group == "9520" ~ "95th precentile for rate bellow 20th precentile",
                     group == "9920" ~ "99th precentile for rate bellow 20th precentile")) %>%
  tbl_summary(
    by = "group",
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  )  %>%
  modify_column_indent(columns = label, rows = c(FALSE, TRUE)) %>%
  modify_column_indent(
    columns = label,
    rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
    double_indent = TRUE
  ) %>%
  add_p()
Characteristic 95th precentile for rate bellow 20th precentile
N = 45,804
1
99th precentile for rate bellow 20th precentile
N = 46,278
1
No exclusion
N = 46,347
1
p-value2
Oliguric-AKI on the first days 20,837 (45.5%) 22,105 (47.8%) 22,373 (48.3%) <0.001
    Maximum KDIGO staging


<0.001
        1 11,034 (53.0%) 11,447 (51.8%) 11,262 (50.3%)
        2 8,528 (40.9%) 8,792 (39.8%) 8,992 (40.2%)
        3 1,275 (6.12%) 1,866 (8.44%) 2,119 (9.47%)
Prevalence at admission 5,941 (13.0%) 6,250 (13.5%) 6,388 (13.8%) 0.001
1 n (%)
2 Pearson’s Chi-squared test
mimic_exclusion_threshold <- validity_threshold_long %>%
  transmute(STAY_ID,
            group,
            first_stage,
            max_stage,
            aki_above_2 = max_stage > 1) %>%
  filter(aki_above_2 == TRUE,
         (group == "cons" | group == "9520" | group == "9920")) %>%
  left_join(table_1, by = "STAY_ID") %>%
    mutate(group =
           case_when(group == "cons" ~ "No exclusion",
                     group == "9520" ~ "95th precentile for rate bellow 20th precentile",
                     group == "9920" ~ "99th precentile for rate bellow 20th precentile")) %>%
  select(
    group,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
  # scr_baseline,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  tbl_summary(
    by = group,
    type = list(
      c(hospital_expire_flag,
        ckd,
        dm,
        rrt_binary) ~ "dichotomous",
      c(admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last) ~ "continuous"
    ),
    statistic = c(admission_age,
                  weight_admit,
                  creat_first,
                  creat_peak_72,
                  creat_last) ~ "{mean} ({sd})",
    missing = "no",
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p() %>%
  add_stat(
    fns = everything() ~ add_by_n
  ) %>%
  modify_header(starts_with("add_n_stat") ~ "**N**") %>%
  modify_table_body(
    ~ .x %>%
      dplyr::relocate(add_n_stat_1, .before = stat_1) %>%
      dplyr::relocate(add_n_stat_2, .before = stat_2) %>%
      dplyr::relocate(add_n_stat_3, .before = stat_3)
  )

mimic_exclusion_threshold
Characteristic N 95th precentile for rate bellow 20th precentile
N = 9,803
1
N 99th precentile for rate bellow 20th precentile
N = 10,658
1
N No exclusion
N = 11,111
1
p-value2
Age at Hospital Admission, years 9,803 67 (16) 10,658 67 (16) 11,111 67 (16) 0.9
Weight at ICU Admission, kg 9,803 89 (27) 10,658 89 (27) 11,111 88 (27) 0.036
Gender 9,803
10,658
11,111
>0.9
    F
4,362 (44%)
4,745 (45%)
4,950 (45%)
    M
5,441 (56%)
5,913 (55%)
6,161 (55%)
Ethnicity 8,427
9,142
9,527
>0.9
    African American
967 (11%)
1,048 (11%)
1,084 (11%)
    Asian
169 (2.0%)
193 (2.1%)
205 (2.2%)
    Caucasian
6,632 (79%)
7,186 (79%)
7,497 (79%)
    Hispanic
279 (3.3%)
299 (3.3%)
310 (3.3%)
    Other
380 (4.5%)
416 (4.6%)
431 (4.5%)
CCI Score 9,803 5 (3, 7) 10,658 5 (3, 7) 11,111 5 (3, 7) 0.8
CKD, Stage 1-4 9,799 2,332 (24%) 10,654 2,582 (24%) 11,107 2,678 (24%) 0.8
Diabetes Mellitus 9,799 2,438 (25%) 10,654 2,622 (25%) 11,107 2,710 (24%) 0.7
SOFA Score at ICU Admission 9,803 5 (2, 8) 10,658 5 (2, 8) 11,111 5 (2, 8) 0.6
SAPS-II at ICU Admission 9,743 39 (30, 50) 10,590 39 (30, 50) 11,041 39 (29, 50) 0.6
APS-III Score at ICU Admission 9,803 46 (34, 63) 10,658 46 (34, 64) 11,111 46 (34, 63) 0.6
First Creatinine in ICU, mg/dL 9,775 1.66 (1.83) 10,621 1.70 (1.91) 11,074 1.70 (1.91) 0.8
Peak Creatinine at first days, mg/dL 9,771 2.07 (2.12) 10,617 2.11 (2.19) 11,065 2.10 (2.19) 0.6
ICU Discharge Creatinine, mg/dL 9,775 1.68 (1.71) 10,621 1.70 (1.75) 11,074 1.69 (1.74) 0.7
Peak KDIGO-Cr at first days 9,737
10,577
11,015
>0.9
    0
5,578 (57%)
6,039 (57%)
6,357 (58%)
    1
2,557 (26%)
2,805 (27%)
2,887 (26%)
    2
715 (7.3%)
759 (7.2%)
773 (7.0%)
    3
887 (9.1%)
974 (9.2%)
998 (9.1%)
Time in hospital, days 9,803 8 (5, 14) 10,658 8 (5, 14) 11,111 8 (5, 14) 0.5
Time in ICU, days 9,803 3.1 (1.9, 5.7) 10,658 3.1 (1.9, 5.7) 11,111 3.1 (1.9, 5.7) 0.9
Renal replacement therapy 9,803 1,345 (14%) 10,658 1,519 (14%) 11,111 1,574 (14%) 0.5
Hospital Mortality 9,803 1,805 (18%) 10,658 2,031 (19%) 11,111 2,111 (19%) 0.4
1 Mean (SD); n (%); Median (Q1, Q3)
2 Kruskal-Wallis rank sum test; Pearson’s Chi-squared test

Clinical Outcomes

Describing the prevalence of oliguric-AKI upon admission and incidence at the first ICU day

akis_all_long %>%
  filter(group == "newcons") %>%
  select(prevalnce_admit,
         Incidence_first_72hr,
         max_stage) %>%
  drop_na(prevalnce_admit) %>%
  transmute(
    aki_binary = if_else(max_stage > 0, 1, 0),
    max_stage = if_else(max_stage == 0, NA, max_stage),
    prevalnce_admit
  ) %>%
  tbl_summary(
    missing = "no",
    digits = everything() ~ c(0, 1),
    label = list(
      aki_binary ~ "Oliguric-AKI on the first days",
      prevalnce_admit ~ "Prevalence at admission",
      max_stage ~ "Maximum KDIGO staging"
    )
  ) %>%
  modify_column_indent(columns = label, 
                       rows = c(FALSE, TRUE)) %>%
  modify_column_indent(columns = label, 
                       rows = c(FALSE, FALSE, TRUE, TRUE, TRUE),
                       double_indent = TRUE)
Characteristic N = 46,3441
Oliguric-AKI on the first days 22,372 (48.3%)
    Maximum KDIGO staging
        1 11,262 (50.3%)
        2 8,991 (40.2%)
        3 2,119 (9.47%)
Prevalence at admission 6,388 (13.8%)
1 n (%)
aki_uo_analysis <- left_join(akis_all_wide, uo_ml_kg_hr, by = "STAY_ID") %>%
  drop_na(FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
          TIME_INTERVAL_FINISH, 
          MAX_STAGE_NEW_CONS) %>%
  transmute(STAY_ID,
         MAX_STAGE_NEW_CONS = as.character(MAX_STAGE_NEW_CONS),
         FIRST_POSITIVE_STAGE_UO_CONS_TIME,
         TIME_INTERVAL_FINISH,
         UO_KG = ML_KG_HR
         ) %>%
  mutate(TIME = as.double(difftime(TIME_INTERVAL_FINISH, 
                                   FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
                                   units = c("hour")))) %>%
  filter(TIME >= -48 & TIME <= 48)

aki_creat_analysis <- left_join(akis_all_wide, creat_diff %>% select(-STAY_ID), by = "HADM_ID") %>%
  select(STAY_ID,
         MAX_STAGE_NEW_CONS,
         FIRST_STAGE_NEW_CONS,
         FIRST_POSITIVE_STAGE_UO_CONS_TIME,
         CHARTTIME,
         CREAT,
         SCR_BASELINE,
         CREAT_BASLINE_DIFF,
         CREAT_BASLINE_RATIO,
         CREAT_LOWEST7_DIFF,
         CREAT_LOWEST7_RATIO
         ) %>%
  mutate(AKI_TO_CREAT = as.double(difftime(CHARTTIME, 
                                   FIRST_POSITIVE_STAGE_UO_CONS_TIME, 
                                   units = c("mins"))) / 60) %>%
  filter(AKI_TO_CREAT >= -72 & AKI_TO_CREAT <= 72)

First Oliguric-AKI Events

table 1 for ICU stays with identified oliguric AKI in the first 72 hours of admission, stratified by max kdigo-uo stage (ICU stays with AKI at admission were excluded):

table1_akis <- akis_all_wide  %>%
  filter(MAX_STAGE_NEW_CONS >= 0) %>%
  select(STAY_ID, MAX_STAGE_NEW_CONS) %>%
  left_join(table_1, by = "STAY_ID")

table_1_staging <- table1_akis %>%
  select(
    MAX_STAGE_NEW_CONS,
    admission_age,
    weight_admit,
    gender,
    race,
    charlson_comorbidity_index,
    ckd,
    dm,
    sofa_first_day,
    sapsii,
    apsiii,
    creat_first,
    creat_peak_72,
    creat_last,
    kdigo_cr_max,
    hospital_days,
    icu_days,
    rrt_binary,
    hospital_expire_flag
  ) %>%
  mutate(
    staging = case_when(
      MAX_STAGE_NEW_CONS == 0 ~ "No AKI",
      MAX_STAGE_NEW_CONS == 1 ~ "Stage 1",
      MAX_STAGE_NEW_CONS == 2 ~ "Stage 2",
      MAX_STAGE_NEW_CONS == 3 ~ "Stage 3",
      .default = NA
    ),
    .keep = "unused"
  ) %>%
  tbl_summary(
    by = staging,
    type = list(
      c(hospital_expire_flag, ckd, dm, rrt_binary) ~ "dichotomous",
      c(
        admission_age,
        weight_admit,
        creat_first,
        creat_peak_72,
        creat_last
      ) ~ "continuous"
    ),
    statistic = c(
      admission_age,
      weight_admit,
      creat_first,
      creat_peak_72,
      creat_last
    ) ~ "{mean} ({sd})",
    missing = "no",
    missing_text = "-",
    digits = list(hospital_days ~ c(1, 1)),
    label = list(
      admission_age ~ "Age at Hospital Admission, years",
      gender ~ "Gender",
      weight_admit ~ "Weight at ICU Admission, kg",
      charlson_comorbidity_index ~ "CCI Score",
      sofa_first_day ~ "SOFA Score at ICU Admission",
      ckd ~ "CKD, Stage 1-4",
      apsiii ~ "APS-III Score at ICU Admission",
      creat_first ~ "First Creatinine in ICU, mg/dL",
      creat_peak_72 ~ "Peak Creatinine at first days, mg/dL",
      creat_last ~ "ICU Discharge Creatinine, mg/dL",
      kdigo_cr_max ~ "Peak KDIGO-Cr at first days",
      race ~ "Ethnicity",
      icu_days ~ "Time in ICU, days",
      hospital_days ~ "Time in hospital, days",
      rrt_binary ~ "Renal replacement therapy",
      hospital_expire_flag ~ "Hospital Mortality",
      dm ~ "Diabetes Mellitus",
      sapsii ~ "SAPS-II at ICU Admission"
    )
  ) %>%
  add_p(c(
    admission_age,
    weight_admit,
    creat_first,
    creat_peak_72,
    creat_last
  ) ~ "aov")

table_1_staging 
Characteristic No AKI
N = 23,972
1
Stage 1
N = 11,262
1
Stage 2
N = 8,991
1
Stage 3
N = 2,119
1
p-value2
Age at Hospital Admission, years 62 (18) 67 (16) 68 (16) 66 (16) <0.001
Weight at ICU Admission, kg 77 (20) 84 (23) 89 (27) 88 (28) <0.001
Gender



<0.001
    F 10,590 (44%) 4,710 (42%) 3,987 (44%) 963 (45%)
    M 13,382 (56%) 6,552 (58%) 5,004 (56%) 1,156 (55%)
Ethnicity



<0.001
    African American 2,051 (9.8%) 910 (9.3%) 840 (11%) 244 (14%)
    Asian 895 (4.3%) 232 (2.4%) 146 (1.9%) 59 (3.4%)
    Caucasian 16,023 (76%) 7,795 (80%) 6,228 (80%) 1,268 (73%)
    Hispanic 935 (4.5%) 325 (3.3%) 243 (3.1%) 67 (3.9%)
    Other 1,053 (5.0%) 471 (4.8%) 336 (4.3%) 95 (5.5%)
CCI Score 4 (2, 6) 5 (3, 7) 5 (3, 7) 6 (4, 8) <0.001
CKD, Stage 1-4 3,101 (13%) 1,802 (16%) 1,879 (21%) 799 (38%) <0.001
Diabetes Mellitus 4,933 (21%) 2,662 (24%) 2,201 (24%) 508 (24%) <0.001
SOFA Score at ICU Admission 3 (1, 5) 4 (2, 6) 4 (2, 7) 8 (4, 12) <0.001
SAPS-II at ICU Admission 30 (22, 38) 33 (26, 42) 37 (29, 47) 49 (37, 60) <0.001
APS-III Score at ICU Admission 34 (26, 44) 38 (29, 50) 43 (33, 58) 64 (45, 83) <0.001
First Creatinine in ICU, mg/dL 1.14 (1.04) 1.19 (1.02) 1.42 (1.51) 2.87 (2.81) <0.001
Peak Creatinine at first days, mg/dL 1.21 (1.07) 1.33 (1.10) 1.70 (1.70) 3.79 (3.07) <0.001
ICU Discharge Creatinine, mg/dL 1.00 (0.79) 1.12 (0.95) 1.40 (1.38) 2.92 (2.46) <0.001
Peak KDIGO-Cr at first days



<0.001
    0 20,132 (86%) 8,387 (75%) 5,646 (63%) 711 (34%)
    1 2,708 (12%) 2,166 (19%) 2,247 (25%) 639 (30%)
    2 440 (1.9%) 369 (3.3%) 569 (6.4%) 204 (9.7%)
    3 175 (0.7%) 198 (1.8%) 454 (5.1%) 544 (26%)
Time in hospital, days 6.0 (3.0, 10.0) 7.0 (4.0, 12.0) 8.0 (5.0, 13.0) 10.0 (5.0, 18.0) <0.001
Time in ICU, days 1.4 (1.0, 2.5) 2.2 (1.3, 4.0) 2.9 (1.8, 5.1) 4.4 (2.8, 8.2) <0.001
Renal replacement therapy 281 (1.2%) 257 (2.3%) 604 (6.7%) 970 (46%) <0.001
Hospital Mortality 1,163 (4.9%) 1,005 (8.9%) 1,440 (16%) 671 (32%) <0.001
1 Mean (SD); n (%); Median (Q1, Q3)
2 One-way analysis of means; Pearson’s Chi-squared test; Kruskal-Wallis rank sum test

UO onset at UOcons event:

mimic_uo_cons_figure <- aki_uo_analysis %>% 
ggplot(aes(TIME, UO_KG, color=MAX_STAGE_NEW_CONS, fill=MAX_STAGE_NEW_CONS))  + 
           # linetype=MAX_STAGE_NEW_CONS))  + 
  geom_hline(yintercept=0.3, size = 0.3, color = "#cccccc") +
  geom_hline(yintercept=0.5, size = 0.3, color = "#cccccc") +
  geom_vline(xintercept=0, size = 0.3, color = "black", linetype = "dashed") +
  stat_summary(fun = median, geom="line") +
  scale_x_continuous(breaks = seq(-24, 48, by=6)) +
  scale_y_continuous(breaks = c(0, 0.3, 0.5)) +
  coord_cartesian(xlim = c(-12, 24), ylim = c(0, 1.7)) +
  # xlim(-24, 48) +
  stat_summary(fun.min = function(z) { quantile(z,0.25) },
               fun.max = function(z) { quantile(z,0.75) },
               geom="ribbon", colour = NA, alpha=0.2) +
  labs(x="Time around AKI onset (hour)", y = "Urine output (ml/kg/hr)", 
       color="Maximum KDIGO-UO stage", fill="Maximum KDIGO-UO stage") + 
  theme_classic() + # remove panel background and gridlines
  scale_color_manual(values = pal_jama("default")(4)[2:4]) +
  scale_fill_manual(values = pal_jama("default")(4)[2:4]) +
  theme(
    legend.position = "none"
  )

mimic_uo_cons_figure

Serum Creatinine Analysis

aki_creat_analysis %>%
ggplot(aes(x=CREAT_BASLINE_DIFF)) + 
    xlim(0, 5) + 
    geom_histogram(binwidth = 0.1)

mimic_creat_a <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 1,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = "Difference from basline (mg/dL)") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
mimic_creat_b <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 2,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
mimic_creat_c <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 3,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE,
                                # labels=FALSE
                                )) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_DIFF)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    # scale_x_discrete(labels=c('-24      ','-18      ','-12      ','-6      ','0      ','6      ','12      ','18      ','24      ','30      ','36      ','42      ','48      ')) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(-0.1, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold"),
          # axis.text.x = element_text(margin = margin(t = 2),
          #                            hjust="1")
          ) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
mimic_creat_d <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 1,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x=" ", y = "Relative sCr change") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
mimic_creat_e <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 2,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x="Time to AKI start (hours)", y = " ") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
mimic_creat_f <- aki_creat_analysis %>% 
  filter(MAX_STAGE_NEW_CONS == 3,
         AKI_TO_CREAT >= -24,
         AKI_TO_CREAT <= 48) %>%
  mutate(AKI_TO_CREAT_BIN = cut(AKI_TO_CREAT, 
                                breaks=12, 
                                ordered_result = TRUE)) %>%
ggplot(aes(factor(AKI_TO_CREAT_BIN), CREAT_LOWEST7_RATIO)) +
    geom_boxplot(linetype = "dashed", outlier.shape = NA, color="brown") +
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = NA, color="brown", fill="orange") +
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), color="brown") +
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), color="brown") +
    stat_summary(fun.y=mean, colour="darkred", geom="point", hape=18, size=2,show_guide = FALSE) +
    labs(x=" ", y = " ") +
    coord_cartesian(ylim = c(1.01, 3)) +
    theme_classic() + # remove panel background and gridlines
    theme(legend.position = "bottom",
          plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
          plot.subtitle = element_text(size = 10, face = "bold")) +
  scale_x_discrete(breaks = function(x){x[c(TRUE, FALSE)]})
ggarrange(
  mimic_creat_a,
  mimic_creat_b,
  mimic_creat_c,
  mimic_creat_d,
  mimic_creat_e,
  mimic_creat_f,
  labels = c("a", "b", "c", "d", "e", "f"),
  ncol = 3,
  nrow = 2,
  heights = c(1,1),
  legend = "bottom",
  common.legend = TRUE
)

Survival Analysis

km_fit <- survfit2(Surv(FOLLOWUP_DAYS, DEATH_FLAG) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide)

mimic_survival_cons_figure <- km_fit %>%
ggsurvfit(linewidth = 1) +
  scale_ggsurvfit(x_scales = list(breaks = seq(0, 30, by = 5))) +
  coord_cartesian(xlim = c(-1, 31)) +
  theme_classic() +
  labs(x="Days", y = "Survival", 
       color='Maximum KDIGO-UO stage', fill='Maximum KDIGO-UO stage') +
  scale_color_jama() +
  scale_fill_jama() +
  theme(legend.position = "bottom",
        plot.title = element_text(color = "#0099F8", size = 16, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "bold")) +
  add_risktable() +
  add_pvalue(caption = "Log-rank {p.value}")

mimic_survival_cons_figure

table of survival probabilities:

mimic_survival_table <-
  km_fit %>% tbl_survfit(times = c(7, 30, 90, 365),
                                                 label = "Maximum KDIGO-UO stage",
                                                 label_header = "**Day {time}**")
mimic_survival_table
Characteristic Day 7 Day 30 Day 90 Day 365
Maximum KDIGO-UO stage



    0 97% (96%, 97%) 93% (93%, 93%) 89% (89%, 89%) 83% (82%, 83%)
    1 94% (94%, 94%) 88% (88%, 89%) 84% (83%, 84%) 77% (76%, 78%)
    2 89% (88%, 89%) 80% (79%, 81%) 74% (73%, 75%) 66% (65%, 67%)
    3 78% (76%, 80%) 63% (61%, 66%) 57% (55%, 60%) 49% (47%, 51%)

Log rank for each pair:

survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide)
## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide)
## 
## n=46344, 2189 observations deleted due to missingness.
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=0 23972     1715     3005    553.64   1190.38
## MAX_STAGE_NEW_CONS=1 11262     1336     1375      1.08      1.44
## MAX_STAGE_NEW_CONS=2  8991     1826     1049    575.26    711.31
## MAX_STAGE_NEW_CONS=3  2119      775      224   1360.29   1427.01
## 
##  Chisq= 2510  on 3 degrees of freedom, p= <2e-16
akis_all_wide_non_01 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 0 | MAX_STAGE_NEW_CONS == 1)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_01)
## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_01)
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=0 23972     1715     2094      68.5       219
## MAX_STAGE_NEW_CONS=1 11262     1336      957     149.7       219
## 
##  Chisq= 219  on 1 degrees of freedom, p= <2e-16
akis_all_wide_non_12 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 1 | MAX_STAGE_NEW_CONS == 2)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_12)
## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_12)
## 
##                          N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=1 11262     1336     1793       116       271
## MAX_STAGE_NEW_CONS=2  8991     1826     1369       153       271
## 
##  Chisq= 272  on 1 degrees of freedom, p= <2e-16
akis_all_wide_non_23 <- akis_all_wide %>%
  filter(MAX_STAGE_NEW_CONS == 2 | MAX_STAGE_NEW_CONS == 3)
survdiff(Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, data = akis_all_wide_non_23)
## Call:
## survdiff(formula = Surv(FOLLOWUP_DAYS, mortality_30) ~ MAX_STAGE_NEW_CONS, 
##     data = akis_all_wide_non_23)
## 
##                         N Observed Expected (O-E)^2/E (O-E)^2/V
## MAX_STAGE_NEW_CONS=2 8991     1826     2144      47.1       272
## MAX_STAGE_NEW_CONS=3 2119      775      457     220.9       272
## 
##  Chisq= 272  on 1 degrees of freedom, p= <2e-16
mimic_t1a <- t1a
mimic_t1b <- t1b
mimic_table_1 <- table_1
mimic_table_1_akis <- table1_akis
mimic_uo_rate <- uo_rate
mimic_aki_epi <- akis_all_long %>%
  filter(group == "newcons")
mimic_akis_all_wide <- akis_all_wide
mimic_table_1_staging <- table_1_staging
mimic_rr_table_data <- rr_table_data
mimic_rr_table_data_adj <- rr_table_data %>% 
  left_join(bind_rows(rr_table_data_m1_adj1, rr_table_data_m2_adj1, rr_table_data_m3_adj1))

save(mimic_t1a,
     mimic_t1b,
     mimic_table_1,
     mimic_table_1_akis,
     mimic_uo_rate,
     mimic_uo_cons_figure,
     mimic_survival_cons_figure,
     mimic_survival_table,
     mimic_aki_epi,
     mimic_akis_all_wide,
     mimic_table_1_staging,
     mimic_rr_table_data,
     mimic_rr_table_data_adj,
     file = "paper_mimic.Rda")
mimic_uo_rate <- uo_rate
mimic_hourly_uo <- hourly_uo
mimic_raw_uo_eligible <- raw_uo_eligible

save(
  all_rows_count,
  distinct_time_item_patient_rows_count,
  S2_a,
  S3a,
  S4_a,
  S4_b,
  S4_c,
  S4_d,
  S4_e,
  S4_f,
  S6_a,
  S6_b,
  # S7_a,
  # S7_b,
  S7_c,
  S7_d,
  # S8_a,
  # S8_b,
  S8_c,
  S8_d,
  S8_e,
  # S8_f,
  S9_a,
  S9_b,
  S9_c,
  S9_d,
  S11,
  # mimic_uo_rate,
  # mimic_hourly_uo,
  # mimic_raw_uo_eligible,
  mimic_exclusion_threshold,
  mimic_kdigo_inter_aki_table,
  mimic_kdigo_inter_cons_mean,
  mimic_kdigo_inter_cons_old,
  mimic_kdigo_inter_mean_old,
  mimic_kdigo_inter_bic,
  mimic_kdigo_inter_survival_table,
  mimic_kdigo_inter_survival_table_adj,
  mimic_kdigo_inter_survival_figure,
  mimic_Sage_a,
  mimic_Sage_b,
  mimic_Sweight_a,
  mimic_Sweight_b,
  file = "s_data.Rda"
)

Technical Details

R Session Info:

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Asia/Jerusalem
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] marginaleffects_0.21.0 rms_6.8-1              Hmisc_5.1-3           
##  [4] gt_0.11.0              ggsurvfit_1.1.0        ggsci_3.2.0           
##  [7] gtsummary_2.0.0        nortest_1.0-4          survminer_0.4.9       
## [10] ggpubr_0.6.0           survival_3.7-0         rmdformats_1.0.4      
## [13] kableExtra_1.4.0       broom_1.0.6            quantreg_5.98         
## [16] SparseM_1.84-2         rlang_1.1.4            ggforce_0.4.2         
## [19] ggpmisc_0.6.0          ggpp_0.5.8-1           scales_1.3.0          
## [22] ggbreak_0.1.2          psych_2.4.6.26         finalfit_1.0.8        
## [25] reshape2_1.4.4         lubridate_1.9.3        forcats_1.0.0         
## [28] stringr_1.5.1          dplyr_1.1.4            purrr_1.0.2           
## [31] readr_2.1.5            tidyr_1.3.1            tibble_3.2.1          
## [34] ggplot2_3.5.1          tidyverse_2.0.0        bigrquery_1.5.1       
## [37] DBI_1.2.3              pacman_0.5.1          
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.4.1        polspline_1.1.25     ggplotify_0.1.2     
##   [4] polyclip_1.10-7      rpart_4.1.23         lifecycle_1.0.4     
##   [7] rstatix_0.7.2        lattice_0.22-6       MASS_7.3-61         
##  [10] insight_0.20.2       backports_1.5.0      magrittr_2.0.3      
##  [13] sass_0.4.9           rmarkdown_2.27       jquerylib_0.1.4     
##  [16] yaml_2.3.10          askpass_1.2.0        cowplot_1.1.3       
##  [19] RColorBrewer_1.1-3   minqa_1.2.7          multcomp_1.4-26     
##  [22] abind_1.4-5          clock_0.7.1          yulab.utils_0.1.5   
##  [25] nnet_7.3-19          TH.data_1.1-2        tweenr_2.0.3        
##  [28] rappdirs_0.3.3       sandwich_3.1-0       labelled_2.13.0     
##  [31] KMsurv_0.1-5         cards_0.2.0          MatrixModels_0.5-3  
##  [34] cardx_0.2.0          svglite_2.1.3        commonmark_1.9.1    
##  [37] codetools_0.2-20     xml2_1.3.6           tidyselect_1.2.1    
##  [40] shape_1.4.6.1        aplot_0.2.3          farver_2.1.2        
##  [43] lme4_1.1-35.5        base64enc_0.1-3      broom.helpers_1.15.0
##  [46] jsonlite_1.8.8       mitml_0.4-5          Formula_1.2-5       
##  [49] iterators_1.0.14     systemfonts_1.1.0    foreach_1.5.2       
##  [52] tools_4.4.1          Rcpp_1.0.13          glue_1.7.0          
##  [55] mnormt_2.1.1         gridExtra_2.3        pan_1.9             
##  [58] mgcv_1.9-1           xfun_0.46            withr_3.0.0         
##  [61] fastmap_1.2.0        boot_1.3-30          fansi_1.0.6         
##  [64] openssl_2.2.0        digest_0.6.36        timechange_0.3.0    
##  [67] R6_2.5.1             gridGraphics_0.5-1   mice_3.16.0         
##  [70] colorspace_2.1-1     markdown_1.13        utf8_1.2.4          
##  [73] generics_0.1.3       data.table_1.15.4    httr_1.4.7          
##  [76] htmlwidgets_1.6.4    pkgconfig_2.0.3      gtable_0.3.5        
##  [79] survMisc_0.5.6       brio_1.1.5           htmltools_0.5.8.1   
##  [82] carData_3.0-5        bookdown_0.40        png_0.1-8           
##  [85] ggfun_0.1.5          knitr_1.48           km.ci_0.5-6         
##  [88] rstudioapi_0.16.0    tzdb_0.4.0           checkmate_2.3.1     
##  [91] nlme_3.1-165         curl_5.2.1           nloptr_2.1.1        
##  [94] cachem_1.1.0         zoo_1.8-12           parallel_4.4.1      
##  [97] foreign_0.8-87       pillar_1.9.0         grid_4.4.1          
## [100] vctrs_0.6.5          car_3.1-2            jomo_2.7-6          
## [103] xtable_1.8-4         cluster_2.1.6        htmlTable_2.4.3     
## [106] evaluate_0.24.0      mvtnorm_1.2-5        cli_3.6.3           
## [109] compiler_4.4.1       ggsignif_0.6.4       labeling_0.4.3      
## [112] plyr_1.8.9           fs_1.6.4             stringi_1.8.4       
## [115] viridisLite_0.4.2    munsell_0.5.1        glmnet_4.1-8        
## [118] Matrix_1.7-0         hms_1.1.3            patchwork_1.2.0     
## [121] bit64_4.0.5          haven_2.5.4          highr_0.11          
## [124] gargle_1.5.2         memoise_2.0.1        bslib_0.7.0         
## [127] bit_4.0.5            polynom_1.4-1