HRD Review - Biomarker Heatmap
github.com/insilica/bawto...fig3-heatmap.R
In the Homologous Recombination Deficiency (HRD) Biomarker review, biomarkers were extracted from prostate cancer clinical trials. sysrev.com/p/81395 provides an open access repository of every review decision. This vignette shows how to build the heatmap generated for the publication.
Get the data
get_answers(81395)
gets raw data from project 81395, list_answers
organizes that into a list of tables.
tbls <- rsr::get_answers(81395) |> rsr::list_answers()
#> list(basic=<tbl>,biomarker=<tbl>,...)
The basic table links articles (eg. aid
1781738) to basic label data (logical, string, category). In this project, a basic label describes article trial identifiers (eg NCT01682772):
tbls$basic |>
mutate(nct=unlist(`NCT Trial ID`)) |>
select(aid,nct)
#> aid nct
#> 1 11781750 NCT01682772
#> 2 11781738 NCT02854436
The other named values in get_answer_list are group labels, which are themselves tables. Here, the biomarker table describes which trials evaluated which genes.
tbls$biomarker |>
mutate(biomarker.name=unlist(biomarker.name)) |>
select(aid, biomarker.name)
#> aid biomarker.name
#> 1 11781738 BRCA1
#> 2 11781738 BRCA2
Extracted tables can be joined by article aid
. Here, joined basic/biomarker tables link trials and biomarkers. ggplot2::geom_tile
can then quickly visualize which trials use which biomarkers.
join.tb = tbls$basic |>
inner_join(tbls$biomarker,by="aid")
# ggplot(join.tb,
# aes(x=bmkr, y=study, fill=elig)) +
# geom_tile() + …
Build the heatmap
A bit more processing of the above tables and ggplot can generate a heatmap. Unfold the code to see how it is done.
joint = tbls$basic |>
inner_join(tbls$biomarker,by="aid") |>
select(aid, study=short_name, bmkr=biomarker.name, eli=eligibility) |>
purrr::modify(~ unlist(.,recursive = T))
top10 = joint |> group_by(bmkr) |> summarize(s = n_distinct(study)) |>
slice_max(n=13,order_by=s,with_ties = F)
ptb = joint |>
inner_join(top10,by="bmkr") |>
mutate(bmkr = fct_rev(fct_infreq(bmkr))) |>
mutate(study = fct_infreq(study)) |>
complete(bmkr, study, fill=list(eli="none")) |>
mutate(eli = ifelse(eli=="sufficient","measured",eli)) |>
mutate(eli = factor(eli,levels=c("none","required_negative","measured"))) |>
mutate(eli = fct_recode(eli,
exclude ="required_negative",
`measure/include`="measured"))
ggplot(ptb,aes(x=study, y=bmkr, fill=eli)) +
geom_tile(col="white",size=0.5) +
scale_fill_manual(values=c("#161616","#4C9605","#CC2C11")) +
theme(text = element_text(size=12),
panel.background = element_blank(),
axis.text.x = element_text(angle=90,hjust=1),
legend.title = element_blank(),
legend.position = "top") +
ylab("") + xlab("")
This is 90% of the way there, visit github.com/insilica/bawto to learn the rest.