BioBricks & ToxRefDB
The ToxRefDB brick is a sqlite asset that contains mammal toxicity information that can aid in chemical risk assessments. This data is pulled from the ToxRefDB Clowder Archive.
To use the ToxRefDB brick, install biobricks and install the brick:
pip install biobricks
biobricks configure # follow the prompts
biobricks install toxrefdb # installs ~400mb toxredfb.sqlite
The toxrefdb is installed at the brickpath which is created in your configuraiton step. The brick has a single asset:
biobricks assets toxrefdb
# toxrefdb_sqlite: [brickpath]/brick/toxrefdb.sqlite
To load the database in R:
library(biobricks)
library(RSQLite)
toxref_assets <- biobricks::bbassets("toxrefdb")
toxref <- dbConnect(RSQLite::SQLite(), toxref_assets$toxrefdb_sqlite)
RSQLite::dbListTables(toxref)
# [1] "pod" "chemical" "study" "guideline" "endpoint" ...
There are 26 tables in this database, many of which have useful information. For this post we’ll focus on the pod
or “point of departure” table and some related tables.
In toxicology, point of departure metrics describe a dose where a chemical has some kind of observable or measurable effect. NOAEL, or No Observed Adverse Effect Level, is the dose directly separating no observed adverse effect from an observed adverse effect.
The point of departure table records 58,447 NOAELs on 735 chemicals from 3633 studies. We can collect all that information into a single table:
library(tidyverse)
# get the basic noael data
# a chemical has an adverse effect when pod `dose` < `max_dose_level`
adverse_pod <- tbl(toxref, "pod") |>
filter(pod_type=="noael") |>
mutate(adverse = dose_level < max_dose_level) |>
select(chemical_id, study_id, adverse) |>
collect()
# get the chemical name
chemical <- tbl(toxref, "chemical") |>
rename(chemical_name=preferred_name) |>
collect()
# get guideline information - each study has a single guideline
study_guideline <- tbl(toxref, "study") |>
inner_join(tbl(toxref,"guideline"), by="guideline_id") |>
filter(!is.na(guideline_number)) |> # ignore rows w/out guideline
select(study_id, guideline_number, guideline_name=name) |>
collect()
# put it all together
pod <- adverse_pod |>
inner_join(chemical,by="chemical_id") |>
inner_join(study_guideline,by="study_id") |>
select(chemical_name, guideline_name, adverse)
# chemical_name guideline_name adverse
# Diquat dibromide Prenatal Developmental Toxicity Study 1
# Fludioxonil 90-day Oral Toxicity in Rodents 1
# Difenoconazole 90-day Oral Toxicity in Nonrodents 0
# Clomazone Reproduction and Fertility Effects 0
# Tepraloxydim Chronic Toxicity 1
We can do a simple count of the number of hazardous and non hazardous compounds for each guideline. Here we are defining hazardous to be those compounds that have a no observed adverse effect level that is less than the maximum tested dose. In other words, here ‘hazardous’ compounds are those that have an observed effect at some level.