Predicting Subchronic and Chronic Animal Toxicity from In Vitro High Content Imaging Data Using PBTK and Machine Learning
Author Block: T. Antonijevic1, and I. Shah2. 1ToxStrategies Inc., Katy, TX; and 2US EPA, Durham, NC.
A major challenge in toxicity testing in the 21st century is to predict animal toxicity from in vitro studies. Here, we utilized physiologically based toxicokinetic (PBTK) modeling and machine learning (ML) to predict mouse, rat, and dog liver toxicity from high content imaging (HCI) data obtained by measuring HepG2 cell responses to 967 chemical treatments across 10 endpoints and 3 time points (1h, 24h, and 72h). First, the HCI data were normalized to generate z-score data for p53, c-Jun, H2A.X, PH3, α-tubulin, mitochondrial membrane potential, mitochondrial mass, cell cycle arrest, nuclear size, and cell number. Second, lowest-observed adverse effect level (LOAEL) values for chemicals from subchronic/chronic studies in mouse (75/154), rat (161/160), and dog (69/113) were obtained from the ToxRef database. Third, LOAEL values were converted to average venous concentrations using PBTK to match the in vitro treatment protocol. Fourth, each in vitro treatment was associated with a toxicity class as follows: nontoxic if the venous concentration corresponding to LOAEL was greater than in vitro concentration, and toxic otherwise. Finally, 5 ML algorithms (k-nearest neighbors (kNN), Random forest (RF), support vector machine (SVM), decision trees (DT) and naïve Bayes (NB)) were used to evaluate the accuracy for predicting toxicity in each study type and species by each in vitro time point. Lastly, we created balanced data using B-splines to interpolate the HCI concentration-response data at untested concentrations. The mean area under the receiver operating characteristic curve (AUC) for chronic and subchronic imbalanced data were 0.7 (SD 0.01) and 0.72 (SD 0.01), respectively. The predictive performance was greater for balanced datasets with a mean AUC of 0.73 (SD 0.01) and 0.79 (SD 0.01) for chronic and subchronic toxicity, respectively. RF was the best algorithm to predict subchronic liver toxicity with AUC 0.96 (SD 0.005) for mice, 0.9 (SD 0.004) for rat and 0.84 (SD 0.01) for dog. For chronic studies, the most accurate classifiers were RF for mice 0.87 (SD 0.01), kNN for rat 0.82 (SD 0.005) and dog 0.79 (SD 0.01). The best prediction for mouse subchronic/chronic liver toxicity was obtained from 1h/24h HepG2 data, whereas for rat and dog, the highest score was obtained at 24h/1h. Our findings suggest the utility of a new approach for linking in vitro data to in vivo outcomes using machine learning. This abstract does not reflect EPA policy.