This open source dataset contains over 290K structure-activity relationship (SAR) data directed at COVID-19 protein, virus and disease targets reported in the literature, patents, and other public sources for over 9K of the substances included in the CAS COVID-19 Antiviral Candidate Compounds Dataset.
As a specialist in scientific information solutions, CAS is partnering with research organizations around the globe to tackle the complex and rapidly evolving challenge of COVID-19. Aligned with our mission as a division of the American Chemical Society, CAS is making a wide range of assets, expertise, and resources available to support this fight.
As part of this effort, CAS has built an open source dataset of structure-activity relationship (SAR) data directed at COVID-19 protein, virus and disease targets reported in the literature, patents, and other public sources. This dataset is a companion to the CAS COVID-19 Antiviral Candidate Compounds Dataset. The dataset license terms support use for applications including research, data mining, machine learning, and analytics at no charge.
This dataset (and readme file) may be downloaded by filling in the form below
About the Dataset
The dataset is in comma-separated value file format (.csv) and contains over 290,000 structure-bioactivity data points including IC50, EC50, and additional potency metrics. CSV file formats are readily importable as data to spreadsheet tools, such as Microsoft Excel. Data are organized by column headings: cas_rn; protein; therapeutic use; activity type; activity value; activity unit; source. For data science tools, such as the free software SciKit-Learn, with Python programming language, the pandas library dataframes can be considered for deploying the dataset to machine learning.
Contact the CAS Customer Center with any questions or if you would like to request additional data or services to support research or analysis related to COVID-19.
Additional CAS COVID-19 Resources
- CAS COVID-19 Resource Hub
- CAS COVID-19 Antiviral Candidate Compounds Dataset
- Research and Development on Therapeutic Agents and Vaccines for COVID-19 and Related Human Coronavirus Diseases (review article in ACS Central Science)