Big Data in Predictive Toxicology
eBook - ePub

Big Data in Predictive Toxicology

Daniel Neagu, Andrea-Nicole Richarz, Daniel Neagu, Andrea-Nicole Richarz

Partager le livre
  1. 394 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Big Data in Predictive Toxicology

Daniel Neagu, Andrea-Nicole Richarz, Daniel Neagu, Andrea-Nicole Richarz

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

The rate at which toxicological data is generated is continually becoming more rapid and the volume of data generated is growing dramatically. This is due in part to advances in software solutions and cheminformatics approaches which increase the availability of open data from chemical, biological and toxicological and high throughput screening resources. However, the amplified pace and capacity of data generation achieved by these novel techniques presents challenges for organising and analysing data output.

Big Data in Predictive Toxicology discusses these challenges as well as the opportunities of new techniques encountered in data science. It addresses the nature of toxicological big data, their storage, analysis and interpretation. It also details how these data can be applied in toxicity prediction, modelling and risk assessment.

This title is of particular relevance to researchers and postgraduates working and studying in the fields of computational methods, applied and physical chemistry, cheminformatics, biological sciences, predictive toxicology and safety and hazard assessment.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Big Data in Predictive Toxicology est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Big Data in Predictive Toxicology par Daniel Neagu, Andrea-Nicole Richarz, Daniel Neagu, Andrea-Nicole Richarz en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Medicine et Toxicology. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2019
ISBN
9781839160820
Édition
1
Sous-sujet
Toxicology
CHAPTER 1
Big Data in Predictive Toxicology: Challenges, Opportunities and Perspectives
Andrea-Nicole Richarz*
European Commission, Joint Research Centre (JRC), Ispra, Italy,
*E-mail: [email protected]

Predictive toxicology and model development rely heavily on data to draw upon and have historically suffered from the paucity of available and good quality datasets. The situation has now dramatically changed from a lack of data hampering model development to “data overload”. With high throughput/content screening methodologies being systematically used aiming to understand the mechanistic basis of adverse effects, and increasing use of omics technologies and consideration of (bio)monitoring data, the volume of data is continuously increasing. Big data in predictive toxicology may not have reached the dimension of other areas yet, such as real-time generated data in the health sector, but encompass similar characteristics and related challenges. Pertinent questions in this area are whether the new plethora of data are adequate for use in predictive toxicology and whether they address this area's most urgent problems. This overview chapter looks at the definition and characteristics of big data in the context of predictive toxicology as well as the challenges and opportunities big data present in this field.

1.1 Introduction

Predictive toxicology and model development rely heavily on data to draw upon and have historically suffered from the paucity of available and good quality datasets. As recently as 20 years ago, obtaining even relatively limited datasets of chemical structures and experimental bioactivities involved the laborious and manual collection, curation and compilation of data. Historically, the foremost problem in predictive toxicity modelling has been the scarcity of data, in addition to that of data quality. The data were produced manually, and thus slowly, by measuring physico–chemical properties and activities experimentally in the laboratory. As a result, models have been built based on the same training sets, and data were only available for a limited number of properties and endpoints.
With the advance in laboratory automation and the paradigm change towards 21st Century Toxicology, focussing on adverse effect pathways and elucidating mechanisms of action, the availability of data has changed and, overall, improved. For example, initiatives such as ToxCast and Tox21 have produced publicly accessible bioactivity data for a large number of chemicals for many endpoints with high throughput/high content assays. Furthermore, omics technologies for screening changes in genomes, proteomes, metabolomes etc. have produced many data. Biomonitoring and epidemiological data are also increasingly available.
Thus, in a short time period, the situation for predictive toxicology has changed dramatically from a lack of data hampering model development to that of “data overload”. It can be argued that big data in predictive toxicology have not yet reached the dimension of other sectors, such as large-scale real-world data generated in the health sector. 1,2 However, these data encompass similar characteristics and related challenges and represent big data for this specific field.
This introductory chapter looks at the definition of big data in the context of predictive toxicology and the challenges and opportunities big data present in this field.

1.2 Big Data in the Area of Predictive Toxicology

Big data is a “buzz word” in many areas of science and technology and without doubt have changed everyday lives in many areas of society, for example, with the “internet of things”, by offering new opportunities to integrate and use information.
Given the ubiquity of the term, what are the “big data” as relates to predictive toxicology? As compared to other fields, the data taken into account for predictive toxicology are not yet generated in real time for instant analysis. In the closely related sector of health care, for example, 24-hour data produced by monitoring functions in patients are already a reality. There are many initiatives in the area of drug discovery or medicine and public health that address the leverage of big data. For example, the European Union Innovative Medicines Initiative (IMI) programme Big Data for Better Outcomes (BD4BO 3 ) supports the use of big data to improve health care. This programme assists by developing platforms for integrating and analysing diverse big data sets. In the area of drug discovery, possibilities to accelerate medicinal chemistry by using big data and artificial intelligence are being investigated. 4 The European Lead Factory, 5 a collaborative public–private partnership, has established the first European Compound Library and the first European Screening Centre, comprising up to 500 000 novel compounds. The EU Integrated Training Network BIGCHEM 6 project provides education in large chemical data analysis for drug discovery, to be able to process and use the growing amount of biomedical data in chemistry and life sciences. 7 The EU ExCAPE project 8 applies the power of supercomputers and machine learning to accelerate drug discovery.
It must be said that compared to the use of big data in other fields (e.g., finance, economics, etc.), the data in toxicology are still relatively small scale, but with regard to what went before, they have broken the mould and can be considered big data. Typical examples of the types of big data in the context of toxicology are high throughput/high content screening data and data generated with omics technologies, gene arrays etc. Compared to “absolute” or “static” toxicity data, in addition to dose–response curves, resolution of responses in time is becoming a major area of investigation, including toxicokinetic measurements – or models – to take into account, for example, the rate of absorption or metabolism. As a consequence, more data are being measured. This is particularly true for the area of high throughput kinetics. Furthermore, as discussed in the 2017 US National Academies of Sciences, Engineering, and Medicine (NAS) report on “Using 21st Century Science to Improve Risk-Related Evaluations”, 9 the use of epidemiological data will play an important role in chemical risk assessment in the future. Monitoring data, either environmental monitoring or human biomonitoring, will also contribute to risk assessment and predictive toxicology, and form an important part of the increasing big data in the area.
Similar challenges in terms of compiling, storing, processing, analysing and integrating these big data apply as for other areas and types of big data. Their quality, adequacy and appropriateness for specific purposes need to be investigated. They do, however, offer opportunities to tackle problems which were previously out of reach, and make completely new applications possible, to support toxicology evaluation and chemical safety assessment.

1.3 The Big Vs of Predictive Toxicology Data

The characteristic attributes of big data, the “Big Vs”, have been discussed frequently in the literature, 10 starting from the three major V attributes that define the characteristics of big data, i.e., Volume, Velocity, Variety. Currently, there are seven to up to 10 or more Vs described for big data, 11,14 which point at other attributes of big data and issues that need to be taken into account when dealing with big data; these vary slightly according to situation and use. The additional Vs include: Veracity, Variability, Validity, Visibility, Visualisation, Volatility, Vulnerability and Value (see Table 1.1). This section addresses how these characteristics of big data apply to the field of (predictive) toxicology; the 11th V, Vulnerability, is not considered in this context.
Table 1.1 Ten Vs of big data and their applicability in predictive toxicology; related opportunities and challenges
Big V Applicability in predictive toxicology Opportunities Challenges
Volume High number of data generated, e.g., high content screening (HCS), omics test read-outs, epidemiological and (bio)monitoring data Broader data basis for modelling and elucidation of modes of action and pathways Storing and processing large amounts of data; finding the relevant data in the flood of information; limits of capacities for data curation
Velocity Speed of data generation increased, e.g., high throughput screening (HTS) More rapid generation of data to fill gaps; ability to generate time-dependent data Speed of data generation overtaking speed of storage, analysis and processing capacities
Variety Many different types of data, e.g., chemical structures, results from a variety of assays, omics, time-dependent kinetics etc. Different types of information that can be combined to get the full picture Integration of different types of data. Representation and informatics processing of chemical structures is a challenge, especially if 3D transformations of structures have to be computed for many chemicals. Comparability of data from varied sources might not always be given
Veracity Data quality, accuracy and reliability; uncertainties requiring data curation and evaluation of data quality Large amounts of data might statistically compensate for inaccuracy in individual data when integrated Data curation and evaluation of data quality for large amount of data
Variability Intrinsic variability of biological data, e.g., inter- and intra- individual variations, genetic, population variations Availability of a large amount of data might enable taking into account variations of parameters with the models built, enabling better prediction of population variations Processing large amounts of variable data
Validity Validity of the data for a specific application, e.g., for prediction of toxicity (for a specific endpoint) Generation of many (types) of data possible in a targeted way, tailored for the specific prediction and toxicity assessment goal Finding/choosing the relevant data for the specific application
Visibility Data sharing, access to data sources leading to centralised databases and repositories Large data sets more visible than small disparate data sets, preferred storage in centralised repositories or linked via hubs/portals Making the data available and visible in an appropriate way
Visualisation Representation of the data content Supports and facilitates making sense of the varied and complex data sets, also supporting the organisation of the data Visualisation of complex data difficult, methods to improve representations of the data content in a clear way are needed
Volatility Data access might cease, or repositories disappear, which affects the sustainability of data resources N/A Appropriate storage and sustainability concept necessary
Value Adequacy and usefulness for predictive toxicology and hazard/risk assessment, depending on specific risk assessment goal Availability of many (types) of data as a broad data basis to understand mechanisms and pathways, in order to build informed predictive models Extracting and distilling the knowledge from the large amount of data
Volume: one of the three major defining characteristics of big data is the amount of data generated. In the field of toxicology, the large datasets produced are currently high throughput/content screening (HTS/HCS) assay data, high throughput kinetics, and omics technology read-outs; it is highly probable that in the future the use of large epidemiological and (bio)monitoring data sets will increase. Examples of large data sets and databases making these data available are ToxCast/Tox21 15,18 data for high throughput screening data – screening thousands of chemicals in about 70 assays covering over 125 processes in the organism, over 120 million data points have been produced so far 19 (see Chapter 8); ChEMBL 20 from the European Bioinformatics Institute (EMB...

Table des matiĂšres