Tutorials in Chemoinformatics
eBook - ePub

Tutorials in Chemoinformatics

Alexandre Varnek, Alexandre Varnek

Compartir libro
  1. English
  2. ePUB (apto para móviles)
  3. Disponible en iOS y Android
eBook - ePub

Tutorials in Chemoinformatics

Alexandre Varnek, Alexandre Varnek

Detalles del libro
Vista previa del libro
Índice
Citas

Información del libro

30 tutorials and more than 100 exercises in chemoinformatics, supported by online software and data sets

Chemoinformatics is widely used in both academic and industrial chemical and biochemical research worldwide. Yet, until this unique guide, there were no books offering practical exercises in chemoinformatics methods. Tutorials in Chemoinformatics contains more than 100 exercises in 30 tutorials exploring key topics and methods in the field. It takes an applied approach to the subject with a strong emphasis on problem-solving and computational methodologies.

Each tutorial is self-contained and contains exercises for students to work through using a variety of software packages. The majority of the tutorials are divided into three sections devoted to theoretical background, algorithm description and software applications, respectively, with the latter section providing step-by-step software instructions. Throughout, three types of software tools are used: in-house programs developed by the authors, open-source programs and commercial programs which are available for free or at a modest cost to academics. The in-house software and data sets are available on a dedicated companion website.

Key topics and methods covered in Tutorials in Chemoinformatics include:

  • Data curation and standardization
  • Development and use of chemical databases
  • Structure encoding by molecular descriptors, text strings and binary fingerprints
  • The design of diverse and focused libraries
  • Chemical data analysis and visualization
  • Structure-property/activity modeling (QSAR/QSPR)
  • Ensemble modeling approaches, including bagging, boosting, stacking and random subspaces
  • 3D pharmacophores modeling and pharmacological profiling using shape analysis
  • Protein-ligand docking
  • Implementation of algorithms in a high-level programming language

Tutorials in Chemoinformatics is an ideal supplementary text for advanced undergraduate and graduate courses in chemoinformatics, bioinformatics, computational chemistry, computational biology, medicinal chemistry and biochemistry. It is also a valuable working resource for medicinal chemists, academic researchers and industrial chemists looking to enhance their chemoinformatics skills.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Tutorials in Chemoinformatics un PDF/ePUB en línea?
Sí, puedes acceder a Tutorials in Chemoinformatics de Alexandre Varnek, Alexandre Varnek en formato PDF o ePUB, así como a otros libros populares de Ciencias físicas y Química analítica. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.

Información

Editorial
Wiley
Año
2017
ISBN
9781119137986
Edición
1

Part 1
Chemical Databases

1
Data Curation

Gilles Marcou and Alexandre Varnek
Goal: Identify and curate problematic chemical information from a data collection. The raw dataset is processed so that it will be ready to feed a relational database dedicated to the organoleptic properties of small organic molecules. Information is interpreted and re‐encoded as categories or bit vectors when relevant.
Software: KNIME 3.0, ChemAxon
Data: The following files are provided in the tutorial:
  • thegoodscent_dup.csv – The raw data formatted in a semicolon separated file extracted from the web site of The Good Scent Company. The data is prepared and the most visible errors and discrepancies are already corrected.
  • thegoodscent_dup.raw – The raw data without any processing related to the tutorial.
  • MissingOdorTypes.csv – Manually curated Odor Types provided for some difficult cases.
  • StructureCuration.csv – File containing the curation rules for some deficient SMILES of the input.
  • TutoDataCuration.zip – The final KNIME workflow. Unzip the archive in the KNIME workspace and it will appear in your LOCAL workflows.
  • Slurp.pl – A Perl script exploring the website of The Good Scents Company in search of some chemical information.
The Good Scent Company is an online shop providing cosmetic, flavor, and fragrance ingredients. It provides information for the flavor, food, and fragrance industry since 1994, and sales ingredients since 1980.

Theoretical Background

Chemical datasets can be collected from literature, compendiums, web sites, lab‐books, databases, and so on. Aggregation and automatic treatment of data represent additional sources of errors. Therefore, verification of quality and accuracy of chemical information is a crucial step of data valorization.[1]
The problem of the quality of publicly available chemical data can be illustrated on the searching the Web for the chemical structure of antibacterial compound Vancomycine, for which stereochemistry information is essential. One can suggest two possible queries using InChIKey notations:[2,3]
  • Query 1: “MYPYJXKWCTUITO” “Vancomycine”
  • Query 2: “MYPYJXKWCTUITO‐LYRMYLQWSA‐N” “Vancomycine”
Query 1 corresponds to the first layer of the InChI code of Vancomycine; it encodes only elemental constitution and atoms connectivity, whereas Query 2 includes detailed stereochemistry information.
A search on Google (29/01/2016) retrieves 82 and 71 entries for Queries 1 and 2, respectively. Entries found with Query 2 correspond to the correct chemical structure of Vancomycine, whereas all 11 additional entries retrieved with Query 1 refer to its different enantiomers, see example on Scheme 1.1.
Image described by caption.
Scheme 1.1 Chemical structures of Vancomycine from PubChem. (a) PubChem CID 441141, InChIKey : MYPYJXKWCTUITO‐UTHKAUQRSA‐N. (b) PubChem CID 14969, InChIKey : MYPYJXKWCTUITO‐LYRMYLQWSA‐N. Notice that Vancomycine corresponds to structure (b), whereas structure (a) is, in fact, its enantiomer.
From this example, one can see that an estimate of the erroneous data associating Vancomycine to the wrong chemical structure is about 13%. Analysis of some 6800 publications in drug discovery[4] show that the average error rate of reported chemical structures is about 8% and, it seems, nothing has changed so far. Numerous examples and alerts about data curation problems, espec...

Índice