eBook - ePub

Python for Bioinformatics

Name: Python for Bioinformatics
Author: Sebastian Bassi

Sebastian Bassi

424 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Python for Bioinformatics

Sebastian Bassi

Book details

Book preview

Table of contents

Citations

About This Book

In today's data driven biology, programming knowledge is essential in turning ideas into testable hypothesis. Based on the author's extensive experience, Python for Bioinformatics, Second Edition helps biologists get to grips with the basics of software development. Requiring no prior knowledge of programming-related concepts, the book focuses on the easy-to-use, yet powerful, Python computer language.

This new edition is updated throughout to Python 3 and is designed not just to help scientists master the basics, but to do more in less time and in a reproducible way. New developments added in this edition include NoSQL databases, the Anaconda Python distribution, graphical libraries like Bokeh, and the use of Github for collaborative development.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Python for Bioinformatics an online PDF/ePUB?

Yes, you can access Python for Bioinformatics by Sebastian Bassi in PDF and/or ePUB format, as well as other popular books in Matemáticas & Probabilidad y estadística. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Chapman and Hall/CRC

Year

2017

ISBN

9781351976954

Edition

Topic

Matemáticas

Subtopic

Probabilidad y estadística

I

Programming

CHAPTER 1 Introduction

1.1 Who Should Read this Book

1.1.1 What the Reader Should Already Know

1.2 Using this Book

1.2.1 Typographical Conventions

1.2.2 Python Versions

1.2.3 Code Style

1.2.4 Get the Most from This Book without Reading It All

1.2.5 Online Resources Related to This Book

1.3 Why Learn to Program?

1.4 Basic Programming Concepts

1.4.1 What Is a Program?

1.5 Why Python?

1.5.1 Main Features of Python

1.5.2 Comparing Python with Other Languages

Readability

Speed

1.5.3 How Is It Used?

1.5.4 Who Uses Python?

1.5.5 Flavors of Python

1.5.6 Special Python Distributions

1.6 Additional Resources

The most effective way to do it, is to do it.

Amelia Earhart

1.1Who Should Read this Book

This book is for the life science researcher who wants to learn how to program. He/she may have previous exposure to computer programming, but this is not necessary to understand this book (although it surely helps).

This book is designed to be useful to several separate but related audiences, students, graduates, postdocs, and staff scientists, since all of them can benefit from knowing how to program.

Exposing students to programming at early stages in their career helps to boost their creativity and logical thinking, and both skills can be applied in research. In order to ease the learning process for students, all subjects are introduced with the minimal prerequisites. There are also questions at the end of each chapter. They can be used for self-assessing how much you’ve learned. The answers are available to teachers in a separate guide.

Graduates and staff scientists having actual programming needs should find its several real-world examples and abundant reference material extremely valuable.

1.1.1What the Reader Should Already Know

Since this book is called Python for Bioinformatics, it has been written with the following assumptions in mind:

•No programming knowledge is assumed, but the reader is required to have minimum computer proficiency to be able to use a text editor and handle basic tasks in your operating system (OS). Since Python is multi-platform, most instructions from this book will apply to the most common operating systems (Windows, macOS and Linux); when there is a command or a procedure that applies only to a specific OS, it will be clearly noted.

•The reader should be working (or at least planning to work) with bioinformatics tools. Even low-scale handmade jobs, such as using the NCBI BLAST to ID a sequence, aligning proteins, primer searching, or estimating a phylogenetic tree will be useful to follow the examples. The more familiar the reader is with bioinformatics, the better he will be able to apply the concepts learned in this book.

1.2Using this Book

1.2.1Typographical Conventions

There are some typographical conventions I have tried to use in a uniform way throughout the book. They should aid readability and were chosen to tell apart user-made names (or variables) from language keywords. This comes in handy when learning a new computer language.

Bold: Objects provided by Python and by third-party modules. With this notation it should be clear that round is part of the language and not a user-defined name. Bold is also used to highlight parts of the text. There is no way to confuse one bold usage with the other.

Mono-spaced font: User declared variables, code, and filenames. For example: sequence = ‘MRVLLVALALLALAASATS’.

Italics: In commands, it is used to denote a variable that can take different values. For example, in len(iterable), “iterable” can take different values. Used in text, it marks a new word or concept. For example “One such fundamental data structure is a dictionary.”

The content of lines starting with $ (dollar sign) are meant to be typed in your operating system console (also called command prompt in Windows or terminal in macOS).

↲: Break line. Some lines are longer than the available space in a printed page, so this symbol is inserted to mean that what is on the next line in the page represents the same line on the computer screen. Inside code, the symbol used is <=.

1.2.2Python Versions

The current version of Python at this moment is 3.6.1. There is a 2.7.12 version that is maintained¹ because there are still a sizable number of applications in production using the 2.7 branch. Versions 3.x and 2.x are slightly different, at the point of being incompatible. Python 3 is more efficient than Python 2 in many aspects. Large websites such as Instagram migrated from Python 2.7 to Python 3.6 to save in CPU and memory consumption by up to 30%. This book uses Python 3.6.

The only scenario where you may need to use Python 2.7, apart from maintenance of old code, is when there is no availability of a specific library for Python 3. In this case, before starting a project in Python 2.7, try to search for a replacement library. For example, you want to connect with a MySQL database and you are told to use MySQLdb, since this package is not Python 3 compatible; instead of using Python 2.7, use mysqlclient or mysql-connector-python, both works with Python 3.

1.2.3Code Style

Python source code that appears in this book is presented as listings. Each line of these listings is numbered. These numbers are not intended to be typed; they are used to reference each line in the text. You don’t need to copy the code from the book, since it can be downloaded from the GitHub repository at https://github.com/Serulab/Py4Bio.

Code can be formatted in several ways and still be valid to the Python interpreter. This following code is syntactically correct:

 def GetAverage(X): avG=sum(X)/len(X) " Calculate the average " return avG

Also this one:

 def get_average(items): """ Calculate the average """ average = sum(items) / len(items) return average

The former code sample follows most accepted coding styles for Python.² Throughout the book you will find mostly code formatted as the second sample. Some code in the book will not follow accepted coding styles for the following reasons:

•There are some instances where the most didactic way to show a particular piece of code conflicts with the style guide. On those few occasions, I choose to deviate from the style guide in favor of clarity.

•Due to size limitation in a printed book, some names were shortened and other minor drifts from the coding styles have been introduced.

•To show th...

Citation styles for Python for Bioinformatics

APA 6 Citation

Bassi, S. (2017). Python for Bioinformatics (2nd ed.). CRC Press. Retrieved from https://www.perlego.com/book/1575180/python-for-bioinformatics-pdf (Original work published 2017)

Chicago Citation

Bassi, Sebastian. (2017) 2017. Python for Bioinformatics. 2nd ed. CRC Press. https://www.perlego.com/book/1575180/python-for-bioinformatics-pdf.

Harvard Citation

Bassi, S. (2017) Python for Bioinformatics. 2nd edn. CRC Press. Available at: https://www.perlego.com/book/1575180/python-for-bioinformatics-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Bassi, Sebastian. Python for Bioinformatics. 2nd ed. CRC Press, 2017. Web. 14 Oct. 2022.