Molecular Data Analysis Using R
eBook - ePub

Molecular Data Analysis Using R

Csaba Ortutay, Zsuzsanna Ortutay

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Molecular Data Analysis Using R

Csaba Ortutay, Zsuzsanna Ortutay

Book details
Book preview
Table of contents
Citations

About This Book

This book addresses the difficulties experienced by wet lab researchers with the statistical analysis of molecular biology related data. The authors explain how to use R and Bioconductor for the analysis of experimental data in the field of molecular biology. The content is based upon two university courses for bioinformatics and experimental biology students (Biological Data Analysis with R and High-throughput Data Analysis with R). The material is divided into chapters based upon the experimental methods used in the laboratories. Key features include:
•Broad appeal--the authors target their material to researchers in several levels, ensuring that the basics are always covered.
•First book to explain how to use R and Bioconductor for the analysis of several types of experimental data in the field of molecular biology.
•Focuses on R and Bioconductor, which are widely used for data analysis. One great benefit of R and Bioconductor is that there is a vast user community and very active discussion in place, in addition to the practice of sharing codes. Further, R is the platform for implementing new analysis approaches, therefore novel methods are available early for R users.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Molecular Data Analysis Using R an online PDF/ePUB?
Yes, you can access Molecular Data Analysis Using R by Csaba Ortutay, Zsuzsanna Ortutay in PDF and/or ePUB format, as well as other popular books in Medicine & Biostatistics. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9781119165040
Edition
1

CHAPTER 1
Introduction to R statistical environment

Why R?

If you work in the field of biodata analysis, or if you are interested in getting a bioinformatics job, you can see a large number of related job advertisements targeting young professionals. There is one common topic coming back in those ads: they demand “a high degree of familiarity with R/Bioconductor.” (Here, I am quoting an actual recent ad from Monster.com.)
Besides, when we have to create and analyze a large amount of data during our bio‐researcher career, sooner or later we realize that simple approaches using spread sheets (aka the Excel part of MS Office) are not flexible anymore to fulfill the needs of our projects. In these situations, we start to look for dedicated statistical software tools, and soon we encounter the countless alternatives from which we can choose. The R statistical environment is one among the possibilities.
With the exponential spread of high‐throughput experimental methods, including microarray and next‐generation sequencing (NGS)-based experiments, the skills related to large‐scale analysis of data from biological experiments have higher and higher value. R and Bioconductor offer a free and flexible tool‐set for these types of analyses; therefore, many research groups and companies select it as their data analysis platform.
R is an open‐source software licensed under the GNU General Public License (GPL). This has an advantage that you can install R for free on your desktop computer, regardless of whether you use Windows, Mac OS X, or a Linux distribution.
Introducing all the features of R thoroughly at a general level exceeds the scope and purpose of this book, which is to focus on molecular biology‐specific applications. For those who are interested in a deeper introduction into R itself, it is suggested reading the book R for Beginners by Emmanuel Paradis as a reference guide. It is an excellent general guide, which can be found online (Paradis 2005). In the course, we use more biology‐oriented examples to illustrate the most important topics. The other recommended book for this chapter is R in a Nutshell by Joseph Adler (2012).

Installing R

The first task of analyzing data with R is to install R on the computer. There is a nice discussion on the bioinformatics blogs about why people so seldom use their knowledge acquired on short bioinformatics courses. One of the main considerations points out that it is because the greatest challenge is to install the software in question.
There are plenty of available information on the web about how to install R, but the most authentic source is the website of the R project itself. In this page, the official documentation, installer, and other related links from the developers of R themselves are collected. The first step is to navigate to the download section of the page and find the mirror pages closest to the location of the user.
However, there are some differences in the installation process depending on the operating system of the computer in use. Windows users should find the Windows installer to their system from the download pages. It is useful to check for the base installer, not the contributed libraries. In the case of a Linux distribution, R can be installed via the package manager. Several Linux distributions provide R (and many R libraries) as a part of their repositories. This way, the package manager can take care of the updates. Mac OS X users and Apple fans can find the pkg file containing the R framework, 64‐bit graphical user interface (GUI) (R.app) and Tcl/Tk 8.6.0 X11 libraries for installing the R base systems on their computer. Brave users of other UNIX systems (i.e., FreeBSD or OpenBSD) can use R, but they should compile it from the source. This is not a beginner topic. In the case of a computer owned by a company, university, or library, the installation of R (just like many other programs) requires most often superuser rights.

Interacting with R

The interface of R is somewhat different from other software used for statistics, such as SPSS, S‐plus, Prism, or MS Excel (which is not a statistical software tool!). There are neither icons nor sophisticated menus to perform analyses. Instead, commands should be typed in the appropriate place of R called the “command prompt”. It is marked with >. In this book, the commands for typing into the prompt are marked by fixed‐width (monospaced) fonts:
> citation() 
After typing in a command (and hitting Enter), the results turn up either under the command or, in case of graphics, in a separate window. If the result of a command is nothing, the string NULL appears as a result. Mistyping or making an error in the parameters of a command leads to an error message with some information about what was wrong.
> c() NULL > a * 5 Error: object 'a' not found 
From now on, we will omit the > prompt character from the code samples so you can just copy/paste the commands. Leaving R happens with the quit() function.
quit(save='no') q() 

Graphical interfaces and integrated development environment (IDE) integration

A command‐line interface is enough for performing the practices. However, some prefer to have GUI. There are multiple choices depending on the operating system in use. The Windows and Mac versions of R starts with a very simple GUI, while Linux/UNIX versions start only with a command‐line interface. The Java GUI for R is available for any platform capable of running Java, and it sports simple, functional menus to perform the most basic tasks related to an analysis (Helbig, Urbanek, and Fellows 2013).
For a more advanced GUI, one can experiment with RStudio or R Commander (Fox 2005). There are several plugins to integrate R into the best coding production tools, such as Emacs (with the Emacs Speaks Statistics add‐on), Eclipse (by StatET for R), and many others.

Scripting and sourcing

Doing data analysis in R means typing in commands and experimenting with parameters suitable for the given set of data. At a later stage, the procedure will be repeated either on the same data with slight modifications in the course o...

Table of contents

Citation styles for Molecular Data Analysis Using R

APA 6 Citation

Ortutay, C., & Ortutay, Z. (2016). Molecular Data Analysis Using R (1st ed.). Wiley. Retrieved from https://www.perlego.com/book/995558/molecular-data-analysis-using-r-pdf (Original work published 2016)

Chicago Citation

Ortutay, Csaba, and Zsuzsanna Ortutay. (2016) 2016. Molecular Data Analysis Using R. 1st ed. Wiley. https://www.perlego.com/book/995558/molecular-data-analysis-using-r-pdf.

Harvard Citation

Ortutay, C. and Ortutay, Z. (2016) Molecular Data Analysis Using R. 1st edn. Wiley. Available at: https://www.perlego.com/book/995558/molecular-data-analysis-using-r-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Ortutay, Csaba, and Zsuzsanna Ortutay. Molecular Data Analysis Using R. 1st ed. Wiley, 2016. Web. 14 Oct. 2022.