eBook - ePub

Dynamic Documents with R and knitr

Name: Dynamic Documents with R and knitr
ISBN: 9781315360706

Yihui Xie,

294 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Dynamic Documents with R and knitr

Yihui Xie,

About this book

Quickly and Easily Write Dynamic Documents

Suitable for both beginners and advanced users, Dynamic Documents with R and knitr, Second Edition makes writing statistical reports easier by integrating computing directly with reporting. Reports range from homework, projects, exams, books, blogs, and web pages to virtually any documents related to statistical graphics, computing, and data analysis. The book covers basic applications for beginners while guiding power users in understanding the extensibility of the knitr package.

New to the Second Edition

A new chapter that introduces R Markdown v2
Changes that reflect improvements in the knitr package
New sections on generating tables, defining custom printing methods for objects in code chunks, the C/Fortran engines, the Stan engine, running engines in a persistent session, and starting a local server to serve dynamic documents

Boost Your Productivity in Statistical Report Writing and Make Your Scientific Computing with R Reproducible

Like its highly praised predecessor, this edition shows you how to improve your efficiency in writing reports. The book takes you from program output to publication-quality reports, helping you fine-tune every aspect of your report.

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Subtopic

Statistiques pour les entreprises et l'économie

Introduction

The basic idea behind dynamic documents stems from literate programming, a programming paradigm conceived by Donald Knuth (Knuth, 1984). The original idea was mainly for writing software: mix the source code and documentation together; we can either extract the source code out (called tangle) or execute the code to get the compiled results (called weave). A dynamic document is not entirely different from a computer program: for a dynamic document, we need to run software packages to compile our ideas (often implemented as source code) into numeric or graphical output, and insert the output into our literal writings (like documentation).

We explain the idea with a trivial example: suppose we need to write the value of 2π into a report; of course, we can directly write the number 6.2832. Now, if I change my mind and I want 6π instead, I may have to find a calculator, erase the previous value, and write the new answer. Since it is extremely easy for the computer to calculate 6π, why not leave this job to the computer completely and free oneself from this kind of manual work? What we need to do is to leave the source code in the document instead of a hard-coded value, and tell the computer how to find and execute the source code. Usually we use special markers for computer code in the source report; e.g., we can write

The correct answer is {{ 6 * pi }}.

in which {{ and }} is a pair of markers that tell the computer 6 * pi is the source code and should be executed. Note here pi (π) is a constant in R.

If you know a Web scripting language such as PHP (which can embed program code into HTML documents), this idea should look familiar. The above example shows the inline code output, which means source code is mixed inline with a sentence. The other type of output is the chunk output, which gives the results from a whole block of code. The chunk output has much more flexibility; for example, we can produce graphics and tables from a code chunk.

Figure 1.1 was dynamically created with a chunk of R code, which is printed below:

FIGURE 1.1: A simulation of Brownian motion for 100 steps:

x_{1} = ϵ_{1}, x_{i + 1} = x_{i} + ϵ_{i + 1}, ϵ_{i} \underset{\sim}{i i d} N (0, 1), i = 1, 2, \dots, 100

If we were to do this by hand, we would have to open R, paste the code into the R console to draw the plot, save it as a PDF file, and insert it into a

document with \includegraphics{}. This is both tedious for the author and difficult to maintain — supposing we want to change the random seed in set.seed(), increase the number of steps, or use a scatterplot instead of a line graph, we will have to update both the source code and the output. In practice, the computing and analysis can be far more complicated than the toy example in Figure 1.1, and more manual work will be required accordingly.

The spirit of dynamic documents may best be described by the philosophy of the ESS project (Rossini et al., 2004) for the S language:

The source code is real.

Philosophy for using ESS[S]

Since the output can be produced by the source code, we can maintain the source code only. However, in most cases, the direct output from the source code alone does not constitute a report that is readable for a human. That is why we need the literate programming paradigm. In this paradigm, an author has two tasks:

1. write program code to do computing, and

2. write narratives to explain what is being done by the program code

The traditional approach to doing the second task is to write comments for the code, but comments are often limited in terms of expressing the full thoughts of the authors. Normally we write our ideas in a paper or a report instead of hundreds of lines of code comments.

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.

Donald E. Knuth

Literate Programming, 1984

Technically, literate programming involves three steps:

1. parse the source document and separate code from narratives

2. execute source code and return results

3. mix results from the source code with the original narratives

These steps can be implemented in software packages, so the authors do not need to take care of these technical details. Instead, we only control what the output should look like. There are many details that we can tune for a report (especially for reports related to data analysis), although the idea of literate programming seems to be simple. For example, data reports often include tables, and Table 1.1 is a table generated from the R code below using the kable() function in knitr:

Think how easy it is to maintain two lines of R code compared to maintaining many lines of messy

code!

Generating reports dynamically by integrating computer code with narratives is not only easier, but also closely related to reproducible research, which we will discuss in the next chapter.

TABLE 1.1 A subset of the mtcars dataset: the first 6 rows and 6 columns.

Reproducible Research

Results from scientific research have to be reproducible to be trustworthy. We do not want a finding to be merely due to an isolated occurrence, e.g., only one specific laboratory researcher can produce the results on one specific day, and nobody else can produce the same results under the same conditions.

Reproducible research (RR) is one possible by-product of dynamic documents, but dynamic documents do not absolutely guarantee RR. Because there is usually no human intervention when we generate a report dynamically, it is likely to be reproducible since it is relatively easy to prepare the same software and hardware environment, which is everything we need to reproduce the results. However, the meaning of reproducibility can be beyond reproducing one specific result or one particular report. As a trivial example, one might have done a Monte Carlo simulation with a certain random seed and got a good estimate of a parameter, but the result was actually due to a “lucky” random seed. Although we can strictly reproduce the estimate, it is not actually repr...

Cover
Title Page
Copyright Page
Dedication
Table of Contents
Preface
Author
List of Figures
List of Tables
1 Introduction
2 Reproducible Research
3 A First Look
4 Editors
5 Document Formats
6 Text Output
7 Graphics
8 Cache
9 Cross Reference
10 Hooks
11 Language Engines
12 Tricks and Solutions
13 Publishing Reports
14 R Markdown
15 Applications
16 Other Tools
Appendix
Bibliography
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Dynamic Documents with R and knitr by Yihui Xie in PDF and/or ePUB format, as well as other popular books in Économie & Statistiques pour les entreprises et l'économie. We have over one million books available in our catalogue for you to explore.

About this book

Tools to learn more effectively

Information

Table of contents

Frequently asked questions