Analyzing Health Data in R for SAS Users
eBook - ePub

Analyzing Health Data in R for SAS Users

Monika Maya Wahi,Peter Seebach

  1. 320 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Analyzing Health Data in R for SAS Users

Monika Maya Wahi,Peter Seebach

Book details
Book preview
Table of contents
Citations

About This Book

Analyzing Health Data in R for SAS Users is aimed at helping health data analysts who use SAS accomplish some of the same tasks in R. It is targeted to public health students and professionals who have a background in biostatistics and SAS software, but are new to R.

For professors, it is useful as a textbook for a descriptive or regression modeling class, as it uses a publicly-available dataset for examples, and provides exercises at the end of each chapter. For students and public health professionals, not only is it a gentle introduction to R, but it can serve as a guide to developing the results for a research report using R software.

Features:



  • Gives examples in both SAS and R


  • Demonstrates descriptive statistics as well as linear and logistic regression


  • Provides exercise questions and answers at the end of each chapter


  • Uses examples from the publicly available dataset, Behavioral Risk Factor Surveillance System (BRFSS) 2014 data


  • Guides the reader on producing a health analysis that could be published as a research report


  • Gives an example of hypothesis-driven data analysis


  • Provides examples of plots with a color insert

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Analyzing Health Data in R for SAS Users an online PDF/ePUB?
Yes, you can access Analyzing Health Data in R for SAS Users by Monika Maya Wahi,Peter Seebach in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

Year
2017
ISBN
9781351394277
Edition
1
1
Differences between SAS and R
This chapter is meant to help the SAS user conceptualize the important differences between R and SAS that will affect the work of the healthcare analyst who knows SAS but is looking to learn how to use R. The first section describes important differences in the structures of the SAS and R programs. This leads to the discussion in the second section, which focuses on how these differences in structure affect the differences in data handling between the two programs. The third section of this chapter contextualizes the choice between using R versus SAS for a healthcare analytics project and provides a guide for selecting which software to use. Optional practice exercises are included in the fourth section.
Structure of Program
Perhaps the most important difference between SAS and R is the structure of how the program is built and maintained. To begin to explain this difference, we will start with considering how different the download and install process is between PC SAS and R. Next, we will cover how these differences are also reflected in the differences in the way licensing is maintained in the two programs. Third, the differences between SAS components and their parallel in R, called R packages, will be discussed, and after that, differences between SAS and R in approach to maintaining the most current version in a production environment will be explained. The differences between the activities of SAS and R user communities will be described, followed by the differences between the SAS and R user interfaces. Finally, some thoughts on the principles of organizing code, metadata, and documentation in SAS and R are presented.
Installation of PC Version: SAS versus R
When using a typical PC SAS license from a university, the data analyst has to download and install the program on his or her Windows personal computer (PC), as the current operating system (OS) for Macintosh is not supported.* To do this, the analyst is provided access to either many CDs that her computer will tell you to put in or take out of her CD/DVD drive during the installation process, or alternatively, an extremely large setup file that takes a very long time to download (plan for hours, not minutes). The analyst is also provided a small text file with unique license information for her institution’s license.
Once the analyst has access to these setup files, a setup file is run on the analyst’s PC, and this setup file takes a while to extract. The installation must be monitored because the user has to click through many menus. This is because these SAS institutional licenses are very extensive and include many add-ons and components. This heavy-handed installation process is designed to make sure the analyst has access to all the components to which she is entitled under the license (which, at a university, tends to be a large volume).
However, when working in SAS, the analyst may encounter the rare case in which she is using an analytic function that is not included in the SAS license components, and this throws up an error message. This is a confusing situation because the error message does not usually point to a missing component; it simply rejects the code as being incorrect in some way, so troubleshooting can be challenging.
Because R is open source, meaning that R developers are volunteers and make the code and documentation for how R runs readily available on the Internet, there is no need for the advanced functions to control licensing that SAS employs. This makes the way R is distributed different from SAS. Anyone can go to the public website called “The Comprehensive R Archive Network” (CRAN) (https://cran.r-project.org) [1] and download the latest version of R for Mac or Windows and install the core program. The user interface (UI) looks slightly different on Mac versus Windows, but the differences are minor.† The R installation file is small relative to the SAS installation file, and once the user downloads this file and runs it, the setup wizard is clear and easy, making for a quick installation process.
Licensing Differences: SAS versus R
The exact components of the SAS institutional license make are communicated to SAS during the installation process at the step where the user is asked by the setup wizard to reference the small text file provided with the setup files. As stated earlier, the list of components included in a SAS institutional license are usually large. When SAS negotiates enterprise licenses with universities, they are set up such that a large volume of components is included, and the student or faculty pays only a negligible fee or nothing to obtain this licensed version, provided they can prove their status with the university. This is because SAS wants to promote use and learning of all its components at universities.
Importantly, SAS prices their licenses for non-university businesses differently. As an example, an independent nonprofit research institute on the campus of a state university in Florida contacted SAS and asked if the research institute could use the university license. SAS did not agree, and instead, prepared a license for PC SAS strictly for the research institute. This license only had one seat, and only had the base component of SAS (“base SAS”) and the basic engine that runs regression functions (“SAS STAT”). This one-seat license cost the nonprofit approximately $10,000 per year in 2007, so the nonprofit could not add extra seats. Hence, researchers who learned SAS at the university and then went on to be hired as scientists at the nonprofit research institute were not able to use their extensive knowledge of all the components of SAS due to the prohibitively expensive licensing approach.
Because R is open source, there is no licensing fee, so R is perfectly suited for public health work in low- to middle-income (LME) countries, nonprofits, and businesses with a low profit margin. This book can hopefully help bridge the gap between SAS and R for SAS-experienced healthcare analysts who are priced out of the market by this situation.
SAS Components versus R Packages
Base SAS, the core of SAS, is a rather large program. As mentioned before, the main R program that is downloaded from CRAN is much smaller than the SAS setup program, and therefore, downloading it is fast and installing it is very easy. The drawback is that just about anything the user tries to do once installing R will require an outside component not in the base program, called an R “package.”
Because R is not licensed for purchase, it is much more efficient for each user to simply build their version of R by installing the packages needed. Admittedly, this can be a daunting task, as some packages are based on other packages, so these all need to be installed. For example, to make a Kaplan–Meier plot in R, the user must install the “survival” package, as well as the “KMSurv” package [2]. More recently, packages are designed to automatically install packages on which they are dependent, so this problem does not occur as frequently anymore.
Luckily, as with the base R program, all the packages are free and are easy to install from within the program. In the native R UI, there is a menu with only seven selections, and “packages” is one of them. If the user chooses the “packages” menu and selects “load package,” the UI presents a list of packages that can be installed, and the user just needs to select the correct ones and install them. Packages can also be installed using commands, but the load package menu makes the process extremely easy.
What is RStudio?
This book gives guidance on using the PC version of R in Windows as it appears with its native UI. R’s native UI is typically sufficient to be used by healthcare analysts to develop statistical models. However, there is also an integrated development environment (IDE) that can be used called RStudio. This is supported by the R Consortium, which is a collaboration between the R Foundation (which maintains CRAN), RStudio (a collection of R developers working on the IDE), and other big tech companies such as Microsoft and Google [3]. Like R, RStudio is also open source and free to individual users.
RStudio is different than R in that it is an IDE and includes a source code editor, build automation tools, and a debugger. RStudio can be run as a desktop or server version, so it is used at universities in programs that teach programming in IDEs [4]. It is an excellent tool for deploying Shiny, which is an R package that interfaces R with the web and turns R analyses into web applications [5].
Generally, the capabilities afforded by RStudio are required for deploying web applications, but for hypothesis-driven healthcare analytics, RStudio can be overkill. For example, RStudio has several windows that are associated specifically with the IDE, and would not appear in R. Therefore, unless the analyst needs an IDE, using R rather than RStudio is preferred.
Maintaining Current Versions in SAS versus R
When using a university SAS license, there is a month in the year that the license expires because these are set up as yearly licenses. When the license expires, the user can still open the SAS program, but it throws up an error message indicating the user will need an updated text file with a new license, and does not let the user unlock the program until this is loaded. At this point, the user must obtain the new text file from the university, and from within SAS, load this license, thus unlocking the program. Although SAS does update its base program with new version from time to time, it does not do it often, so renewing the license is much more common than installing a new version of SAS.
Part of the reason SAS does not update its base program often has to do with how it determines what to put in updates. Independent SAS programmers typically develop macros (canned code procedures) in SAS macro language and make these available on the Internet. These are not held in a central repository but posted all over the Internet, and are also highlighted in SAS white papers presented for regional as well as national and international user groups [6–8], which are also not available in a central indexed repository (although they are posted on SAS-sponsored websites and are easy to find through a search on the web).
If a particular macro becomes popular with SAS users, some may choose to write a peer-reviewed article about it [9], fostering discussion of the macro in the SAS community. If SAS receives enough requests to include the macro as a main function in SAS, and can verify...

Table of contents

Citation styles for Analyzing Health Data in R for SAS Users

APA 6 Citation

Wahi, M. M., & Seebach, P. (2017). Analyzing Health Data in R for SAS Users (1st ed.). CRC Press. Retrieved from https://www.perlego.com/book/1615397/analyzing-health-data-in-r-for-sas-users-pdf (Original work published 2017)

Chicago Citation

Wahi, Monika Maya, and Peter Seebach. (2017) 2017. Analyzing Health Data in R for SAS Users. 1st ed. CRC Press. https://www.perlego.com/book/1615397/analyzing-health-data-in-r-for-sas-users-pdf.

Harvard Citation

Wahi, M. M. and Seebach, P. (2017) Analyzing Health Data in R for SAS Users. 1st edn. CRC Press. Available at: https://www.perlego.com/book/1615397/analyzing-health-data-in-r-for-sas-users-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Wahi, Monika Maya, and Peter Seebach. Analyzing Health Data in R for SAS Users. 1st ed. CRC Press, 2017. Web. 14 Oct. 2022.