The New S Language
eBook - ePub

The New S Language

  1. 720 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

The New S Language

About this book

This book provides documentation for a new version of the S system released in 1988. The new S enhances the features that have made S popular: interactive computing, flexible graphics, data management and a large collection of functions. The new S features make possible new applications and higher-level programming, including a single unified language, user defined functions as first-class objects, symbolic computations, more accurate numerical calculations and a new approach to graphics. S now provides direct interfaces to the poowerful tool of the UNIX operating system and to algorithms implemented in Fortran and C.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access The New S Language by R. Becker in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.
1
How to Beat the Lottery
One of the best ways of getting acquainted with S is to use it to help you understand a particular set of data. Let’s look at a situation where you might be motivated to perform data analysis.
1.1 Using S to Understand Data
The lottery is a common feature of modern life. Lotteries range from the Irish Sweepstakes, with its yearly large drawings and enormous payoffs, to daily numbers games run by state governments (as well as illegal games run by bookies).
You might wonder why we are presenting lottery data here. There are several answers. First, there is the traditional association between probability theory and gambling—the foundations of statistics go back to studies of games of chance. Lotteries raise many interesting questions. In fact, data analysis may be the only practical way of answering questions such as ā€œIs the lottery fair?ā€ A second reason is that the ubiquity of gambling and lotteries has acquainted almost everyone with the basic concepts involved. A third reason is that a scientific look at lottery data may provide answers to the important questions: ā€œShould I play, and if so, how should I play?ā€
1.2 New Jersey Pick-It Lottery Data
The specific data we will look at concerns the New Jersey Pick-It Lottery, a daily numbers game run by the state of New Jersey to aid education and institutions. Our data is for 254 drawings just after the lottery was started, from May, 1975 to March, 1976. Pick-It is a parimutuel game, meaning that the winners share a fraction of the money taken in for the particular drawing. Each ticket costs fifty cents and at the time of purchase the player picks a three-digit number ranging from 000 to 999. Half of the money bet during the day is placed in a prize pool (the state takes the other half) and anyone who picked the winning number shares equally in the pool.
The data available from the NJ Lottery Commission gives for each drawing the winning number and the payoff for a winning ticket. The winning numbers are:†
Image
The corresponding payoffs are:
Image
Image
Thus, for the first drawing, the winning number was 810 and it paid $190.00 to each winning ticket holder. Streams of numbers like this are both difficult to use and boring. One of the best ways to understand the data is to look at it graphically. Before doing any plots, however, we should think of the questions we might want to ask of the data. For example, there have been notorious cases of fraud in lotteries (see Figure 1.1).
Although a single rigged drawing is something that we could not detect with our data, we may be able to detect long-term irregularities. Let’s look at the winning numbers to see if they appear to be chosen at random.
> hist (lottery.number) # Figure 1.2
The histogram looks fairly flat—no need to inform a grand jury.
Of course, most of our attention will probably be directed at the payoffs. Elementary probabilistic reasoning tells us that, unless we can predict the future or rig the lottery, a single number that we pick has a 1 in 1000 chance of winning. If we play many times, we expect about 1 winning number per 1000 plays. Since a ticket costs fifty cents, 1000 plays will cost $500, so we hope to win at least $500 each time we win, otherwise we will lose money in the long run.
Image
Figure 1.1. A case of lottery fraud in September, 1980. (Ā© 1980 by The New York Times Company. Reprinted by permission. Wide World Photos.)
Image
Figure 1.2. Histogram of winning numbers from 254 lottery drawings. Since there are 10 bars, the count should be approximately 25 in each bar, if the winning numbers are drawn at random. The small bar at the left represents the one time that 000 was the winning number.
Let’s make a histogram of the payoffs.
> hist (lottery.payoff) # Figure 1.3
Image
Figure 1.3. A histogram of the lottery payoffs shows that payoffs range from less than $100 to more than $800, although the bulk of the payoffs are between $100 and $400.
In our set of data there were a number of payoffs larger than $500—perhaps we have a chance. The widely varying payoffs are primarily due to the parimutuel betting in the lottery; if you win when few others win, you will get a large payoff. If you are unlucky enough to win along with lots of others, the payoff may be relatively small. Let’s see what the largest and smallest payoffs and corresponding winning numbers were:
> max (lottery.payoff) # the largest payoff
[1] 869.5
> lottery.number[ lottery.payoff==max(lottery.payoff) ]
[1] 499
> min(lottery.payoff) # the smallest payoff
[1] 83
> lottery.number[ lottery.payoff==min(lottery.payoff) ]
[1] 123
Winners who bet on ā€œ123ā€ must have been disappointed; $83 is not a very large payoff. On the other hand, $869.50 is very nice.
Since the winning numbers and the payoffs come in pairs, a number and a payoff for each drawing, we can produce a scatterplot of the data to see if there is any relationship between the payoff and the winning number.
> plot(lottery.number, lottery.payoff) # Figure 1.4
What do you see in the picture? Does the payoff seem to depend on the position of the winning number? Perhaps it would help to add a ā€œmiddleā€ line that follows the overall pattern of the data:
Image
Figure 1.4. Scatterplot of winning number and payoff for the 254 different lottery drawings.
> lines ( lowess (lottery.number, lottery.payoff, f=.2) )
> # Figure 1.5
Can you see the interesting characteristics now in Figure 1.5? There are substantially higher payoffs for numbers with a leading zero, meaning fewer people bet on these numbers. Perhaps that reflects people’s reluctance to think of numbers with leading zeros. After all, no one writes $010 on a ten dollar check! Also note that, except for the numbers with leading zeros, payoffs seem to increase as the winning number increases.
It would be interesting to see exactly what numbers correspond to the large payoffs. Fortunately, with an interactive graphical input device, we can do that by simply pointing at the ā€œoutliersā€:
> identify (lottery.number, lottery.payoff, lottery.number)
> # Figure 1.6
Can you see the pattern in the numbers with very high payoffs? Spend some time thinking before looking at the footnote, which contains the explanation.† Did you find the pattern? If so, you have accomplished something very important—you learned something new by looking at the data, and afterwards found that it could be explained by the rules of the game. Much of data analysis consists of detecting clues from patterns in the data and then following up on the clues to better understand the data.
Image
Figure 1.5. A smooth curve is superimposed on the winning number and payoff scatterplot.
Image
Figure 1.6. Outliers on the scatterplot are labelled...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. 1 How to Beat the Lottery
  8. 2 Tutorial Introduction to S
  9. 3 Using the S Language
  10. 4 Graphical Methods in S
  11. 5 Data in S
  12. 6 Writing Functions
  13. 7 More on Writing Functions
  14. 8 More about Data
  15. 9 Examples and Case Studies
  16. 10 Advanced Graphics
  17. 11 How S Works
  18. Bibliography
  19. Appendix 1 S Function Documentation
  20. Appendix 2 S Dataset Documentation
  21. Appendix 3 Index to S Functions
  22. Appendix 4 Old-S and S
  23. Index