Environmental Statistics and Data Analysis
eBook - ePub

Environmental Statistics and Data Analysis

  1. 336 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Environmental Statistics and Data Analysis

About this book

This easy-to-understand introduction emphasizes the areas of probability theory and statistics that are important in environmental monitoring, data analysis, research, environmental field surveys, and environmental decision making. It communicates basic statistical theory with very little abstract mathematical notation, but without omitting importa

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Environmental Statistics and Data Analysis by Wayne R. Ott in PDF and/or ePUB format, as well as other popular books in Mathematics & Mathematical Analysis. We have over one million books available in our catalogue for you to explore.

Information

1
Random Processes
Random: A haphazard course — at random: without definite aim, direction, rule or method1
The concept of “randomness,” as used in common English, is different from its meaning in statistics. To emphasize this difference, the word stochastic commonly is used in statistics for random, and a stochastic process is a random process. A stochastic process is one that includes any random components, and a process without random components is called deterministic. Because environmental phenomena nearly always include random components, the study of stochastic processes is essential for making valid environmental predictions.
To most of us, it is comforting to view the world we live in as consisting of many identifiable cause-effect relationships. A “cause-effect” relationship is characterized by the certain knowledge that, if a specified action takes place, a particular result always will occur, and there are no exceptions to this rule. Such a process is called deterministic, because the resulting outcome is determined completely by the specified cause, and the outcome can be predicted with certainty. Unfortunately, few components of our daily lives behave in this manner.
Consider the simple act of obtaining a glass of drinking water. Usually, one seeks a water faucet, and, after it is found, places an empty glass beneath the faucet and then turns the handle on the faucet. Turning the handle releases a piston inside the valve, allowing the water to flow. The process is a simple one: the act of turning the handle of the faucet (the “cause”) brings about the desired event of water flowing (the “effect”), and soon the glass fills with water.
Like so many other events around us, this event is so familiar that we ordinarily take it for granted. If, before we operated the faucet, someone asked us, “What will happen if the handle of the faucet is turned?”, we would be willing to predict, with considerable certainty, that “water will appear.” If we had turned the handle and no water appeared, we probably would conclude that there is something wrong with the plumbing. Why do we feel so comfortable about making this simple cause-effect prediction? How did we arrive at this ability to predict a future event in reality?
In our mind, we possess a conceptual framework, or a “model,” of this process. This model has been developed from two sources of information: (1) Our knowledge of the physical structure of faucets, valves, and water pipes and the manner in which they are assembled, and (2) Our historical experience with the behavior of other water faucets, and, perhaps, our experience with this particular faucet. The first source of knowledge comes from our understanding of the physical construction of the system and the basic principles that apply to water under pressure, valves that open and close, etc. For example, even if we had never seen a faucet or a valve before, we might be willing to predict, after the mechanism and attached pipe were described to us in detail, that turning the handle of the faucet would release the water. The second source of knowledge is derived from what we have learned from our experience with other, similar faucets. We reason thus: “Turning the faucet handle always has caused the water to flow in the past, so why shouldn’t it do so in the future?” The first source of knowledge is theoretical and the second source is empirical (that is, based on actual observations). If only the second source of knowledge were available — say, 179 cases out of 179 tries in which the faucet handle is turned on and the water appears — we probably would be willing to predict (based on this information alone and with no knowledge of the internal workings of the system) that turning the handle the next time — for the 180th try — would allow us fill the glass with water.
These two independent sources of information — physical knowledge of the structure of a system and observational knowledge about its behavior — greatly strengthen our ability to make accurate predictions about the system’s future behavior. From the first source of information, we can construct a conceptual model based on the internal workings of the system. From the second source of information, we can validate the conceptual model with real observations. The first source of information is theoretical in nature; the second one is empirical. A theoretical model validated by empirical observation usually provides a powerful tool for predicting future behavior of a system or process.
Unfortunately, the world about us does not always permit the luxury of obtaining both sources of information — theory and observation. Sometimes, our knowledge of the system’s structure will be vague and uncertain. Sometimes, our observational information will be very limited. Despite our lack of information, it may be necessary to make a prediction about the future behavior of the system. Thus, a methodology that could help us analyze existing information about the system to improve the accuracy of our predictions about its future behavior would be extremely useful.
Consider the above example of the water faucet. Suppose little were known about its construction, attached pipes, and sources of water. Suppose that the faucet behaves erratically: when the handle is turned, sometimes the water flows and sometimes it does not, with no obvious pattern. With such uncertain behavior of the device, we probably would conclude that unknown factors (for example, clogged pipes, defective pumps, broken valves, inadequate water supplies, wells subject to rising and falling water tables) are affecting this system. The faucet may, in fact, be attached to a complex network of pipes, tanks, valves, filters, and other devices, some of which are faulty or controlled by outside forces. Because the arrival of water will depend on many unknown factors beyond the user’s control, and because the outcome of each particular event is uncertain, the arrival of water from the faucet may behave as a stochastic process.
How can one make predictions about the behavior of such a process? The first step is to express the event of interest in some formal manner — such as a “1” or “0” — denoting the presence or absence of water, or a quantitative measure (gm or m3) denoting the amount of water arriving in a fixed time interval. Such a quantitative measure is called a random variable. A random variable is a function of other causative variables, some of which may or may not be known to the analyst. If all of the causative variables were known, and the cause-effect relationships were well-understood, then the process would be deterministic. In a deterministic process, there are no random variables; one can predict with certainty the rate at which the water will flow whenever the valve is opened by examining the status of all the other contributing variables.
In view of the uncertainty present in this system, how does one develop sufficient information to make a prediction? If we had no prior information at all — neither theoretical nor empirical — we might want to flip a coin, showing our total uncertainty, or lack of bias, about either possible outcome. Another approach is to conduct an experiment. Let K be a random variable denoting the arrival of water: if K = 0, water is absent, and if K = 1, water is present. Each turning of the faucet handle is viewed as a trial of the experiment, because we do not know beforehand whether or not the water will appear. Suppose that the faucet handle is tried 12 times, resulting in the following series of observations for K: {1,0,1,1,1,0,1,0,1,1,1,1}. Counting the ones indicates that the water flowed in 9 of the 12 cases, or 3/4 of the time. What should we predict for the 13th trial? The data we have collected suggest there may be a bias toward the presence of water, and our intuition tells us to predict a “success” on the 13th trial. Of course, this bias may have been merely the result of chance, and a different set of 12 trials might show a bias in the other direction. Each separate set of 12 trials is called a realization of this random process. If there are no dependencies between successive outcomes, and if the process does not change (i.e., remains “stationary”) during the experiment, then the techniques for dealing with Bernoulli processes (Chapter 4) provide a formal methodology for modeling processes of this kind.
Suppose that we continue our experiment. The faucet is turned on 100 times, and we discover that the water appears in 75 of these trials. We wish to predict the outcome on the 101st trial. How much would we be willing to bet an opponent that the 101st trial will be successful? The information revealed from our experiment of the first 100 trials suggests a bias toward a successful outcome, and it is likely that an opponent, witnessing these outcomes, would not accept betting odds of 1:1. Rather, the bet might be set at odds of 3:1, the ratio of past successes to failures. The empirical information gained from our experiment has modified our future predictions about the behavior of this system, and we have created a model in our minds. If an additional number of trials now were undertaken, say 1,000, and if the basic process were to remain unchanged, we would not be surprised if water appeared in, say, 740 outcomes.
The problem becomes more difficult if we are asked to compare several experiments — say, two or more different groups of 1,000 observations from the faucet — to determine if a change has occurred in the basic process. Such comparison is a “trend” analysis, since it usually utilizes data from different time periods. Probabilistic concepts must be incorporated into such trend analyses to help assess whether a change is real or due to chance alone.
Our intuitive model of this process is derived purely from our empirical observations of its behavior. A model is an abstraction of reality allowing one to make predictions about the future behavior of reality. Obviously, our model will be more successful if it is based on a physical understanding of the characteristics of the process as well as its observed past behavior. The techniques described in this book provide formal procedures for constructing stochastic models of environmental processes, and these techniques are illustrated by applying them to examples of environmental problems.
Many of these techniques are new and have not been published elsewhere, while some rely on traditional approaches applied in the field of stochastic modeling. It is hoped that, by bridging many fields, and by presenting several new theories, each technique will present something new that will provide the reader with new insight or present a practical tool that the reader will find useful.
STOCHASTIC PROCESSES IN THE ENVIRONMENT
The process described above is an extremely simple one. The environmental variables actually observed are the consequence of thousands of events, some of which may be poorly defined or imperfectly understood. For example, the concentration of a pesticide observed in a stream results from the combined influence of many complex factors, such as the amount of pesticide applied to crops in the area, the amount of pesticide deposited on the soil, irrigation, rainfall, seepage into the soil, the contours of the surrounding terrain, porosity of the soil, mixing and dilution as the pesticide travels to the stream, flow rates of adjoining tributaries, chemical reactions of the pesticide, and many other factors. These factors will change with time, and the quantity of pesticide observed in the stream also varies with time. Similarly, the concentrations of an air pollutant observed in a city often are influenced by hundreds or thousands of sources in the area, atmospheric variables (wind speed and direction, temperature, and atmospheric stability), mechanical mixing and dilution, chemical reactions in the atmosphere, interaction with physical surfaces or biological systems, and other phenomena. Even more complex are the factors that affect pollutants as they move through the food chain — from sources to soils, to plants, to animals, and to man — ultimately becoming deposited in human tissue or in body fluids. Despite the complexity of environmental phenomena, many of these processes share certain traits in common, and it is possible to model them stochastically. There is a growing awareness within the environmental community of the stochastic nature of environmental problems.
Ward and Loftis2 note that, with the passage of the Clean Water Act (Public Law 92-500), water quality management expanded both its programs (permits and planning) and the money devoted to wastewater treatment plants. The data collected at fixed water quality monitoring stations* assumed a new role: to identify waters in violation of standards and to evaluate the effectiveness of expenditures of the taxpayers’ money. They conclude that water quality monitoring was expected to serve as a “feedback loop” by which to evaluate the effectiveness of regulatory programs. Unfortunately, unless the stochastic properties of these data are taken into account, the inherent randomness of these data will conceal real changes in environmental conditions:
When data, collected to check only if a sample meets a standard, are used to evaluate management’s success, the stochastic variation in the data often completely masks any improvement in controlling society’s impact on water quality. Since the data cannot show management’s effectiveness, the conclusion is that fixed station data are useless. However, the data are not useless: they are simply being asked to provide information they cannot show, without further statistical analysis.2
Standards in air and water pollution control usually are long-term goals that are well-suited to statistical formulation. Previous environmental standards often have been deterministic, however, perhaps because it was believed that probabilistic forms would complicate enforcement activities. Practically, it is impossible to design a regulatory program that can guarantee that any reasonable standard never will be violated, and there is a growing awareness that probabilistic concepts should be an integral part of the standard setting process. Ewing4 states that water quality standards should be formulated in a probabilistic manner:
The establishment of state water quality standards for both interstate and intrastate streams has recently been accomplished. In practically every case, DO [dissolved oxygen] requirements have been set without any reference to the probability of these levels being exceeded. It would seem that the state-of-the-art is rapidly advancing to the point, however, where the probabilistic concept should be recognized more specifically in the statement of the water quality standards themselves.
Drechsler and Nemetz5 believe that incorporation of probabilistic concepts places greater demands on the design of the monitoring program but will yield a more efficient and effective water pollution control program:
We recommend that, where appropriate, standards be altered to reflect the probability of occurrence of pollution events. This will require a greater degree of information concerning both the distribution of pollutant discharges and biological damage functions…. These measures will help overcome some of the significant weaknesses in the current regulatory system for the control of water pollution and will be more efficient and effective in the protection of both corporate and social interests.
In the air pollution field, significan...

Table of contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Dedication
  5. Table of Contents
  6. 1. RANDOM PROCESSES
  7. 2. THEORY OF PROBABILITY
  8. 3. PROBABILITY MODELS
  9. 4. BERNOULLI PROCESSES
  10. 5. POISSON PROCESSES
  11. 6. DIFFUSION AND DISPERSION OF POLLUTANTS
  12. 7. NORMAL PROCESSES
  13. 8. DILUTION OF POLLUTANTS
  14. 9. LOGNORMAL PROCESSES
  15. INDEX