Statistical Methods and the Geographer
eBook - ePub

Statistical Methods and the Geographer

  1. 240 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Statistical Methods and the Geographer

About this book

First published in 1978. For the non-mathematician, however, even the simpler introductory books on statistics often raise considerable problems. In this second edition First, some attention has been given to the problem of the transformation of data in order to reinforce the appreciation of the need for normally-distributed data for the use of so many techniques. Secondly, the use of probability paper, at least in simple terms, has been introduced to illustrate the ways in which the labour of probability assessments can be circumvented. Thirdly, radical changes have been made, plus considerable expansion added, to the theme of non-parametric testing, to provide a more systematic approach to what is a most important group of possible techniques for geographers. Fourthly, change and expansion are also reflected in the sections on correlation and regression, including some simple consideration of curvilinear relationships and the presentation of computational techniques more geared to the use of desk calculators rather than long-hand methods. Finally, the bibliography has also been expanded, to incorporate a wider range of books on techniques and a selection of research papers using such techniques in a geographical (or near-geographical) context.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Statistical Methods and the Geographer by S Gregory,Stanley Gregory in PDF and/or ePUB format, as well as other popular books in Physical Sciences & Geography. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2014
Print ISBN
9780367239879
eBook ISBN
9781317873105
Edition
1
Subtopic
Geography
Chapter 1

Characteristics of data

The methods and techniques used in the analysis of statistical data are in large measure controlled by the very character of the statistical data themselves. It is therefore necessary to begin with a very brief consideration of some of these characteristics so that the varied themes that will be introduced later will be more readily understood.
When any collection of data, representing some quantitative value of any given phenomenon, is to be processed it will be found that although such data all represent the same phenomenon they are not all of exactly the same value. Thus if a study were being made of the distance inland from the coast that vessels of a given draught could sail it would be found that these distances vary markedly between one river and another, or between one part of the world and another. Again, if the number of vessels sailing along these rivers were examined a very wide range in values would be found between the different rivers. This highly variable nature of the numerical data is common, to a greater or lesser extent, to all sets of data, and this quantity which varies (mileage, or numbers of vessels, in the two cases given above) is known as the variate, or sometimes as the variable.
Three broad sets of distinctions concerning such variates need to be borne in mind. Firstly, there are a range of possible types of units in terms of which data are expressed—nominal or classificatory ; ordinal or ranking; interval; ratio.
(a) Nominal. This is a group of data which all too often in the past was assumed by geographers to preclude quantitative description or testing. However, it is a frequently occurring category of data in geography—the distinction of settlements into Celtic, Anglo-Saxon and Scandinavian origins; the classifying of soils into podzols, brown earths and rendzinas; the distinction of forest, grassland and heath vegetation complexes; the recognition of various tribal, racial or cultural groups; the functional divisions of towns or the land-use division of rural areas. None of these carries implications of quantity, nor even of relative order of magnitude; they simply refer to categories that are different from one another. Nevertheless, under sampling the various categories may occur with differing degrees of frequency, and these provide data in a form that can be analysed statistically.
(b) Ordinal. This is also a very common group of data in geography, in that the relative importance (or order of magnitude) of data may be known, even though their absolute values are not. In other words, the data can be ranked or put in order, either individually or in classes. Sometimes this reflects constraints that exist upon data collection, such that only rankings are known; in other cases, the use of data in ordinal form is a deliberate choice, even though other data forms could have been used.
(c) Interval. When not only is the order of magnitude known, but also the actual degree of magnitude as well, then an interval scale exists. This is characteristic of rainfall data, production values, population returns, and many other types of data of geographical relevance. In all these and similar cases, either exact measurements are made in some standard unit, or the occurrences of the phenomenon are counted.
(d) Ratio. In this fourth category, interval data have been converted into another form. For example, the number of persons in a given socio-economic group may be expressed as a proportion of the total population, or the number of persons voting for a particular party expressed as a percentage of the total electorate. Again, measured values may have been converted into an index, such as a pH value or an index of production. Such ratio values are often, but not invariably, characterized by finite upper and lower limits.
Secondly, a distinction must be made between continuous and discrete variates. For example, in the case of the navigable mileage of rivers outlined above, it is possible for any mileage value to be recorded and for fractions of a mile to be included. In other words, it is a continuous variate such that there are no clear-cut or sharp breaks between the values that are possible. Such continuous variates occur with measured interval data, or with ratio data. On the other hand, the number of vessels actually sailing these rivers can only be in terms of whole numbers or integers, for fractions of vessels cannot be recorded. Such a variate is known as discrete, and special care must be taken when interpreting the results of the analysis of such discrete variates. Interval data based on the counting of occurrences fall into this category.
A third distinction that must be made is between data for individual items and data that are grouped into classes or cells. The listing of each item separately is possible for all types of data units except the nominal category, which by definition implies the number of occurrences in a given class. The grouping of data can be effected for all types of data, whether this be because of the form in which data are made available, because of doubts concerning the precise accuracy of interval or ratio measurements, or for convenience in calculations or testing procedures. For example, economic or social data may be obtained from official bodies such as employment exchanges or government departments, which are often precluded by law from making individual values available. Thus the numbers of people employed by individual firms may vary from one to some high value, but data may be available only in a series of classes (1–50, 51–100, etc.). Again, the profitability or costs of certain operations may be defined by firms or farmers as high, medium and low, because they are unwilling to make actual values available. At other times, difficulties of measurement or recording may make the ordinal form of data more convenient than the interval form, as when classifying river-bed load as coarse, medium and fine, or slopes as steep, moderate and gentle, or soils as acid, neutral and alkaline. In all these cases, however, there exists some implicit underlying continuum in terms of magnitude, the discrete categories being merely a convenient division.
The variable nature of geographical data can best be understood and appreciated if the data are plotted graphically to show the frequency of occurrence of values of different given amounts. The data are first grouped into ‘classes’, so that it is known how many occurrences fall into each of a series of quantitatively different sets of conditions. Then the number of occurrences are plotted against the appropriate ‘class’, and a diagram drawn in the form of ‘building blocks’. Such a diagram is known as a histogram and the pattern which it presents is called the frequency distribution for that set of data. From such a diagram a smoothed curve can be interpolated, this being known as the ‘frequency curve’ of that set of data. Thus in Fig. 1 can be seen the frequency distribution for population densities of the European nation-states. The values for individual states are grouped into various classes depending on their order of magnitude (e.g. 0–49.9 persons per sq. km,; 50–99.9 persons per sq. km.), and the variable character of these population densities is readily apparent. The way in which these population densities vary is shown by both the ‘blocks’ and by the smoothed curve. A similar frequency distribution curve can be constructed for any and all sets of data. Figure 2, for example, shows the distribution of hill summit heights in North Wales based on summit ring-contours taken from the provisional edition of the O.S. 1:25,000 maps. As with the population densities, these summit heights are a continuous variate. Moreover, both Fig. 1 and Fig. 2 also display another feature of many distribution curves. It can be seen clearly that these curves are not symmetrical, having their peak markedly to one side. Such a distribution is known as skew, and the problems which this introduces, together with various methods by which these problems may be largely solved, w...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Acknowledgements
  8. Preface
  9. Introduction
  10. 1 Characteristics of data
  11. 2 Taking a sample
  12. 3 Measures of central tendency
  13. 4 Deviation and variability
  14. 5 The normal frequency distribution curve and its characteristics
  15. 6 Probability assessments
  16. 7 Sample characteristics and sampling error
  17. 8 The comparison of sample values—I (Non-parametric tests)
  18. 9 The comparison of sample values—II (Parametric tests)
  19. 10 Methods of correlation
  20. 11 Regression lines and confidence limits
  21. 12 Fluctuations and trends
  22. 13 The way ahead
  23. A selective bibliography
  24. Formulae index
  25. General index