eBook - ePub

Choosing and Using Statistics

Name: Choosing and Using Statistics
Author: Calvin Dytham

A Biologist's Guide

Calvin Dytham

Partager le livre

English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Choosing and Using Statistics

A Biologist's Guide

Calvin Dytham

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

Choosing and Using Statistics remains an invaluable guide for students using a computer package to analyse data from research projects and practical class work. The text takes a pragmatic approach to statistics with a strong focus on what is actually needed. There are chapters giving useful advice on the basics of statistics and guidance on the presentation of data. The book is built around a key to selecting the correct statistical test and then gives clear guidance on how to carry out the test and interpret the output from four commonly used computer packages: SPSS, Minitab, Excel, and (new to this edition) the free program, R. Only the basics of formal statistics are described and the emphasis is on jargon-free English but any unfamiliar words can be looked up in the extensive glossary. This new 3 rd edition of Choosing and Using Statistics is a must for all students who use a computer package to apply statistics in practical and project work.

Features new to this edition:

Now features information on using the popular free program, R
Uses a simple key and flow chart to help you choose the right statistical test
Aimed at students using statistics for projects and in practical classes
Includes an extensive glossary and key to symbols to explain any statistical jargon
No previous knowledge of statistics is assumed

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Choosing and Using Statistics est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Choosing and Using Statistics par Calvin Dytham en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Biological Sciences et Biology. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Wiley-Blackwell

Année

2011

ISBN

9781444348217

Édition

Sujet

Biological Sciences

Sous-sujet

Biology

1
Eight steps to successful data analysis

This is a very simple sequence that, if you follow it, will integrate the statistics you use into the process of scientific investigation. As I make clear here, statistical tests should be considered very early in the process and not left until the end.

1 Decide what you are interested in.
2 Formulate a hypothesis or several hypotheses (see Chapters 2 and 3 for guidance).
3 Design the experiment, manipulation or sampling routine that will allow you to test the hypotheses (see Chapters 2 and 4 for some hints on how to go about this).
4 Collect dummy data (i.e. make up approximate values based on what you expect to obtain). The collection of ‘dummy data’ may seem strange but it will convert the proposed experimental design or sampling routine into something more tangible. The process can often expose flaws or weaknesses in the datacollection routine that will save a huge amount of time and effort.
5 Use the key presented in Chapter 3 to guide you towards the appropriate test or tests.
6Carry out the test(s) using the dummy data. (Chapters 6–9 will show you how to input the data, use the statistical packages and interpret the output.)
7 If there are problems go back to step 3 (or 2); otherwise, proceed to the collection of real data.
8 Carry out the test(s) using the real data. Report the findings and/or return to step 2.

I implore you to use this sequence. I have seen countless students who have spent a long time and a lot of effort collecting data only to find that the experimental or sampling design was not quite right. The test they are forced to use is much less powerful than one they could have used with only a slight change in the experimental design. This sort experience tends to turn people away from statistics and become ‘scared’ of them. This is a great shame as statistics are a hugely useful and vital tool in science.

The rest of the book follows this eight-step process but you should use it for guidance and advice when you become unsure of what to do.

2
The basics

The aim of this chapter is to introduce, in rather broad terms, some of the recurring concepts of data collection and analysis. Everything introduced here is covered at greater length in later chapters and certainly in the many statistics textbooks that aim to introduce statistical theory and experimental design to scientists.

The key to statistical tests in the next chapter assumes that you are familiar with most of the basic concepts introduced here.

Observations

These are the raw material of statistics and can include anything recorded as part of an investigation. They can be on any scale from a simple ‘raining or not raining’ dichotomy to a very sophisticated and precise analysis of nutrient concentrations. The type of observations recorded will have a great bearing on the type of statistical tests that are appropriate.

Observations can be simply divided into three types: categorical where the observations can be in a limited number of categories which have no obvious scale (e.g. ‘oak’, ‘ash’, ‘elm’); discrete where there is a real scale but not all values are possible (e.g. ‘number of eggs in a nest’ or ‘number of species in a sample’) and continuous where any value is theoretically possible, only restricted by the measuring device (e.g. lengths, concentrations).

Different types of observations are considered in more detail in Chapter 5.

Hypothesis testing

The cornerstone of scientific analysis is hypothesis testing. The concept is rather simple: almost every time a statistical test is carried out it is testing the probability that a hypothesis is correct. If the probability is small then the hypothesis is deemed to be untrue and it is rejected in favour of an alternative. This is done in what seems to be a rather upside down way as the test is always of what is called the null hypothesis rather than the more interesting hypothesis. The null hypothesis is the hypothesis that nothing is going on (it is often labelled as H₀). For example, if the weights of bulbs for two cultivars of daffodils were being investigated, the null hypothesis would be that there is no weight difference between cultivars: ‘the weights of the two groups of bulbs are the same’ or, more correctly, ‘the two groups of bulbs are samples from a larger population with the same distribution’. A statistical test is carried out to find out how likely that null hypothesis is to be true. If we decide to reject the null hypothesis we must accept the alternative, more interesting, hypothesis (H₁) that: ‘the weights of bulbs for the two cultivars are different’ or, more correctly, that ‘the groups are samples from populations with different distributions’.

P-values

The P-value is the bottom line of most statistical tests. (Incidentally, you may come across it written in upper or lower case, italic or not: e.g. P value, P-value, p value or p-value.) It is the probability of seeing data this extreme or more extreme if the null hypothesis is true. So if a P-value is given as 0.06 it indicates that you have a 6% chance of seeing data like this if the null hypothesis is true. In biology it is usual to take a value of 0.05 or 5% as the critical level for the rejection of a hypothesis. This means that providing a hypothesis has a less than one in 20 chance of being true we reject it. As it is the null hypothesis that is nearly always being tested we are always looking for low P-values to reject this hypothesis and accept the more interesting alternative hypothesis.

Clearly the smaller the P-value the more confident we can be in the conclusions drawn from it. A P-value of 0.0001 indicates that if the null hypothesis is true the chance of seeing data as extreme or more extreme than that being tested is one in 10 000. This is much more convincing than a marginal P = 0.049.

P-values and the types of errors that are implicitly accepted by their use are considered further in Chapter 4.

Sampling

Observations have to be collected in some way. This process of data acquisition is called sampling. Although there are almost as many different methods that can be used for sampling as there are possible things to sample, there are some general rules. One of the most obvious is that a large number of observations is usually better than a small number. Balanced sampling is also important (i.e. when comparing two groups take the same number of observations from each group).

Most statistical tests assume that samples are taken at random. This sounds easy but is actually quite difficult to achieve. For example, if you are sampling beetles from pit-fall traps the sample may seem totally random but in fact is quite biased towards those species that move around the most and fail to avoid the traps. Another common bias is to chose a point at random and then measure the nearest individual to that point, assuming that this will produce a random sample. It will not be random at all as isolated individuals and those at the edges of clumps are more likely to be selected than those in the middle. There are methods available to reduce problems associated with non-random sampling but the first step is to be aware of the problem.

A further assumption of sampling is that individuals are either only measured once or they are all sampled on several occasions. This assumption is often violated if, for example, the same site is visited on two occasions and the same individuals or clones are inadvertently remeasured.

The sets of observations collected are called variables. A variable can be almost anything it is possible to record as long as different individuals can be assigned different values.

Some of the problems of sampling are considered in Chapter 4.

Experiments

In biology many investigations use experiments of some sort. An experiment occurs when anything is altered or controlled by the investigator. For example, an investigation into the effect of fertilizer on plant growth will use a control plot (or several control plots) where there is no fertilizer added and then one or more plots where fertilizer has been added at known concentrations set by the investigators. In this way the effect of fertilizer can be determined by comparison of the different concentrations of fertilizer. The condition being controlled (e.g. fertilizer) is usually called a factor and the different levels used called treatments or factor levels (e.g. concentrations of fertilizer). The design of this experiment will be determined by the hypothesis or hypotheses being investigated. If the effect of the fertilizer on a particular plant is of interest then perhaps a range of different soil types might be used with and without fertilizer. If the effect on plants in general is of interest then an experiment using a variety of plants...