S. Kamal ( Institute of Statistics, University of the Punjab, Lahore. )

S. Zaman ( Department of Social and Preventive Paediatrics, King Edward Medical College, Lahore. )

#### April 1995, Volume 45, Issue 4

### Practical Epidemiology and Biostatistics in Research

### Abstract

Soundly planned study may well lead to the findings that are of wide scientific application and interest. This paper is intended to provide a simple and systematic guideline pertinent to the design, analysis and interpretation of studies, especially in the health care. It emphases on the importance of ‘Statistics’ in the design, conduct, analysis and interpretation of the studies. It is pointed out that all the stages of research studies are vulnerable to statistical mismanagement. The most important thing is to balance the interests of the individuals in the study with those of the much larger number who may benefit in the long mn. Concerning the proper use of statistics, the following things are recommended. (a) Statistical advice is needed at the planning stage of a project and not at its end; (b) statistical advice may be saught at almost every stage of a project; (c) the important aspects of logic and correctness of argument may be carefully looked before interpreting the results and (d) finally, before getting any work published, opinion of an expert statistician must be saught. so that erroneous conclusions may not become enshrined as the truth.

### Introduction

In research works, the basis of collection of information is to provide facts. These facts in turn lead to a policy formation. If the facts are true, policies can be upheld. The facts are dependent on a sound study design and its interpretation. Closer the attention to minor details, more successful can be the study. The basic aim of this article, therefore, is to provide simple and practical outlines for designing a study i.e., planning, conducting and analyzing it, especially pertaining to medical field. The importance of ‘statistics’, at various stages of research is also described.

**The guidelines relating to the research studies mainly consist of the following steps:**

(a) Identifying objectives

(b) Hypothesis formulation

(c) Choosing a study design

(d) Planning of methods

(i) Study population:

Selection and definition

Sampling

Size

**ii) Variables**

Selection

Definition

Scales of measurement

**iii) Methods of data collection**

(e) Data recording and data processing

(f) Data analysis

(g) Presentation and interpretation of results

(h) Report writing/publication.

The above guidelines, specifically the importance of statistics at each stage, will be discussed briefly in the sections that follow. The main emphasis is on how the objectives of the study are defined and what procedures are used to obtainthem. However, such guidelines cannot be a set mies, but rather advice.

**a) Identifying objectives**

This is the first and the foremost step when the researcher has to identify the problem and clarify why he is studying it. For example, does he want to obtain information to provide a basis for utilization of sources (counting the number of dianhoeal patients to request for more dehydration salts etc.), or to identify persons at risk having a disease (bottle-fed infants having more diarrhoea as compared to breast-fed infants), or to study the effect of an intervention policy (effect of immunization on incidence of diseases), or to follow a population/disease over a time period (outcome/incidence of disease) or to educate public on defined aspects of health (health education on using ORS for preventing dehydration in diarrhoea) etc.

The identification of the problem will only bejustified or clarified further when in-depth knowledge regarding a particular problem is obtained. This can be done through the libraries or personal communications. In most instances, this is straight forward and objectives can be reasonably well-defined. Sometimes this may not be possible. Broader study objectives can then be fonnulated and tested.

**b) Hypothesis formulation**

The more well-defined the objectives are, easier it becomes to formulate the hypothesis. Precisely, what information does the researcher wish the study to yield is the question tobe asked by the investigator. His decision will determine the furtherplanning of the study. Forexample, he wants to identify a causative factorfor a disease outcome. He will need to know about the disease, the burden of the disease, the possible causative factors, confounders etc., which may effect the interpretation of the results. Any preconceived ideas about the study on the part of the researcher can introduce bias in the findings. Planning may then have to be done accordingly. Geitgeg and Metz^{1} have expressed this as "If necessity is the motherof invention, the awareness of problem is the mother of research”.

**c) Choosing a study design**

The choice of a study design will depend upon the objective(s) of the study. The question asked will point towards the choice of an appropriate study design e.g., descriptive studies, either cross- sectional or longitudinal or case-control studies; evaluative studies; intervention studies, experimental studies, trials etc. The hypothesis that ‘influenza during pregnancy leads to congenital anomalies’ can be studied retrospectively by comparing the illness history of mothers of malformed children, orprospectively by following the pregnant mothers and comparing the number of malformed children in the group of mothers with or without influenza.

**d) Planning of methods**

The value of any investigation depends on sound planning. This may necessitate a considerable amount of effort. It is therefore possible that this phase may take more time than the actual study itself. The better the techniques of investigation, the greater the prospects of finding useful results having better reproducibility. The planning phase mainly consist of:

1. The study sample

2. Sample size

3. Type of statistical design - treatment allocation method etc.

4. Methods of data collection

5. Methods of data recording and processing

In the planning stage of a Study, it is very important to seek the expert advice of a statistician. If it is not practicable, the investigator should have a sound understanding of the statistical techniques or methods he employs. Statistical techniques play an extremely important role in the planning of a good study for productive findings in the end. Generally, there is a tendency to treat the statistician as a ‘Magic Box’ into which one can feed the data at the end of the study and after a period of arcane deliberation will eject a result^{2}. This is not advised since it is not productive at the end. The subsequent good analysis of the study depends on the proper designing and planning of a study, which is not possible without taking into consideration the statistical techniques. No statistician with clever analysis of any amount will be able to compensate for the major flaws in the design. It is just as to say “Garbage in and Garbage out”. However, almost every study is constrained by practical difficulties, oversights and accidents resulting in methodological imperfections. But the important thing is that the investigator should be aware of these difficulties, examine their impact on the results and take them into account in the interpretation of this finding.

**(i) Study population Sampling methods**

A properly drawn sample will enable the rçsults in the study to yield valid conclusions. A sample which is chosen in a haphazard way or because it is handy to collect, is unlikely to be a representative sample and will invalidate the results of the study. These types of samples in any field of investigation and specially in the medical research may come out with incorrect results. The recommended method of sampling is probability sampling3. In probability sampling each individual unit in the population has a known probability of being selected. Some of the probability sampling methods are random sampling, stratified sampling, systemic sampling, cluster sampling etc. But sometimes purposive sampling or quota sampling (nonprobability sampling) is also done. The design to do such sampling is based on some specific reasons, stating that such a sample is more representative of the population. The most important thing here is to note that whatever type of sampling is done, the rules to perform must be decided in advance, so that any sort of bias is avoided. A statistician should be consulted to utilize his skills at the phase ofplanrnng in order to avoid any flaws which can occur later on.

**Sample size**

Sample size estimation is another important factor for the success of any study. This is the stage where a statistician can give the most important advice, to have a reasonable chance of achieving a significant result. Some researchers at this point do consult a statistician, whereas others give insufficient attention to this important factor and decide to choose the most convenient number such as 20,30,50,100, etc. or time period say, one month, a year etc., for the study. The researchers who neglect the importance of sample size should realize that there are important statistical implications hidden in the choice of sample size for a study. For example, if a sample is too small it may be impossible to make sufficiently precise and confident generalization of the results in the population orto obtain statistical significance results incase of testing associations. Similarly, an overlarge sample may be deemed unethical as it is waste of resources which are almost always scarce and unnecessary involvement of extra subjects is not advisable. Moreover, large samples may result in over-emphasizing on a trivial difference, which might sometimes lead us to false conclusions. Sackett^{4} expressed the problem of sample size as “Samples which are too small can prove nothing; samples which are too large can prove anything. In medical-literature it is common to find studies with too small sample size^{5,6} and probably veiy rare over-large sample studies. Ethical implications of inappropriate sample size is nuely considered by the researchers^{7}. A statistician can help in making an important decision regarding sample size of a study. There is a mathematical relationship between \\\'alpha\\\' (the probability of rejecting null hypothesis when it is actually true), ‘beta’ (the probability of accepting null hypothesis when it is false), ‘delta’ (the difference between the treatment and the study) and ‘n’ (sample size). By fixing three of these factors the fourth can be computed. Hence, a statistician can make a reasonably good estimate of the appropriate sample size, provided he has some information regarding: How precise the investigator wants his estimate to be; what confidence level does he require (i.e. margin of error); and what risk is he ready to take that the actual error is larger than this margin^{8,9}. One thing should be kept in mind that in reality the other factors such as the objectives of the study, design of the study, time constraints, resources available etc., also play their part and they should be taken into account while determiningthe sample size. In the case of studies where one of the objectives is to compare one sample with another, it is better to keep the sample sizes equal, though this is not essential^{10}.

**Design**

Another important thing for the success of any research study is the designofthe study. There is notonebest designfor all the medical studies. The choice of study design depends on the nature of the study. However, whatever design is selected, it should clearly describe the following points:

(a) Relevant information on treatment allocations

(b) Nature of the treatments

(c) Sample selection

(d) If and how randomization will be done

(e) Whether or not the study will be ‘blind’ in any way

(f) The response time

As one of the objectives is that the research should provide useful results and this may oftenbe achieved best by a randomized study^{11}. May^{12} has expressed the importance of the design of study as: “A poorly designed or poorly conceived experiment is unethical by definition and should not be permitted; ensure that the conception and design meet the accepted cannons of scientific method because we are dealing with experimentation which may not be for the individual subjects’ direct benefit”.

**(ii) The variables**

The characteristics that are measured are referred to as variables. The variables can be quantitative (e.g., weight, height or age) or qualitative (e.g., sex, smoker or non-smoker or presence or absence of a disease).

**Selection**

The objective of any investigation plays significant role inthe selection of the variables. The more specific formulation of objective, the greater the number of variables that will have to be selected. Apart from the variables which have an obvious relationship to the study objectives, consideration should be given to the following three types of variables also:

1. Universal variables e.g., sex, age, ethnic group, marital status, social class, religion, place of residence, geographical mobility (i.e., native or immigrant) etc.

2. Measure of time: e.g., ma longitudinal investigation it may be necessary to record the dates on which the subject entered and left the study.

3. Variables relating to the study population orpopulations i.e. the characteristics of the study population may indicate the extent to which generalization may be made from the findings.

**Number:** The answer to the question that, how many variables should be studied, depends on lot of things, e.g. objectives of the study, resources etc. Generally the initial list of variables is too long, but factors like data collection and processing, resources etc. play a significant role in reducing this list.

**Definition**

The selected variables should be properly defined and clarified. When defining the variables two things should be considered. First, defining the variable in terms of objectively measurable facts and stating, if necessary, how these facts are to be obtained. Secondly, the scale of measurement to be used in data collection should be specified, e.g., for variable ‘social class’. (Definition) Father’s occupational grade, using standard grading scheme; (Scale) Social class I, Social class H Social class V, etc.

1. Scales of measurement: In order to clarify each of the variables to be studied, it is necessary to specify its scale of measurement. The selection of this scale for measuring a variable depends partly by the variable itself and partly by the method available for measuring it. The data analysis also plays a significant role in determining the scale of measurement of a variable. The following things should be kept in mind in order to decide the satisfactory scale of measurement for a variable. The scale of measurement should be appropriate, practicable, powerful, clearly defined categories, sufficient categories; comprehensive; mutually exclusive.

**There are four types of scales:**

Nominal: Consists of two or more categories that are mutually exclusive e.g., absent and present, male and female, yes and no, single, married, widowed and divorced etc.

Ordinal: It has quality that categories are ranked (i.e. it is like the positions on a ladder), e.g., social class: 1,11, III, IV and V. Education: 0, <10, 11-14,more than 14.

Interval scale: The intervals between classes are equal, e.g., temperature, measured in degrees.

Ratio scale: e.g., weight, height, income, etc.

**(iii) Data collection methods**

Another important thing to decide during the planning phase of the study is the method stating that how the required information will be collected. A well designed study can even go wrong in the process of data collection, e.g., the Lanarkshim milk experiment^{13}. There are various methods of collecting data, viz observational (i.e., clinical examinations), interviews, self-administered questionnaires, secondary data collection (i.e., using the documentary sources e.g., published statistics, census etc.). The choice of methods of data collection is described by the investigator on the basis of the nature and accuracy of the information required for a particular variable in the study. The choice is also constrained by practical implications e.g. time span available for study, limitation of resources both financial and human, technical skills available etc.

**e) Data recording and data processing**

Before the analysis of any data, it is recommended to carry out some degree of data screening, i.e., checking the plausibility of the recorded values, as usually it is impossible to know whether the recorded information is correct. Data screening or data cleaning means checking of collected information for each variable (i.e., the collected information are within reasonable limits). Wherever possible cross-checking should be performed. For instance, observations such as a fafteenyears old womanhas six children, a woman givingbirth to another baby within six months of last pregnancy etc., are inconsistent. It has been pointed out in the literature^{14} that alot can be learnt about the data by an initial close examination of the data, i.e., by simple frequency tables, cross-tabulations, scatter diagrams and histograms. Moreover, close checking of data may indicate the outliers, missing values for a variable and transformation of any variable if required (e.g., transformation in case of data with large variability). Usually the investigators and researchers give little attention to such a sensitive issue of data screening, even though it may have major implications onthe results and hence can put the validity of the study in question. And this attitude goes unchecked because of the use of computers and calculators, which are machines and unable to study the data.

**f) Data analysis**

The data analysis is a tricky phase of the study. A lot depends on the results of the study. Therefore, the data set should be analyzed very carefully and judiciously by the investigator. It is recommended that the best way here is to get the study data analyzed from an experienced statistician who has been involved in the study at the early stage of its planning, ordo the analysis under his guidance. The involvement of a statistician at the early stage of planning is also helpful in a way that the investigator may be able to decide in broad outline how the analysis of the proposed study data will be performed. It is often helpful to draw up a number of specimen skeleton tables for the various variables indicating their scales. This sort of planning not only helps in overcoming the gaps and defects in data collection but also helps to know whether the study objectives will be fulfilled or not. Literature^{15,16} is full with the reviews indicating lot of common errors made in many published papers, especially use of statistical methods; cite examples, regression methods, t-test, chi-square test, etc. are applied without paying sufficient attention to the assumptions required for a valid use of these methods. This sort of inappropriate use of statistical method is as based as misuse of any laboratory equipments or techniques. Both of these may lead to incorrect findings and invalid conclusions, thus making the research worthless. Some of the common errors made are enumerated below:

**1. Use oft-test**

People often use incorrectly Student’s t-test to compare two groups of measurements. The t-test is based on some assumptions and if any data set violates highly these assumptions, the application of t-test becomes incorrect. In many cases it is not checked before the application oft-test whether the two sets of data come from normal populations and have the same variances. Another place of error is to ignore the fact whether the two sets of measurements relate to the same individuals i.e., matched pairs. In such cases, paired t-test should be used. White^{17} has discussed these problems widely in his paper.

**2. Use of Chi-square test**

Another test which canbe seen widely usedby people is Chi- square test to compare proportions. The main problem in the application of Chi-square test arises when any cell frequency has too few observations (usually <5); or in comparing observed number with expected number when the sample size is too small; or the data does not follow normal distribution. The misuse of Chi- square test in such cases may provide false conclusions.

**3. Correlation and regression**

Closely related but conceptually very much different are correlation analysis and regression analysis. The correlation coefficient is a measure of degree of linear association between two random variables, whereas regression analysis is concerned with the study of the dependence of one variable on one or more other independent variables and provides an estimating equation.

**Correlation Coefficient:** The simple correlation coefficient measures strength of linear association (as mentioned above). For example, association between smoking and lung cancer, between scores on intermediate and BA examinations, etc. is measured by correlation coefficient. It is worthwhile to note that if the relationship between two variables is not linear, say is curved, then the correlation coefficient will show lower degree of association between two variables. In the same way a few quite different observations (outliers) may artifically increase the correlation coefficient. Hence, simple correlation coefficient has no real meaning in case of non-linear relations. For above reasons it is therefore not wise to put a lot of emphasis on the value of correlation coefficient without actually looking at a scatter plot of the data. Correlation analysis can also be misleading if the two data sets relate to totally different characteristics, e.g., the correlation between growthofbotanical plants and the number of cases of dysentery is meaningless. Another problem which may arise in case of correlation analysis is the test of significance of a correlation coefficient (i.e. Ho: no association). This is based on the assumption of joint normality of two variables. If for any reason this assumption is violated, the correlation computed may not be correct and thus the test invalid. This problem can be overcome either by transforming the data or by simply computing rank correlation (which makes no assumptions as compared to simple correlation coefficient). Lot of emphasis is being given on the calculation of correlation coefficient in medical research. But correlation really should be only considered as investigative analysis and test should be avoided.

**Regression analysis:** In regression analysis, we try to mathematically relate one variable with one or more variables. Again in case of regression, the most important underlying assumption is that dependent variable Y is normally distributed with constant variance. Any major departure from this assumption can render the whole analysis invalid. There is no restriction imposed on the values of the independent variables (i.e., X- variables). A regression analysis is mainly used to predict the value of Y from X, e.g. weight from certain height of an individual. One should be very careful to use the regression equation to predict the Y-values for the X-values lying outside the range of original data (i.e., extrapolation) e.g., in case of straight line fitted to data that shows curvature or data consisting of heterogeneous subgroups, etc. It is hard to detect an error made on the basis of regression analysis unless results are presented graphically also.

**Reference Ranges:** Statistical methods are often misused and applied blindly in the calculation of reference ranges against which future observations arejudged. Apparent differences in reference ranges for the same index can often be attributed to one or more of them having been calculated incorrectly. A small sample size also affects the calculation of reliable answers. The usual computed 95% reference range (Mean±2S.D) is again based on the assumption that data follows a Gaussian or normal distribution. The normality assumption is often not fulfilled and people still calculate the wrong reference limits. Healy^{18} and Oldham^{19} have discussed the issue of correct computation of reference ranges in their papers.

**Type of data to be analyzed:** In every study another problem occurs on the selection of which data is to be analyzed. In a comparison of several groups of subjects it is not valid to select these groups with the highest and lowest values and apply the usual significance test for means purely on that basis because the null hypothesis of no difference is appropriate when the largest difference is being examined. Another error is not to include those variables in the analysis for which the data is missing without any justification. The basic principle is to analyze according to the original hypothesis and experimental design.

**(g) Interpreting results**

Finally, the researcher faces the problem of interpretation of results. For a good and sensible interpretation of statistical analysis, the investigator should be acquainted with the knowledge of what the data is and how data was collected, that is, he is required to have a feeling for the data. The investigator should understand the statistical methods used in the analysis and the limitation of these methods. The errors in design or analysis, may lead to incorrect results and thus erroneous conclusions. Often in the interpretation of results some specific errors are commonly made. One of such errors is related to significance tests.

The level of significance should be understood properly that it is just an indication of the degree of plausibility of the ‘null hypothesis’. If the null hypothesis is found not to be tenable i.e., it is rejected infavourof the alternative hypothesis meaning thereby that the treatments differ in their effects. The P-value (significance level) should be considered only as a guide to interpretation^{20,21} not as a strict rule. For example, to interpret a result of P=0.05 as ‘probably significant^{22}’ implies that which side of 0.05 is really taken. And the values of P=0.055 and P=O.045 can give quite opposite results, which may not be very true. Similarly a non-significant result just means that the results were not strong enough to reject the null hypothesis. But this ‘not significant’ does not imply either ‘not important’ or ‘non-existent’. It only means that an observed difference is real one. Statistical significance is quite different from scientific significance. Hence, the important thing to find is the magnitude ofaneffect, rather thanwhether the null hypothesis is accepted or rejected^{23}. Therefore in this respect the confidence intervals are valuable. Hence it is recommended especially in case of negative results to give confidence interval around the observed effect^{24,25}. A decision regarding some variable’s effect should never be based solely on a P-value. Sometimes investigators tend to carry out several significance tests on one data set. Many time this is done to see which pairs of a number of groups are significantly different fmm each other or vice versa. But it is not a good practice, because it is more likely to have a ‘false-positive’ result with running of several tests of significance. In such a situation Meier^{26} has suggested that a good compromise,”is to treat a small number of tests as being of main importance, and to regard other findings as tentative, subject to confinnation in future experiments”.

**Associations:** Often people have presumptions about the underlying relationship betweenvariables even in the absence of any supporting evidence. Many investigators, reach their decisions about the associations among the variables on the basis of the values of correlation coefficient. As has been pointed out earlier that it is dangerous to decide about the association between two variables only on the basis of correlation coefficient, stated in other words, when two variables change with time, they may display an association even though there is altogether no causal relationship, an example ofsuchacase can be association between divorce rate and the price of petrol.

Prediction: Regression equations are usually used to describe the relationships between continuous variables. Here one thing to note is thatthese estimated equations are approximate, i.e., they are computed from a sample data and the imposition ofanexact relationship (e.g. linearorcurvilinear) may be more convenient than realistic. The degree of scatter of the observations amund the estimated line indicates the closeness of the relationship between the variables and thus the uncertainty associated with predicting one from the other for specific cases, e.g., a regression of weight for height for infants would show a clear positive relationship with a large amount of scatter. Regressionequations shouldbe used for prediction only within their limitations, e.g., the regression line described above would be inappropriate for either teenagers or adults. Such extrapolation is completely invalid.

**(h) Report writing or publication**

Finally, the end point of any research is to write a report or publish a paper on it. It is important that the report or research paper includes the information on the relevant aspects of design, methods of data collection, the amount of data used in the analysis and the statistical methods used. If the unusual statistical methods are used in the analysis, their reference and reasonfortheir use shouldbe clearly mentioned.

The main findings should be included in the summary.

**Negative findings if any, should be bmught forward instead of hiding them. Format of a research paper may be:**

- Introduction

- Design

- Methodology

- Analysis

- Results

Discussion

- Summary

**(i) Conclusion**

It is not aneasy job to do a good and productive research. In this article an attempt has been made to provide an overall picture of basic rules and techniques required in creating a study. The importance of statistics, which is required to make the study results productive and valid, are indicated at each stage.

With the invent of calculators and ready made ‘statistical packages’ on computers, the investigators having a little knowledge of statistical skills and methods, tend to carry out statistical analysis on their own. It is not an easy job to carry out statistical analysis adequately because of the following reasons:

(1) statistics is not an easy subject that by doing a small introductory course or reading a book, one can grasp enough knowledge to perform analysis adequately;

(2) in order to apply statistical methods to research, it is necessary to have greater understanding of statistical concepts;

(3) ready-made packages provide the statistical analysis, but fail to make a judgement whether the application of these methods is valid in the concerned case or not. A statistician laments that “in the past we have had misgivings about ‘cook book’ statistics and now what has evolved has to be termed the ‘TV dinner’.... Previously, we could believe that the user would at least have to read the recipe” ^{27}. Another observes that “it is not hard to do a bad analysis... all that is needed for a bad analysis is a canned computer programme ... and a lack of competence inbiostatistics” ^{28}.

(4) A poorly designed study cannot support any valid conclusion.

In the literature Yates and Healy^{29} wmte “it is depressing to find how much good biological work is in danger of being wasted through incompetent and misleading analysis of numerical results”. Therefore, it is recommended to consult a statisticianforthe analysis of the study data. It has beengreatly emphasized that the statistician should be consulted at the planning phase of the study and not at the end. No amount of good and intelligent statistical analysis can overcome the majorfiaws of the design of the study. Hence, if the statistician is involved in the early planning phase of the study, he can help the investigator to overcome the shortcomings of the design of the study, sample size required to make the results valid, data collection methods and appropriate analysis, that can make it possible for the investigatorto achieve his goals. A statistician can help in achieving the objectives of the study while statistical methods can only help in facilitating the interpretstionofthe study by quantifying, testing hypothesis, explaining behaviour and generally obtaining quantitative answers to research questions. But, they do not generally give any insight into the mechanism of action. Hence, the final interpretation must be based on biological, biochemical or other substantive grounds, otherwise any analysis will be incomplete. And, finally not mentioned above is on a literature review regarding the research topic. This is very important to see what other people have written on similar topics and how’ they have handled their problems. The literature review is a valuable resoutte in designing, conducting and analyzing the study. It helps in understanding the problems faced by others in conducting their work and thus the investigatorcan improve his own work.

### Acknowledgement

The authors like to express thanks to Professor Clive J. Lawrence, MSOR Department, University of Exeter, U.K. for his helpful comments on the manuscript.

### References

1. Gietgey. D. A. and Metz. E. A. Nursing Research, 1969; 18:p 339.

2. Abramson, J. H. Survey methods in cdmmunity medicine: an introduction to epidemiological and evaluative studies, 3rd edition, London, Churchill Living-stone, 1986.

3. Artuitage, P. Statistics in medical research. Blackwell, Oxford. 1990.

4. Sackett, D. L. Bias in analytic research. J. Chronic Dis.. 1 979;32 :51-63.

5. Boag, J. W, Haybittle, 3. L. Fowler, 3. F. et al. The number of patients required in a clinical trial, Br. J. Radiol., 1971;44:122-5.

6. Arnbroz, A. Chalmers, T. C.. Smith. H. et at. Deficiencies ofrandomized control trials. Clin. Res., 1978;26:280.

7. Newell, D. 3. Type II error and ethics, Br. Med. J., I 978;ii: 1789.

8. Freiman, J. A., Chalmers, T. C. Smith, H. etal. The importance of beta, type II error and sample size in the design and interpretation of the randomized control trial, N. Engl. J. Med., 1978;299:690-702.

9. Atman, D. G. How large asample, in: Statistics in Practice, 3rd Edition, London, British Medical Association, l985,pp.6-8.

10. Abramson, J. H. Sampling methods, in: Survey methods in community medicine: an introduction to epidemiological and evaluative studies, 3rd edition, London, Churchill Livingstone. 1986.

11. May, W. W. The composition and function of ethical committees. J. Med. Ethics, 1975;1 :23-9.

12. Meier, P. Terminating a trial - the ethical problem, Clin Pharmacol. Ther., 1979;25:633-40.

13. ‘Student’, The Lanarkshire milk experiment. Biometrika, 1931 ;23 :398-406.

14. Healy, M. 3. R. The disciplining of medical data, Br. Med. Bull., 1968;24:210

15. Schor, S. and Karten, I. Statistical evaluation of medical journal manuscripts JAMA,, I 966;l 95:1123-28.

16. Gore, S. M., Jones, 1.0. and Rytter, E. C. Misuse of statistical methods: critical assessment of articles in BMJ from January to March 1976. Br. Mcd J 1977;i:85-7.

17. White, S. J. Statistical errors in papers. Br. J. Psychiat, 1979;135:336.43.

18. Healy, M. J. R. Normal values from a statistical view point, Bulletin de I\\\' Academic Royale de Medicine de Belgique. 1 969;9:703-l 8.

19. Cox, D. R. The role of significance tests (with Discussion). Scand. J. Stat.. 1977;4:49-70.

20. Gibbons, J. D. and Pratt, 3. W., P-values: Interpretation and Methodology. Am. Statistician, l975;29,l :20-5.

21. Newton, J., lllingworth, R., Elias, J. et al. Continuous intrauterine copper contraception for three years in comparison of replacement at two years with continuation of use. Br. Med. 3., 1977;1:197-9.

22. Schlesselman, 3. J. Case Control Studies : Design, Conduct, Analysis. Oxford, Oxford University, 1982.

23. Rose, G. Beta-blockers in immediate treatment of myocardial infarction. Br. Med. J., 1980;280:1088.

24. Chalmers, T. C., Malta, R. 3., Smith. H. et al. Evidence favouring the use of anticoagulants in the hospital phase of acute myocardial infarction. N. Engl. J., Med., 1977;297:1091-6.

25. Meier, P. Statistics and medical experimentation. Biometrics, 1975,31:511.29.

26. Bross, I. D. J. Scientific strategies to save your life A statistical approach to primary prevention, New York, Marcel Dekker, 1981, pp.42-3.

27. Yates, F. and Healy, M. J. R. How should we perform the teaching of statistics? J. R. Stat. Soc., 1964;127:199-210.

28. Mahmood, Z. Uses and abuses ofbiostatistics in medical research in Pakistan. J Pak. Med. Assoc., 1990;40:270-71.

### Related Articles

**Journal of the Pakistan Medical Association has agreed to receive and publish manuscripts in accordance with the principles of the following committees:**