|
|
LETTER TO THE EDITOR |
|
|
|
Year : 2013 | Volume
: 45
| Issue : 2 | Page : 205 |
|
Basic statistics for postgraduate students
Jaykaran Charan
Department of Pharmacology, Govt. Medical College, Surat, Gujarat
Date of Web Publication | 11-Mar-2013 |
Correspondence Address: Jaykaran Charan Department of Pharmacology, Govt. Medical College, Surat, Gujarat
 Source of Support: None, Conflict of Interest: None  | Check |
DOI: 10.4103/0253-7613.108331
How to cite this article: Charan J. Basic statistics for postgraduate students. Indian J Pharmacol 2013;45:205 |
Sir,
I read with interest the article "statistics for post graduate students". [1] Few issues need some more exploration in my opinion. There are as follows:
- Types of data: Data can be classified in various ways. One important classification is quantitative and qualitative data. Here authors have mentioned nominal, ordinal and interval as type of data which is more suitably called as scales of measurement than types of data. Even for scale of measurements, majority of biostatistician classify them into four types - Nominal, ordinal, Interval and Ratio. The difference between interval and ratio scale is only one and that is absence of "absolute zero" in interval scale. Also in ordinal data the orders/scores are formed by convenience so there will be no equal interval between two orders. There will be no equal difference between mild - moderate and moderate - severe.
- Authors have mentioned that data can be converted for making them normally distributed. Usually data are not "converted" but "transformed" in another form like log or square root etc for making them normally distributed. But conversion of one form of data to another (ratio to nominal) is done many times for convenience in summarisation as well for finding significant results. [2]
- It is mentioned that if the sample is more than 30, then it follows the normal distribution. But I have observed that the sample size of more than 30 doesn't guarantee assumption about normal distribution.
- Again, for correlation coefficient both variables should be distributed normally. But this is needed only in the case of "Pearson correlation coefficient" in the case of Spearmen correlation coefficient (for non- parametric data) and contingency coefficient (nominal data) there is no need to follow this assumption. In the case of ratio or interval data, if data does not follow normal distribution spearmen correlation coefficient can be used.
- Authors mention that in the case of normal distribution skew is zero. This statement needs some more clarification. Small amount of skewness or even kurtosis will not make a sample non normal. Skewness and kurtosis can be measured by any biostatistics software or excel and it values lie between +1 to -1, the sample should be considered as following normal distribution. Whether the data follows normal distribution or not can be checked by histogram, skewness, kurtosis, Q-Q plot and statistical tests like Komolgorov Smirnov test and Shapiro Wilk test.
- Instead of stating "null hypothesis is accepted" it is better to state "failed to reject null hypothesis" because non-significant P value may be because of actual no difference or because of less power of the study or type 2 error. Non -significant P values do not show equality or "samples are from same population" as "absence of evidence should not be considered as evidence of absence". [3]
- The standard explanation of null hypothesis and alternative hypothesis given by authors are applicable to majority of study designs and trials (superiority trials) but in case of non inferiority or equivalent trials, null and alternative hypothesis do not follow these rules. [4]
- Sample size measurement mentioned in this article is applicable only to the intervention studies or trials involving quantitative variables. This formula cannot be used in cross sectional studies, other epidemiological studies or clinical studies involving qualitative data. [5]
- Authors have not mentioned the reason for using analysis of variance (ANOVA) over "t" test when study groups are 3 or more. It is very important to understand that more the number of statistical tests more are the chances of inflation of type 1 error. ANOVA is used to prevent this.
- The information about Chi-square test that even if sample size is small (< 30), this test is used by using Yates correction, but frequency in each cell should not be less than 5", seems erroneous. In my opinion, Yates correction with Chi-square test should be used when frequency in any cell is less than 5.
» References | |  |
1. | Dakhale GN, Hiware SK, Shinde AT, Mahatme MS. Basic biostatistics for post-graduate students. Indian J Pharmacol 2012;44:435-42.  [PUBMED] |
2. | Charan J. Beware -statistics may deceive you. Int J Med Sci Public Health. 2012;1:10-3.  |
3. | Jaykaran, Saxena D, Yadav P, Kantharia ND. Nonsignificant P values cannot prove null hypothesis: Absence of evidence is not evidence of absence. J Pharm Bioallied Sci 2011;3:465-6.  [PUBMED] |
4. | Gupta SK. Non-inferiority clinical trials: Practical issues and current regulatory perspective. Indian J Pharmacol 2011;43:371-4.  [PUBMED] |
5. | Patra P. Sample size in clinical research, the number we need. Int J Med Sci Public Health 2012;1:5-9.  |
|