|Year : 2014 | Volume
| Issue : 2 | Page : 154-159
Types of observational studies in medical research
Rajeev Kumar1, Amir Maroof Khan2, Pranab Chatterjee2
1 Department of Biostatistics and Medical Informatics, University College of Medical Sciences, Delhi, India
2 Department of Community Medicine, University College of Medical Sciences, Delhi, India
|Date of Web Publication||31-Jul-2014|
Dr. Amir Maroof Khan
Department of Community Medicine, University College of Medical Sciences, Dilshad Garden, Delhi
Source of Support: None, Conflict of Interest: None
Study design forms a core component of research, mainly determined by the study objectives, and it in turn further decides the type of statistical analyses to be carried out. Observational studies are devoid of the investigator's control over assignment of a subject to the treated or control group, in contrast to interventional studies Even though randomized controlled trials are seen as the best study design, evidence shows that properly conducted observational studies give similar results, and is relevant in medical research where ethics and feasibility concerns assume great significance. Observational studies point out towards possible causal associations, are less resource intensive than trials and have a better external validity. This review article discusses various types of observational study designs such as case reports, cross sectional, cohort, case-control and nested case-control studies with real literature examples.
Keywords: Case control, case series, cohort, ecologic, nested case control, observational study
|How to cite this article:|
Kumar R, Khan AM, Chatterjee P. Types of observational studies in medical research. Astrocyte 2014;1:154-9
| Introduction|| |
Designing of study is relatively more important than the data analysis because error conducted in designing of study can never be corrected unless researcher devises a fresh study, whereas a wrongly analyzed data can be re-analyzed to get meaningful results.  A better understanding of study designs is essential for the proper conduct and interpretation of medical research. This review article focuses on observational study designs which are commonly used in medical research.
Medical researchers are usually interested in finding out the effect of a particular risk/therapeutic factor on the disease/health outcome. Comparisons between two groups is a logical method for providing such an evidence. The groups to be compared can be based on a) exposure/risk factor b) disease/outcome or c) intervention. In observational studies, the groups to be compared are not based on intervention/manipulation by the investigator. Belonging to an exposure or outcome group, hence, forms the mainstay of such type of research studies. An example of an observational study: To study the effect of industrial pollution on health of the people, the health status of people living in an industrial area can be compared with those living in a nonindustrial area.
Due to a lack of intervention by the investigator, these types of studies reflect the comparative effectiveness in real world scenarios, and therefore have a higher external validity. Observational studies are not without limitations, though. Selection bias, information bias and confounding are the major issues to be taken care of in observational studies. Details of other biases that can affect observational studies are presented elsewhere. 
I. Why observational studies, when randomized controlled trials are considered the gold standard in medical research?
Even though randomized controlled trial (RCT) have certain unique advantages like randomization, allocation concealment etc., there are instances when RCTs or other interventional studies are inappropriate or not feasible. Experimentation may prove to be inadequate in cases where the outcome of intervention is determined by activities of the care provider, such as physiotherapy, surgery etc.  Even when studying rare diseases, it may not be possible to have sufficient number of patients for conducting a clinical trial. It may also be unethical to perform an interventional study in certain situations, for example, to study the effect of a harmful substance such as tobacco on the health status; you cannot expose someone to the harmful substance for the purpose of your study. In such instances, the most appropriate study designs are observational study designs. A recent Cochrane review has reported that there is little difference between the results obtained from observational studies and RCTs and hence factors other than study design per se need to be considered when exploring reasons for a lack of agreement between results of RCTs and observational studies.  Thus a proper appraisal of the study design is critical to interpret the findings. The 'type of study design' alone, cannot be the endpoint in debates concerning the strength of the evidence generated.
II. Types of observational studies
Observational studies are usually categorized into various categories such as case report or case series, ecologic, cross-sectional (prevalence study), case-control and cohort studies. Other variants of these observational studies are also possible such as nested case-control study, case cohort study etc.
A. Case reports and case series
A description of a single case, typically describing the manifestations, clinical course, and prognosis of that case, is a case report. For example; a case report of a patient of severe lactic acidosis and multiorgan failure due to thiamine deficiency during total parenteral nutrition (TPN).  When multiple case reports are reviewed for a particular disease condition, diagnosis or treatment procedures, etc., it forms a case series. For example, a case series study where clinical presentation and management of six cases presenting to an emergency department following synthetic cannabinoid intoxication has been reported.  Due to lack of a control or comparison group, these type of studies do not explore cause-effect relationships. But nonetheless, these type of studies can be used as a source of hypothesis by the researchers to design studies that provide stronger empirical evidence.
B. Cross-sectional (Prevalence Study) Study
If the data regarding both the exposure and the outcome are collected simultaneously from a selected group of individuals belonging to a specified population at a given point of time, it is called as a cross-sectional study. Frequently referred to as a survey, this type of study design is a preferred design to find out the prevalence and associated risk factors of a particular outcome/disease. A cross sectional study may extend from weeks to months depending upon the magnitude of the survey but the variables from a subject are measured at only a single given point of time and the measurements are not repeated at different time intervals. The data collected in a cross-sectional study can be grouped in diseased/nondiseased and exposed/nonexposed groups and certain associations can be studied. As temporality is not taken into account, one cannot comment on causality, though associations can be pointed out. For example, if a cross-sectional study finds that milk drinking is associated with peptic ulcer in a cross-sectional study; is that because milk causes the disease, or because ulcer sufferers drink milk to relieve their symptoms? 
C. Ecological (Aggregate) Study
An observational study where two variables, that is, a risk/therapeutic factor and an outcome, are studied and one of the variables is measured at the population level is an ecological study. Group comparisons are made in these type of studies as opposed to the individual level comparisons. A recent study reported that the prevalence of obesity in a given population is inversely proportional to the prevalence of Helicobacter pylori in the same population.  It will be fallacious to report that obese people had lesser chances of being infected with H. pylori or H. pylori protects against obesity with this study, as individual level data of obese and nonobese people and their individual H. pylori status was not collected.  This faulty interpretation is common with ecological studies and is called as ecological fallacy. But these type of studies provide some direction for more in-depth research, even though its results by itself provide very weak empirical evidence.
D. Cohort study
Cohort is a group of subjects that represent the population of interest having certain common characteristics and studied over a sufficient period of time, for example, cohort of survivors of Bhopal gas tragedy, cohort of children born in a particular year.
In these type of studies; cohorts, with and without exposure, are selected and then followed up to measure the occurrence/incidence of disease in them. By comparing the incidences of the diseases in the two groups, one can provide some evidence on the cause-effect relationship between the exposure and outcome as here you are sure of the exposure factor preceding the outcome. If you select the cohorts at the present time point and then follow them in future, it is called as prospective cohort study. Alternatively you can also (if you have detailed past records of the exposure and health characteristics of the individuals) look in the past records, select cohorts from there and then follow them till the present time point to look for the occurrence of the This is known as retrospective or historical cohort study design. The term 'retrospective' in study designs is used when past records are taken into account. [Figure 1] shows a schematic presentation of a cohort study design.
An example of a prospective cohort study: To find out whether peer victimization during adolescence is associated with the risk of anxiety disorder in adulthood two parallel cohorts of adolescents aged 13-14 years, who had and who did not have peer victimization experience were followed up to age of 18 years and the incidence of the anxiety disorder between the cohorts were compared. 
Example of a retrospective cohort study: Villwock et al. conducted a study to find the effect of age (≤80 vs. >80 years) on acute ischemic stroke (AIS) outcomes in patients following endovascular mechanical thrombectomy (EMT). A retrospective cohort study design was adopted by selecting the entire cohort of patients from 2008 to 2010 with primary diagnosis of AIS who received EMT from Nationwide inpatient sample database. 
Some studies may start from the past records and move ahead in time to the present time point and then further continue to follow these cohorts in future too, thus combining the retrospective and prospective study design into one, thereby known as ambi-directional cohort study.
At times, one cohort is selected and comparisons between subgroups or also known as internal cohorts are made. For example to study the effect of body mass index (BMI) on cardiovascular disease, a cohort of subjects free of cardiovascular disease were selected, categorized based on their BMI into internal cohorts and followed up for sufficient period of time to capture and compare the incidence of cardiovascular disease.
In some instances, multiple cross-sectional studies are conducted on a certain population, which is known as a pseudo-cohort study as unlike a true cohort study, the same cohort has not been followed. For example, a pseudo-cohort study using national cross-sections (2001, 2004, 2007, and 2010) was conducted to examine differences in smoking prevalence under different smoking ban policies. 
Cohort study designs enable the researchers to study the temporal association between cause and effect, study multiple outcomes in the cohort, find out incidence of the disease and alculate relative and attributable risks. But these type of studies are not suitable for rare diseases or those that have a longer latent period. Other major limitations of cohort studies are that they are relatively expensive, time consuming, and have a higher attrition rate.
E. Case-control study
Case-control studies begin when the outcome/disease has already occurred. Here two groups namely cases (those having the outcome of interest, i.e., a particular disease, a complication) and controls (those not having the outcome of interest) are selected and information about their exposure(s)/ risk factor(s) under consideration are collected from existing records, patient examinations, personal interviews, etc., for comparison. [Figure 2] shows a schematic presentation of a case-control study design.
Even though the population-based case-control study is more appropriate because subjects are captured from a representative population hospital-based case-control studies where both the cases and controls are captured from the same hospital are more common for the sake of convenience. Even the landmark study that found the association between smoking and lung carcinoma was a hospital-based case-control study. 
Selection of cases: Only confirmed cases and those that are homogenous with respect to the disease severity should be included. In general, incident (new) cases are preferred than prevalent cases (both new and old) because newly diagnosed cases participate more actively, have lesser chances of recall bias; diagnostic criterion among the cases is more consistent than subjects captured from different period (prevalent cases). Due to nonavailability of adequate incident cases at one point of time, researchers recruit the cases over a period of time and this should not be misunderstood as a follow up study. But when, case-control studies include prevalent cases, this may introduce survival bias; for instance, those cases that were exposed to the factor under study may have poor survival than those exposed, and prevalent cases do not represent the exact population of cases and it increases as time progress. 
Selection of controls (who do not have concerned disease):
To ensure comparability, the controls as along with the cases should represent the same population and meet the same inclusion criteria. It is at times assumed that if controls are selected from the same hospital from where cases have been selected, they are being selected from the same population but this may not be true as the catchment population of the cases and controls may be entirely different. In order to reduce this selection bias, controls are recruited from the cases' friends or relatives, which usually represent the same population. Inclusion of blood relatives is generally not preferred due the apprehension of overmatching in terms of exposure to various risk factors.
Another situation where selection bias is possible is when the control's disease, which is different from the case definition, may share the same risk factors with case's disease under the study. For example, in a case-control study exploring the effect of alcohol on liver cancer, the strength of association would be underestimated if controls were patients from the accident and emergency ward because frequency of exposure to alcohol is higher in accident and emergency wards. One way to reduce this bias can be to select the controls from the different diseases instead of a particular disease.
For rare diseases, sufficient number of cases may not be available and in such situations there is an option to include multiple controls (up to three or four) per case, which increases the power of the study. 
Case control studies are relatively less time consuming, relatively less expensive and best suited for rare diseases or those having a longer latent period. These studies usually serve as a quick assessment of associations between suspected exposure and outcomes, which can be later studied by more appropriate, costly, and robust study designs like cohort studies.
Still these type of studies are prone to recall and interviewer biases, which affect its internal validity. It is erroneous to establish temporal sequence in case-control study with prevalent cases included, but when incident cases are included, it may be established.
F. Nested case-control study: Variation of a case-control design
In the nested case-control study, cases of a disease that occur in a defined cohort are identified and, for each, a specified number of matched controls is selected from among those in the cohort who have not developed the disease by the time of disease occurrence in the case.  The type of nested case-control study depends on how the controls are selected. i) At the end of the cohort study, controls are randomly selected from the disease-free subjects. This design is known as exclusive design or cumulative incident sampling and usually applied in case-control studies and odds ratio is calculated for such design.  ii) All controls are randomly selected from the cohort at the start of the follow-up. Here, a control at the beginning of study may become case during the subsequent follow-up, hence, it is called inclusive design and as the controls are representative of full cohorts, it is also known as a case-cohort study.  iii) The cohort is followed at regular intervals and as a new case is diagnosed, a control is selected from the population at-risk at that point of time. This is the concurrent type. Usually in research literature, the last two designs of case-controls have been referred to as nested case-control. The main feature of these studies are both case and control are selected from the same source of population and reduce the chance of selection bias. In addition, these studies are cost effective compared with full cohort studies, albeit decrease in sample size may reduce the statistical power.
Example of a nested case-control study: To evaluate whether treatment with sodium valproate (exposure) was associated with reduced risk of stroke (outcome), a nested case-control study was implemented with cases diagnosed with incident nonhemorrhagic stroke and controls matched for sex, year of birth, and study start date. Cohort from which these cases and controls were selected came from the electronic health records data, which were extracted from Clinical Practice Research Database for participants ever diagnosed with epilepsy and prescribed antiepileptic drugs. 
G. Measures of association
Relative risk (RR), risk difference (RD) and odds ratio (OR) are the measures of association frequently reported in observational studies. RR is the ratio of the probability of the occurrence/incidence of an event in the exposed group to that in the nonexposed group. RD is difference in risk between the experimental group and the control group. As temporality is ascertained in cohort studies, incidence rates and hence risks can be computed by these type of study designs. In case-control studies, the measure of association is the odds ratio. Simply put, OR is the ratio of odds of exposure among the diseased group to the odds of exposure among the non-diseased group. The ORs and RRs, although conceptually different, are often mistakenly interpreted in the same way. Suffice here to state that odds ratio may be equal to relative risk for rare diseases, which is known as rare disease assumption.
III. Reporting an observational study
The STROBE statement which is being endorsed by a growing number of biomedical journals has provided a checklist of items to be included in reports of observational studies. , STROBE stands for an international, collaborative initiative of epidemiologists, methodologists, statisticians, researchers, and journal editors involved in the conduct and dissemination of observational studies, with the common aim of Strengthening the Reporting of Observational studies in Epidemiology. This statement is not a prescription for designing or conducting observational studies and it focuses on cross-sectional, case-control, and cohort study designs.  But it is recommended that researchers should refer to STROBE statement at the planning stage of an observational study design so that the chances of making inadvertent omissions are reduced.
| Conclusion|| |
Observational studies are relevant in medical research where feasibility and ethics are indispensable components. The various types of observational studies have their own merits and limitations and a proper understanding of these is required for implementation and interpretation of such study designs.
| References|| |
|1.||Campbell MJ, Machin D. In: Medical Statistics: A Common-sense Approach. 2 nd ed. Chichester: Wiley; 1993. p. 2. |
|2.||Indrayan A. Medical biostatistics: CRC Press; 3 rd ed. Boca Raton: Chapman and Hall; 2012. |
|3.||Thadani R. Formal trials versus observational studies in Fabry disease: Perspective from 5 years of FOS. In: Mehta A, Beck M, Sunder-Plassmann G, editors. Oxford: Oxford Pharma Genesis; 2006. Available from: http://www.ncbi.nlm.nih.gov/books/NBK11597/ [Last accessed on 2014 Jun 5]. |
|4.||Anglemyer A, Horvath HT, Bero L. Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials. Cochrane Database Syst Rev 2014 [In Press]. |
|5.||Ramsi M, Mowbray C, Hartman G, Pageler N. Severe lactic acidosis and multiorgan failure due to thiamine deficiency during total parenteral nutrition. BMJ Case Reports 2014 [In Press]. |
|6.||Harris CR, Brown A. Synthetic cannabinoid intoxication: A case series and review. J Emerg Med 2013;44:360-6. |
|7.||BMJ. Case control and cross sectional studies. BMJ. Available from: http://www.bmj.com/about-bmj/resources-readers/publications/epidemiology-uninitiated/8-case-control-and-cross-sectional [Last accessed on 2014 Jun 3]. |
|8.||Lender N, Talley NJ, Enck P, Haag S, Zipfel S, Morrison M, et al. Review article: Associations between Helicobacter pylori and obesity - an ecological study. Aliment Pharmacol Ther 2014;40:24-31. |
|9.||Press Trust of India. Stomach bacteria may protect against obesity. The Financial Express. 2014 Jun 3 Health. Available from: http://www.financialexpress.com/news/stomach-bacteria-may-protect-against-obesity/1256946 [Last cited on 2014 Jun 3]. |
|10.||Stapinski LA, Bowes L, Wolke D, Pearson RM, Mahedy L, Button KS, et al. Peer victimization during adolescence and risk for anxiety disorders in adulthood: A prospective cohort study. Depress Anxiety 2014 [In Press]. |
|11.||Villwock MR, Singla A, Padalino DJ, Deshales EM. Acute ischaemic stroke outcomes following mechanical thrombectomy in the elderly versus their younger counterpart: A retrospective cohort study. BMJ Open 2014;4:e004480. |
|12.||Tabuchi T, Hoshino T, Hama H, Nakata-Yamada K, Ito Y, Ioka A, et al. Complete Workplace Indoor Smoking Ban and Smoking Behavior among Male Workers and Female Nonsmoking Workers′ Husbands: A Pseudo Cohort Study of Japanese Public Workers. Biomed Res Int 2014;2014:303917. |
|13.||Doll R, Hill AB. Smoking and lung carcinoma. Br Med J 1950;2:739-48. |
|14.||Hill HS, Kleinbaum DG. Bias in observational studies in Encyclopaedia of Epidemiologic Methods. In: Gail MH, Benichou J, editors. England: John Wiley and Sons; 2000. |
|15.||Fletcher RH, Fletcher SW, Fletcher GS. Clinical Epidemiology: The essentials. Philadelphia: Lippincott Williams and Wilkins; 2014. |
|16.||Ernster VL. Nested case-control studies. Prev Med 1994;23:587-90. |
|17.||Rothman KJ, Greenland S. Modern epidemiology. 2 nd ed. Philadelphia, PA: Lippincott-Raven Publishers: 1998. |
|18.||Prentice RL. A case-control design for epidemiology cohort studies and disease prevention trail. Bimoetrika 1986;73:1-11. |
|19.||Dregan A, Charlton J, Wolfe CD, Gulliford MC, Markus HS. Is sodium valproate, an HDAC inhibitor, associated with reduced risk of stroke and myocardial infraction. Pharmacoepidemiol Drug Saf 2014. [In Press] |
|20.||von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP; STROBE Initiative. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. BMJ 2007;335:806-8. |
|21.||STROBE Initiative. STROBE checklist. Available from: http://www.strobe-statement.org/fileadmin/Strobe/uploads/checklists/STROBE_checklist_v4_combined.pdf [Last accessed on 2014 Jun 3]. |
[Figure 1], [Figure 2]