American Journal of Ophthalmology
Volume 147, Issue 5 , Pages 766-767, May 2009

Logistic Regression Analysis: Applications to Ophthalmic Research

  • Stanley Lemeshow

      Affiliations

    • College of Public Health, The Ohio State University, Columbus, Ohio
    • Corresponding Author InformationInquiries to Stanley Lemeshow, College of Public Health, The Ohio State University, M116 Starling Loving Hall, Columbus, OH 43210
  • ,
  • David W. Hosmer Jr

      Affiliations

    • School of Public Health and Health Sciences, Department of Public Health, University of Massachusetts, Amherst, Massachusetts

Accepted 23 July 2008.

Article Outline

 

The purpose of this Editorial is to present a brief overview of an extremely useful analytical tool for the analysis of data frequently arising in health sciences research. To facilitate understanding of the method, we consider a hypothetical study involving 286 patients who enter a study for the treatment of a particular ophthalmic condition. Two treatments are available (let us call these treatments A and B). Both treatments offer immediate results, and after just one week of receiving treatment, all symptoms and evidence of disease are gone for all 286 patients. However, disease will reoccur in a certain proportion of patients, so the study is designed to last for a total of 20 months to determine which treatment is more effective for the relief of symptoms and disease in this period. In our hypothetical data, 76 (26.57%) of the 286 patients experienced a recurrence of symptoms before the end of the study. The breakdown of recurrence by treatment is shown in Table 1.

TABLE 1. Cross Classification of Recurrence of Disease and Treatment
RecurrenceTreatmentTotal
AB
No11496210
Yes166076
Total130156286

It is clear from this cross tabulation that 38.4% (60/156) of patients receiving treatment B experience a recurrence of symptoms, whereas the rate is only 12.3% (16/130) of patients receiving treatment A. To put this in epidemiologic terms, the odds of recurrence among patients receiving treatment A is (16/130)/(114/130) = 16/114 = 0.140. For patients who receive treatment B, the odds of recurrence is (60/156)/(96/156) = 60/96 = 0.625. The odds ratio (OR) is the ratio of these odds and is 0.14/0.625 = 0.224 (ie, the odds of recurrence is only 22% as great in patients who received treatment A (x = 1) as compared with those who received treatment B (x = 0)).

The logistic regression model is appropriate for modeling a binary outcome (such as the recurrence of disease at the end of the study). The actual model is as follows:

where x is the independent variable. So, if we know the value of x, that is, the treatment the patient received, then π(x) expresses the probability that the patient receiving that value of x will experience a recurrence. One of the great appeals of logistic regression analysis is that exponentiating the coefficient associated with the independent variable results in a direct estimate of the OR. That is, eβ1 = OR.

The fitted logistic regression model for these data is shown in Table 2. Note that exponentiating the estimated coefficient for treatment in Table 2, we have e−1.494 = 0.224, the identical OR we computed earlier from the contingency table in Table 1.

TABLE 2. Fitted Logistic Regression Model Containing Treatment
CoefficientStandard ErrorzP value95% Confidence Interval
Treatment−1.4940.3136−4.76<.001−2.108to−0.879
Constant−0.4700.1646−2.86.004−0.793to−0.147

Researchers will immediately recognize that one potential reason why treatment A performed so much better than treatment B is that the patients who received treatment A may have differed with respect to some other characteristic, such as age. If age is related to recurrence as well as differing in the treatment groups, then age may be a confounder of the association between treatment and recurrence. To explore this possibility using a logistic regression model, we only have to include age in the previous model (Table 3).

TABLE 3. Fitted Logistic Regression Model Containing Treatment and Age
CoefficientStandard ErrorzP value95% Confidence Interval
Treatment−1.4600.3162−4.62<.001−2.080to−0.841
Age0.0250.01301.88.060−0.001to0.050
Constant−1.1710.4217−2.78.005−1.997to−0.344

If we exponentiate the coefficient associated with treatment, we obtain e−1.46 = 0.232. This is the adjusted OR, where we have controlled for age. Note that the OR did not change much from the crude OR (ie, controlling for nothing), and hence, age is judged not to be a confounder of treatment in these data.

Note that to use the logistic regression model, all we need to known about recurrence is whether it is present (y = 1) or absent (y = 0) for each subject at the end of the study. The fact that subjects might have been under observation for varying lengths of time over the course of the study was not considered or used in any way. The resulting estimate of effect for treatment is the OR, adjusted for age, and is applicable as a measure of effect only at the end of the study. Hosmer and Lemeshow provide a detailed treatment of modeling binary outcome data using the logistic regression model.1

Back to Article Outline

 

The authors indicate no financial support or financial conflict of interest. Both authors were involved in design and conduct of study; data collection; analysis and interpretation of data; and preparation and review of the manuscript.

Back to Article Outline

Reference 

  1. Hosmer D, Lemeshow S. Applied Logistic Regression. 2nd ed.. New York, New York: John Wiley & Sons Inc; 2000;

PII: S0002-9394(08)00610-7

doi:10.1016/j.ajo.2008.07.042

American Journal of Ophthalmology
Volume 147, Issue 5 , Pages 766-767, May 2009