Estimating the Concordance Probability in a Survival Analysis with a Discrete Number of Risk Groups (2024)

  • Journal List
  • HHS Author Manuscripts
  • PMC4886856

As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsem*nt of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more: PMC Disclaimer | PMC Copyright Notice

Estimating the Concordance Probability in a Survival Analysis with a Discrete Number of Risk Groups (1)

Link to Publisher's site

Lifetime Data Anal. Author manuscript; available in PMC 2017 Apr 1.

Published in final edited form as:

Lifetime Data Anal. 2016 Apr; 22(2): 263–279.

Published online 2015 May 29. doi:10.1007/s10985-015-9330-3

PMCID: PMC4886856

NIHMSID: NIHMS783569

PMID: 26022558

Glenn Heller1,* and Qianxing Mo2

Author information Copyright and License information PMC Disclaimer

The publisher's final edited version of this article is available at Lifetime Data Anal

Abstract

A clinical risk classification system is an important component of a treatment decision algorithm. A measure used to assess the strength of a risk classification system is discrimination, and when the outcome is survival time, the most commonly applied global measure of discrimination is the concordance probability. The concordance probability represents the pairwise probability of lower patient risk given longer survival time. The c-index and the concordance probability estimate have been used to estimate the concordance probability when patient-specific risk scores are continuous. In the current paper, the concordance probability estimate and an inverse probability censoring weighted c-index are modified to account for discrete risk scores. Simulations are generated to assess the finite sample properties of the concordance probability estimate and the weighted c-index. An application of these measures of discriminatory power to a metastatic prostate cancer risk classification system is examined.

Keywords: C-index, Concordance probability estimate, Discrimination, Inverse probability censoring weight, Risk classification

1. Introduction

Treatment decisions are based on prognosis derived from a set of molecular, pathological, and histological patient factors. A treatment decision algorithm is often determined from a reduced set of important prognostic markers that are categorized to create a risk classification system. The quality of the treatment decision algorithm is tied to the strength of the risk classification.

Discrimination is an important metric in assessing the strength of a risk classification system in survival analysis. When the endpoint is survival time, good discrimination implies that the ordinality of the risk groups predicts the ranking of survival times. If a risk group classification can be constructed to confidently delineate long-term survivors from short-term survivors, then it will provide utility in clinical decision making.

The most widely used metric for the global assessment of discrimination is the concordance probability. Its popularity stems from its equivalence to the area under the receiver operating curve when the outcome is binary, and its relation to Kendall’s tau and the Goodman-Kruskal gamma for continuous outcomes (Sprent 1989). For survival outcomes, the concordance probability is defined for two hypothetical subjects as Pr[R2 > R1|T1 > T2], where R represents the risk score and T is survival time. The patient-specific risk scores represent the discrete risk groups and may be determined through a regression model via the linear combination β̂TX, with X as a dummy variable. If there are p risk groups, the kth component of the dummy variable X takes the value 1 if the patient is in group k (for k = 1, …, p − 1) and all other components of X are zero. In this paper it is assumed that the proportional hazards model h(t|X) = h0(t) exp(βTX) was used to determine patient risk. The application of this model requires that the adequacy of the proportional hazards assumptions have been empirically tested and that the model assumptions cannot be rejected. Two common methods used to test the proportional hazards assumption are developed in Lin et al. (1993) and Grambsch and Therneau (1994).

The outline of this paper is as follows. In Section 2, two previously developed estimates of the concordance probability are presented for continuous risk scores. The estimates, the weighted c-index and the concordance probability estimate (CPE), have derived asymptotic distributions with continuous risk scores. In Section 3, these estimates are adapted for discrete risk scores. The asymptotic distributions of the CPE and the weighted c-index with discrete risk scores are presented. An analytic expression for the asymptotic variance of the CPE is derived. For the weighted c-index, a resampling based estimate of the asymptotic variance is utilized. Simulations are presented in Section 4 to examine the finite sample properties of the estimates. The development of a risk group classification system in prostate cancer is used illustrate the methods in Section 5 and further discussion points are provided in Section 6.

2. Concordance probability with continuous risk scores

It is assumed that that a correctly specified proportional hazards model was used to develop the risk score and that larger values of βTX represent a greater risk of death. The concordance probability, derived from this regression model, is

𝒞 =Pr(βTX1>βTX2|T2>T1).

An early estimate of the concordance probability, adapted for survival analysis, is the c-index (Harrell et al. 1982)

ijI(yi<yj)I(δi=1)I(β^TXi>β^TXj)ijI(yi<yj)I(δi=1),

where y is the minimum of the survival time t and censoring time u, δ = I(tu), and β̂TX is the estimated risk score from the regression model. Although this has been the predominant discrimination statistic used in survival analysis, it has been observed that the statistic is biased and its distribution is a function of the censoring distribution (Gönen and Heller 2005, Uno et al. 2011). Indeed, the limiting value of Harrell’s c-index is Pr[βTX1 > βTX2|T1 < T2, T1 < U1U2]. Thus, the c-index has the undesirable property that it is influenced by the rate of patient accrual and the length of the study.

To address this limitation, Uno et al. (2011) introduced inverse probability censoring weights to the c-index. The inverse probability censoring weights are incorporated to produce a consistent estimate of the concordance probability when the support of the survival distribution is less than the support of the censoring distribution. The weighted c-index is computed as

ijI(yi<yj)I(δi=1)I(β^TXi>β^TXj){Ĝ(yi)}2ijI(yi<yj)I(δi=1){Ĝ(yi)}2,

where Ĝ(y) is the Kaplan-Meier estimated survival function for the censoring time random variable. For this weighted c-index, it is assumed that the censoring random variable is independent of the risk covariates. An extension developed by modeling the censoring distribution as a function of the covariates was proposed by Gerds et al. (2013). In general, measures based on inverse probability censoring weights require care as they may be sensitive to late failure times.

If {Ti, Xi} are continuous, independent identically distributed random variables, an application of Bayes theorem shows that the concordance probability is also equal to Pr[T2 > T1|βTX1 > βTX2]. This form of the concordance probability is used to derive an alternative estimate (Gönen and Heller 2005), based on the proportional hazards specification

m(T) =βTX +ε,

where ε is a standard extreme value random variable and m is a monotone function (Kalbfleisch 1978). It follows that

Pr(T2>T1|βTX1>βTX2)=βTX1>βTX2[1+exp{βT(X2X1)}]1dF(βTX1)dF(βTX2)βTX1>βTX2dF(βTX1)dF(βTX2)

where F is the distribution function of the risk score βTX. The concordance probability estimate (CPE) is attained by substituting the partial likelihood estimate for β and the empirical distribution function for F

2n(n1)i<j{I(β^TXji<0)1+exp(β^TXji)+I(β^TXij<0)1+exp(β^TXij)},

where Xji = XjXi.

3. Concordance probability with discrete risk scores

The concordance probability is adapted for use with discrete risk scores. In general, the creation of risk groups induces a large number of ties in the risk scores. For p groups, with nk subjects in group k, the proportion of subject pairs in the same group is

k=1pnk(nk1)(k=1pnk)[(k=1pnk)1].

Thus for example, a 200 patient study with 50 patients in each of 4 risk groups, nets approximately 25% of the patient pair comparisons from the same group.

3.1 Concordance parameters

There are two paths to modify the concordance probability 𝒞 for discrete risk scores. The first is to acknowledge the ambiguity resulting from the inclusion of ties in the risk scores

and the second is to exclude patient pairs with the same risk score

𝒞E =Pr(βTX1>βTX2|T2>T1,βTX1βTX2).

The concordance measures 𝒞I and 𝒞E are the survival analogs of Kendall’s tau-a and Kendall’s tau-b, respectively. Yan and Greene (2008) demonstrate that 𝒞I is a convex combination of 𝒞E and 0.5

𝒞I =[𝒞E×{1−Pr(βTX1 =βTX2)}] +[0.5×Pr(βTX1 =βTX2)].

The parameter 𝒞E is a common concordance measure which ranges from 0.5 to 1.0. In contrast, the inclusion of ties attenuates 𝒞I toward 0.5, with its upper bound equal to 1−0.5 × Pr(βTX1 = βTX2). Thus, the use of 𝒞I penalizes the concordance measure, with fewer risk groups incurring a greater penalty.

It is assumed that β̂TX1β̂TX2 unless X1 = X2 (patient pairs within the same risk group). Under this assumption, conditional on X, the asymptotic distribution of the estimate of 𝒞I follows immediately from the asymptotic distribution of the estimated 𝒞E. As a result, the focus in this paper is on the statistical properties of concordance estimates based on 𝒞E.

3.2 Concordance parameter estimates

The weighted c-index adapted for discrete risk scores is

Cn,E(β^)=ijI(yi<yj)I(β^TXi>β^TXj)I(δi=1){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1ijI(yi<yj)I(β^TXiβ^TXj)I(δi=1){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1.

This estimate explicitly excludes ties in the risk scores and has been modified to account for possibly unequal follow-up between risk groups, using Ĝ(y|β̂TX) as the within group Kaplan-Meier survival function for the censoring time random variable. The weighted c-index is a consistent estimate of 𝒞E when the right support of the survival distribution is less than the right support of the censoring distribution. If the support of the survival distribution exceeds the support of the censoring distribution, then the estimate is stabilized by truncating the follow-up time at τ so that G(τ|βTX) > 0. For the weighted c-index, this is incorporated into the statistic as

Cn,E(β^;τ)=ijI(yi<yj)I(β^TXi>β^TXj)I(δi=1)I(yi<τi){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1ijI(yi<yj)I(β^TXiβ^TXj)I(δi=1)I(yi<τi){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1

where τi represents the truncation time within the ith subject’s risk group.

The asymptotic normality of the weighted c-index for continuous risk scores was demonstrated in Uno et al. (2011). For discrete risk scores, the asymptotic normality is derived in the appendix. This extension follows directly from the work in Cheng et al. (1995) and Fine et al. (1998). Estimation of the asymptotic variance, however, requires density estimation for the within group censoring random variable. To avoid kernel smoothing and bandwidth selection, a stratified by risk group bootstrap standard error estimate of the weighted c-index in the simulations. An alternative resampling approach was used in Uno et al (2011).

The CPE with discrete risk scores is

Kn,E(β^)=ijI(β^TXi>β^TXj)[1+exp(β^TXji)]1ijI(β^TXi>β^TXj).

The CPE is a consistent estimate of 𝒞E when the proportional hazards specification is correct. The consistency and the asymptotic distribution of the CPE, including its asymptotic variance, are derived in the appendix.

Theorem

Under the standard conditions for the proportional hazards model, n1/2[Kn,E(β̂) − 𝒞E] is asymptotically normal with mean 0.

The CPE is a model based measure of concordance and therefore relies on the correctness of the proportional hazards model. The advantage of using the CPE is that it does not require inverse probability censoring weights, which may be sensitive to late failures. Use of the CPE provides additional insight into the discriminatory power of a risk classification system derived with p risk groups by noting that the CPE is a weighted average of pairwise probabilities, with each element representing the probability the lower risk patient lives longer

Kn,E(β^)=k=1pl=1pI[β^k>β^l]nknl[1+exp{β^lβ^k}]1k=1pl=1pI[β^k>β^l]nknl.

Thus, in addition to computing the overall CPE, we can extract the individual probability [1+exp{β̂lβ̂k}]−1, that a patient in risk group k lives longer than a patient in risk group l.

4. Simulations

Simulation experiments were conducted to examine the accuracy of the CPE and weighted c-index for a four group risk classification system. The data were generated from Weibull and lognormal regression models. For the Weibull model the survival times were produced from

Ti =exp[−0.5x1i−0.25x2i−0.10x3iεi

where X = (x1, x2, x3)T are dummy variables. The errors {εi} were independent identically distributed Weibull random variables with scale parameter 1 and shape parameters {1.85, 4.1, 7.3, 13.5}, which were chosen to produce concordance probabilities equal to {0.6, 0.7, 0.8, 0.9}. The Weibull regression model satisfies the proportional hazards assumption. For the lognormal model, the log survival times were created using the same covariates and regression parameters. The log of the error random variables {log εi} were independent identically distributed normal random variables with mean 0 and scale parameters {0.62, 0.28, 0.16, 0.09}, which were again chosen to produce concordance probabilities equal to {0.6, 0.7, 0.8, 0.9}. The lognormal simulations were constructed to examine the robustness of the CPE and the weighted c-index statistics when the regression coefficients were computed using the proportional hazards model but the data do not satisfy the proportional hazards assumptions.

For each group, independent uniform censoring times (0, Mk), (k = 1, 2, 3, 4), were generated to determine the overall proportion censored. The sample size for each simulation was 200 and the proportion of patients in the four groups was {0.1, 0.2, 0.3, 0.4}. One thousand simulations were run in each setting to compute the weighted c-index excluding ties (Cn,E), and the CPE excluding ties (Kn,E). The results of the simulations are summarized in Tables 13.

Table 1

Proportional hazards data

CPProp
cen
Cn,E
avg
Kn,E
avg
Cn,E
sim se
Kn,E
sim se
se(Cn,E)
avg
se(Kn,E)
avg
cov(Cn,E)cov(Kn,E)RE
0.60.000.6020.6050.0280.0250.0280.0270.9530.9581.212
0.60.250.6060.6050.0310.0290.0300.0320.9380.9561.151
0.60.500.6070.6090.0370.0360.0370.0410.9270.9581.030
0.60.750.6180.6240.0540.0460.0560.0580.9290.9411.204
0.70.000.7000.7020.0260.0220.0250.0240.9440.9581.385
0.70.240.7010.7030.0270.0260.0290.0280.9550.9631.066
0.70.490.6980.7020.0350.0340.0350.0350.9350.9551.059
0.70.750.6930.7010.0530.0460.0520.0510.9290.9551.350
0.80.500.7980.8000.0210.0190.0210.0200.9370.9631.233
0.80.240.8000.8000.0230.0220.0230.0230.9490.9531.093
0.80.500.7970.8010.0290.0270.0300.0280.9440.9471.164
0.80.750.7840.7970.0450.0400.0450.0410.9500.9461.418
0.90.000.9000.9000.0150.0130.0150.0140.9280.9491.331
0.90.240.8990.9010.0160.0140.0160.0150.9390.9571.305
0.90.500.9000.9010.0200.0180.0200.0190.9360.9401.231
0.90.750.8940.8990.0310.0270.0310.0260.9420.9281.366

Open in a separate window

CP = Concordance Probability; Prop cen = proportion censored

Cn,E = Weighted c-index excluding ties; Kn,E = CPE excluding ties

avg = average; sim se = simulation standard error; cov = coverage

RE = Mean square error efficiency of the CPE relative to the weighted c-index

Table 3

Log normal survival time data

CPProp
cen
Cn,E
avg
Kn,E
avg
Cn,E
sim se
Kn,E
sim se
se(Cn,E)
avg
se(Kn,E)
avg
cov(Cn,E)cov(Kn,E)
0.60.000.6020.5900.0280.0260.0300.0270.9530.940
0.60.240.6050.6010.0320.0280.0310.0320.9340.975
0.60.500.6120.6140.0370.0340.0370.0400.9200.950
0.60.750.6290.6410.0550.0480.0550.0570.9030.873
0.70.000.6990.6750.0280.0260.0270.0240.9310.836
0.70.250.7010.6860.0310.0300.0310.0280.9340.916
0.70.500.7000.7000.0380.0350.0370.0350.9460.942
0.70.750.7050.7260.0560.0450.0520.0490.9110.906
0.80.000.8000.7690.0200.0230.0210.0210.9500.691
0.80.240.8010.7790.0240.0260.0250.0230.9480.829
0.80.490.8000.7890.0320.0300.0310.0280.9290.936
0.80.750.7950.8070.0500.0400.0460.0380.9210.926
0.90.000.9000.8730.0150.0180.0140.0150.9390.583
0.90.240.8990.8770.0170.0200.0170.0170.9440.725
0.90.490.9000.8850.0200.0240.0210.0190.9330.858
0.90.750.8980.8930.0330.0310.0330.0260.9260.913

Open in a separate window

CP = Concordance Probability; Prop cen = proportion censored

Cn,E = Weighted c-index excluding ties; Kn,E = CPE excluding ties

avg = average; sim se = simulation standard error; cov = coverage

When proportional hazards is correct, the results shown in Table 1 demonstrate that the CPE is relatively unbiased, except when the concordance probability is small. The bias that occurs when the regression model is weak is a function of the bias in the Cox regression coefficient estimates in these simulations. For the weighted c-index, in addition to the bias when the CP = 0.60, a bias is present when the censoring proportion is high. The deficiency of the weighted c-index when data are highly censored is likely attributable to the shorter censoring distribution support relative to the survival distribution support; a violation of a key assumption for the weighted c-index stated in Section 2. The weighted c-index with a time truncation, Cn,E(β̂, τ) in Section 3.2, attenuated the weighted c-index and was not helpful in reducing the bias (Table 2). The truncation time was chosen such that within each risk group, G(τ|βTX) > 0.10.

Table 2

Weighted c-index with time truncation; proportional hazards data

CPProp
cen
Cn,E
avg
Cn,E(τ)
avg
Cn,E
sim se
Cn,E(τ)
sim se
se(Cn,E)
avg
se(Cn,E(τ))
avg
cov(Cn,E)cov(Cn,E(τ))
0.60.000.6020.6020.0280.0280.0280.0280.9530.953
0.60.250.6060.6060.0310.0310.0300.0300.9380.938
0.60.500.6070.6040.0370.0380.0370.0370.9270.930
0.60.750.6180.6120.0540.0530.0560.0550.9290.945
0.70.000.7000.7000.0260.0260.0250.0250.9440.944
0.70.240.7010.7010.0270.0270.0290.0290.9550.955
0.70.490.6980.6960.0350.0360.0350.0350.9350.933
0.70.740.6930.6770.0530.0560.0520.0550.9290.914
0.80.000.7980.7980.0210.0210.0210.0210.9370.937
0.80.240.8000.8000.0230.0230.0230.0230.9490.949
0.80.500.7970.7960.0290.0300.0300.0300.9440.944
0.80.750.7840.7690.0450.0510.0450.0500.9500.935
0.90.000.9000.9000.0150.0150.0150.0150.9280.928
0.90.240.8990.8990.0160.0160.0160.0160.9390.939
0.90.490.9000.9000.0200.0200.0200.0200.9360.936
0.90.750.8940.8880.0310.0330.0310.0330.9420.945

Open in a separate window

CP = Concordance Probability; Prop cen = proportion censored

Cn,E = Weighted c-index excluding ties

Cn,E(τ) = Time-truncated weighted c-index excluding ties, G(y|βTX) > 0.10

avg = average; sim se = simulation standard error; cov = coverage

The analytic estimated standard error of the CPE derived in this paper is close to its simulation standard error, and the empirical coverage for the concordance probability based the asymptotic 95% confidence interval Kn,E ± 1.96 × se(Kn,E) is good. As expected, the estimated standard error of the CPE increases as the percent censoring increases and the concordance probability (CP) decreases. For the weighted c-index, the stratified bootstrap resampling standard error and coverage estimate are also accurate.

If the proportional hazards assumption is true, a benefit to using the CPE relative to the weighted c-index is efficiency. The simulation standard errors of Kn,E are uniformly smaller than the simulation standard errors of Cn,E. In terms of mean squared error, the relative efficiency of the CPE to the weighted c-index is always greater than 1 and is as high as 1.4.

For nononproportional hazards data, Table 3 indicates that the CPE is not accurate. The CPE is biased and its coverage may be poor. The inadequacy of the CPE in this case is unsurprising, since its kernel is based on a result stemming from extreme value survival times

Pr[T2>T1|X1,X2]=11+exp{βT(X2X1)},

whereas for normally distributed log survival times

Pr[T2>T1|X1,X2]=1Φ(βTX1βTX22σ),

with Φ(·) representing the standard normal distribution function and σ the scale parameter. The results in Table 3 reinforce the use of diagnostics to assess proportionality before applying the CPE to a given set of data. Conversely, the weighted c-index appears to be sufficently robust and provides similar bias and variance accuracy for both error distributions.

5. Prostate cancer example

In recent years, considerable progress in the treatment of metastatic prostate cancer has resulted in new treatments using immunotherapy, hormone therapy, and chemotherapy, with each demonstrating a prolongation of life. As a result, patient prognosis, as determined through risk groups, can affect the choice of treatment. Importantly, the incorporation of a risk classification system in the treatment decision process requires an understanding of the strength of this system to discriminate clinical outcome. A risk classification system was developed from 148 newly diagnosed metastatic prostate cancer patients. After screening multiple factors, patient risk was based on a categorization of two important biomarkers in this patient population: circulating tumor cells (CTC) and lactate dehydrogenase (LDH) (Scher et al. 2009). CTC is a blood-based assay that provides information on the accumulation of tumor cells in the peripheral blood. LDH is a marker of cell turnover and is considered an indirect marker of tumor burden. The finding that five or more CTCs at baseline is associated with shorter survival times has been found in multiple metastatic solid tumor populations (Danila et al. 2011). The upper limit of normal for LDH was defined as 250 units per liter as determined by the central laboratory used for this study. As a result of these previously determined normal/abnormal ranges, the risk classification system derived from the two biomarkers was:

Low riskCTC < 5
Intermediate riskCTC ≥ 5 LDH ≤ 250
High riskCTC ≥ 5 LDH > 250

Open in a separate window

The Kaplan-Meier estimates of survival were well separated by risk and are depicted in Figure 1. The median survival times in the three groups were: 23.6 months (low risk), 13.6 months (intermediate risk), and 8.2 months (high risk). A proportional hazards model was generated using this risk classification and a test of the proportional hazards assumption (Grambsch and Therneau 1994) provided insufficient evidence that the proportional hazards assumption was violated (p= 0.293). From the results of the proportional hazards model, the estimated log relative risk of death was 0.860 (se=0.304) and 1.584 (se = 0.280) for the intermediate and high risk groups respectively, compared to the low risk group. This risk classification is clearly prognostic, but p-values alone, which were both <0.005, are insufficient to assess the strength of this classification. In general, p-values are sensitive to the sample size of the data, with a large number of failures producing a downward influence on the p-value. In order to ascertain the strength of this classification system in discriminating patient risk, the concordance probability estimate and weighted c-index were computed.

Open in a separate window

Figure 1

Kaplan-Meier estimates of survival based on clinical risk.

The CPE and weighted c-index excluding ties were computed to assess the discrimination strength of this risk group classification. The CPE and weighted c-index were equal to 0.753 (se=0.036) and 0.761 (se = 0.060), respectively. These results indicate that the three group staging system provides good discriminatory power and may be used to assist in the treatment decision process.

When patient pairs in the same risk group were included in the discrimination measures, the CPE was equal to 0.659 and the weighted c-index was equal to 0.666, indicating the the derived metastatic prostate cancer risk groups were not sufficiently strong at discriminating survival rates. However, the overall proportion of pairwise comparisons with patients in the same risk group was 0.30, which explains the attenuation of these discrimination measures.

The CPE and weighted c-index provide a measure of the overall discriminatory power of the risk group classification system. In addition to this overall measure of strength, the CPE can also be used to estimate the discriminatory power between individual risk groups. Using the proportional hazards specification

Pr[T2>T1|X1,X2] =[1+exp{βT(X2X1)}]−1,

and substituting β̂, the probability is 0.83 that a patient in the low risk group will survive longer than a patient in the high risk group. The probability is 0.70 that a patient in the low risk group survives longer than a patient in the intermediate risk group, and approximately the same probability holds for a patient in the intermediate risk group relative to a patient in the high risk group. This illustrates that although the overall CPE is 0.753, the discriminatory strength between individual groups varies. As noted in Section 3, the overall CPE is a weighted average of these individual measures.

The discriminatory power of a risk classification system should be validated before the system is put into practice. An independent cohort of 97 metastatic prostate cancer patients (Danila et al. 2014) was used to validate the estimates of the concordance probability. The CPE and weighted c-index with ties excluded was 0.772 and 0.783, respectively. These results are comparable to the development study results and confirm the good discriminatory power of the CTC+LDH risk classification system in this patient population.

6. Discussion

The interpretation of the concordance probability differs depending on whether ties in the risk groups are included or excluded. The inclusion of patient pairs within the same risk group reduces the concordance probability (𝒞I) parameter toward the null value 0.50 and is therefore affected by the distribution of the subjects across risk groups. In contrast, the parameter (𝒞E), based on the exclusion of patient pairs within the same risk group is unaffected by the distribution of patients across groups.

The most commonly applied measure of discrimination is the unweighted c-index. An advantage of the c-index is that it is nonparametric and may be useful if the proportional hazards assumptions do not hold. We previously showed that with continuous risk scores the unweighted c-index had a bias that was a function of the percent censored (Gönen and Heller 2005). We have adapted the weighted c-index (Uno et al. 2011) to the case when patient pairs in the same risk group are excluded in the calculation. Our simulations indicate that the inverse probability censoring weighted c-index excluding ties is robust to the violation of proportional hazards.

Under proportional hazards, the semiparametric CPE excluding ties was more efficient than the nonparametric weighted c-index. Both statistics were relatively unbiased, but at high censoring rates the weighted c-index demonstrated some bias. In general, time truncation is used to reduce bias for an inverse probability censoring weight statistic, but our simulations demonstrated a lack of improvement uisng the time truncated weighted c-index. In contrast, the CPE is a model based measure of discrimination. It does not require inverse probability censoring weights and therefore does not require a stabilizing truncation point. It does, however, assume that the proportional hazards relation is correct.

Clinical risk classification systems are ubiquitous in medicine and their usefulness in clinical decision making is a function of their discriminatory power. Risk classification systems differ in their discrimination capability and a measure to assess the strength of a classification system should be applied. Our findings indicate that if the proportional hazards model is employed to develop the classification system, and diagnostics are used to confirm the proportional assumptions, then the CPE is an efficient measure of discrimination. If an alternative approach to risk classification is employed, then the nonparametric weighted c-index produces an accurate assessment of discrimination.

R code to compute the CPE and its standard error, with the option to include or exclude ties in the risk scores is available in the R package CPE. For the weighted c-index, the R package pec enables the user the option to include or exclude ties in the risk scores.

Appendix

The asymptotic distribution for the CPE excluding ties

It is assumed that the proportional hazards model h(t|X)=h0(t)exp(β0TX) is the correct specification for the data and that the standard conditions for the asymptotic normality of n1/2(β̂β0) apply (Andersen and Gill, 1982).

Consistency of the CPE excluding ties: Kn,E(β^)p𝒞E where

Kn,E(β^)=ijI(β^TXi>β^TXj)[1+exp(β^TXji)]1ijI(β^TXi>β^TXj)

and𝒞E=Pr[β0TX1>β0TX2|T2>T1,β0TX1β0TX2].

Using the proportional hazards specification provided in Section 2 and the law of large numbers,

Kn,E(β^)pPr[T2>T1|β0TX1>β0TX2].

Note that the concordance probability excluding ties may be written as

𝒞E=Pr[T2>T1|β0TX1>β0TX2]×Pr[β0TX1>β0TX2]Pr[T2>T1,β0TX1β0TX2].

(A.1)

Now using the independence between subjects, the following relations hold

Pr[β0TX1>β0TX2]=Pr[β0TX1β0TX2]2

(A.2)

12=Pr[T2>T1|β0TX1β0TX2]Pr[β0TX1β0TX2]+Pr[T2>T1|β0TX1=β0TX2]Pr[β0TX1=β0TX2]

(A.3)

Substituting (A.2) into (A.3), it follows that

Pr[T2>T1,β0TX1β0TX2]=Pr[β0TX1>β0TX2]

(A.4)

and substitution of (A.4) into the denominator of (A.1) proves the consistency of the CPE.

Asymptotic distribution of n1/2 [Kn,E(β̂) − 𝒞E]

To prove the asymptotic normality of the CPE, it is first shown that if

|β0TXij|>ε>0,

(A.5)

then n1/2Kn,E(β̂) is asymptotically equal to

n1/2K˜n,E(β^)=n3/2ijI(β0TXij>0)[1+exp(β^TXji)]1n2ijI(β0TXij>0)

where within the indicator functions the estimate β̂ is replaced with the true regression coefficient β0 and Xij = XiXj. Note that in the discrete covariate case there are a finite number of covariate values Xij. As a result, (A.5) is not a strong assumption.

The asymptotic equality is demonstrated by considering the cases Xij = 0 and Xij ≠ 0 separately.

  • If Xij = 0, then clearly I(β^TXij>0)=I(β0TXij>0).

  • If Xij ≠ 0, then using the consistency of β̂ and assumption (A.5), for n sufficiently large, |β̂TXij| > ν > 0 and I(β^TXij>0)=I(β0TXij>0) for all Xij. Therefore, for n large,

    n1/2Kn,E(β^)=n1/2K˜n,E(β^).

A Taylor expansion for the asymptotically equivalent CPE produces

n1/2[K˜n,E(β^)𝒞E]=n1/2[K˜n,E(β0)𝒞E]+(K˜n,Eβ)T[n1/2(β^β0)]+op(1).

The partial derivative ∂K̃n,E/∂β is asymptotically constant. Since n1/2(β̂β0) has asymptotic mean zero conditional on X, it is asymptotically independent of n1/2 [K̃n,E(β0) − 𝒞E]. Therefore the asymptotic variance of K̃n,E(β̂) is

Var{n1/2[K˜n,E(β0)𝒞E]}+(K˜n,Eβ)TVar[β^](K˜n,Eβ).

The individual components of the asymptotic variance can be estimated as follows. In each case, the substitution of β̂ for β0 provides a consistent estimate.

The Var(β̂) is estimated from the inverse of the second derivative of the partial likelihood.

The partial derivative evaluated at β0 is equal to

K˜n,E(β)β=ijI(β0TXij>0)Xjiexp(β0TXji)[1+exp(β0TXji)]2ijI(β0TXij>0).

The first term, n1/2 [K̃n,E(β0) − 𝒞E], may be approximated by the U-statistic

π1n3/2ijI(β0TXij>0){[1+exp(β0TXji)]1𝒞E}

where π=limnn2ijI(β0TXij>0).

The asymptotic variance of this U-statistic is

π2n3ijkj[uij+uji][uik+uki].

where

uij=I(β0TXij>0){[1+exp(β0TXji)]1𝒞E}.

Combining these results provides the estimated asymptotic variance of the CPE excluding ties.

Asymptotic distribution of n1/2 [Cn,E(β̂) − 𝒞E]

From Section 3.2, this expression may be written as

n3/2ijδiI(yi<yj)I(β^TXiβ^TXj){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1[I(β^TXi>β^TXj)𝒞E]n2ijδiI(yi<yj)I(β^TXiβ^TXj){Ĝ(yi|β^TXi)Ĝ(yi|β^TXj)}1=ψ1n3/2ijδiI(yi<yj)I(β0TXiβ0TXj)[I(β0TXi>β0TXj)𝒞E]G(yi|(β^TXi)G(yi|(β^TXj)+op(1)

where ψ = limn→∞n−2ij δiI(yi < yj)I(β̂TXiβ̂TXj)Ĝ−1(yi|β̂TXi)Ĝ−1(yi|β̂TXj).

Letting

eij=δiI(yi<yj)I(β0TXiβ0TXj)[I(β0TXi>β0TXj)𝒞E]

and Taylor expanding β̂ around β0,

n1/2[Cn,E(β^)𝒞E]=ψ1n3/2ijeijG(yi|β0TXi)G(yi|β0TXj)+ψ1n3/2ijeijG(yi|β0TXi)G(yi|β0TXj)[G(yi|β0TXi)Ĝ(yi|β0TXi)G(yi|β0TXi)+G(yi|β0TXj)Ĝ(yi|β0TXj)G(yi|β0TXj)]+ψ1n3/2ij{[βeijG(yi|βTXi)G(yi|βTXj)]T}(β^β0)+op(1)

The second term may be rewritten using the martingale representation theorem as

2n1/2k=1nt=0qXk(t)θXk(t)dMXk(t)

where

qXk(t)=limnij{eijG(yi|β0TXi)G(yi|β0TXj)}{I(yjt)[I(Xi=Xk)+I(Xj=Xk)]}

θXk(t)=limnjI(yjt,Xj=Xk)

MXk(t)=I(ykt,δk=0)u=0tI(yku)dΛXk(u)

and ΛXk is the cumulative hazard of the censoring random variable belonging to group Xk (Cheng et al. 1995, Fine et al. 1998).

It follows that n1/2 [Cn,E(β̂) − 𝒞E]

=ψ1n3/2ijeijG(yi|β0TXi)G(yi|β0TXj)+2ψ1n1/2k=1nt=0qXk(t)θXk(t)dMXk(t)+ψ1{n2ij[βeijG(yi|βTXi)G(yi|βTXj)]T}n1/2(β^β0)+op(1)

Each component is asymptotically normal with mean zero. Analytic estimation of the asymptotic variance, however, requires density estimation of the censoring random variable. Alternatively, a stratified bootstrap resampling approach is utilized to estimate the asymptotic variance.

References

  • Andersen PK, Gill RD. Cox’s regression model for counting procsses: A large sample study. Annals of Statistics. 1982;10:1100–1120. [Google Scholar]
  • Cheng SC, Wei LJ, Ying Z. Analysis of transformation models with censored data. Biometrika. 1995;82:835–845. [Google Scholar]
  • Cox DR. Regression models and life tables (with Discussion) Journal of the Royal Statistical Society, Series B. 1972;34:187–220. [Google Scholar]
  • Cox DR. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]
  • Danila DC, Fleisher M, Scher HI. Circulating tumor cells as biomarkers in prostate cancer. Clinical Cancer Research. 2011;17:3903–3912. [PMC free article] [PubMed] [Google Scholar]
  • Danila DC, Anand A, Schultz N, Heller G, Wan M, Sung CC, Dai C, Khanin R, Fleisher M, Lilja H, Scher HI. Analytic and clinical validation of a prostate cancer-enhanced messenger RNA detection assay in whole blood as a prognostic biomarker for survival. European Urology. 2014;65:1191–1197. [PMC free article] [PubMed] [Google Scholar]
  • Fine JP, Ying Z, Wei LJ. On the linear transformation model for censored data. Biometrika. 1998;85:980–986. [Google Scholar]
  • Gerds TA. Prediction Error Curves for risk prediction models in survival analysis. R package version 2.4.2. 2014 [Google Scholar]
  • Gerds TA, Kattan MW, Schumacher M, Yu C. Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine. 2013;32:2173–2184. [PubMed] [Google Scholar]
  • Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005;92:965–970. [Google Scholar]
  • Grambsch PM, Therneau TM. Proportional Hazards Tests and Diagnostics Based on Weighted Residuals. Biometrika. 1994;81:515–526. [Google Scholar]
  • Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. Journal of the American Medical Association. 1982;247:2543–2546. [PubMed] [Google Scholar]
  • Kalbfleisch JD. Likelihood methods and nonparametric tests. Journal of the American Statistical Association. 1978;73:167–170. [Google Scholar]
  • Lin DY, Wei LJ, Ying Z. Checking the Cox Model with Cumulative Sums of Martingale-Based Residuals. Biometrika. 1993;80:557–572. [Google Scholar]
  • Mo Q, Gönen M, Heller G. CPE: Concordance Probability Estimates in Survival Analysis. R package version 1.4.3. 2012 [Google Scholar]
  • R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2011. URL http://www.R-project.org/ [Google Scholar]
  • Scher HI, de Bono JS, Fleisher M, Pienta KJ, Raghavan D, Heller G. Circulating tumour cells as prognostic markers in progressive, castration-resistant prostate cancer: a reanalysis of IMMC38 trial data. Lancet Oncology. 2009;10:233–239. [PMC free article] [PubMed] [Google Scholar]
  • Sprent P. Applied Nonparametric Statistical Methods, Second edition. London: Chapman and Hall; 1989. [Google Scholar]
  • Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine. 2011;30:1105–1117. [PMC free article] [PubMed] [Google Scholar]
  • Yan G, Greene T. Investigating the effects of ties on measures of concordance. Statistics in Medicine. 2008;27:4190–4206. [PubMed] [Google Scholar]
Estimating the Concordance Probability in a Survival Analysis with a Discrete Number of Risk Groups (2024)

References

Top Articles
Latest Posts
Article information

Author: Delena Feil

Last Updated:

Views: 5795

Rating: 4.4 / 5 (45 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Delena Feil

Birthday: 1998-08-29

Address: 747 Lubowitz Run, Sidmouth, HI 90646-5543

Phone: +99513241752844

Job: Design Supervisor

Hobby: Digital arts, Lacemaking, Air sports, Running, Scouting, Shooting, Puzzles

Introduction: My name is Delena Feil, I am a clean, splendid, calm, fancy, jolly, bright, faithful person who loves writing and wants to share my knowledge and understanding with you.