Pulmonary Arterial Hypertension: Epidemiology and Registries
Registries of pulmonary arterial hypertension (PAH) are important means by which to characterize the presentation and outcome of patients and to provide a basis for predicting the course of the disease. This article summarizes the published conclusions of the World Symposium of Pulmonary Hypertension task force that addressed registries and epidemiology of PAH.
Collection of patient information into registry databases enables characterization of pulmonary arterial hypertension (PAH) in terms of demographics, clinical presentations, and outcomes. Because this type of information provides the foundation for recognizing PAH and assessing the utilization of treatment strategies, the Fifth World Symposium for Pulmonary Hypertension included a task force to summarize what has been learned from PAH registries, to outline appropriate interpretation of registry data, and to recommend how registries ought to be pursued for optimal acquisition of useful knowledge in the future. This article will summarize some of the major conclusions of that effort that have been published previously.1
The common denominator of all PAH registries is to provide a description of patients with PAH, to determine the impact of the disease (outcome), to elucidate how the outcome is determined by patient characteristics (risk), and to document how outcome may be broadly altered by therapy.
The task force described at the outset what sort of information could be included in registries and what factors must be considered in meaningfully analyzing that data. All PAH registries considered by the task force sought to be as comprehensive as possible in assimilating variables while simultaneously recognizing the limitations imposed by available resources required to collect the data. Thus, a registry must: (i) wisely confine its methodology to addressing carefully constructed and clearly articulated questions, (ii) understand and transparently describe limitations, and (iii) identify potential biases imposed by the methodology. All major registries have been observational and descriptive. Therefore, conclusions emerge about how PAH is identified and handled in “the real world” rather than within a framework of “ideal” management as advised by consensus guidelines. Moreover, registries have varied with respect to the exact selection criteria, which in turn may predetermine the nature of some conclusions. Some of the specific ways in which registries have differed from one another include the clinical and hemodynamic definitions used to identify the types of patients enrolled in the studies, the use of newly diagnosed and/or previously diagnosed patients, the specific data collected, and the frequency and duration of follow-up.
The determination of patient eligibility depends to a large extent on the goal of the particular registry. Registries that intend to evaluate a previously well-specified population are carefully designed to include only patients who meet the accepted definition of disease. Thus, these PAH registries enroll patients in whom other types of pulmonary hypertension (PH) have been conscientiously excluded by clinical and hemodynamic criteria. The strength of these types of registries is that they describe the behavior of a well-circumscribed disease entity, which can be compared to similar populations from other eras or geographic locales. An example of this type of registry is the French registry.2 Other registries may be more interested in identifying the characteristics of a more loosely circumscribed population to uncover the limits that define post-hoc a cohesive group that could be considered to have PAH, without recourse to a precise, prespecified consensus definition. This approach is exemplified by the REVEAL registry, in which a pulmonary arterial wedge pressure up to 18 mm Hg was permitted and the clinical diagnosis of PAH was based only on the opinion of the treating physician.34 Inclusion of patients with “nonconforming” high wedge pressures (pulmonary artery wedge pressure ranging from 16 to 18 mm Hg) allows for these patients to be excluded or included in individual analyses so that similarities and differences between groups may be evaluated.5
Likewise, some registries focus on describing the course of disease exclusively from the time of its first documentation by right heart catheterization (so-called incident patients) to unambiguously understand the “full” natural history of PAH from the time of diagnosis. Others emphasize trying to understand the course of disease from any time point in its trajectory, and therefore include both incident and previously diagnosed (“prevalent”) patients to compare these 2 groups and attempt to identify predictors of survival (risk factors) independent of time of diagnosis. Survival studies emerging from the UK6 and REVEAL registries,7 respectively, are representative of these 2 approaches. Of course, in a registry that enrolls both types of patients, analyses can be performed on either subpopulation or on both together, depending on the specific question being asked. Some investigators favor restricting survival analyses to incident patients,6 while others point out that risk stratification or a delayed entry model accounting for left truncation is preferable to excluding prevalent patients from PAH registries.8 A population is considered left truncated if patients may have been excluded from a cohort due to events that occurred prior to the study. Patients who die prior to study initiation are excluded, while patients who survive to study initiation are included from the point in their survival at which they were enrolled. An approach to analyzing survival from diagnosis, utilizing both newly diagnosed and previously diagnosed patients, was used in the US-REVEAL protocol, as well as in the French registry. Survival from time of diagnosis, utilizing data from both incident and prevalent patients, is comparable to survival estimates that are restricted to incident patients.239
The key to interpreting registries using different study populations is clearly understanding the broad population to whom the results can be generalized. For example, using an outcome measure (ie, survival) derived from a prevalent population as a basis for comparing outcome in newly diagnosed patients is inappropriate, whereas generalizing it to the population of patients with previously diagnosed disease is legitimate. Additionally, survival estimates from one incident cohort may not be generalizable to another incident cohort if diagnosis method or time from symptom onset to diagnosis differs between cohorts. The task force recognized that it is not appropriate to define an at-risk period that includes time during which patients were not on study. Doing so leads to immortal time bias because patients are guaranteed to have survived the prestudy period.
The use of registry data for comparative effectiveness is difficult and controversial;1011 since aggressive treatments will generally be applied to the sickest patients, the worst outcomes will occur frequently among these patients, thereby confounding assessment of efficacy. A variety of methods exist to adjust for confounding. Matching, multivariable risk-adjusted models of outcomes, and propensity scores can be effective if all confounding variables have been identified and measured. In PAH, it is plausible that most (but not all) important potential confounders have been successfully identified.
Finally, a source of potential bias is the means of funding for a registry. Registries are expensive. Costs include funding for site coordinators, project management, in-person meetings, data management, and statistical analysis. When studies receive industry sponsorship, the relationship of the sponsor and advisors must be clearly delineated, and it is similarly important for data ownership and data access rules to be specified contractually. Disclosing conflict of interest is critical, but there are many important scientific objectives where the interests of industry, patients, and the scientific community are fully aligned.
The task force summarized characteristics of 11 major registries in which 6 countries were represented. All registries enrolled patients with idiopathic and heritable PAH, 7 included other PAH patients, and 1 also included chronic thromboembolic PH (CTEPH, PH Group 4) (Table 1). The number of patients in each registry ranged from 72 to 3515, and participating centers ranged from 1 to 55. Table 2 provides the basic presenting characteristics of patients enrolled in each registry.


In general, survival in registry populations has improved as treatment options increase (Table 3). Data from the US-REVEAL registry suggest that current median survival is 7 years for patients with PAH9 compared to 2.8 years for patients with primary PH (PPH, now referred to as idiopathic/heritable PAH [IPAH/HPAH]) in the US-National Institutes of Health (NIH) registry.12

Considerable changes in the PAH phenotype have been observed over time. These include substantial changes in age, gender, comorbidities, and survival (Tables 2 and 4). While the mean age of patients with IPAH in the first registry created in 1981 (US-NIH registry) was 36 ± 15 years,13 PAH is now more frequently diagnosed in elderly patients, resulting in a mean age at diagnosis between 50 ± 14 and 65 ± 15 years in current registries (Table 2). Furthermore, the female predominance is quite variable among registries and may not be present in elderly patients.14 A potential explanation for the change in phenotype may be the increased awareness of PAH in the modern management era as effective therapies become available. For example, since PPH was considered a rare disease that affected young women at the time of the initial US-NIH registry, it is likely that older patients and men were often not considered for the diagnosis at that time. Other factors contributing to biased enrollment include lack of awareness of this registry among nonexperts in the community and unavailability of widespread screening tools such as Doppler echocardiography. Since PAH may be detected more frequently in elderly patients, one should also be cautious about possible misclassifications between PAH and non-PAH PH (particularly postcapillary PH due to heart failure with preserved ejection fraction, HFpEF), which may occur particularly in elderly patients as a consequence of uncertainties in the current definitions and difficulties in the measurement of the pulmonary arterial wedge pressure.

Registries from China15 and other developing countries demonstrate similar demographics and characteristics to the early studies of the US-NIH registry, suggesting that some differences in phenotype might be related to the health care environment rather than to different expressions of the disease. Nonetheless, specific sources of systematic bias in PAH registries include: (i) changes in the classification of PH, which have led to inclusion of a varying spectrum of patients in modern registries; (ii) changing interest in PH by academic physicians, producing more development and dissemination of information; (iii) increased awareness of PH by clinicians due to availability and marketing of efficacious therapy, with associated education from pharmaceutical representatives;16 (iv) easier access to medical information by patients, who may then influence their own referral to specialized care; and (v) widespread use of noninvasive techniques (Doppler echocardiography), which allow for disease detection even in the absence of prior suspicion, thereby leading to a perception of increased disease prevalence.17 Thus, it appears that the changing phenotype of patients with PH in modern registries is potentially influenced by factors that are independent of the disease itself.
An important asset of registries is the capability of identifying patient characteristics that predict outcome. The US-NIH registry was the first to develop a prognostic equation.12 Use of this equation in the current treatment era has limitations, as it provides information only on the natural history of untreated PPH rather than on Group 1 PH (PAH). More recent registries have identified predictors of outcome (Table 4) that show substantial homology between studies, including disease etiology, patient gender, and factors reflective of right heart function. Four registries (US-REVEAL, US-PHC, French, and UK) employed multivariable analyses to develop prognostic equations (US-REVEAL, US-PHC, French) or calculators (US-REVEAL, UK). Despite the US-REVEAL equation's derivation in a combined incident and prevalent cohort at the time of enrollment, the equation demonstrated equal prognostic power when tested at time of diagnosis and was validated in an entirely incident population18 and in distinct PH populations at other institutions.19–21 The UK prognostic score was also validated in a second set of incident patients taken retrospectively from the UK registry only (derivation was from the Scottish registry only). Both the French and US-REVEAL equations have shown strong predictive power when cross-validated in matched patients from the US-REVEAL and French registries, respectively.2223 It appears that concerns about the relative contribution to mortality risk of “newly” and “previously” diagnosed patients is minimized and overshadowed by the overall contribution of individual risk profiles in each of these populations, respectively. In other words, a newly diagnosed patient is not “independently” at risk of dying by the mere fact of being “newly diagnosed,” but rather because they have a larger proportion of “at-risk” factors than those previously diagnosed.824
The task force discussed about how future registry databases could be expanded to better understand broad PH populations. Although patients belonging to Group 2 (PH due to left heart diseases) and Group 3 (PH due to chronic lung diseases and/or hypoxia) represent an increasing part of the clinical practice, there is little information about the demographics and clinical course of this segment of the PH population, suggesting that registry database methodology may be useful for these groups. However, the structure of registries incorporating “non-PAH” PH is problematic. A single registry could include all patients with any type of PH from which defined subgroups (ie, PH associated with interstitial lung disease, chronic obstructive pulmonary disease [COPD], left ventricular systolic dysfunction, or left ventricular HFpEF) could be extracted for analyses. An advantage of this model is that all patients would be enrolled from the same sites and would permit direct comparisons between cohorts with minimal adjustment for differences in enrollment patterns, location, or follow-up. Disadvantages are that many patients would need to be enrolled to provide sufficient cohort size for characterization of all groups and a single case report form (CRF) may not be appropriate for all cohorts. The ASPIRE registry has attempted to assess the spectrum of PH across the 5 PH groups encountered in a single specialist referral center, allowing specific descriptions of PH patients with associated diseases such as COPD and other comorbidities.2526 An alternative model would be to develop separate registries around specific disease entities of interest, using focused CRFs at less anticipated cost. This has been successfully proposed for CTEPH.27
The task force recognized that unless all patients who have PH within a population are enrolled in a registry, estimates of incidence or prevalence of disease in a prespecified population are not possible. To understand the chances of PH developing in a population requires that the population at risk be observed systematically over time in order to detect the occurrence of PH. Examples of populations of interest in whom the risk of developing PH makes systematic data collection likely to yield clinically useful information include patients with known BMPR2 mutations, with 2 or more family members with PH, with systemic sclerosis, with cirrhosis and portal hypertension, with past or present methamphetamine use, with mean pulmonary artery pressure of 20–25 mm Hg, or with PH observed only during exercise.
Since not all factors that may be determinants of outcome can be anticipated, registries must be designed to accommodate and explore future advances in knowledge as they develop. This requires CRFs to be fluid enough to allow changes in coding variables over time, but more importantly mandates that blood and tissue of participants be collected and stored so that biomarker and genetic correlates to clinical phenotypic expression can be examined both in the present and the future.
The profile of PH varies throughout the world, and comparison between environments, population demographics, and health care delivery systems may permit the development of hypotheses about how PH is best diagnosed and managed under different conditions. Accordingly, systematic acquisition of clinical data in registries worldwide represents a desirable objective.28 Collaborative efforts among registries have been useful in creating hypotheses about these observations, but have been hampered to an extent by differences in study design, patient ascertainment, entry criteria, and follow-up. More uniformly designed and orchestrated registry data acquisition and analysis will likely yield more coherent observations and conclusions.
The overriding question is not so much whether a global approach to PH registry data is desirable, but how it could be achieved. Several models can be considered: (i) a single global registry with a unified funding source under the direction of a single steering committee; (ii) a variety of national or regional registries, each with distinct funding sources and separate steering committees, but using a common (or overlapping) CRF and comparable enrollment principles; (iii) independently developed and operated databases using separate CRFs, which can be compared using adjustments for differences to the extent possible during post-hoc collaborations. Of these, (ii) seems to be the best compromise between collaboration and feasibility.
Contributor Notes
Disclosure: No conflicts disclosed.