INJ Search

CLOSE


Int Neurourol J > Volume 28(Suppl 2); 2024 > Article
Shin, Ko, Park, Han, Yeom, and Lee: Machine Learning Models for the Noninvasive Diagnosis of Bladder Outlet Obstruction and Detrusor Underactivity in Men With Lower Urinary Tract Symptoms

ABSTRACT

Purpose

This study aimed to develop and evaluate machine learning models, specifically CatBoost and extreme gradient boosting (XGBoost), for diagnosing lower urinary tract symptoms (LUTS) in male patients. The objective is to differentiate between bladder outlet obstruction (BOO) and detrusor underactivity (DUA) using a comprehensive dataset that includes patient-reported outcomes, uroflowmetry measurements, and ultrasound-derived features.

Methods

The dataset used in this study was collected from male patients aged 40 and older who presented with LUTS and sought treatment at the urology department of Samsung Medical Center. We developed and trained CatBoost and XGBoost models using this dataset. These models incorporated features like prostate size, voiding parameters, and responses from questionnaires. Their performance was assessed using standard metrics such as accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUROC).

Results

The results indicated that the CatBoost models displayed greater sensitivity, rendering them effective for initial screenings by accurately identifying true positive cases. Conversely, the XGBoost models showed higher specificity and precision, making them more suitable for confirming diagnoses and reducing false positives. In terms of overall performance for both BOO and DUA, XGBoost surpassed CatBoost, achieving an AUROC of 0.826 and 0.819, respectively.

Conclusions

Integrating these machine learning models into the diagnostic workflow for LUTS can significantly enhance clinical decision-making by offering noninvasive, cost-effective, and patient-friendly diagnostic alternatives. The combined application of CatBoost and XGBoost models has the potential to improve diagnostic accuracy and provide customized treatment plans for patients, ultimately leading to better clinical outcomes.

INTRODUCTION

In middle-aged and older men, conditions such as benign prostatic hyperplasia (BPH), age-related diseases, neurological disorders, and hormonal changes frequently lead to lower urinary tract symptoms (LUTS), which significantly impact quality of life [1,2]. Key voiding symptoms, including a slow stream, hesitancy, and straining, primarily result from 2 causes: bladder outlet obstruction (BOO) and detrusor underactivity (DUA) [3]. Accurately distinguishing between these conditions is crucial for effective treatment, as BOO and DUA necessitate different management strategies. BOO is generally treated with medications like alpha-blockers or 5-alpha reductase inhibitors, and in more severe cases, surgical options such as transurethral resection of the prostate may be considered [4]. On the other hand, treatments for DUA might include bladder-emptying techniques such as intermittent catheterization or pharmacological methods to stimulate bladder contractions [5,6].
Despite these therapeutic distinctions, LUTS alone cannot differentiate between BOO and DUA, necessitating the use of invasive urodynamic studies (UDS) to measure real-time urine flow and detrusor pressure [7]. However, UDS poses discomfort and risks to patients, underscoring the need for noninvasive diagnostic alternatives [8]. This study introduces an artificial intelligence-based approach that utilizes the International Prostate Symptom Score (IPSS), uroflowmetry data, and transrectal ultrasound (TRUS) measurements to accurately differentiate between BOO and DUA. By comparing CatBoost and XGBoost models, our aim is to develop a reliable, noninvasive diagnostic tool that can improve clinical decision-making and patient outcomes without relying on invasive urodynamic tests.

MATERIALS AND METHODS

Participants

The study participants were male patients aged 40 years and older who exhibited LUTS and visited the urology department between December 2006 and December 2020. To qualify for inclusion in the study, participants were required to be male, aged 40 or older, and to have undergone various examinations, including UDS, IPSS assessment, TRUS, and uroflowmetry, to measure maximum flow rate and residual urine volume. Furthermore, only patients with a maximum flow rate of less than 15 mL/sec were included. BOO was defined as a bladder outlet index of 40 or greater, and DUA was defined as a bladder contractility index (BCI) of less than 100 [9].

Collected Data

The dataset included a range of parameters, and the measurements are summarized in Table 1.

Feature Engineering

In addition to the previously mentioned data, further features were extracted from the IPSS questionnaire to enhance the un-derstanding of patient symptoms. These additional features encompass the sum of voiding symptom scores from questions 1, 3, 5, and 6; the sum of storage symptom scores from questions 2, 4, and 7; the total sum of all symptom scores from questions 1 through 7; and the quality-of-life score derived from question 8.
From the TRUS imaging data, numerical features, including prostate volume, width, length, and height, as well as those of the transition zone, were extracted. These features were utilized during model training and played a critical role in evaluating the physical characteristics of the prostate associated with conditions such as BOO and DUA.

Model Development

In this study, 4 models were developed, comprising 2 CatBoost models and 2 XGBoost models. Each model was specifically designed to diagnose either BOO or DUA, conditions traditionally identified through invasive UDS. The CatBoost models utilized the CatBoostClassifier from the CatBoost library in Python. The categorical features incorporated into these models included variables such as cerebrovascular accident, diabetes, dementia, hypertension, IPSS, International Continence Society Male Short-Form, age group, and history of radical pelvic surgery.
Similarly, the XGBoost models employed the XGBoostClassifier from the XGBoost library in Python. To ensure a fair comparison between the 2 algorithms, the same set of categorical features used in the CatBoost models was applied to the XGBoost models. This method facilitated a consistent evaluation of the models’ performance in diagnosing BOO and DUA.

Model Training and Evaluation

The models were trained on the dataset, which included appropriate preprocessing steps to handle categorical data, missing values, and normalization as needed. Each model’s effectiveness in diagnosing BOO and DUA was evaluated using standard performance metrics, including accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUROC).
Our approach integrated patient-reported outcomes, objective flow measurements, and precise imaging-derived diagnostics to compare the performance of CatBoost and XGBoost models. This comparison aimed to determine which model offered greater accuracy and reliability in distinguishing between BOO and DUA, potentially reducing the need for invasive urodynamic tests.

RESULTS

In the study, a total of 4,817 patients were enrolled. Among them, 676 patients were diagnosed with both BOO and DUA, 1,058 with BOO only, and 2,335 with DUA only. The remaining 748 patients did not have a diagnosis of either BOO or DUA.

BOO Diagnosis

In Tables 2 and 3, we present the results of BOO diagnosis using CatBoost and XGBoost. XGBoost demonstrated superior overall performance compared to CatBoost, with an AUROC of 0.826 versus 0.809 for CatBoost. In terms of accuracy, XGBoost achieved a higher value of 0.755, while CatBoost reached 0.730. For sensitivity, CatBoost slightly outperformed XGBoost, scoring 0.767 versus 0.756, indicating a better ability to identify true positive cases. Both models had identical specificity values of 0.755, demonstrating similar effectiveness in detecting true negative cases. Precision was higher for XGBoost at 0.648, compared to 0.610 for CatBoost, suggesting that XGBoost produced fewer false positives. Lastly, XGBoost recorded a marginally better F1-score of 0.697, with CatBoost close behind at 0.680.
In Fig. 1, the 2 bar plots illustrate the key features utilized by the CatBoost and XGBoost models to predict BOO. The first plot reveals that the CatBoost model considers the maximum flow rate measured during uroflowmetry to be the most crucial feature, along with significant consideration given to PSA and age. In contrast, the second plot highlights that the XGBoost model prioritizes the height of the transition zone as the most important feature, in addition to other anatomical measurements of the prostate and transition zone. The XGBoost model also underscores the significance of the maximum flow rate from uroflowmetry measurements. Overall, while the CatBoost model focuses on demographic and clinical measurements, the XGBoost model leans more toward detailed anatomical characteristics. Despite their different emphases, both models integrate these data types to predict BOO.
Due to the differing feature importance values, we developed an ensemble model that combines CatBoost and XGBoost to leverage their complementary strengths. This model was constructed by assigning different weights to the predictions from CatBoost and XGBoost, followed by calculating the weighted average of their outputs. Specifically, we assigned weights of 0.6 to CatBoost and 0.4 to XGBoost, based on their individual performance and contributions to the final prediction. This strategy is designed to capitalize on the unique contributions of each model, thereby enhancing the overall prediction accuracy for BOO. By integrating the distinct perspectives of both models, the ensemble method offers a more robust and comprehensive diagnostic tool. However, experiments that involved varying the weights showed that the ensemble model did not significantly enhance performance compared to the better-performing XGBoost model alone. As indicated in Table 4, the performance metrics of the ensemble model are very close to those of XGBoost. This lack of significant improvement can be attributed to the similarity in the prediction values of both models. Despite the different features emphasized by each model, their predictions are sufficiently similar, resulting in minimal gains from the ensemble approach.

DUA Diagnosis

Tables 2 and 3 show the DUA diagnosis results using both models. XGBoost demonstrated superior performance in terms of AUROC, with scores of 0.819 and 0.803, respectively. Both models achieved similar accuracy, with CatBoost at 0.739 and XGBoost at 0.734. However, CatBoost exhibited a higher sensitivity of 0.807 compared to XGBoost’s 0.754, indicating that CatBoost was more effective at identifying true positive cases. Conversely, XGBoost outperformed CatBoost in specificity, scoring 0.701 compared to 0.621, which suggests that XGBoost was more adept at identifying true negative cases. In terms of precision, XGBoost again showed superior performance with a score of 0.813, versus CatBoost’s 0.786. Finally, XGBoost achieved a slightly higher F1-score of 0.782, while CatBoost recorded 0.712.
Feature importance plots for DUA prediction are illustrated in Fig. 2. The CatBoost model for DUA identifies uroflowmetry voiding time as the most crucial feature, closely followed by maximal flow rate and chart mean bladder capacity. Additionally, this model emphasizes the importance of anatomical features such as prostate height and transition zone height. In contrast, the XGBoost model for DUA assigns the greatest importance to transition zone height, with other significant features including uroflowmetry maximal flow rate and ICS_Q_I4. This model also considers prostate height and radical pelvic surgery as important factors. Both models underscore the significance of transition zone height, uroflowmetry maximal flow rate, and prostate height, indicating these features are essential for predicting DUA. However, there are notable differences in their focus: CatBoost prioritizes uroflowmetry voiding time and chart mean bladder capacity, whereas XGBoost places more emphasis on specific questionnaire items (ICS_Q_I4 and ICS_Q_V_ Sum) and radical pelvic surgery.
Similar to the approach used for BOO, an ensemble model was created by combining CatBoost and XGBoost. However, as indicated in Table 4, the performance of the ensemble model did not improve.

DISCUSSION

In this study, we developed machine learning models to distinguish between BOO and non-BOO, as well as DUA and non-DUA, in male patients with LUTS. Utilizing a comprehensive dataset that includes uroflowmetry, ultrasound-derived parameters, and patient-reported outcomes, our CatBoost and XGBoost models demonstrated strong performance in classifying BOO and DUA.
There have been ongoing efforts to develop noninvasive methods as alternatives to invasive UDS for diagnosing BOO and DUA. Various noninvasive evaluation techniques have been explored, including the penile cuff test (PCT), bladder wall thickness (BWT), detrusor wall thickness (DWT), and intravesical prostatic protrusion (IPP). The PCT, for instance, provides a noninvasive alternative to pressure flow studies (PFS) by measuring isovolumetric bladder pressure during micturition. Although PCT has demonstrated a high negative predictive value for BOO and offers shorter procedure times than PFS, it is limited by a low positive predictive value and diagnostic uncertainty due to variability in patient responses and voided volumes [10,11]. Similarly, techniques such as BWT and DWT, while useful for measuring anatomical changes, are constrained by the absence of standardized protocols and defined cutoff values, which diminish their diagnostic accuracy. IPP, although offering valuable insights into prostate obstruction, is highly dependent on the operator, introducing an additional layer of variability in clinical practice. Most of these studies have focused on differentiating BOO from non-BOO, with less attention given to diagnosing DUA [12].
Recent advancements in AI technology within the field of urology have been notable [13]; however, there are relatively few studies that focus specifically on developing machine learning models for diagnosing BOO and DUA in male patients with LUTS. Bang et al. [14] utilized deep learning techniques, including CNNs, to analyze uroflowmetry graphs and predict BOO and DUA. Despite this innovative approach, their models demonstrated only moderate performance, achieving AUROCs of approximately 73%. In a similar vein, Matsukawa et al. [15] created an AI-based diagnostic system for LUTS that depended exclusively on uroflowmetry data to classify BOO and DUA. Although this system reached an accuracy of 84%, its reliance on a single data source hindered its ability to comprehensively address the complex, multifactorial nature of LUTS. In a 2023 follow-up study, Matsukawa et al. [16] further explored the characteristics of uroflowmetry patterns, such as the initial peak flow rate, to improve differentiation between BOO and DUA. While they successfully identified significant patterns in the uroflowmetry data, their study did not incorporate the use of advanced machine learning algorithms.
Our study improved diagnostic accuracy by integrating multiple data sources, including prostate ultrasound and patient-reported outcomes. This comprehensive dataset allows our models to provide a more nuanced assessment of bladder function and obstruction. Utilizing CatBoost and XGBoost, our approach leverages a broad set of clinical data and effectively handles both categorical and continuous variables, thus making our models adaptable to various patient profiles. The CatBoost models have shown high sensitivity, effectively identifying true positive cases, which is essential for initial screenings. Conversely, the XGBoost models exhibit higher specificity and precision, making them ideal for confirming diagnoses and reducing false positives. This distinction underscores the strengths of each model, depending on clinical priorities such as minimizing false negatives or false positives. Our findings indicate that incorporating CatBoost and XGBoost models into the diagnostic workflow for LUTS could significantly improve clinical decision-making. These models provide noninvasive, cost-effective, and patient-friendly alternatives to traditional protocols, potentially offering more accurate diagnoses and customized treatment plans.
The limitations of our study are as follows. First, our models were specifically designed to distinguish between BOO and non-BOO, as well as DUA and non-DUA, rather than directly differentiating between BOO and DUA. This presents a limitation in clinical settings where both conditions might coexist, as ideally, a single model that can differentiate between BOO and DUA would be more beneficial. Second, the inherent complexity of DUA makes its definition based solely on the BCI somewhat limited. Although the BCI is a common metric in clinical research, it may not adequately reflect the complex nature of DUA. Nevertheless, to maintain consistency with previous studies and to establish a clear baseline, we chose to define DUA using the most widely accepted BCI thresholds found in the literature.
In conclusion, our machine learning-based approach to diagnosing BOO and DUA represents a significant advancement over previous studies, as it incorporates a broader array of clinical features and utilizes more sophisticated machine learning algorithms. This approach offers a promising noninvasive diagnostic tool that could improve clinical decision-making. Future research should aim to validate these models using larger datasets and incorporate more clinical and genetic data to further enhance their performance. The ongoing development of machine learning models shows great potential for transforming the diagnosis and management of BPH and other medical conditions.

NOTES

Grant/Fund Support
This work was supported by National IT Industry Promotion Agency (NIPA) grant funded by the Korean government (MSIT) (No.H0401-24-1001, Development of AI Precision Medical Solution (Doctor Answer 2.0)).
Research Ethics
The data for this study were collected from Samsung Medical Center, following approval from the Institutional Review Board (IRB) under File No. SMC 2021-08-116.
Conflict of Interest
No potential conflict of interest relevant to this article was reported.
AUTHOR CONTRIBUTION STATEMENT
· Conceptualization: IY, KSL
· Data curation: KJK, DHH, HS, WJP
· Formal analysis: HS, WJP, IY
· Funding acquisition: KSL
· Methodology: DHH
· Project administration: KSL
· Visualization: WJP, IY
· Writing - original draft: KJK, HS
· Writing - review & editing: IY, KSL

REFERENCES

1. Foo KT. What is a disease? what is the disease clinical benign prostatic hyperplasia (BPH)? World J Urol 2019;37:1293-6. PMID: 30805683
pmid pmc
2. Chapple CR, Wein AJ, Abrams P, Dmochowski RR, Giuliano F, Kaplan SA, et al. Lower urinary tract symptoms revisited: a broader clinical perspective. Eur Urol 2008;54:563-9. PMID: 18423969
crossref pmid
3. Ko KJ, Lee CU, Lee KS. Clinical implications of underactive bladder. Investig Clin Urol 2017;58(Suppl 2):S75-81. PMID: 29279879
pmid pmc
4. Murad L, Bouhadana D, Nguyen DD, Chughtai B, Zorn KC, Bhojani N, et al. Treating LUTS in men with benign prostatic obstruction: a review article. Drugs Aging 2023;40:815-36. PMID: 37556075
crossref pmid pdf
5. Rademakers KL, van Koeveringe GA, Oelke M. Detrusor underactivity in men with lower urinary tract symptoms/benign prostatic obstruction: characterization and potential impact on indications for surgical treatment of the prostate. Curr Opin Urol 2016;26:3-10. PMID: 26574876
pmid
6. Li X, Liao L. Updates of underactive bladder: a review of the recent literature. Int Urol Nephrol 2016;48:919-30. PMID: 26931421
crossref pmid pdf
7. Osman NI, Chapple CR, Abrams P, Dmochowski R, Haab F, Nitti V, et al. Detrusor underactivity and the underactive bladder: a new clinical entity? a review of current terminology, definitions, epidemiology, aetiology, and diagnosis. Eur Urol 2014;65:389-98. PMID: 24184024
pmid
8. Swavely NR, Speich JE, Stothers L, Klausner AP. New diagnostics for male lower urinary tract symptoms. Curr Bladder Dysfunct Rep 2019;14:90-7. PMID: 31938079
crossref pmid pmc pdf
9. Nitti VW. Pressure flow urodynamic studies: the gold standard for diagnosing bladder outlet obstruction. Rev Urol 2005;7 Suppl 6:S14-21. PMID: 16986024
pmid
10. Ko KJ, Suh YS, Kim TH, Sung HH, Ryu GH, Lee KS. Diagnosing bladder outlet obstruction using the penile cuff test in men with lower urinary tract symptoms. Neurourol Urodyn 2017;36:1884-9. PMID: 28220532
crossref pmid pdf
11. Khosla L, Codelia-Anjum A, Sze C, Martinez Diaz S, Zorn KC, Bhojani N, et al. Use of the penile cuff test to diagnose bladder outlet obstruction: a systematic review and meta-analysis. Low Urin Tract Symptoms 2022;14:318-28. PMID: 35716000
pmid
12. Malde S, Nambiar AK, Umbach R, Lam TB, Bach T, Bachmann A, et al. Systematic review of the performance of noninvasive tests in diagnosing bladder outlet obstruction in men with lower urinary tract symptoms. Eur Urol 2017;71:391-402. PMID: 27687821
crossref pmid
13. Kim ES, Eun SJ, Youn S. The current state of artificial intelligence application in urology. Int Neurourol J 2023;27:227-33. PMID: 38171322
pmid pmc
14. Bang S, Tukhtaev S, Ko KJ, Han DH, Baek M, Jeon HG, et al. Feasibility of a deep learning-based diagnostic platform to evaluate lower urinary tract disorders in men using simple uroflowmetry. Investig Clin Urol 2022;63:301-8. crossref pmc pdf
15. Matsukawa Y, Kameya Y, Takahashi T, Shimazu A, Ishida S, Yamada M, et al. Development of an artificial intelligence diagnostic system for lower urinary tract dysfunction in men. Int J Urol 2021;28:1143-8. PMID: 34342055
pmid
16. Matsukawa Y, Kameya Y, Takahashi T, Shimazu A, Ishida S, Yamada M, et al. Characteristics of uroflowmetry patterns in men with detrusor underactivity revealed by artificial intelligence. Int J Urol 2023;30:907-12. PMID: 37345347
crossref pmid

Fig. 1.
Comparison of feature importance for CatBoost and XGBoost classifiers in predicting bladder outlet obstruction. (A) Cat-Boost. (B) XGBoost. XGBoost, extreme gradient boosting; BOO, bladder outlet obstruction; TZ, transitional zone; PSA, prostate-specific antigen; ICS, International Continence Society.
inj-2448360-180f1.jpg
Fig. 2.
Comparison of feature importance for CatBoost and XGBoost classifiers in predicting detrusor underactivity. (A) CatBoost. (B) XGBoost. XGBoost, extreme gradient boosting; DUA, detrusor underactivity; TZ, transitional zone; PSA, prostate-specific antigen; ICS, International Continence Society.
inj-2448360-180f2.jpg
Table 1.
Collected data
Category Data
Demographic and clinical data Age, date of urodynamic study, diagnosis, medical history (diabetes mellitus, hypertension, cerebrovascular accident, radical pelvic surgery, spinal cord injury, dementia, and other neurological disorders)
Laboratory tests Prostate-specific antigen levels
Transrectal ultrasound Prostate size, T-zone size
Voiding diary parameters Frequency, nocturia, functional bladder capacity, maximum bladder capacity, urgency scale, urgency (more than 3 episodes), urge urinary incontinence, nocturnal index, nocturnal polyuria index, nocturnal bladder capacity index, 24-hour total urine volume
Simple uroflowmetry Maximal flow rate, voided volume, postvoided residual, average flow rate, voiding time, flow time, time to maximum flow, voiding efficiency
Cystometrogram Maximum cystometric capacity, detrusor pressure at maximum flow, maximum cystometric capacity, detrusor pressure at maximum flow
Pressure flow study Maximum flow rate, voided volume, postvoided residual, first sensation of bladder filling, first desire to void, strong desire to void
Indices Bladder outlet obstruction index, bladder contractility index
Questionnaires International Continence Society Male Short-Form, International Prostate Symptom Score
Table 2.
CatBoost versus XGBoost for BOO and DUA diagnosis
Metric BOO
DUA
CatBoost XGBoost XGBoost CatBoost XGBoost XGBoost
AUROC 0.809 0.826 0.803 0.819
Accuracy 0.730 0.755 0.739 0.734
Sensitivity 0.767 0.756 0.807 0.754
Specificity 0.709 0.755 0.621 0.701
Precision 0.610 0.648 0.786 0.813
Recall 0.767 0.756 0.651 0.754
F1-score 0.680 0.697 0.712 0.782
PPV 0.610 0.648 0.786 0.813
NPV 0.836 0.838 0.651 0.623

XGBoost, extreme gradient boosting; BOO, bladder outlet obstruction; DUA, detrusor underactivity; AUROC, area under the receiver operating characteristic curve; PPV, positive predictive value; NPV, negative predictive value.

Table 3.
Confusion matrices for BOO and DUA
BOO
DUA
CatBoost
XGBoost
CatBoost
XGBoost
Predicted positive Predicted negative Predictive positive Predicted negative Predicted positive Predicted negative Predicted positive Predicted negative
Actual positive 120 60 86 94 261 44 196 109
Actual negative 54 248 18 284 81 96 32 145

BOO, bladder outlet obstruction; DUA, detrusor underactivity; XGBoost, extreme gradient boosting.

Table 4.
Performance of the ensemble model combining CatBoost and XGBoost for BOO prediction
BOO
DUA
Value Metric Value Value Metric Value
AUROC 0.825 Precision 0.821 0.818 Precision 0.851
Accuracy 0.768 F1-score 0.608 0.709 F1-score 0.741
Sensitivity 0.483 PPV 0.821 0.656 PPV 0.851
Specificity 0.937 NPV 0.753 0.802 NPV 0.575

XGBoost, extreme gradient boosting; BOO, bladder outlet obstruction; DUA, detrusor underactivity; AUROC, area under the receiver operating characteristic curve; NPV, negative predictive value; PPV, positive predictive value.

TOOLS
Share :
Facebook Twitter Linked In Google+
METRICS Graph View
  • 0 Crossref
  • 0 Scopus
  • 148 View
  • 7 Download
We recommend


ARTICLE & ORGAN
Article Category

Browse all articles >

Organ

Browse all articles >

ISSUES
DISEASES & TOPICS
Diseases

Browse all articles >

Topics

Browse all articles >

AUTHOR
INFORMATION

Official Journal of Korean Continence Society & ESSIC (International Society for the Study of BPS) & Korean Society of Urological Research & The Korean Children’s Continence and Enuresis Society & The Korean Association of Urogenital Tract Infection and Inflammation & Korean Society of Geriatric Urological Care
Editorial Office
Department of Urology, Kangbuk Samsung Medical Center, Sungkyunkwan University School of Medicine,
29 Saemunan-ro, Jongno-gu, Seoul 03181, Korea
Tel: +82-2-2001-2237     Fax: +82-2-2001-2247    E-mail: support@einj.org

Copyright © 2024 by Korean Continence Society.

Developed in M2PI

Close layer
prev next