INTRODUCTION
Vasectomy is a common male contraceptive method that involves cutting and suturing the vas deferens to block the discharge of sperm, thereby achieving contraception. The specifics of the procedure can vary depending on the surgical method, but it is generally simpler, quicker, and less expensive than female sterilization surgeries, with a shorter recovery period [1,2]. Unlike female sterilization methods such as Mirena or the Loop, which have expiration dates, vasectomy offers the advantage of being semipermanent and reversible. Despite minimal impact on family planning and sexual relationships, vasectomy is considered a personal preference in Korea [3].
In South Korea, sterilization has been closely tied to cultural and social issues. From the 1960s to the 1980s, the South Korean government implemented family planning policies in order to curb rapid population growth. With slogans such as “Let's have only 2 children and raise them well,” the government actively campaigned to lower birth rates, and both vasectomies and tubal ligations were actively encouraged. The government offered incentives such as reduced surgery fees and exemptions from military service. Notably, in 1977, the introduction of subscription system for the housing subsidy program prioritized sterilization procedures including vasectomies since it contributed to their increased popularity [4].
This topic gained attention in newspapers at the time, with statistics that showed a significant rise in the number of sterilizations from 80,000 at the end of 1976 to over 140,000 by the end of August 1977 with more men undergoing vasectomies than women. This surge was related to the Korean ‘dream of owning a house’ and future family planning [5].
Even in modern times, vasectomies continue to be associated with cultural and social issues. For example, in 2018, news articles reported on an 18-year-old high school senior who had a vasectomy during his vacation. In South Korean, it is uncommon to educate teenagers about proper contraception, so some parents tend to choose vasectomies to prevent unwanted pregnancies [6]. Some urology clinics and doctors refuse to perform vasectomies on unmarried men out of ethical concerns, and there are also articles that challenge the notion of performing vasectomies on minors [7]. Thus, it is evident that cultural and social factors significantly influence the awareness and acceptance of vasectomy.
Although numerous studies have confirmed the effectiveness and safety of vasectomy, misconceptions and biases persist with some men fearing sexual deterioration and a loss of masculinity postsurgery [8]. Therefore, there is a need to research that identifies the cultural and social issues surrounding vasectomy.
To gain insight into these issues, it is crucial to analyze public discourse shared on social network platforms and other online public resources. Big data analysis, particularly using natural language processing (NLP) techniques focused on keywords, is increasingly being used to explore public perceptions and social phenomena [9-11]. In this regard, this study utilizes the keyword ‘vasectomy’ to identify relevant issues and topics within Q&A social platforms in South Korea through co-occurrence matrix analysis.
Related Work
NAVER Jisik iN
NAVER Jisik iN (knowledge exchange service between NAVER users) is one of the most actively used Q&A social services in South Korea. It serves as a community where users from diverse age groups and backgrounds engage in exchanging questions and answers [12]. This platform is particularly valuable for gathering data to identify public perceptions and to collect a wide range of perspectives on topics related to health, medical care, social issues, and more.
As shown in Fig. 1, questions on NAVER Jisik iN are submitted anonymous, and anyone can freely write answers. However, in the medical field, doctors and industry professionals can contribute by offering expert answers through the platforms’ knowledge partner services, which consist of individuals from related fields. In addition, during the question submission process, the platform provides clear guidelines as shown in Table 1 and recommends specific doctor categories to enhance the quality of the inquiries. Table 2 summarizes the health consultation categories available on NAVER Jisk iN and the number of questions in each category as of July 2024.
By offering these features, NAVER Jisik iN enables users to freely ask and answer questions about sensitive topics such as vasectomy, while maintaining anonymity. This allows individuals to share their personal experiences and situations, while professionals and experts can contribute answers based on their specialized knowledge and experience. Consequently, the platform provides valuable data that is well-suited for analyzing public perceptions of vasectomy. Therefore, this study aims to analyze the keywords ‘vasectomy’ on NAVER Jisik iN to identify and explore social and cultural issues related to vasectomy in South Korea.
Co-occurrence matrix
In the field of NLP, a co-occurrence matrix is a method used to analyze the relationship between words by examining how frequently certain words appear together in textual data [13]. As shown in Fig. 2, the matrix records the number of times 2 words co-occur within a sentence or document in order to numerically express the association between words through a specific formula.
Co-occurrence matrices are employed in various NLP tasks, such as topic modeling, document clustering, and word meaning analysis. They are particularly valuable for determining semantic similarities between words or identifying key terms associated with a particular topic. Words that frequently co-occur around a topic are likely to be strongly linked to that topic so that they can provide insights into public perceptions or high lighting the main issues being discussed. These matrices offer a visual representation of the correlation between words based on their frequency, thus to reveal hidden patterns within the data. This enables the identification of related topics or issues centered around specific keywords and facilitates the analysis of complex relationships in text data through visualized data.
By applying a co-occurrence matrix to the data collected from NAVER Jisik iN, this study will identify key issues and topic words related to vasectomy.
Network analysis for visualization
This study aims to visualize the results of the co-occurrence matrix analysis as a network graph to better understand the correlations between words as shown in Fig. 3. To achieve this, this study will analyze the network’s centrality and grouping based on word correlations. Specifically, it will employ network analysis techniques such as degree centrality and community detection to assess the importance of specific nodes in the data visualization process and to construct the overall network structure.
Degree centrality: Degree centrality is a measure of the centrality of a particular node within a network. It reflects the number of direct connections (edges) it has with other nodes. This measure helps evaluate the importance of a node by counting how many edges are directly linked to it. Degree centrality is further divided into in-degree and out-degree based on the direction of relationships between nodes. In-degree represents the node’s popularity, while out-degree indicates its influence. As networks grow in size, the degree centrality index may become distorted, making normalization necessary. Normalized degree centrality allows for comparisons across networks of different sizes, which is useful for evaluating a node’s influence or importance. Nodes with high degree centrality values have more connections to other nodes and they indicate that they hold significant information or influence within the network [14].
Community detection: Community detection is a method for identifying groups (communities) of closely connected nodes within a network. A community is a set of nodes that interact more frequently with each other than with those outside the community. Each community typically aggregates nodes based on specific characteristics or topics. Community detection helps to understand the modular structure of the network and to derive relationships between nodes. The Girvan-Newman algorithm is a widely used method for community detection, which identifies communities by iteratively removing the most central edges within the network, as shown in Table 3 [15].
In this study, the Girvan-Newman algorithm was applied to detect communities within the network and assign community numbers to each node for visualization. This process is represented in the network visualization by grouping nodes that belong to the same community, making it easier to visually distinguish them. These techniques, degree centrality and community detection, are crucial tools in network visualization and analysis [16,17].
MATERIALS AND METHODS
This study involves data collection, data analysis, and visualization as shown in Fig. 4. The first step is to identify the structure and CSS (Cascading Style Sheets) elements of the NAVER Jisik iN platform, which is essential for efficiently extracting the necessary information during data collection. By understanding the structure of the web pages, this study establishes a foundation for accurately gathering only the relevant source data required for analysis as shown in Table 4. To automate data collection, this study configures a Python virtual environment using Selenium, a web automation tool. It begins by collecting URLs of detailed question pages that appear when searching for the keyword ‘vasectomy’ on NAVER Jisik iN. Using these URLs, it automates the extraction of data from each detailed page. The collected data is then categorized and saved as a CSV (comma-separated values) file in an automated environment.
The stored data undergoes preprocessing steps, including text cleaning, the removal of stop words, and the elimination of duplicate data, in preparation for analysis. The preprocessed data is then normalized, and only noun words are extracted using Python’s Open Korean Text morphological analyzer [18]. After morphological analysis, the data is subjected to co-occurrence matrix analysis to explore the relationships between words. From the numerous word pairs derived from the co-occurrence matrix, the top 200 word pairs are selected based on their weight. These selected word pairs are then used to calculate the degree centrality of each word, and the Girvan-Newman algorithm is applied to detect communities within the word network, preparing the data for visualization.
Finally, based on the results of degree centrality and community detection, this study visualizes the data as a network graph using the R programming language, which offers extensive functionality for data visualization. The resulting visualizations reveal the relationships between words, allowing us to intuitively understand how public perceptions of vasectomy and major issues are interconnected on NAVER Jisik iN.
RESULTS
This study analyzes question data to identify word pairs related to topics of user interest and examines how public responses are structured in the answer data. It extracts key terms from both questions and answers, analyzing their connections. Cooccurrence matrices from each dataset are visualized as network graphs, highlighting word centrality and interconnections. By comparing these graphs, the study assesses whether specific keywords consistently hold importance across questions and answers or if they are used differently in each context [19,20].
Data Collection
The data collection is summarized in Table 5. The data collection period spanned from July 2019 to July 31, 2024, totaling 5 years, during which 13,487 detailed page data were gathered. The search keyword ‘vasectomy’ was used, and data not related to urology were removed. Additionally, entries consisting solely of images without accompanying text, as well as irrelevant macro answers and duplicate data unrelated to the intent of the question, were excluded. In total, 5,838 questions and 11,546 answers were retained for analysis. Due to the nature of the service, each detailed page with a question may contain multiple answers, and there were instances where the questioner deleted some of the responses.
Co-occurrence Matrix Analysis Results for Question Data
Table 6 presents the results of the co-occurrence matrix analysis for the question data. This study focused on analyzing the intent behind the top 50 word pairs out of the top 200 based on their weights.
First, the word pair (surgery - surgery) has the highest weight (28,060), indicating that ‘surgery’ is frequently mentioned as a central topic in the question content. This suggests that users are particularly concerned with the surgical aspect of vasectomy. The word pairs (vasectomy - surgery) and (surgery - vasectomy) both have a weight of 16,394 and they highlight that vasectomy itself is a prominent topic within the questions, with users seeking specific information or experiences related to surgery.
Next, high-ranking pairs like (woman - woman) (8,564) and (marriage - marriage) (7,440) suggests frequent mentions of these topics, reflecting the social and cultural contexts of vasectomy, particularly regarding marital relationships and gender roles. Paris such as (surgery - alcohol) (4,903) and (alcohol - surgery) (4,903) also indicate common queries about alcohol consumption after surgery. These questions might involve concerns about how drinking could affect surgical outcomes or recovery.
Other notable word pairs include (pregnancy - surgery) (3,892), (surgery - pregnancy) (3,892), and (contraception - contraception) (3,868), which point to questions about the relationship between vasectomy, pregnancy, and contraceptive effectiveness.
In addition, the word pairs (surgery - hospital) (3,420) and (hospital - surgery) (3,420) indicate that many questions pertain to where to undergo a vasectomy. It clearly reflects an interest in selecting a reputable medical center or hospital for the procedure.
Finally, the word pairs (surgery - pain) (2,365) and (pain - surgery) (2,365) suggest that pain management and concerns about postsurgical discomfort are significant topics in the questions.
Co-occurrence Matrix Analysis Results for Answer Data
Table 7 shows the co-occurrence matrix analysis for the answer data, focusing on the top 50-word pairs out of 200 based on weight.
The highest-weighted pair, (surgery - surgery) (327,416), highlights that ‘surgery’ is a central theme, with frequent discussions on the surgical aspects of vasectomy. the pairs (surgery - vasectomy) and (vasectomy - surgery) (both 151,954) reinforce the significance of these terms in the context of vasectomy.
High-ranking word pairs such as (contraception - contraception) (140,856), (contraception - method) (77,197), and (method - contraception) (77,197) indicate that contraception is closely associated with the purpose of vasectomy in the an-swers. The frequent mention of these terms suggests that respondents often discuss vasectomy in relation to birth control.
Other significant word pairs include (contraception - intake) (58,316) and (intake - contraception) (58,316), which highlight discussions about vasectomy in the context of women’s use of birth control pills. Additionally, the word pairs (contraception - menstruation) (47,201) and (menstruation - contraception) (47,201) reflect discussions about how vasectomy and contraception related to menstruation and family planning.
The word pairs (surgery - penis) (30,834) and (penis - surgery) (30,834) indicate that many answers address how vasectomy is performed in relation to male genitalia. It reflects concerns about potential physical changes or health issues following the procedure.
Lastly, the word pairs (surgery - hospital) (27,192) and (hospital - surgery) (27,192) show that respondents often provide information about selecting a hospital for a vasectomy, which aligns with the questioners’ interest in finding a reliable medical facility.
Network Visualization and Analytics for Question Data
Fig. 5 presents a network graph based on the key words derived from the co-occurrence matrix analysis of the question data with the centrality of connections and community structures. This graph provides a clear view of how keywords related to vasectomy are interconnected, highlighting the importance and relationships of each word.
The word ‘surgery’ is notably central, serving as a pivotal node with extensive connections, positioning it as a core topic in vasectomy discussions. Other significant nodes, such as ‘va-sectomy’ and ‘contraception,’ underscore their frequent mention alongside vasectomy.
The graph is color-coded to represent different communities, each representing a distinct theme. Table 8 shows the detailed information of each community.
The edges, or connecting lines, between words indicate the frequency with which 2 words are mentioned together. The thickness and color intensity of these edges visually represent the strength of the connection. For instance, the thick, dark line between ‘surgery’ and ‘vasectomy’ emphasizes the frequent cooccurrence of these terms.
Network Visualization and Analytics for Answer Data
Fig. 6 presents a network graph based on the key words derived from the co-occurrence matrix analysis of the answer data to analyze the centrality of connections and community structures. This graph visually illustrates how key words are interconnected in the answers related to vasectomy and offers an intuitive understanding of the importance of each word and their relationships.
The centrality of the words ‘surgery’ and ‘contraception’ is a defining feature of the graph. Both words are positioned at the center of their respective network groups and are extensively connected to other key words. The frequent mention of vasectomy in the answer data, particularly in the context of the surgical procedure and its contraceptive effects, is evident.
The color-coded communities within the graph highlight the connections between different topics. Table 9 shows the de-tailed information of each community.
The edges between words visually emphasize the frequency of their co-occurrence, with strong connections between ‘surgery’ and ‘vasectomy’ indicating that these terms are often mentioned together. Similarly, the strong connection between ‘contraception’ and ‘woman’ highlights the importance of discussions about contraception and its impact on women.
Public Perceptions of Vasectomy and Side Effects: BPH and Dysuria
While vasectomy is a widely accepted method of contraception for men, there are some public concerns regarding potential side effects such as benign prostatic hyperplasia (BPH), dysuria, and prostate cancer. However, no objective link or clear correlation between vasectomy and these conditions has been conclusively identified [21-23].
In this regard, this study extracted keywords related to ‘BPH’ and ‘dysuria’ from the analyzed vasectomy-related data and conducted an in-depth analysis of public perceptions concerning these side effects.
Benign prostatic hyperplasia
The word analysis in Table 10 presents public perceptions of vasectomy, focusing on the terms ‘BPH’ and ‘prostate.’ The analy-sis was conducted using the direct medical term ‘BPH’ and the related term ‘prostate’ in order to provide a broader scope.
In the questions, the word pair ‘BPH’ and ‘surgery’ (weight: 15) had the highest relevance, though the low weight indicates that ‘BPH’ is rarely mentioned. This suggests that while the term is relevant, it is not frequently brought up by the public in their inquiries. In contrast, the word pair ‘BPH’ and ‘prostate cancer’ (weight: 502) showed the highest relevance in the answers. It indicates that experts or respondents frequently discuss the potential connection between BPH and prostate cancer. Other highly ranked terms including ‘surgery’ (164), ‘symptoms’ (176), ‘risk’ (96), and ‘patient’ (96), reflecting a strong interest in the various symptoms, surgical risks, and diseases associated with BPH.
Prostate
To broaden the scope of the analysis, this study also includes the term ‘prostate.’ The findings are in the questions, ‘prostate’ and ‘surgery’ (weight: 270) have the highest correlation. This indicates that the public is curious about the relationship between vasectomy and prostate surgery, or is interested in surgical approaches to address prostate-related health issues. Other notable terms include ‘vasectomy’ (135), ‘test’ (87), ‘alcohol’ (79), ‘testicles’ (76), and ‘pain’ (65). This suggests significant public interest in the connections between vasectomy, prostate health, related tests, and postoperative pain.
Similar to BPH analysis, the word pair ‘prostate’ and ‘prostate cancer’ (weight: 1,868) had the highest relevance in the answers. Additional terms with high relevance include ‘vasectomy’ (1,197), ‘surgery’ (1,599), ‘symptoms’ (1,061), and ‘treatment’ (1,088), highlighting public interest in understanding the symptoms, risks, and treatment options for prostate-related symptoms and treatments.
This analysis indicates that while the public’s questions may focus more on procedural aspects and immediate health concerns, the answers provided by experts often emphasize potential long-term risks, particularly the association between BPH, prostate surgery, and prostate cancer.
Urination
Table 11 provides an analysis of public perceptions concerning the keywords ‘urination’ and ‘urine’ in discussions related to vasectomy. The analysis highlights the close association of these terms with ‘dysuria’, a condition that may occur after a vasectomy, and the various questions and answers related to this issue.
In the questions, the word pair ‘urination’ and ‘pain’ (weight: 22) is the most closely related in the questions, though the low weight value suggest that the term ‘urination’ is not frequently used by the public in their inquiries. This indicates that while there is some concern about pain during urination post-vasectomy, the term itself may not be a primary focus in public questions.
Conversely, the word pair ‘urination’ and ‘prostate cancer’ (weight: 504) shows the highest relevance in the answers, suggesting that many responses discuss the potential relationship between urination problems after a vasectomy and prostate cancer. Other frequently associated words include ‘symptoms’ (371), ‘prostate’ (292), and ‘occurrence’ (259), which further underscore the strong connection between urination issues and prostate health in public discourse. The mention of ‘prostatitis’ (185) also indicates that additional conditions related to postvasectomy urination problems are being considered and discussed.
Urine
To broaden the scope, an additional analysis was conducted using the generic term ‘urine.’ The results show that the word pair ‘urine’ and ‘surgery’ (weight: 375) had the highest correlation in the questions, indicating that the public is interested in the relationship between vasectomy and urinary issues. Other frequently mentioned words include ‘urine’ (296), ‘test’ (250), and ‘vasectomy’ (204), which indicate that people have various questions bout urine-related symptoms or tests after a vasectomy. The word ‘pain’ (194) further suggests that individuals have experienced pain while urinating after a vasectomy and are actively seeking information.
In the answers, the word pair ‘urine’ and ‘prostate cancer’ (weight: 1,005) had the highest relevance, suggesting that many responses address the connection about the connection between urinary problems after a vasectomy and prostate cancer. Additional highly relevant terms such as ‘surgery’ (930), ‘pregnancy’ (772), ‘contraception’ (653), and ‘test’ (583) reveal that discussions also focus on the use of urine tests in confirming pregnancy and contraception.
The analysis revealed that the public harbors significant concerns about potential prostate issues, urination problems, and urine-related symptoms that might arise after a vasectomy. This concern is particularly pronounced among middle-aged and older men, who are especially worried about the health risks associated with post-surgical urination problems. Additionally, the keywords ‘prostate,’ ‘BPH,’ ‘urination,’ and ‘urine’ were all heavily weighted in responses related to the keyword ‘prostate cancer.’ This likely reflects the concerns of individuals who either feared developing prostate cancer after a vasectomy or experienced urination discomfort. In response, expert respondents frequently addressed the uncertain causal relationship between vasectomy and prostate cancer, often using the term ‘prostate cancer’ in their explanations, which contributed to its prominence in the analysis. Representative questions and answers are provided in Table 12.
DISCUSSION
This purpose of this study is to conduct an in-depth analysis of public perceptions of vasectomy and the major issues surrounding it by using data collected from the NAVER Jisik iN service in South Korea. For this end, this study employed cooccurrence matrix analysis to examine Q&A data, and visualized the relationships between words through connection centrality and community analysis. The findings revealed that keywords such as ‘surgery,’ ‘vasectomy,’ and ‘contraception’ are central to public discussions, with their interconnections illustrating how public interest in vasectomy is structured.
A comparison of the visualization graphs derived from the Q&A data showed that discussions on vasectomy in both datasets focus on specific topics. Questions predominantly sought information related to the surgical procedure, while answers emphasized the physiological aspects related to contraceptive effectiveness and women postsurgery. These results suggest that public awareness of vasectomy is shaped not only by the procedure itself but also by its effects, side effects, and the cultural and social issues surrounding it.
It is noteworthy that negative keywords such as ‘prostate,’ ‘BPH,’ ‘urination,’ and ‘urine,’ were heavily weighted in the answer analysis, along with terms related to ‘prostate cancer,’ This likely reflects the concerns of individuals who are either worried about the potential link between vasectomy and prostate cancer or who have experienced urination discomfort and sus-pect it might be related to prostate cancer. In response, expert respondents frequently discussed the uncertain causal relationship between vasectomy and prostate cancer, often referencing the medical term ‘prostate cancer’ in their explanations, which likely contributed to its prominence in the analysis.