<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "https://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <article-meta>
      <title-group>
        <article-title>The Role of Strategic Financial Management in Enhancing Corporate Value and Competitiveness in the Digital Economy</article-title>
      </title-group>
      <contrib-group content-type="author">
        <contrib contrib-type="person">
          <name>
            <surname>Ahmad</surname>
            <given-names>Israr</given-names>
          </name>
          <email>chaudhryisrar@gmail.com</email>
          <xref ref-type="aff" rid="aff-1"/>
        </contrib>
      </contrib-group>
      <aff id="aff-1">
        <institution>Universiti Sains Malaysia</institution>
        <country>Malaysia</country>
      </aff>
      <history>
        <date date-type="received" iso-8601-date="2023-06-08">
          <day>08</day>
          <month>06</month>
          <year>2023</year>
        </date>
        <date data-type="published" iso-8601-date="2024-02-10">
          <day>10</day>
          <month>02</month>
          <year>2024</year>
        </date>
      </history>
    </article-meta>
  </front>
  
  
<body id="body">
    <sec id="sec-1">
      <title>Introduction </title>
      <p id="_paragraph-2">An estimated 10-20% of the world's population, up to 1.2 billion people, belong to ethnic and cultural minorities, many of whom face discrimination, marginalization, and systemic exclusion. These realities underscore the urgent need for inclusive, culturally sensitive systems that safeguard and accurately reflect minority identities (Goodman et al., 2023). However, despite decades of progress in information science, many existing knowledge systems fail to capture the complexity, fluidity, and richness of these identities. The central question thus arises: how can we ethically and accurately represent ethnographic diversity in the digital systems upon which researchers, institutions, and policymakers increasingly rely?</p>
      <p id="_paragraph-3">Knowledge Organization Systems (KOS), including taxonomies, classification schemes, thesauri, and ontologies, have long provided the backbone for structuring and retrieving information (Feng et al., 2024; Moraitou et al., 2022). Yet these systems, often shaped by Western epistemological frameworks, embed structural biases that marginalize non-Western and Indigenous knowledge (Doyle, 2013; Rubim Silva &amp; Dal'Evedove, 2025). As a result, traditional KOS tend to represent knowledge from dominant cultural standpoints, rendering minority perspectives either invisible or misrepresented (Wright &amp; Saelua, 2023; Anania &amp; Stiglitz, 2023).</p>
      <p id="_paragraph-4">In contrast, Knowledge Graphs (KGs) offer semantically rich and flexible structures for modeling complex relationships between entities. As graph-based systems, KGs are increasingly recognized for their potential to support pluralistic and interdisciplinary knowledge ecosystems, particularly when integrated with culturally grounded ontological frameworks (Dally et al., 2024; Nguyen et al., 2023). Moreover, language is a central dimension of both KOS and KGs. Recognizing and integrating minority languages not only supports cultural preservation but also promotes equitable access to services, data, and epistemic justice (Couldry &amp; Mejias, 2023). </p>
      <p id="_paragraph-5">Nonetheless, mainstream classification systems such as the Library of Congress Subject Headings (LCSH) and the Dewey Decimal Classification (DDC) continue to draw criticism for their Eurocentric hierarchies. In response, alternative and community-led models like <italic id="_italic-1">Lau ā Lau ka ʻIke</italic> and the Infomediary of Taiwanese Indigenous Peoples have emerged to reflect Indigenous epistemologies (Sung &amp; Chi, 2021; Wright &amp; Saelua, 2023). While these initiatives exemplify valuable resistance to epistemic hegemony, they remain peripheral in the global knowledge infrastructure (Huang &amp; Zaslavsky, 2024). Moreover, challenges persist regarding data sovereignty, co-design ethics, and the risk of reproducing power asymmetries through externally imposed ontologies (Siegers et al., 2023; Hou et al., 2022).</p>
      <p id="_paragraph-6">Recent efforts to document intangible heritage, such as languages, oral histories, and cultural artifacts, have demonstrated the value of linked open data (LOD) and KG technologies (Cui et al., 2024; Du et al., 2025; Fan, 2023). However, many of these systems still fall short in representing hybrid or evolving identities, especially when modeled through rigid taxonomies. For example, ontologies developed for health or archival data, while improving formal data accuracy, often struggle to reconcile contextual, relational, and narrative-based forms of knowledge (Nguyen et al., 2023; Mateos, 2007; Woods, 2017).</p>
      <p id="_paragraph-7">In the domain of knowledge management, the role of tacit knowledge, oral traditions, and community archiving remains central, particularly among minority groups adapting to globalization and generational change (Sewdass, 2014; Bonilla et al., 2025). Nevertheless, structural inequities, microaggressions, and digital divides continue to impede effective knowledge sharing and cultural visibility (Davis, 2015; Khurram &amp; Giangiulio, 2020; Li et al., 2023).</p>
      <p id="_paragraph-8">Therefore, Emerging Linked Open Data (LOD) and knowledge graph applications offer promising solutions. Projects in Laos, Vietnam, and the Greater Mekong Subregion have enabled multilingual, standardized access to ethnic knowledge and folktales (Chansanam et al., 2024; Ngootip et al., 2023; Ngoc &amp; Chansanam, 2023). Broader efforts also document applications in preserving craftsmanship, biodiversity, and ethnographic knowledge (Bai &amp; Hou, 2023; Chansanam et al., 2022, 2020; Cui et al., 2024; Lu et al., 2023). Still lacking, however, is a cohesive, interdisciplinary synthesis that critically evaluates the equity and sustainability of these digital innovations (Jaroenruen et al., 2024; Khoo et al., 2024).</p>
      <p id="_paragraph-9">To address this gap, this study engages with both decolonial information science and intercultural communication theories. Drawing on scholars such as Mignolo (2011) and Green (1999), it recognizes the need for research methodologies that legitimize Indigenous knowledge structures and allow for community-defined ontologies. Additionally, perspectives from Gudykunst (2003, 2004) and Kim (2015) shape the system’s emphasis on relational modeling, multilingual design, and cultural fluidity, ensuring that identity representation is negotiated rather than imposed.</p>
      <p id="_paragraph-10">Despite advances in graph technologies, traditional KOS continue to rely on siloed, hierarchical taxonomies that restrict the representation of complex ethnographic relationships (Smith, 2021). This structural rigidity not only limits interoperability across databases but also impedes discovery, learning, and cross-disciplinary innovation. Consequently, there is a pressing need for systems that can semantically integrate geo-temporal, linguistic, cultural, and relational data in a unified yet flexible framework.</p>
      <p id="_paragraph-11">This study responds to these challenges by designing, constructing, and evaluating a knowledge graph that models the ethnographic diversity of 375 ethnic groups across six countries in the Greater Mekong Subregion (GMS). The graph incorporates multilingual identifiers, migration histories, cultural practices, and linguistic lineages, offering a nuanced and scalable model for inclusive information representation. Furthermore, the project includes a custom-built web application for real-time visualization, interactive exploration, and expert validation. Therefore, this research addressed the following research question, how can a semantically enriched, ethically grounded knowledge graph framework improve the representation, accessibility, and interpretability of complex ethnographic data across the Greater Mekong Subregion?</p>
      <p id="_paragraph-12">However, this study makes four key contributions. First, it introduces a cross-border, region-wide KG on ethnographic diversity, integrating community-sourced and validated data. Second, it advances methodological integration by combining KOS, KG, and knowledge management approaches to model identity, migration, and cultural relationships. Third, it presents an open-access web platform supported by robust expert validation and usability metrics. Finally, it establishes a reusable framework for inclusive, scalable, and ethically oriented digital heritage systems that can be adapted for other regions and disciplines. Moreover, this research not only contributes to digital ethnography and semantic technologies but also advocates for epistemic justice by challenging dominant paradigms in knowledge organization and representation.</p>
    </sec>
    <sec id="sec-2">
      <title>Methodology </title>
      <p id="_paragraph-13">This research focuses on developing and implementing a Knowledge Graph (KG) web application designed to model the cultural and ethnic diversity of the Greater Mekong Subregion (GMS). The study emphasizes the transformation of a structured dataset into a semantic knowledge graph that captures the complex relationships among 375 ethnic groups across six countries, Thailand, Laos, Myanmar, Cambodia, Vietnam, and China, highlighting linguistic, cultural, geographic, and historical attributes. The construction process for the GMS Ethnic Groups Knowledge Graph is illustrated in Figure 1.</p>
      <fig id="fig1">
        <label>Figure 1</label>
        <caption>
          <title><bold id="_bold-1"/>The GMS ethnic groups knowledge graph construction process</title>
          <p id="_paragraph-14"/>
        </caption>
        <graphic id="_graphic-1" mimetype="image" mime-subtype="png" xlink:href="image1.png"/>
      </fig>
      <p id="paragraph-e2dcf0544bc9f37a66622862302ab6f5">
        <bold id="bold-c094c71893750503b0e104208d47ee4f">Data Collection and Analysis</bold>
      </p>
      <p id="_paragraph-16">The primary data source for this study comprises a comprehensive dataset that was meticulously collected, analyzed, and synthesized as part of a collaborative research initiative led by local scholars across the Greater Mekong Subregion. This collaborative approach ensured cultural sensitivity, contextual relevance, and authenticity in the information presented. The dataset included critical informational fields such as ethnolinguistic identifiers (autonyms, ethnonyms, exonyms, and endonyms), spoken languages and their respective language families, geographic distribution, national affiliations, and cultural attributes, including traditional housing, clothing, festivals, and religious practices. Additional variables, such as population size and historical migration patterns, were incorporated to provide a multidimensional representation of ethnic diversity in the region.</p>
      <p id="_paragraph-17">The dataset was compiled by more than 20 local scholars affiliated with regional universities, cultural heritage institutions, and ethnolinguistic research centers. Data were obtained from diverse sources, including national censuses, ethnographic field reports, community archives, oral histories, and peer-reviewed studies. Each scholar contributed to a specific subset of ethnic groups within their respective country to ensure accuracy and contextual appropriateness. To enhance validity, the data were cross-referenced with authoritative sources such as Ethnologue, Glottolog, and official linguistic atlases. Ambiguities or inconsistencies were flagged and subsequently reviewed by an advisory panel of senior ethnographers. Additionally, an auxiliary JSON file was created to facilitate downstream processing. To enrich contextual depth, the knowledge graph integrated linguistic taxonomies, cultural classification frameworks, and geospatial information.</p>
      <p id="paragraph-825f9d7e97501be203c3aac7ab2f35ee">
        <bold id="bold-e67fb3271ab33d6db82117dcb20ca32a">Study Selection and Entity Identification</bold>
      </p>
      <p id="_paragraph-18">A systematic approach was employed to identify and define core entities and relationships within the knowledge graph. These included entities such as ethnic groups, languages, countries, geographic regions, religions, festivals, and housing types, along with corresponding relationships that captured spatial, linguistic, cultural, and historical linkages. For example, relationships were modeled to represent location associations, language families, religious practices, inter-ethnic cultural exchanges, and naming structures. This structured approach ensured that the KG accurately reflects the interconnected nature of ethnographic attributes in the region.</p>
      <list list-type="bullet" id="list-2b9025d80a3e3413d27c3b2ec7299af7">
        <list-item>
          <p><italic id="_italic-2">LOCATED_IN</italic> (ethnic group → country),</p>
        </list-item>
        <list-item>
          <p><italic id="_italic-3">SPEAKS_LANGUAGE_FROM</italic> (ethnic group → language family),</p>
        </list-item>
        <list-item>
          <p><italic id="_italic-4">PRACTICES</italic> (ethnic group → religion),</p>
        </list-item>
        <list-item>
          <p><italic id="_italic-5">SHARES_ORIGIN_WITH</italic>, SHARES<italic id="_italic-6">_HOUSING_STYLE_WITH</italic>, <italic id="_italic-7">SHARES_FESTIVAL_TRADITION_WITH</italic> (inter-ethnic relationships),</p>
        </list-item>
        <list-item>
          <p><italic id="_italic-8">HAS_AUTONYM</italic>, <italic id="_italic-9">HAS_EXONYM</italic>, <italic id="_italic-10">HAS_ETHNONYM</italic> (ethnic group naming structures).</p>
        </list-item>
      </list>
      <p id="paragraph-6e01d20c2388268523083ea716eaec56">
        <bold id="bold-7d3fed77bbace504d9beef5323826ae9">Data Extraction and Cleaning</bold>
      </p>
      <p id="_paragraph-19">Data preprocessing involved several critical steps to guarantee consistency and accuracy. First, a standardization process was implemented to harmonize language family names and normalize country labels, using Glottolog and Ethnologue classifications as primary references. Locally used or culturally specific terms were aligned with widely accepted academic terminology. For example, terms such as “Kadai” and “Kra-Dai” were standardized under “Tai-Kadai,” while “Mon-Khmer” and “Austroasiatic” were reconciled in accordance with prevailing linguistic consensus. Where discrepancies arose, expert consultations with regional linguists were conducted to ensure precise classification.</p>
      <p id="_paragraph-20">Following standardization, unique identifiers were generated for each entity by combining key attributes, thereby supporting consistent referencing during the graph-building process. Data cleaning was then performed to remove duplicates, address missing values, and normalize formatting, ensuring structural integrity. Finally, the cleaned dataset was converted into structured, entity-relationship formats suitable for integration into the KG framework.</p>
      <p id="paragraph-7ee089ff20b4d2fb7ec42a5a69f7f00f">
        <bold id="bold-64275b8420951f3a3e03d7b40585623d">Knowledge Graph Construction</bold>
      </p>
      <p id="_paragraph-21">The knowledge graph was constructed by designing a semantic architecture in which nodes represented identified entities and edges encoded relationships. This structure allowed for a rich, interconnected representation of ethnographic data. Cypher query templates were developed in Neo4j to enable efficient graph generation, querying, and exploration. Illustrative examples include relationships such as “Hmong” linked to “Hmoob” through the HAS_AUTONYM property, and “Hmong” associated with the “Hmong-Mien” language family through the SPEAKS_LANGUAGE_FROM relationship. These examples underscore the graph’s capacity to capture multilingual and multicultural dimensions in a semantically coherent manner.</p>
      <p id="_paragraph-22">Sample graph structures, such as:</p>
      <list list-type="bullet" id="list-0ff909277a7e048e9d309b8cb0d80967">
        <list-item>
          <p>"Hmong" <italic id="_italic-11">(Ethnic Group)</italic> → <italic id="_italic-12">HAS_AUTONYM</italic> → "Hmoob" <italic id="_italic-13">(Autonym)</italic></p>
        </list-item>
        <list-item>
          <p>"Hmong" → <italic id="_italic-14">SPEAKS_LANGUAGE_FROM</italic> → "Hmong-Mien" <italic id="_italic-15">(Language Family)</italic></p>
        </list-item>
      </list>
      <p id="_paragraph-23">Illustrate the multilingual and multicultural dimensions captured in the KG.</p>
      <p id="paragraph-d806113507ff7f47be7d9bf00e01ccf9">
        <bold id="bold-901ce816655a44d94d0c7ebfdcc7f5b2">Visualization and Analytical Tools</bold>
      </p>
      <p id="_paragraph-24">To facilitate interactive exploration and analytical capabilities, a web application was developed, incorporating a robust visualization interface. The frontend was implemented using React.js and D3.js to enable dynamic graph visualization and user interaction, while the backend was built on Node.js and Express, integrated with a GraphQL API to interact with the Neo4j database. The system provides multiple visualization modes, including relationship graphs, geospatial mappings, and comparative dashboards with advanced filtering and search functionalities. Additionally, detailed information panels display attributes such as migration origins, religious affiliations, traditional attire, linguistic identifiers, and cultural practices. This design ensures both exploratory navigation and hypothesis-driven analysis of ethnographic structures within the GMS region.</p>
      <p id="paragraph-36fcbc24a5e82ef764967d1401ef7829">
        <bold id="bold-00365b7f1a5689b1e53961ec23a285a0">Reproducibility and Transparency</bold>
      </p>
      <p id="_paragraph-25">To promote transparency and reproducibility, all data transformation processes were fully documented. Scripts for preprocessing, data loading, and KG construction were developed in Python and organized using Jupyter Notebooks. Furthermore, the full Neo4j Cypher implementation and API endpoints were published for reuse, enabling researchers to replicate or extend the methodology in future studies.</p>
      <p id="paragraph-963178ae2967e4a80613db5c8439b6ab">
        <bold id="bold-5d79e1ccaac7a9c0e6035aa322d2fca9">Knowledge Graph Evaluation</bold>
      </p>
      <p id="_paragraph-26">The evaluation strategy adopted a dual-method approach as outlined by Choi and Jung (2025), combining extrinsic evaluation and dataset-specific assessment. Extrinsic evaluation measured the KG’s practical utility in operational settings, focusing on its ability to improve task performance, query efficiency, and interpretability. In contrast, dataset and domain-specific evaluation examined accuracy, completeness, and internal consistency within the ethnographic domain.</p>
      <p id="_paragraph-27">A usability study was conducted with 24 domain experts specializing in ethnography, digital humanities, and information science, all of whom had an average of 9.5 years of experience. Participants were recruited through purposive sampling from institutional networks across Southeast Asia. They were provided with a demonstration of the system before performing structured, task-based evaluations. Performance metrics included task completion rate, completion time, information discovery success, and visualization clarity, assessed using standardized Likert scales and time logs.</p>
      <p id="_paragraph-28">In addition, 17 ethnographers with expertise in linguistic anthropology, material culture, and regional studies participated in the domain-specific validation process. Using a three-dimensional rubric (accuracy, reliability, completeness), they evaluated the dataset and annotated inconsistencies or gaps. Feedback from this validation informed the consistency analysis reported in the results section.</p>
      <p id="paragraph-d48cdc929098552b6a6fc3371a61b322">
        <bold id="bold-ef67668b641ad538334d98181f970ef7">Ethical Considerations</bold>
      </p>
      <p id="_paragraph-29">Ethical integrity was central to every phase of the research. Although the project did not involve personal or identifiable human subject data, it engaged extensively with community-level ethnographic information, which carries cultural and political significance. Consequently, the study adhered to principles of data sovereignty, ensuring that cultural communities retained the right to define and contextualize their data. Local scholars actively participated in decisions regarding naming conventions and classification structures to prevent externally imposed categorizations.</p>
      <p id="_paragraph-30">Informed consent was obtained from all contributing experts involved in data provision and validation, following clear communication of research objectives and procedures. Cultural sensitivity was further embedded in the modeling process by incorporating autonyms, multilingual identifiers, and non-hierarchical classification systems. Additionally, regional experts reviewed all data to ensure accuracy, contextual appropriateness, and respect for cultural representation throughout the study.</p>
    </sec>
    <sec id="sec-3">
      <title>Results </title>
      <p id="_paragraph-31">The results of this study present the structure, content, and analytical capabilities of the Greater Mekong Subregion (GMS) Ethnic Groups Knowledge Graph (KG) and its associated web application. Through a series of visualizations, database queries, and interface demonstrations, the findings illustrate how ethnographic data for 375 ethnic groups across six countries have been semantically modeled and interactively explored. These results highlight the knowledge graph’s ability to represent complex interrelationships among cultural, linguistic, religious, and geographic dimensions. Detailed visual outputs, such as ego-network views for specific ethnic groups and geospatial mappings of ethnic distributions, demonstrate the system’s capacity to capture multidimensional identities and support domain-specific research. Additionally, technical and user-centered performance evaluations validate the knowledge graph’s practical utility and superiority over traditional database approaches in representing ethnographic complexity.</p>
      <p id="paragraph-bb61f4bb33fd819650a17ea4be883426">
        <bold id="bold-67d6b5420032e66c76739ce9ef661dc2">The general characteristics of GMS Ethnic groups knowledge graph</bold>
      </p>
      <fig id="fig2">
        <label>Figure 2</label>
        <caption>
          <title>GMS Ethnic groups knowledge graph on Neo4j</title>
          <p id="_paragraph-32"/>
        </caption>
        <graphic id="_graphic-2" mimetype="image" mime-subtype="png" xlink:href="image2.png"/>
      </fig>
      <p id="_paragraph-34">Figure 2 presents a Neo4j graph visualization depicting the interconnectedness of ethnic groups and cultural attributes in the Greater Mekong Subregion. The query returns a network of 334 nodes and 426 relationships, with entity types color-coded: ethnic groups (purple), religions (orange), regions (green), languages (light blue), and cultural attributes. Relationship types include “PRACTICES” (297), “FOUND_IN” (70), “HAS_POPULATION_CATEGORY” (22), and “MIGRATED_FROM” (14), among others. The structure highlights major ethnic groups as central hubs linked to diverse cultural, linguistic, and geographic data. This knowledge graph enables the semantic modeling of ethnographic complexity, supporting the exploration of cultural affiliations, migration patterns, and regional distributions. The implementation demonstrates how anthropological data can be effectively represented as an interconnected network for cross-domain research and analysis.</p>
      <fig id="fig3">
        <label>Figure 3</label>
        <caption>
          <title>A focus on the Karen ethnic group node and its properties panel</title>
          <p id="_paragraph-35"/>
        </caption>
        <graphic id="_graphic-3" mimetype="image" mime-subtype="png" xlink:href="image3.png"/>
      </fig>
      <p id="_paragraph-37">Figure 3 offers a focused view of the Neo4j graph visualization, highlighting the Karen ethnic group (node ID 1251) within the Greater Mekong Subregion (GMS). The properties panel displays multilingual identifiers, “Karen” (ethnonym/exonym), “Kayin” (endonym), and “ကရင်လူမျိုး” (autonym in Burmese script), alongside a population estimate of 5.15 million. Geographically, the group spans regions in Myanmar and Thailand, including Hpa-an, Myawaddy, and Kawkareik. Religious affiliations include Buddhism, Christianity, and Animism, reflecting syncretic belief systems. Cultural attributes feature traditional festivals, Karen New Year, Wrist Tying, Housewarming, and agricultural celebrations, and costume details such as tasseled headdresses and region-specific attire. This knowledge graph exemplifies how computational methods in digital humanities can encode complex ethnographic data, enabling relational exploration of cultural identities, geographic distribution, and linguistic diversity across the GMS.</p>
      <fig id="fig4">
        <label>Figure 4</label>
        <caption>
          <title>The home page of a GMS Ethnic Groups Knowledge Graph web application</title>
          <p id="_paragraph-38"/>
        </caption>
        <graphic id="_graphic-4" mimetype="image" mime-subtype="png" xlink:href="image4.png"/>
      </fig>
      <p id="_paragraph-40">Figure 4 displays the home page of the Greater Mekong Subregion (GMS) Ethnic Groups Knowledge Graph web application, developed using Node.js. The interface presents a responsive visualization of interconnected ethnic group data, with navigation options for "Graph View" and "Geographic View" enabling multiple representation formats. The left sidebar offers interactive filters, including a search by ethnic group, country, language family, and a temporal slider with preset years (1800–2020). A “Show Migrations” checkbox enables visualization of population movements. The central network graph uses color-coded nodes, blue (English names), yellow (autonyms), red (ethnonyms), and cyan (exonyms), with multicolored edges indicating relationship types. A status bar confirms Neo4j connectivity via localhost, and the application adheres to modern UI design principles. As a digital humanities tool, it supports exploring ethnic diversity and cultural networks across the GMS..</p>
      <fig id="fig5">
        <label>Figure 5</label>
        <caption>
          <title>The detailed visualization from the GMS Ethnic Groups Knowledge Graph application</title>
          <p id="_paragraph-41"/>
        </caption>
        <graphic id="_graphic-5" mimetype="image" mime-subtype="png" xlink:href="image5.png"/>
      </fig>
      <p id="_paragraph-43">Figure 5 displays a detailed network visualization from the GMS Ethnic Groups Knowledge Graph. It focuses on ethnocultural relationships in the Greater Mekong Subregion and has a timeline set to 2020. The dataset comprises 348 ethnic groups and 4,882 relationships, as the status bar indicates. Color-coded nodes distinguish entity types: blue (English names), yellow (autonyms), red (ethnonyms), cyan (exonyms), orange (endonyms), pink (languages), green (countries), and others representing festivals, costumes, language families, and regions. Embedded cultural data includes descriptions of traditional attire and practices (e.g., “wrapped skirts” and “long tunics”) alongside religious affiliations like Theravada and Mahayana Buddhism, and festival nodes such as Bon Kate and Tet. Geographic nodes connect to Vietnamese provinces, including An Giang, Tay Ninh, and Dong Nai. The visualization offers a rich, multidimensional view of how ethnic identities intersect with language, religion, tradition, and place, supporting deeper ethnographic analysis across the GMS.</p>
      <fig id="fig6">
        <label>Figure 6</label>
        <caption>
          <title>A focused search and visualization of the Karen ethnic group network within the GMS Ethnic Groups Knowledge Graph application</title>
          <p id="_paragraph-44"/>
        </caption>
        <graphic id="_graphic-6" mimetype="image" mime-subtype="png" xlink:href="image6.png"/>
      </fig>
      <p id="_paragraph-45">Figure 6 presents an ego-network visualization of the Karen ethnic group within the GMS Ethnic Groups Knowledge Graph, generated by querying “Karen” and selecting the central node. The Karen node (blue) is surrounded by a network of cultural, geographic, linguistic, and religious connections, with orange edges denoting key relationships. The visualization highlights the group’s geographic distribution across regions in Myanmar and Thailand, including Hpa-an, Myawaddy, Kawkareik, and others. Religious affiliations include Buddhism, Christianity, and Animism. Cultural practices are represented by the Karen New Year, Wrist Tying, Housewarming, Campfire, Farm, Boat Floating, and the unique “Festival for Collecting Human Bones.” Traditional attire is detailed through associated text nodes. Linguistically, the Karen are linked to Tibeto-Burman and Karenic language groups, with additional nodes indicating Mon origins and anthropological classification as “Mongoloid.” The filtered view displays four ethnic groups and 1,038 relationships with the timeline set to 2020, offering a comprehensive snapshot of Karen's ethnographic context within the Greater Mekong Subregion.</p>
      <fig id="fig7">
        <label>Figure 7</label>
        <caption>
          <title><bold id="_bold-7"/> A comprehensive visualization of the Kinh ethnic group network from the GMS Ethnic Groups Knowledge Graph application</title>
          <p id="_paragraph-47"/>
        </caption>
        <graphic id="_graphic-7" mimetype="image" mime-subtype="png" xlink:href="image7.png"/>
      </fig>
      <p id="_paragraph-49">Figure 7 displays an ego-network visualization of the Kinh ethnic group from the GMS Ethnic Groups Knowledge Graph, generated by querying “Kinh” and selecting the central node. As Vietnam’s majority ethnic group, the red Kinh node connects to a wide array of cultural, religious, and geographic entities, with the visualization encompassing two ethnic groups and 997 relationships. The left panel presents structured ethnographic metadata, reflecting a Neo4j property graph format. Linguistic identifiers include "Người Kinh" (endonym), "Kinh" (autonym and ethnonym), and "Vietnamese" (exonym). Geographically, the Kinh are distributed across all provinces of Vietnam, with noted origins along the country’s length. Cultural practices include traditional attire (Ao dai), single-story housing, and religious beliefs centered on ancestor worship and local deities. Celebrated festivals include Tet, Qingming, Doan Ngo, and the New Rice Festival. Additional connections link to Caodaism, Hoa Hao Buddhism, and folk practices like reverence for Kitchen and Earth Gods. This visualization illustrates how knowledge graphs can represent the multidimensional identity of the Kinh, integrating linguistic, cultural, geographic, and religious dimensions within a unified, semantically rich structure.</p>
      <fig id="fig8">
        <label>Figure 8</label>
        <caption>
          <title>The Geographic View mode of the GMS Ethnic Groups Knowledge Graph application</title>
          <p id="_paragraph-50"/>
        </caption>
        <graphic id="_graphic-8" mimetype="image" mime-subtype="png" xlink:href="image8.png"/>
      </fig>
      <p id="_paragraph-52">Figure 8 presents the Geographic View of the GMS Ethnic Groups Knowledge Graph application, offering a spatial representation of ethnic distributions across the Greater Mekong Subregion. Countries are shown as color-coded polygons, China (pink), Myanmar (light blue), Laos (purple), Vietnam (light green), Thailand (green), and Cambodia (yellow). The interface retains previous filtering and timeline tools, with the year set to 2020 and “Show Migrations” enabled. Map-specific features include overlays for rivers, historical sites, and population sizing. Key waterways such as the Mekong River are visible, highlighting geographic factors shaping settlement and cultural exchange. Markers denote ethnic groups and historical sites, while dashed lines represent migration patterns. This geospatial view complements the network visualization by emphasizing territorial distribution and geographic context. Integrating temporal and spatial filters, the dual-interface design enhances the analytical potential of the application for ethnographic research in the Greater Mekong Subregion.</p>
      <p id="paragraph-0acc45f44eb605d896b107767aa6f452">
        <bold id="bold-98d07cac62c09ea3a0fc8c1fa05a33ea">Knowledge Graphs Evaluation</bold>
      </p>
      <p id="_paragraph-53">For comparative evaluation, we defined the baseline method as a traditional relational database system containing the same ethnographic dataset in tabular form, accessed via SQL-based queries without semantic relationships or interactive visualization. This baseline system served as a control to assess usability, query efficiency, information discovery, and interpretability. While it allowed for standard data retrieval, it lacked support for complex relationship traversal, multilingual naming structures, and geotemporal filtering. This comparison enabled a systematic evaluation of the added value provided by the knowledge graph framework. According to Choi and Jung (2025), the most appropriate evaluation method for the GMS Ethnic Groups web application involves combining Extrinsic Evaluation with Dataset and Domain-Specific Evaluation. The GMS Ethnic Groups Knowledge Graph can leverage extrinsic evaluation because it emphasizes the practical utility of knowledge graphs in real-world scenarios. The application's interactive features, including the visualization and exploration of ethnic group relationships, geographic distributions, cultural practices, and temporal information, align with the extrinsic evaluation's focus on assessing the operational performance of knowledge graph models. </p>
      <p id="_paragraph-54">For instance, the application could measure query response time and efficiency to evaluate how effectively it processes complex queries related to ethnic relationships, especially those filtered by timeline or specific attributes. Additionally, user-focused metrics could be employed to assess the practical usability of geographic and graph visualizations, akin to evaluating the readability of community structures and node layouts. Moreover, domain-specific accuracy metrics would be essential to evaluate how accurately the system represents the intricate cultural relationships, geographic distributions, and cultural practices of different ethnic groups. The domain-specific evaluation is especially pertinent, given the application's engagement with specialized ethnographic data. As part of the Dataset and Domain-Specific Evaluation, this approach emphasizes the completeness and consistency of the knowledge graph in representing the ethnographic data within the Greater Mekong Subregion. This combined evaluation strategy ensures that the GMS Ethnic Groups Knowledge Graph is technically precise and offers significant practical benefits for researchers examining cultural relationships in the Greater Mekong Subregion.</p>
      <p id="paragraph-c398ffcc38fdf55ece2522dcd5b019b5">
        <bold id="bold-f98702b7a679744e2bef51f9cb12a159">Extrinsic Evaluation for GMS Ethnic Groups Knowledge Graph</bold>
      </p>
      <p id="_paragraph-55">Based on the extrinsic evaluation, the evaluation results presented for the GMS Ethnic Groups Knowledge Graph as follows.</p>
      <p id="paragraph-0888993a32c084e1b034986aaec9f3fe">
        <bold id="bold-b9e94d6861ac7b74da39497529185433">Query Response Time</bold>
      </p>
      <p id="_paragraph-56">Table 1 presents a quantitative analysis of query response times for the GMS Ethnic Groups Knowledge Graph, demonstrating significant performance improvements achieved through advanced filtering mechanisms. The baseline measurements indicate varying computational demands across different query types, with basic ethnic group information retrieval exhibiting the fastest response (245ms), followed by geographic distribution queries (412ms), complex relationship queries (678ms), and temporal migration pattern analyses requiring the longest processing time (834ms). </p>
      <table-wrap id="tbl1">
        <label>Table 1</label>
        <caption>
          <title><bold id="_bold-10"/>Query Response Time</title>
          <p id="_paragraph-58"/>
        </caption>
        <table id="_table-1">
          <tbody>
            <tr id="table-row-1e8dc93b6ba569a75339c33ee68728b6">
              <th id="a3b65909d031655bd942570105fb7778">
                <bold id="_bold-11">Query Type</bold>
              </th>
              <th id="29f91521b17b0f69be315a9495a9a862">
                <bold id="_bold-12">Average Response Time (ms)</bold>
              </th>
              <th id="3e16fd1bc2115fc98f5a992c68543fa2">
                <bold id="_bold-13">Response Time with Filtering (ms)</bold>
              </th>
              <th id="e5e546d29fac7cd8b5e1de2128542b77">
                <bold id="_bold-14">Improvement (%)</bold>
              </th>
            </tr>
            <tr id="table-row-6d521e03e3a5ba3ab393403347a68d9d">
              <td id="ab1d0b1ec691dc218d10291710c0ae91">Basic Ethnic Group Information</td>
              <td id="9ffcd2e3b1239597dcd13ea77bb0722a">245</td>
              <td id="d992f0dc9501f2ebbab6f5eab7727ba8">-</td>
              <td id="e40936a6b2a8072bd0088187c06eef93">-</td>
            </tr>
            <tr id="table-row-77aa705f1960e4992c7aa80622b4a06a">
              <td id="f613ecac482f6b21ceb845837db4e323">Complex Relationship Query</td>
              <td id="6730e8a35d6374b42b72fb10c7edad8b">678</td>
              <td id="66fb4e518ce930d413d2eb0e3562248a">321</td>
              <td id="02d1f9ae0fc98c660aadf7ac831367f0">52.7%</td>
            </tr>
            <tr id="table-row-f2a4d466756de976b109cc16ccfdfbe7">
              <td id="373b98fa9623f6c50689645e21f87070">Geographic Distribution</td>
              <td id="3c7ad2e6ea0d81aa72716cbb778841c5">412</td>
              <td id="9587862e2363794fdcbb22c66175209b">198</td>
              <td id="db574a6bc6cfa12238b6dbc447421c15">51.9%</td>
            </tr>
            <tr id="table-row-b36a33a7fb68941fbd5a18637fd9667b">
              <td id="70c6f6776d3601bb476a95f273ced7cf">Temporal Migration Patterns</td>
              <td id="3df1232679e574f935c5abab789009f7">834</td>
              <td id="20738f2ddaca527bf13daa3e94d55997">386</td>
              <td id="b67cef2be0adcfc232293da2ef2755c6">53.7%</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-59"><italic id="_italic-16">Note: "Response Time with Filtering" indicates performance after adding advanced filtering mechanisms.</italic>Source: calculated by the author</p>
      <p id="_paragraph-60">Implementing advanced filtering mechanisms yielded substantial performance enhancements across all complex query types, though basic ethnic group information retrieval remained unaffected due to its already optimized structure. Complex relationship queries showed a 52.7% improvement, with response times decreasing from 678ms to 321ms. Similarly, geographic distribution queries experienced a 51.9% reduction in processing time (from 412ms to 198ms). In contrast, temporal migration pattern analyses - the most computationally intensive query type - demonstrated the most pronounced improvement of 53.7% (from 834ms to 386ms). </p>
      <p id="_paragraph-61">These results highlight the effectiveness of optimized filtering approaches in knowledge graph applications dealing with complex ethnographic data. The consistent improvement ratio of approximately 52-54% across different query types suggests that the filtering mechanisms address fundamental computational bottlenecks within the system rather than query-specific optimizations. This performance enhancement contributes significantly to the application's usability, particularly for interactive exploration scenarios where responsive user interfaces are essential for effective ethnographic research and analysis.</p>
      <p id="paragraph-1aa06f4cab7b82c2d4c86e0448ffaba5">
        <bold id="bold-6dfc05aaa0c80a102c0485de21beef49">Performance metrics between traditional databases and GMS Knowledge Graph systems</bold>
      </p>
      <p id="_paragraph-62">Table 2 presents a comparative analysis of performance metrics between traditional databases and GMS Knowledge Graph systems, evaluated by 24 domain experts. Using the GMS Knowledge Graph approach, the results demonstrate statistically significant improvements across all measured dimensions.</p>
      <p id="_paragraph-63">The usability evaluation involved 24 domain experts selected from academic institutions, cultural heritage agencies, and NGOs working on ethnic and linguistic diversity in the Greater Mekong Subregion. Participants included ethnographers, digital humanities researchers, and information science professionals with an average of 9.5 years of experience in their respective fields. Recruitment was conducted through targeted invitations to individuals affiliated with partner organizations in Thailand, Vietnam, Myanmar, and Laos, ensuring regional and disciplinary diversity. All participants had prior experience with traditional database tools and were introduced to the GMS Knowledge Graph application before performing the evaluation tasks. Their insights provide a grounded perspective on system usability, task efficiency, and information accessibility.</p>
      <table-wrap id="tbl2">
        <label>Table 2</label>
        <caption>
          <title><bold id="_bold-15"/>User Study Results (n=24 domain experts)</title>
          <p id="_paragraph-65"/>
        </caption>
        <table id="_table-2">
          <tbody>
            <tr id="table-row-dfc248a252a2da38af0e1e4475af595e">
              <th id="43c89fe091a3dcd367f0759eb97525bb">
                <bold id="_bold-16">Usability Metric</bold>
              </th>
              <th id="0aa548f11eaa5c5da5b36c2f793c2513">
                <bold id="_bold-17">Traditional Database</bold>
              </th>
              <th id="253631981e10c3fbfc99327466e89e3b">
                <bold id="_bold-18">GMS Knowledge Graph</bold>
              </th>
              <th id="53c43995564cf679279e1dadd20818a5">
                <bold id="_bold-19">p-value</bold>
              </th>
            </tr>
            <tr id="table-row-40c315219e7ae9e7b21a242946198872">
              <td id="1aec14526be101be2f05c56ef3eb2f2b">Task Completion Rate (%)</td>
              <td id="a51796f4e64e180697085b44b5d2856a">72.3</td>
              <td id="91aed62b886995894be95fd557110007">94.6</td>
              <td id="250e71cf6506052ce2181deacf9661f8">p&lt;0.001</td>
            </tr>
            <tr id="table-row-4e4d916614b5cf31788e99c56a1b99f7">
              <td id="73c37c6e77bcfe9b382252073593eebe">Time to Complete Task (min)</td>
              <td id="800523e803ab17b98344f31ca271f2ae">8.4</td>
              <td id="5ffac5f6eb618d814e68d1464077679a">3.2</td>
              <td id="b1639e7c8bdf32a29d55531c413e9267">p&lt;0.001</td>
            </tr>
            <tr id="table-row-53b5f20c7717f119b81aac937848bb47">
              <td id="acbe160c9462019f2b9e90fd5f4c555f">Information Discovery Score (1-10)</td>
              <td id="548771122c47e1102768befad3f76419">5.8</td>
              <td id="046f54f7c108f52a4b3d96cea2f78e71">8.7</td>
              <td id="c6bc6819a9d0d77f84b4c905fec57038">p&lt;0.001</td>
            </tr>
            <tr id="table-row-8340c2297b2f88670e02efc4c3bdc76d">
              <td id="a5abde025e0dd9a261609eabe6e617d8">Visualization Clarity (1-10)</td>
              <td id="4ac7cb4d779512f2a64429a62b817347">4.3</td>
              <td id="294449d3ac2b0584c5d0030fe061429d">8.5</td>
              <td id="1e90e05ca76e52d13acaae20633b6857">p&lt;0.001</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-66">Source: calculated by the author</p>
      <p id="_paragraph-67">Specifically, task completion rates showed a marked increase from 72.3% with the traditional database to 94.6% with the GMS Knowledge Graph (p&lt;0.001). This substantial improvement of over 22 percentage points suggests that the knowledge graph structure significantly enhances users' ability to complete assigned tasks successfully. In addition, efficiency metrics were similarly favorable for the GMS Knowledge Graph, with average task completion time decreasing from 8.4 minutes to 3.2 minutes (p&lt;0.001). This represents a reduction of approximately 62% in time required, indicating considerably improved operational efficiency with the knowledge graph implementation.</p>
      <p id="_paragraph-68">Furthermore, domain experts rated information discovery capabilities substantially higher for the GMS Knowledge Graph, with mean scores increasing from 5.8 to 8.7 on a 10-point scale (p&lt;0.001). This improvement of 2.9 points suggests that the knowledge graph approach provides superior mechanisms for locating and retrieving relevant information. Finally, the most dramatic improvement was observed in visualization clarity, where ratings increased from 4.3 with the traditional database to 8.5 with the GMS Knowledge Graph (p&lt;0.001). This 4.2-point improvement indicates that the knowledge graph's representation of information significantly enhances users' ability to comprehend and interpret the visualized data. These findings provide compelling evidence that the GMS Knowledge Graph approach offers substantial advantages over traditional database systems across multiple dimensions of usability and effectiveness for domain experts.</p>
      <p id="paragraph-678117a2db9fc56d2afb49ea3602ee2f">Downstream Task Performance</p>
      <p id="_paragraph-69">Table 3 presents a comparative analysis of performance metrics between Baseline and GMS Knowledge Graph (KG) methods across four distinct downstream tasks. The results demonstrate substantial performance improvements when utilizing the GMS KG method across all evaluated dimensions.</p>
      <table-wrap id="tbl3">
        <label>Table 3</label>
        <caption>
          <title><bold id="_bold-20"/>Downstream Task Performance</title>
          <p id="_paragraph-71"/>
        </caption>
        <table id="_table-3">
          <tbody>
            <tr id="table-row-b55cc7d8e1a5390905332887aafdb0a1">
              <th id="635325ab652f5b949bb7ca90c357b209">
                <bold id="_bold-21">Task</bold>
              </th>
              <th id="71adee7f179dff81416acf59999735a8">
                <bold id="_bold-22">Baseline Method</bold>
              </th>
              <th id="161845fc6b83cc3e9d0d2db90b053aed">
                <bold id="_bold-23">GMS KG Method</bold>
              </th>
              <th id="557fffa38f9ace7ad5d59aeef434603c">
                <bold id="_bold-24">Improvement (%)</bold>
              </th>
            </tr>
            <tr id="table-row-13fbb0460b3534ed7e61eda26726a1a4">
              <td id="8e119aed1668c71f1d01a1e72a38e983">Migration Pattern Analysis</td>
              <td id="b6e872742171acfa1f82e8e54d9407e7">67.3% (F1)</td>
              <td id="38e8f8848c48ae691b39f5c094ee8366">86.2% (F1)</td>
              <td id="fefd53d930c6cb41fe7ae4d6dab5c933">28.1%</td>
            </tr>
            <tr id="table-row-3911b5f8fd383941314b92a0e6271ff3">
              <td id="13653b4e4cf4fd37b1e77b0882cab1e0">Cultural Connection Identification</td>
              <td id="e43cae8e7fa188aa16f14edb57feeb73">59.8% (Accuracy)</td>
              <td id="99757a779b0ca293ce357880a2d1ba20">82.4% (Accuracy)</td>
              <td id="23b2045d21c0045570bb51cc05d89ec1">37.8%</td>
            </tr>
            <tr id="table-row-c0e0d824853132e7348005f50390d1f5">
              <td id="8da96b1bd6902b25db2cfc833b1076ca">Ethnic Group Classification</td>
              <td id="e9f60f1c87dfb274574ed0ebcbedeee9">71.2% (Precision)</td>
              <td id="412da5a127f122684d02c1f3aed217c6">88.7% (Precision)</td>
              <td id="08191ba8ac6bf2199407173207aba0b5">24.6%</td>
            </tr>
            <tr id="table-row-6deff524bc009f780f284cae5832ec46">
              <td id="2e73d7f20bacb4b9d9d979ae25dd3d83">Geographic Origin Prediction</td>
              <td id="2187b219b9f73848ffed31c527f97f63">63.4% (Recall)</td>
              <td id="9d1eba40d84f1618f9ef14450f492491">85.9% (Recall)</td>
              <td id="c4ed83074efbc9e0ffbdb470e5942560">35.5%</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-72">Source: calculated by the author</p>
      <p id="_paragraph-73">For Migration Pattern Analysis, the GMS KG method achieved an F1 score of 86.2% compared to the Baseline method's 67.3%, representing a 28.1% improvement. This significant enhancement suggests that the knowledge graph approach more effectively captures the complex relationships inherent in migration patterns. Cultural Connection Identification showed the most substantial improvement among the tasks, with accuracy increasing from 59.8% using the Baseline method to 82.4% with the GMS KG method, a 37.8% improvement. This pronounced enhancement indicates that the knowledge graph structure is particularly well-suited for identifying nuanced cultural associations and interconnections.</p>
      <p id="_paragraph-74">In Ethnic Group Classification, precision improved from 71.2% with the Baseline method to 88.7% with the GMS KG method, yielding a 24.6% improvement. This result demonstrates the knowledge graph's enhanced capability to classify ethnic groups with fewer false positives. Similarly, Geographic Origin Prediction exhibited a substantial improvement in recall, increasing from 63.4% with the Baseline method to 85.9% with the GMS KG method - a 35.5% improvement. This indicates that the knowledge graph approach substantially reduces false negatives in predicting geographic origins. Therefore, these findings demonstrate that the GMS Knowledge Graph method consistently outperforms the Baseline method across diverse analytical tasks, with improvements ranging from 24.6% to 37.8%. The results suggest that the knowledge graph approach provides a more effective framework for capturing and leveraging the complex interrelationships within the data, resulting in enhanced performance across various domain-specific applications.</p>
      <p id="paragraph-b928bf7494e3c54b39a13c55d8d34012">
        <bold id="bold-48318c38dd3ad275d370fcb921211949">Dataset and Domain-Specific Evaluation for GMS Ethnic Groups Knowledge Graph</bold>
      </p>
      <p id="paragraph-7fd0d0986de8b524f1f3b1e694bf1081">
        <bold id="bold-a49105d48f66247a22071c74420d1339">Data Coverage</bold>
      </p>
      <p id="_paragraph-75">Table 4 presents a comprehensive assessment of data coverage across six distinct categories relevant to the GMS Knowledge Graph. The analysis reveals varying degrees of coverage completeness, confidence levels, and specific limitations within each data domain. The coverage percentage in Table 4 was calculated by comparing the number of populated entries in each category against the total number of expected data points across all 375 ethnic groups. For example, if a category such as “Languages/Dialects” had documented data for 344 groups, the coverage was 91.8%. Confidence levels were assessed based on data provenance (e.g., whether information came from verified ethnographic studies, government datasets, or fieldwork), completeness of subfields, and cross-checking with at least two authoritative sources. “Very High” confidence was assigned when data were fully documented, consistently structured, and validated by at least two external references. “Medium” or “Medium-Low” ratings indicate partial data or reliance on a single source with potential ambiguities. These assessments were then corroborated through expert validation (as detailed in Table 5), which triangulated and confirmed reliability judgments across domains.</p>
      <table-wrap id="tbl4">
        <label>Table 4</label>
        <caption>
          <title><bold id="_bold-25"/>Data Coverage Assessment Analysis</title>
          <p id="_paragraph-77"/>
        </caption>
        <table id="_table-4">
          <tbody>
            <tr id="table-row-1645e9d40eea91fc51be315321efd3b1">
              <th id="687a33701fd76df9a9a8ceb4250948b0">
                <bold id="_bold-26">Data Type</bold>
              </th>
              <th id="5ef0b9f95591edeb6edcf7dfe26fd1f6">
                <bold id="_bold-27">Coverage (%)</bold>
              </th>
              <th id="d749b9f580cac12f0c8ce93bb712ce97">
                <bold id="_bold-28">Confidence Score</bold>
              </th>
              <th id="5c04075c130017d3ad270eea6698ca87">
                <bold id="_bold-29">Notes</bold>
              </th>
            </tr>
            <tr id="table-row-fda98b8a63ae155f55a21479d51e75f3">
              <td id="811cd7d845d20c9bfde148912865c053">Ethnic Groups</td>
              <td id="33563b0f2b369f82ceccb357d67b3967">94.2%</td>
              <td id="6984db1b2bbb5b8e91dfadb0f0b236a7">High</td>
              <td id="c3b83799e3e8de901811cbfd4d564ab9">Missing some small subgroups</td>
            </tr>
            <tr id="table-row-c1270357dd55d181b6140f358338e0a2">
              <td id="3b3b2c23ccd19151d5ef0bfa3f343ba2">Geographic Regions</td>
              <td id="2e311c53ca9c86c4379f3a3623c436d1">98.7%</td>
              <td id="e783e6c41f89e5c2476733b69e13d7bf">Very High</td>
              <td id="1dfde7dcee8b85f348e48c70ed668a4e">Complete coverage of GMS region</td>
            </tr>
            <tr id="table-row-f497c6d2f52bcfa8b57666b36a5b519a">
              <td id="77e9be620690f8c2d370db207db7b063">Cultural Practices</td>
              <td id="81201e1c913219f36241f8150c55f5f6">87.3%</td>
              <td id="82fd76de22c06b0e7674e5db85dc5ce5">Medium</td>
              <td id="a0218f95d980c414b89cd898f757ec4b">Limited data on certain practices</td>
            </tr>
            <tr id="table-row-696e523fb8cbadd21dc1077b0a651cd3">
              <td id="7883fc040a96e4442e0d71a77c1013ab">Migration Patterns</td>
              <td id="e9d775e79751cfc11af27dd7f4300d15">78.6%</td>
              <td id="7e64b913b739061ec54aabc43f333709">Medium-Low</td>
              <td id="a1bcd473b8b48690c30a156a6cfb1928">Historical data gaps pre-1900</td>
            </tr>
            <tr id="table-row-4b067234314b7a8e87ae349a49e90541">
              <td id="0b88465d2ef7f6ddc2cd6b4c734d7b2d">Languages/Dialects</td>
              <td id="120ca78f865740a17707cb358ad0c812">91.8%</td>
              <td id="893b0f4631701f0cbb9e7fa2a11ae4ab">High</td>
              <td id="ed35f122b6a6d6ffdea1f227194d6797">Well-documented linguistic data</td>
            </tr>
            <tr id="table-row-c34fbdfc05b92433bc16d436724f75ad">
              <td id="e694fe413c36f746d0bcfa9884953bb0">Religious Practices</td>
              <td id="9928efad85edece5ceb2704e343ff637">89.4%</td>
              <td id="bfff546d23d7ce7c3ac448ad38005ddf">Medium-High</td>
              <td id="83ef9b1a4fd32aa86f983b2aed22f761">Some syncretic practices underrepresented</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-78">Source: calculated by the author</p>
      <p id="_paragraph-79">The results indicated that Ethnic Groups demonstrate a robust coverage rate of 94.2% with a high confidence score, suggesting that the knowledge base contains comprehensive information on most ethnic populations within the scope of interest. The primary limitation noted is the absence of certain small subgroups, which, while constituting a minor portion of the overall data, may represent an area for future enhancement. Moreover, Geographic Regions exhibit the strongest coverage at 98.7% with a very high confidence score, indicating near-complete representation of the GMS region. This exceptional coverage suggests that the spatial dimension of the knowledge graph is particularly well-developed and reliable for analytical purposes. In addition, Cultural Practices show a moderately high coverage of 87.3% with a medium confidence score. The annotation indicates that certain cultural practices have limited representation in the dataset, suggesting potential gaps in ethnographic documentation or data collection challenges associated with specific cultural phenomena.</p>
      <p id="_paragraph-80">In contrast, Migration Patterns demonstrate the lowest coverage among the assessed categories at 78.6% with a medium-low confidence score. The noted historical data gaps for pre-1900 periods indicate a temporal limitation in the dataset, likely reflecting the challenges of documenting historical population movements before modern record-keeping practices. On the other hand, Languages/Dialects exhibit strong coverage at 91.8% with a high confidence score, suggesting comprehensive documentation of linguistic diversity within the studied region. This relatively complete linguistic data provides valuable resources for understanding communication patterns and cultural transmission. Similarly, Religious Practices show substantial coverage at 89.4% with a medium-high confidence score. The noted underrepresentation of syncretic practices suggests a particular challenge in documenting religious expressions that blend multiple traditions, potentially indicating a systematic bias toward documenting more formalized or discrete religious systems. This heterogeneous coverage profile across different data types highlights the strengths and limitations of the current knowledge base, providing important context for researchers utilizing this resource and identifying specific areas where additional data collection efforts might be beneficial.</p>
      <p id="paragraph-43fee3f386c1425476ade2768c6c526a">
        <bold id="bold-639217889bb78b57d4c380c30f2dee2d">Domain Expert Validation of Ethnographic Data</bold>
      </p>
      <p id="_paragraph-81">The results in Table 5 reveals a comprehensive evaluation of the knowledge base by 17 ethnographers across six domains, assessing accuracy, reliability, and completeness on a 10-point scale. The results indicate consistently high ratings across all evaluation aspects, though with notable variations between domains and assessment dimensions as follows:</p>
      <table-wrap id="tbl5">
        <label>Table 5</label>
        <caption>
          <title><bold id="_bold-30"/>Domain Expert Validation of Ethnographic Data (n=17 ethnographers)</title>
          <p id="_paragraph-83"/>
        </caption>
        <table id="_table-5">
          <tbody>
            <tr id="table-row-d6b9e64157b5211b889cb3eb83b250b2">
              <th id="be045d2f884a9da8eecd937542624e5a">
                <bold id="_bold-31">Evaluation Aspect</bold>
              </th>
              <th id="9fddceab6b576dc6a791d5680fdc413d">
                <bold id="_bold-32">Accuracy Score</bold>
                <bold id="_bold-33">(1-10)</bold>
              </th>
              <th id="5e26a630e3d820955902e73a3cba28e1">
                <bold id="_bold-34">Reliability Score</bold>
                <bold id="_bold-35">(1-10)</bold>
              </th>
              <th id="1d3e32ff98748ae02073841a289377ce">
                <bold id="_bold-36">Completeness Score</bold>
                <bold id="_bold-37">(1-10)</bold>
              </th>
            </tr>
            <tr id="table-row-b1d718c869b2f5f56d0e7ecafe65e425">
              <td id="2fb84b31a920728704cf3a0e32b4827f">Karen ethnographic data</td>
              <td id="8d00c9fcd4fd6b68a77b25b675681958">9.3</td>
              <td id="bff4a3d831d35a8dda941d69dcb3fb7b">9.1</td>
              <td id="f01c8e4304f77bc161f3b5204eb04865">8.7</td>
            </tr>
            <tr id="table-row-15410ae2b6d1adef00c53da16428117b">
              <td id="ba4db93dc539f3c0dd4ab311acf208a9">Kinh ethnographic data</td>
              <td id="e948ab51026e391a00fe25d90b1cc3ba">9.5</td>
              <td id="70bd82d1cf97a14c969419673c8e74f5">9.4</td>
              <td id="76cdd231f8b8c45a054620cb0e0d371e">9.2</td>
            </tr>
            <tr id="table-row-57c98a22f8a5ce4bbc8e8cd286e30360">
              <td id="3d9e0fd15bf84823550d99807f89b5dc">Myanmar regional data</td>
              <td id="c673907ca2f9a48cc7ef88a9ad8737a7">8.9</td>
              <td id="f57ee6f6bacc7ce03d2b3b23fa5a044c">8.7</td>
              <td id="f1308afcff34e5209be86482437bdf0d">8.4</td>
            </tr>
            <tr id="table-row-13f62e06155692605163a7743bf49d21">
              <td id="6ff04f2e66e4a90dc861903218b0a6f3">Tibeto-Burman language classification</td>
              <td id="dde9f0ae03ad161b02105fe43c89abaf">9.2</td>
              <td id="a1c5100dd536636735132f30b7b682f8">9.0</td>
              <td id="b00b95f6e654b057962eaad1d63a5927">8.8</td>
            </tr>
            <tr id="table-row-1752e903b918ee4dd55396e164bd1870">
              <td id="b21953221757844ed49fdc2d5a5fe904">Religious practice representation</td>
              <td id="b423f8e72ac4ba4ded06d413b3017cb7">8.8</td>
              <td id="9c81942f6fbc3292c0144ea571b04b95">8.5</td>
              <td id="141190ecab2066abb3e5ee6f8e504739">8.1</td>
            </tr>
            <tr id="table-row-e2969c009cd57c20e3bd29c284c10c28">
              <td id="56633e0c9c80269a3ee42fd8d7659469">Migration timeline accuracy</td>
              <td id="12f0d54953406c5ac9952aa83d178ade">8.6</td>
              <td id="1dcdc9ff23a625c56ad35cfb1994e3b5">8.2</td>
              <td id="c2edc6b4ff6cbd68caf1164783fc7d43">7.9</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-84">Source: calculated by the author</p>
      <p id="_paragraph-85">Ethnographic data for the Kinh people received the highest overall ratings, with scores of 9.5 for accuracy, 9.4 for reliability, and 9.2 for completeness. These exceptional ratings suggest that the knowledge base's representation of Kinh cultural, historical, and social characteristics is particularly robust and well-documented, establishing it as the strongest domain in the evaluation. Following closely, Karen's ethnographic data was similarly well-regarded, with scores of 9.3 for accuracy, 9.1 for reliability, and 8.7 for completeness. This represents the second-strongest domain in the assessment, indicating high-quality documentation and representation of Karen ethnographic information within the knowledge base. In addition, the Tibeto-Burman language classification received strong validation scores of 9.2 for accuracy, 9.0 for reliability, and 8.8 for completeness. These ratings suggest that the linguistic categorization and relationships within this language family are well-represented and correspond closely to expert understanding of these linguistic systems. Meanwhile, Myanmar regional data was rated somewhat lower but still favorably, with scores of 8.9 for accuracy, 8.7 for reliability, and 8.4 for completeness. This minor score reduction may reflect greater complexity or challenges in comprehensively representing regional characteristics.</p>
      <p id="_paragraph-86">Similarly, religious practice representation received moderate to high scores of 8.8 for accuracy, 8.5 for reliability, and 8.1 for completeness. The relatively lower completeness score suggests that certain aspects of religious practices may be underrepresented in the current knowledge base. However, the information that is included is generally accurate and reliable. Finally, migration timeline accuracy received the lowest overall ratings, with 8.6 for accuracy, 8.2 for reliability, and 7.9 for completeness. While still positive, these ratings indicate that migration data presents the greatest challenges among the evaluated domains, potentially reflecting the inherent difficulties in documenting historical population movements. Across all domains, a consistent pattern emerges wherein accuracy scores are highest, followed by reliability scores, with completeness scores consistently lower. This pattern suggests that while the available information is generally accurate and reliable, gaps in comprehensive coverage persist across all domains, with particular limitations in migration timelines and religious practice representation.</p>
      <p id="paragraph-4148202aa2e4ff8f0f9c63e517dac5e0">
        <bold id="bold-70a63c7eba98a8ce8f72f45923e74681">Ethnographic Consistency</bold>
      </p>
      <p id="_paragraph-87">Table 6 shows a detailed assessment of consistency within an ethnographic knowledge base across six critical dimensions, documenting the identification, resolution, and final state of data inconsistencies. Overall, the analysis demonstrates a robust framework for detecting and addressing potential contradictions within ethnographic data.</p>
      <table-wrap id="tbl6">
        <label>Table 6</label>
        <caption>
          <title><bold id="_bold-38"/>Ethnographic Consistency Analysis</title>
          <p id="_paragraph-89"/>
        </caption>
        <table id="_table-6">
          <tbody>
            <tr id="table-row-71ea826f66fbc2c3571c9174e669bcb6">
              <th id="c37d4593effd7828f056762f814fde38">
                <bold id="_bold-39">Consistency Type</bold>
              </th>
              <th id="3a4ce0cb1fd26a0b0b7704d1dcf16fa5">
                <bold id="_bold-40">Violations Detected</bold>
              </th>
              <th id="7d20abb3b2bc540737422c68e346a935">
                <bold id="_bold-41">Resolution Rate (%)</bold>
              </th>
              <th id="2b8f3d2b2c06ff6a5fa38b0c19fcb471">
                <bold id="_bold-42">Final Consistency Score (%)</bold>
              </th>
            </tr>
            <tr id="table-row-505465a64a9fce9f8686aa1dd4a59824">
              <td id="b3a497872ded99ef2d425af30749459f">Temporal</td>
              <td id="ed4e0e9320c601c6344fae08722109cd">47</td>
              <td id="b788225527d313f2f3d8c6398afe5298">95.7</td>
              <td id="642bcbc474acea495a885b5dd36782a8">99.3</td>
            </tr>
            <tr id="table-row-95a92a47c62084f0928db5afebe8d523">
              <td id="379557f515c1d2a2833442e897a33526">Geographic</td>
              <td id="c7ae20432b0636b5245deab196b25ddc">23</td>
              <td id="f9d89975cbdb5da759c71275da8d9a8a">100</td>
              <td id="813a91e956659a8182794de53cb58897">100</td>
            </tr>
            <tr id="table-row-92e1ec9ca1909811569ab13e9787074c">
              <td id="d0a669efd10ff0f9b5d106e6e81bd47a">Cultural</td>
              <td id="45b2913ba77233a43c005e65b495d17b">68</td>
              <td id="f1e67950ee0804da8976b5b7b9a2bd8d">92.6</td>
              <td id="fabf41819ae0735f45ca964104b6cff1">98.7</td>
            </tr>
            <tr id="table-row-cad6299a7d8edd8074e906438ba13045">
              <td id="58d5bee70c665ba481a706eddf8d9964">Linguistic</td>
              <td id="9464eb0884ba3c3d7d7e87a63fb7e5bd">31</td>
              <td id="fdc16613cd3fb7484861590a020f0c78">96.8</td>
              <td id="3c34a5f0d4369928dacfff29cb30d3e1">99.5</td>
            </tr>
            <tr id="table-row-c1ad6bb0aaf50e4eec57730caf4dd602">
              <td id="4051a653ac949e18997cf5a3ec827f05">Religious</td>
              <td id="892dedeb023bc1164b3b66379c23961a">42</td>
              <td id="fd13d82b1a165eb38587c6e944ade48d">95.2</td>
              <td id="0173c4354a6c1ffebb005c03b8e68429">99.2</td>
            </tr>
            <tr id="table-row-5c686e844d591e34f1bfbb3725ada32a">
              <td id="58c79d8eac5615634a8efb3bd7a5c65f">Migration</td>
              <td id="bdd9b394e9f390fef2523ca247ddf648">56</td>
              <td id="57b33119d22921ea2343d0f6ae9dfc14">91.1</td>
              <td id="0c4558613483257ef604ec4042d7dfa6">98.2</td>
            </tr>
          </tbody>
        </table>
      </table-wrap>
      <p id="_paragraph-90"><italic id="_italic-17">These consistency violations represented cases where data might have contradictory information (e.g., an ethnic group listed in incompatible geographic regions simultaneously or timeline inconsistencies). Most were resolved through domain expert consultation.</italic>Source: calculated by the author</p>
      <p id="_paragraph-91">To begin with, temporal consistency analysis revealed 47 violations, representing instances where chronological information contained contradictions or implausible sequences. With a resolution rate of 95.7%, most of these temporal inconsistencies were successfully addressed, yielding a final consistency score of 99.3%. This high resolution rate indicates effective mechanisms for reconciling chronological discrepancies. In comparison, geographic consistency examination identified 23 violations, potentially indicating cases where populations or cultural practices were erroneously associated with incompatible locations. Notably, all geographic inconsistencies were resolved, achieving a perfect 100% resolution rate and final consistency score, suggesting particularly effective protocols for addressing spatial contradictions.</p>
      <p id="_paragraph-92">On the other hand, cultural consistency analysis detected the highest number of violations at 68, likely reflecting the inherent complexity of representing cultural practices and their interconnections. These violations achieved a 92.6% resolution rate, resulting in a final consistency score of 98.7%. The slightly lower resolution rate may indicate greater challenges in resolving cultural contradictions, possibly due to cultural documentation's nuanced and sometimes subjective nature. Similarly, a linguistic consistency review identified 31 violations, with a strong resolution rate of 96.8% and a final consistency score of 99.5%. This suggests effective resolution of language classification, distribution, and relationship contradictions. Moreover, religious consistency analysis detected 42 violations, of which 95.2% were successfully resolved, yielding a final consistency score of 99.2%. This demonstrates the effective reconciliation of contradictions related to religious practices and beliefs.</p>
      <p id="_paragraph-93">However, migration consistency examination revealed 56 violations, with the lowest resolution rate among all dimensions at 91.1%, resulting in a final consistency score of 98.2%. This lower resolution rate may reflect the inherent challenges in documenting and reconciling historical population movements, particularly when dealing with limited or conflicting historical records. Crucially, the domain expert consultation was instrumental in resolving these inconsistencies, highlighting the importance of specialized knowledge in addressing complex ethnographic data contradictions. Overall, the analysis demonstrates a highly effective framework for ensuring data consistency, with final consistency scores exceeding 98% across all dimensions despite the varied challenges in addressing different ethnographic inconsistencies. These KGs evaluation follow the frameworks described in the Knowledge Graph Construction: Extraction, Learning, and Evaluation (Choi &amp; Jung, 2025), adapted specifically for the ethnographic knowledge domain of the GMS Ethnic Groups Knowledge Graph application shown in Figure 2-8.</p>
      <p id="_paragraph-94">Inconsistencies were identified through automated logic checks and manual inspection using Neo4j Cypher queries designed to flag conflicting properties, entities linked to incompatible temporal values or mutually exclusive geographic regions. To resolve these issues, we applied a rule-based resolution framework guided by authoritative ethnographic sources and regional classification standards (e.g., national census data and ethnolinguistic maps). Domain experts reviewed flagged cases during structured review sessions. Geographic conflicts (e.g., groups listed in two distant provinces without migration context) were reconciled by adding migration paths, correcting region tags, or introducing historical location variants where appropriate. Although the geographic inconsistency resolution rate is listed as 100%, we have revised the text to clarify that this reflects the successful handling of all flagged cases to a satisfactory threshold rather than a guarantee of correctness. A few ambiguous or historically complex cases were annotated with contextual notes rather than modified outright, ensuring transparency in areas where authoritative resolution was impossible.</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion </title>
      <p id="_paragraph-95">This study investigated the potential of semantic knowledge graphs (KGs), ontologies, and Knowledge Organization Systems (KOS) to ethically represent ethnographic data across the Greater Mekong Subregion (GMS). The GMS Ethnic Groups Knowledge Graph successfully integrated data on 375 ethnic groups across six countries, modeling intricate linguistic, cultural, religious, and migratory relationships. Compared to conventional relational databases, the KG demonstrated significant improvements in usability, interpretability, and analytical capabilities, as evidenced by over a 25% increase in task performance metrics during expert evaluations.</p>
      <p id="_paragraph-96">These results substantiate growing calls for more inclusive and context-aware knowledge infrastructures (Doyle, 2013; Rubim Silva &amp; Dal'Evedove, 2025; Wright &amp; Saelua, 2023). Unlike traditional classification systems such as the Library of Congress Subject Headings (LCSH) or the Dewey Decimal Classification (DDC), the KG presented in this study prioritizes marginalized perspectives, advancing epistemic justice through culturally responsive and community-informed design. While the project builds upon prior initiatives, including Lau ā Lau ka ʻIke (Wright &amp; Saelua, 2023) and Linked Open Data (LOD) projects in Laos and Vietnam (Chansanam et al., 2024; Ngootip et al., 2023; Ngoc &amp; Chansanam, 2023), it extends these contributions by offering a scalable, multi-country implementation that features interactive filtering, temporal migration tracking, and domain-specific validation.</p>
      <p id="_paragraph-97">Scholars have long argued that established knowledge structures often perpetuate the marginalization of non-Western epistemologies (Doyle, 2013; Tran, 2017). In response, Indigenous information theorists have advanced alternative frameworks such as community-controlled vocabularies, participatory classification systems, and relational metadata structures (Dlamini, 2024; Smith, 2021; McDonough, 2013; Tuhiwai Smith, 2012). These approaches emphasize the value of local knowledge authority, oral traditions, and cultural sovereignty in the development of knowledge infrastructures.</p>
      <p id="_paragraph-98">Aligned with this critical discourse, decolonial information science advocates for dismantling the hierarchies embedded in metadata schemas, authority control mechanisms, and archival practices (Ntshoe, 2020; Sugimoto &amp; Wijesundara, 2024). The present study contributes to this movement by employing a semantic architecture that enables fluid, multilingual, and self-determined identity modeling. This design facilitates the co-existence of plural ontologies and supports the layered representation of ethnicity, language, and cultural practices.</p>
      <p id="_paragraph-99">Despite the innovations presented, several limitations must be acknowledged. Most notably, the data coverage for historical migration patterns and syncretic religious practices remains incomplete. This limitation reflects the intrinsic difficulties in sourcing and verifying such information, as well as the gaps inherent in available historical records. Furthermore, while the platform’s visualization and interaction features are robust, the reliance on manually curated datasets poses challenges to scalability. Future enhancements should therefore consider the integration of natural language processing techniques and automated entity recognition to support expansion. Moreover, the lower completeness scores observed in specific categories underscore the necessity of deepened collaboration with local ethnographers and cultural experts.</p>
      <p id="_paragraph-100">The manual nature of the curation process, though critical for ensuring cultural sensitivity, introduces an element of subjectivity, particularly in classifying dynamic cultural phenomena such as religious practices and festivals. These representational challenges are symptomatic of broader epistemological tensions. As Doyle (2013) and Tran (2017) emphasize, structured classification systems may inadvertently essentialize fluid and hybrid identities. While the knowledge graph developed in this study is designed for flexibility and inclusivity, it still simplifies lived experiences to accommodate computational representation. Compared to initiatives such as Lau ā Lau ka ʻIke (Wright &amp; Saelua, 2023) and Taiwan’s Indigenous Infomediary project (Sung &amp; Chi, 2021), this research expands the geographic scope and technical scale while maintaining a strong commitment to participatory, context-sensitive design. It seeks to balance the authenticity of community representation with the demands of interoperability and digital infrastructure.</p>
      <p id="_paragraph-101">Nonetheless, an unresolved tension persists between the structured nature of knowledge graphs and the inherently dynamic character of cultural identity. Ethnic, linguistic, and religious affiliations often shift over time and resist fixed categorization. Although this study incorporates multilingual labels, temporal filters, and relational structures to mitigate essentialism, no digital system can fully encapsulate contested or evolving identities. Therefore, continued community engagement remains critical for ensuring the long-term integrity and ethical validity of the system.</p>
      <p id="_paragraph-102">This study offers three principal implications. First, it presents a replicable framework for the digitization and semantic modeling of ethnographic knowledge in other regions, thereby contributing to global efforts in inclusive knowledge representation. Second, it demonstrates that integrating KOS principles with knowledge graph architectures can simultaneously support computational efficiency and cultural nuance. Third, the developed application serves as a valuable tool for interdisciplinary education and research, with practical applications in digital humanities, anthropology, linguistics, and data science.</p>
      <p id="_paragraph-103">Ultimately, the findings reaffirm the study’s central proposition: that mainstream knowledge systems often fail to capture the complexity of minority identities, necessitating the development of ethically grounded alternatives (Doyle, 2013). By drawing on interdisciplinary methods and community-sourced data, this project challenges dominant paradigms and contributes to ongoing debates on equity, justice, and inclusion in digital knowledge infrastructures (Khoo et al., 2024; Siegers et al., 2023).</p>
      <p id="_paragraph-104">The GMS Ethnic Groups Knowledge Graph illustrates how semantically rich, culturally inclusive systems can advance ethical knowledge representation. It promotes equitable access, deeper cultural understanding, and sustainable heritage documentation. Looking ahead, future work should prioritize sustained collaboration with local communities to ensure that marginalized voices continue to shape and define the narratives embedded within digital infrastructures.</p>
      <p id="_paragraph-105">Looking ahead, future work should prioritize sustained collaboration with local communities to ensure that marginalized voices continue to shape and define the narratives embedded within digital infrastructures. Additionally, I recommend elaborating more on how end-users, such as policymakers, educators, or ethnic communities, might directly engage with the platform in real-world contexts. Clarifying use scenarios can help translate technical outcomes into meaningful social impact and guide future system development and adoption.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion </title>
      <p id="_paragraph-106">This study developed and evaluated a semantic knowledge graph (KG) and accompanying web application that represents 375 ethnic groups across the Greater Mekong Subregion. By integrating culturally grounded data with semantic web technologies, the project demonstrated that knowledge graphs can model complex cultural, linguistic, religious, and geographic relationships with greater flexibility and nuance than traditional databases. Expert validation and performance metrics confirmed substantial improvements in usability, query precision, and knowledge discovery. A core contribution of this work lies in its interdisciplinary synthesis of Knowledge Organization Systems (KOS), Knowledge Management (KM), and Digital Humanities. This integration responds to growing calls for more equitable, inclusive, and context-sensitive digital infrastructures that center minority and Indigenous knowledge systems. Unlike earlier initiatives limited in geographic or technical scope, this region-wide implementation supports multilingual navigation, temporal analysis, and participatory validation.</p>
      <p id="_paragraph-107">However, some limitations remain. The manual nature of data curation poses scalability challenges, while the classification of evolving or hybrid cultural practices introduces representational ambiguities. Additionally, the dataset’s coverage of historical migration and syncretic religious practices is incomplete and requires further enrichment. Looking ahead, future developments should explore the integration of natural language processing for scalable entity extraction, incorporate user feedback for dynamic ontology refinement, and expand both linguistic and geographic coverage. Most importantly, sustained collaboration with local communities is essential to ensure authenticity, cultural appropriateness, and epistemic justice.mUltimately, this framework offers a replicable foundation for building ethical, pluralistic knowledge infrastructures in the domain of digital cultural heritage. It has the potential to inform not only academic research but also educational tools, policymaking, and community-driven documentation initiatives. This study contributes to the field of intercultural communication by offering a semantically rich, culturally grounded platform that enhances mutual understanding, models identity fluidity, and supports equitable dialogue across diverse cultural groups.</p>
      <p id="_paragraph-108"><bold id="_bold-43">Acknowledgement Statement: </bold>The authors would like to thank to all participants and the reviewers for providing comments in helping this manuscript to completion.</p>
      <p id="_paragraph-109"><bold id="_bold-44">Conflicts of interest: </bold>The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.</p>
      <p id="_paragraph-110"><bold id="_bold-45">Authors'</bold><bold id="_bold-46"> contribution statements:</bold> Wirapong Chansanam, Lan Thi Nguyen, Chunqiu Li, and Christopher Khoo Soo Guan performed the experiment. Wirapong Chansanam and Lan Thi Nguyen wrote the manuscript with support from Chunqiu Li and Christopher Khoo Soo Guan .Wirapong Chansanam and Lan Thi Nguyen fabricated the GMS ethnic groups' dataset, and Wirapong Chansanam supervised the project.</p>
      <p id="_paragraph-111"><bold id="_bold-47">Funding</bold> <bold id="_bold-48">statements:</bold> This research was funded by the Faculty of Humanities and Social Sciences, Khon Kaen University, grant number HUSO-2568.</p>
      <p id="_paragraph-112"><bold id="_bold-49">Data availability statement: </bold>Data is available at request. Please contact the corresponding author for any additional information on data access or usage.</p>
      <p id="_paragraph-113"><bold id="_bold-50">Disclaimer:</bold> The views and opinions expressed in this article are those of the author(s) and contributor(s) and do not necessarily reflect JICC's or editors' official policy or position. All liability for harm done to individuals or property as a result of any ideas, methods, instructions, or products mentioned in the content is expressly disclaimed.</p>
    </sec>
  </body><back/></article>
