Introduction
Despite decades of efforts to broaden participation in science, technology, engineering, and mathematics (STEM), Black professionals remain significantly underrepresented across these fields (Fry et al., 2025). As of 2021, Black individuals comprised just 8% of the U.S. STEM workforce, which is significantly lower than their 11% representation in the overall workforce (National Center for Science and Engineering Statistics (NCSES), 2024). This gap extends beyond a simple numerical difference and reflects broader structural inequities within STEM fields (National Academies of Sciences, Engineering, and Medicine, 2023; National Science Foundation, 2023; Pew Research Center, 2021; Suran, 2021). It is sustained by systemic barriers, including inequitable access to funding, hiring discrimination, and exclusion from leadership and decision-making roles (Mays et al., 2023). These inequities also shape the narratives that define who belongs in STEM and whose contributions are recognized and remembered (E. O. McGee, 2020).
A central driver of these inequities is the dominance of Eurocentric epistemologies, which shape how knowledge is produced, validated, and disseminated in STEM (Henderson, 2024; Kayumova & Strom, 2023; E. O. McGee, 2020). STEM is often positioned as culturally neutral and universally objective, but its norms, values, and knowledge structures have historically been defined by Western perspectives (Henderson, 2024; Kayumova & Strom, 2023). These epistemologies prioritize white ways of knowing and position scientific knowledge as culturally neutral and detached from lived experience (Henderson, 2024; Kayumova & Strom, 2023). In practice, they shape whose expertise is validated, whose work is cited, and whose intellectual contributions are incorporated into the body of scientific knowledge (Almeida, 2015; Henderson, 2024). Through this process, Black scientific contributions have often been excluded from mainstream narratives, creating a false impression of absence where, in fact, there is a long-standing presence of innovation and excellence (Graves et al., 2022; E. O. McGee, 2020).
Eurocentrism also manifests through linguistic structures that govern STEM communication (Henderson, 2024). Language plays a powerful role in shaping the social and epistemic boundaries of STEM, influencing whose knowledge is legitimized and whose voices are marginalized (Kayumova & Dou, 2022; E. McGee, 2020). Dominant narratives, often rooted in Eurocentric traditions, frame scientific knowledge as objective and universal while marginalizing the voices, experiences, and innovations of Black scientists and technologists (Kwachou, 2025; Udah, 2024). These linguistic structures influence publication norms, institutional culture, and classroom hierarchies, reinforcing a cycle of exclusion (Loui & Fiala, 2024; Martinez-Acosta & Favero, 2018).
Challenging these inequities requires approaches that not only critique Eurocentrism but also offer alternative conceptualizations of science, technology, and futurity. Afrofuturism, positioned at the intersection of African diasporic identity, imagination, and technology, provides a powerful framework for disrupting these patterns. Afrofuturism invites us to reimagine futures where Black lives, creativity, and knowledge systems are central to innovation (Gipson, 2019). As both a cultural and political movement, Afrofuturism challenges traditional ideas of progress and imagines new paths toward justice, liberation, and innovation in science (Gipson, 2019).
Within this context, Natural Language Processing (NLP), a branch of artificial intelligence (AI) focused on analyzing and interpreting human language, offers a promising toolkit for uncovering patterns, narratives, and themes within Black STEM discourse. However, most mainstream NLP methods often reproduce racialized biases because they are trained on corpora that marginalize or misrepresent Black speech, identity, and culture (Field et al., 2021; Mire et al., 2025). Applying NLP without a critical, culturally grounded lens risks perpetuating the very inequities it aims to study (Z. Liu, 2023). These risks become especially evident when NLP systems interact with Black linguistic and cultural expressions (Field et al., 2021; Mire et al., 2025).
For example, many pre-trained language models, including GPT, BERT, and RoBERTa, are trained on corpora that predominantly reflect English-language usage grounded in white, Western, male-dominated norms (Gallegos et al., 2024; Nemani et al., 2023). As a result, they absorb and reproduce the same biases they were exposed to from stereotyping Black names and vernaculars to underrepresenting or misinterpreting African American Vernacular English (AAVE) and other diasporic linguistic forms.
This underrepresentation means that Black digital discourse can often be flagged as “ungrammatical,” “aggressive,” or “incoherent” by automated systems, leading to disproportionate content moderation or outright silencing (Blodgett et al., 2020). Furthermore, the deployment of AI in surveillance and predictive policing has disproportionately impacted Black communities, extending algorithmic harm beyond the virtual realm into everyday life (NAACP, 2024). Language policing online, such as flagging or deplatforming culturally specific Black speech, raises significant ethical concerns about freedom of expression, cultural erasure, digital safety, consent, privacy, and the political misuse of data collected (Chin-Rothmann & Lee, 2022; NAACP, 2024; Tortora, 2024).
The intersection of NLP, Afrofuturism, and Black representation in STEM highlights a critical gap in existing AI research. While large language models (LLMs) can generate, summarize, and analyze vast text corpora, they often fail to capture the richness, nuance, and cultural grounding of Black STEM narratives (Z. Liu, 2023). These models are not trained to understand or amplify stories of Black innovation, nor are they designed to ask how systemic racism shapes the way Black people engage with science, technology, or the future.
This article is a theoretical and methodological contribution that advances the integration of Afrofuturism into computational research on Black STEM discourse. It does not present new empirical findings or propose a formalized framework. Instead, it offers a theoretically grounded approach for rethinking how NLP can be applied in ways that center Black epistemologies, cultural context, and ethical responsibility. By bridging Afrofuturism and computational methods, this articles contributes to emerging conversations in AI ethics, critical data studies, and STEM equity by demonstrating how researchers can design more culturally responsive and justice-oriented analytical practices. This article makes three key contributions: (1) it conceptualizes Afrofuturism as a critical lens for computational research, (2) it identifies methodological considerations for applying NLP to Black STEM discourse, and (3) it examines ethical and practical implications for AI development, policy, and education.
Theoretical Foundations: Afrofuturism and Black Representation in STEM
Afrofuturism, a cultural aesthetic and theoretical framework, combines African diasporic histories with imagined futures driven by technology, science, and liberation (Dery, 1994; Womack, 2013). Emerging in the mid-20th century, it provided Black communities with a way to envision themselves beyond the constraints of systemic oppression (Dery, 1994; Womack, 2013). Pioneers like Sun Ra used music and cosmic imagery to reclaim Black existence in space and time, asserting an alternative Black ontology that defied Eurocentric narratives (Womack, 2013). Similarly, author Octavia Butler used science fiction to question power, gender, race, and human evolution, writing Black futures into genres that often ignored them (Womack, 2013). Collectively, these works position Black creativity and resilience at the center of technological and societal futures.
Although Afrofuturism imagines expansive Black futures, it is grounded in a long tradition of Black excellence in science and technology (Yaszek, 2006). This excellence has often been minimized or erased in mainstream narratives (Yaszek, 2006). However, this erasure is not accidental. Structural racism within academia and industry continues to dismiss and devalue Black excellence in STEM (Hatfield et al., 2022; E. O. McGee, 2020). Research shows persistent disparities in grant funding, citation rates, editorial inclusion, and review times for Black scholars (Ginther et al., 2011; Heidt, 2023; F. Liu et al., 2023; Taffe & Gilpin, 2021). These inequities reinforce the false perception that Black contributions to STEM are rare rather than chronically underacknowledged. Moreover, Eurocentric framings of science as culturally neutral further marginalize Black scientists by disconnecting scientific progress from their lived experiences and labor (E. O. McGee, 2020).
Given this pattern of erasure, Afrofuturism serves as a transformative framework for reimagining equity in STEM (E. O. McGee et al., 2025). Rather than promoting simple inclusion into existing systems, Afrofuturism encourages a radical rethinking of STEM education, research, and innovation (E. O. McGee et al., 2025). It invites educators and technologists to center African diasporic knowledge systems, cultural relevance, and principles of justice and community when conceptualizing the future of STEM fields (Eshun, 2003; King et al., 2023).
In the context of AI and data science, Afrofuturism challenges the myth of algorithmic neutrality by exposing how models encode dominant power structures (Benjamin, 2019; Noble, 2018). Predictive algorithms and NLP systems can distort or marginalize Black speech and identity when trained on biased or incomplete datasets (Hanna et al., 2020; Noble, 2018). Drawing on Afrofuturist principles in computational research emphasizes the development of systems rooted in Black linguistic, cultural, and epistemic traditions (Klassen et al., 2024). This includes designing algorithms that uplift and protect rather than extract or misrepresent Black communities (Klassen et al., 2024).
By integrating Afrofuturist thought with computational methods, researchers can develop new pathways for advocacy, analysis, and accountability in STEM fields (Johns & Howard, 2022). This approach centers the narratives and aspirations of Black professionals as central to reimaging the future of science and technology (Egede et al., 2024). Moreover, Afrofuturism emphasizes the importance of centering Black voices as innovators of technological futures (Hill-Jarrett, 2023). It advocates for equity not only in access but in authorship, demanding that Black scientists, engineers, and futurists lead the conversations about innovation, ethics, and the future of our world (Hill-Jarrett, 2023). As we move deeper into an AI-driven world, Afrofuturism offers both a critical perspective for rethinking how computational systems are designed, interpreted, and evaluated, particularly in relation to historically marginalized communities (E. McGee, 2025).
Methodological Possibilities: NLP and AI in Social Justice Research
NLP is a subfield of AI that focuses on enabling computers to understand, interpret, and generate human language (Khurana et al., 2022). By combining computational linguistics with machine learning, NLP enables large-scale analysis of text data, making it extremely useful for examining digital communication, social trends, and cultural expression (Khurana et al., 2022).
In recent years, NLP has increasingly been used to investigate social and political discourse, particularly within movements focused on equity and justice. Researchers have utilized NLP to analyze narratives from Black Lives Matter (BLM), #MeToo, and other online activism campaigns, identifying how language is used to fight oppression and build solidarity (Alfonzo, 2021; Bisgin et al., 2022; Harb et al., 2020; Nguyen et al., 2023; Schneider & Carpenter, 2019; Ujah et al., 2023). Tools such as topic modeling, sentiment analysis, named entity recognition (NER), and discourse analysis have enabled scholars to identify patterns in how individuals express emotions, use language, and build shared identities online (Hankar et al., 2025; Li et al., 2024; Nandwani & Verma, 2021).
Within racial justice research, NLP has also been used to detect biased or racially coded language in media (Castillo-Campos et al., 2025), uncover patterns of hate speech and misinformation on social platforms (Davidson et al., 2017), and examine how racialized identities are represented and policed through language. These applications demonstrate NLP’s potential as a tool for documenting inequality, advocating for structural change, and amplifying historically silenced voices. However, they also raise important questions about how computational tools should be designed, interpreted, and ethically applied with engaged Black discourse.
Applying Afrofuturist Principles to NLP in Black STEM Discourse
To explore how Black professionals navigate, critique, and reimagine STEM through discourse, this paper examines how Afrofuturist principles can inform the application of NLP to Black STEM narratives. This section outlines key considerations that shape how researchers might select data, apply computational methods, and interpret findings in culturally grounded ways. These considerations, such as diverse data sources, culturally aware NLP techniques, and ethical, community-driven research practices, are presented as interrelated domains. Together, they illustrate how Afrofuturist principles can shape computational analysis into Black STEM discourse.
Data Sources for Studying Black STEM Narratives
To meaningfully analyze Black STEM discourse, it is important to draw from diverse, culturally rich, and community-grounded sources. Social media platforms serve as real-time records of Black thought, resistance, and innovation (Maragh-Lloyd, 2024). Hashtags such as #BlackInSTEM, #Afrofuturism, and #BlackAndSTEM on X (formerly Twitter), Reddit threads like r/BlackTechnology, and TikTok video content provide insight into everyday experiences, emerging trends, and the formation of collective identity among Black STEM professionals and enthusiasts. These platforms also provide a space where Black individuals navigate issues of institutional racism, representation, and joy within scientific and technological spaces.
Beyond social media, Black STEM media outlets, such as Data for Black Lives and AfroTech, as well as speculative fiction blogs centered on Black science fiction authors, offer curated perspectives on systemic challenges, innovation, and futurism. These sources can blend technical discourse with cultural and political critique, making them well-suited for computational analysis informed by Afrofuturist theory.
Finally, academic publications authored by Black scholars working at the intersection of STEM, education, race, and equity represent a critically important source of discourse. These works often grapple with structural exclusion and propose frameworks for inclusive innovation, offering a theoretical foundation to computational inquiry. Including peer-reviewed literature in the dataset can help elevated marginalized scholarly voices within mainstream research and ground computational analysis in community-rooted knowledge.
NLP Techniques for Studying Black STEM Discourse
Analyzing Black STEM discourse may benefit from a combination of NLP techniques that can capture both content (what is being said) and context (how and why it is being said). Each method offers a different way of understanding the emotional, cultural, and structural dimensions of Black engagement in STEM.
Topic modeling refers to a class of algorithms that automatically identify groups of words that frequently occur together, which can be interpreted as “topics” within a collection of documents (Abdelrazek et al., 2022). Techniques such as Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), Top2Vec, and BERTopic can help uncover these latent themes without needing predefined categories (Egger & Yu, 2022). When applied to Black STEM narratives, topic modeling can identify recurring discussions, such as institutional racism, educational inequity, cultural affirmation, or mentorship. These models can be used to examine how topics evolve and how conversations differ across platforms.
Sentiment analysis is the computational process of determining whether a piece of text expresses a positive, negative, or neutral emotional tone (Jim et al., 2024). Tools like the Valence Aware Dictionary and sEntiment Reasoner (VADER) or transformer-based models like RoBERTa can be used to assess the emotional tone of posts and publications. This technique can be used to examine how Black individuals emotionally respond to moments of exclusion, celebration, resistance, or innovation in STEM environments.
Named Entity Recognition (NER) is a technique for automatically detecting and classifying proper nouns in text, such as names of people, organizations, places, or events (Perera et al., 2020). In Black STEM discourse, NER can highlight frequently mentioned figures (e.g., Katherine Johnson, Mark Dean), institutions (e.g., Historically Black Colleges and Universities (HBCUs)), and initiatives (e.g., Black Girls Code). This can help map community focal points and provide insight into which individuals or entities are perceived as influential or symbolic within the conversation.
Discourse analysis in NLP, particularly when powered by word embeddings (e.g., Word2Vec, GloVe) or contextual language models (e.g., GPT, BERT), examines how language constructs meaning across sentences and contexts (Joty et al., 2019). Unlike sentiment analysis, which focuses on tone, discourse analysis looks at how concepts are framed (Joty et al., 2019). This includes how terms like “representation,” “equity,” or “Afrofuturism” are used differently across posts or over time. These models can be used to examine how Black technologists and scholars discuss justice, innovation, and visibility, as well as how dominant narratives are challenged or reimagined through discourse.
Ethical Considerations and AI Bias
While computational analysis can be helpful, it also raises serious ethical questions, particularly when working with the digital expressions of marginalized communities. NLP tools are not neutral; they are shaped by the biases of the datasets and developers behind them (Ferrara, 2023). Many models underperform when analyzing AAVE, Caribbean dialects, or culturally specific terminology, leading to misclassification or erasure (Dunlap & McCoy, 2026; Mire et al., 2025). When analyzing Black STEM discourse, it is critical to evaluate and correct for bias in pre-trained models or consider fine-tuning on culturally relevant data.
Moreover, ethical research on Black communities demands more than technical rigor. It requires intentional community engagement. This means designing research that includes Black STEM professionals as collaborators, not just as subjects of study. Participatory approaches can help ensure that the questions posed, tools used, and interpretations made are rooted in the values and lived experiences of those represented in the data.
Finally, data privacy and informed consent are non-negotiable. Even when data is publicly available online, Black digital expression often exists within vulnerable contexts and should not be mined carelessly. Researchers must avoid extractive practices and work to contextualize and protect the narratives they analyze, ensuring that representation does not become exploitation.
Taken together, these considerations demonstrate how Afrofuturist principles can inform the design, application, and interpretation of NLP in the study of Black STEM discourse. This perspective seeks to amplify, protect, and learn from Black voices rather than merely study them. Furthermore, it aligns with the broader goals of Afrofuturism by positioning Black professionals as agents of technological imagination and transformation.
Implications for STEM Policy, AI Ethics, and Education
The integration of Afrofuturist principles with NLP highlights transformative possibilities for shaping more equitable STEM systems. As STEM disciplines increasingly influence public policy, global innovation, and everyday life, insights gained from analyzing Black STEM discourse through an Afrofuturist lens can inform discussions across three key domains: policy, AI ethics, and STEM education.
Informing STEM Equity Policies
NLP has the potential to contribute to equity-focused STEM policy discussions. By analyzing discourse across social media, academic writing, and professional networks, computational methods can surface the concerns, barriers, and aspirations voiced by Black individuals in STEM. These insights may inform diversity and inclusion initiatives by identifying patterns of exclusion and areas where institutional support is most needed.
For policymakers and institutional leaders, such findings can provide a data-driven foundation for developing more responsive programs. For example, analysis of #BlackInSTEM posts may highlight underrepresentation in specific STEM fields or recurring experiences of isolation in academic departments. These patterns could inform recruitment strategies, mentorship programs, and efforts to evaluate bias in admissions, hiring, and promotion practices.
For example, prior research analyzing social media discourse within movements such as Black Lives Matter has demonstrated how computational methods can surface patterns of exclusion, collective identity, and calls for institutional change (Chang et al., 2022; Gallagher et al., 2018). Applying similar approaches to #BlackInSTEM discourse could highlight underrepresentation in STEM fields or recurring experiences of isolation within academic departments. These insights could then be used by universities or funding agencies to design targeted interventions, such as pipeline development and recruitment initiatives aimed at increasing representation in underrepresented STEM fields, or departmental climate assessments.
Building Afrocentric AI Models for STEM Inclusion
The push for inclusive STEM must also extend to the technologies we use to shape its future. Current AI models are rarely built with Black cultural knowledge or ethical concerns in mind (Hanna et al., 2020; Noble, 2018). There is an urgent need to develop Black-centered AI ethics frameworks, ones that prioritize cultural specificity, data justice, and community empowerment. These frameworks would not only critique existing systems but also actively reimagine what equitable AI could look like.
Afrofuturism offers a valuable foundation for this reimagining. It invites us to ask: What would AI look like if it were built by and for Black communities? How might algorithms function differently if they were trained on values of communal care, liberation, and ancestral knowledge? By incorporating Afrofuturist principles into the development of AI models, we can challenge existing biases in STEM hiring, funding distribution, and educational access. These processes could be transformed into tools of justice rather than tools that perpetuate inequality.
For example, recent work in algorithmic fairness has shown that AI systems can both reflect and detect patterns of bias in language, including forms of gendered and exclusionary phrasing that may shape opportunities in domains such as hiring (Yan et al., 2020; Zafar et al., 2017). Extending this approach, Afrofuturist-informed NLP models could be trained to detect racially coded or culturally exclusionary language in STEM hiring materials, grant calls, or promotion criteria. In practice, this could involve auditing institutional documents to identify patterns that may disadvantage Black applicants and recommending more inclusive language aligned with equity goals. These models could also suggest culturally affirming content in STEM classrooms and promote visibility for underrepresented scholars in academic publishing platforms.
Future Directions and Community-Centered Implementation
Advancing this work requires sustained interdisciplinary collaboration. AI researchers must work alongside Black scholars, educators, ethicists, and policymakers to ensure that computational tools serve real-world equity goals. These collaborations should be ongoing, reciprocal, and grounded in shared decision-making, not just one-time consultations or tokenized partnerships. Together, these groups can design research agendas, build ethical infrastructures, and co-create technologies that are responsive to the needs of Black communities in STEM.
Additionally, the field must invest in publicly available, ethical datasets that center Black STEM discourse. Currently, much of the digital narrative around Black experiences in STEM remains under-collected or isolated behind privately owned digital platforms or websites. Open-access datasets that are community-reviewed and ethically curated would make research more accessible. It would allow more scholars, especially those from underrepresented backgrounds, to engage in this work. Also, these datasets would allow for longitudinal analyses of how discourse, representation, and policy evolve.
For example, educators could integrate Afrofuturist-informed datasets into data science or AI courses, allowing students to analyze Black STEM discourse using NLP tools. This might include analyzing hashtags such as #BlackInSTEM or curated datasets from Black-led organizations to explore themes of representation, identity, and innovation. Such approaches build technical skills and encourage students to critically engage with questions of bias, equity, and cultural context in AI systems.
Furthermore, the field must advocate for community-centered NLP approaches that prioritize data ownership, linguistic inclusivity, and cultural specificity. Drawing on Afrofuturist principles encourages researchers to recognize Black digital discourse as a site of creativity, resistance, and knowledge production. This includes building models that are designed in collaboration with Black communities, researchers, and technologists to ensure ethical alignment and relevance. These practices shift computational research from extractive approaches to participatory knowledge production.
For instance, initiatives such as the Algorithmic Justice League and Data for Black Lives demonstrate how community-centered data practices can support ethical and justice-oriented research (Algorithmic Justice League, n.d.; Data for Black Lives, n.d.; Hanna et al., 2020). A similar approach in NLP research could involve co-developing datasets with Black STEM professionals, where participants have input on how their narratives are collected, interpreted, and used. This model shifts data collection from extraction to collaboration and ensures that computational analyses remain accountable to the communities they represent.
Ultimately, the integration of NLP into social justice research is not just a technical opportunity. It is a political responsibility. If developed thoughtfully, it can help dismantle structural barriers, amplify historically silenced voices, and help shape a more equitable future for STEM and beyond. In this way, NLP becomes more than a research method. It becomes a tool for challenging dominant narratives and shaping more equitable futures in STEM and beyond, where Black expression is essential to how we understand language, data, and power.
Conclusion
Black voices have consistently shaped science and technology through innovation, resistance, and vision (Lemelson Center for the Study of Invention and Innovation, 2021). However, they remain systematically marginalized in both representation and recognition (Lemelson Center for the Study of Invention and Innovation, 2021). This paper argues that Afrofuturism is not merely a critical lens but an essential perspective for reimagining STEM equity. Rather than advocating for inclusion within existing and unjust structures, Afrofuturism pushes for the transformation of those structures altogether by redefining who creates, controls, and benefits from technological progress.
Through this lens, language becomes a powerful site of both oppression and possibility. As this paper outlines, NLP techniques such as topic modeling, sentiment analysis, NER, and discourse analysis can be leveraged to amplify Black narratives, map cultural knowledge, and reframe innovation in the context of justice. However, such tools must be utilized with care. The risks of algorithmic bias, linguistic erasure, and extractive research practices are real and well-documented.
Ethical, community-driven applications of NLP demand transparency, cultural responsiveness, and deep engagement with Black scholars, practitioners, and communities. To build equitable futures in STEM, we must move beyond representation as a goal and toward liberation as a foundation. This requires urging AI researchers and data scientists to integrate Afrofuturist principles into model design, training, and evaluation. Moreover, it requires fostering interdisciplinary collaboration between technologists, educators, Black scholars, and community leaders to co-create inclusive approaches to AI and STEM. These partnerships are essential not only for technical innovation but for ensuring that technology serves the communities it claims to uplift.
This work builds on existing scholarship in AI ethics, algorithmic bias, and STEM equity by extending these conversations through an Afrofuturist lens (e.g., Buolamwini, 2023; Hanna et al., 2020; E. O. McGee, 2020; Noble, 2018). While prior research has documented bias in computational systems and inequities in STEM, this article contributes a theoretical and methodological approach for rethinking how these systems are designed and applied when analyzing Black discourse. Rather than treating culture as a limitation, this perspective positions Black linguistic, cultural, and epistemic practices as central to computational inquiry.
For research, this approach can be applied in future work by informing decisions about data selection, the application of NLP techniques, and the interpretation of findings. For example, scholars can incorporate culturally grounded datasets, critically evaluate how NLP models perform across linguistic contexts, and engage in community-centered research design. In doing so, this work provides a foundation for developing computational methods that are technically robust and ethically and culturally responsible. By applying critical, culturally grounded approaches to NLP and AI, we can move beyond diversity metrics and toward transformative change. Together, these approaches move us from analyzing futures to actively shaping them.
