Workshop
Muslims in ML
Sanae Lotfi · Hammaad Adam · Hadeel Al-Negheimish · Sarah Fakhoury · Razan Baltaji · Marzyeh Ghassemi · Shakir Mohamed · Aya Salama · S. M. Ali Eslami · Tasmie Sarker
Room 203 - 205
The Muslims in ML workshop seeks to promote awareness, collaboration, and the development of mitigation strategies to ensure that machine learning and artificial intelligence advancements are implemented fairly and equitably for Muslims worldwide. By bringing together a diverse range of experts and incorporating multiple perspectives and backgrounds, our workshop aims to examine the challenges and opportunities of integrating AI/ML in the lives of Muslims and those in Muslim-majority countries. The workshop's focus extends beyond religious identification, encompassing cultural association and proximity to the Muslim identity. This broad approach acknowledges the complexity and diversity within the Muslim community and emphasizes the importance of inclusivity and understanding in addressing the potential impact of AI/ML technologies.
Schedule
Mon 6:30 a.m. - 6:45 a.m.
|
Registration
(
Registration
)
>
|
🔗 |
Mon 6:45 a.m. - 7:00 a.m.
|
Opening Remarks
(
Opening Remarks
)
>
SlidesLive Video |
Sanae Lotfi · Hammaad Adam · Hadeel Al-Negheimish 🔗 |
Mon 7:00 a.m. - 7:30 a.m.
|
Red Teaming Generative AI Systems
(
Keynote
)
>
SlidesLive Video As generative AI systems continue to evolve, it is crucial to rigorously evaluate their robustness, safety, and potential for misuse. In this talk, we will explore the application of red teaming methodologies to assess the vulnerabilities and limitations of these cutting-edge technologies. By simulating adversarial attacks and examining system responses, we aim to uncover latent risks and propose effective countermeasures to ensure the responsible deployment of generative AI systems in new domains and modalities. |
Lama Ahmad 🔗 |
Mon 7:30 a.m. - 7:45 a.m.
|
Contributed Lightning Talks
(
Spotlight
)
>
SlidesLive Video |
🔗 |
Mon 7:45 a.m. - 8:05 a.m.
|
Coffee break
(
break
)
>
|
🔗 |
Mon 8:05 a.m. - 8:35 a.m.
|
Unruly Data: On the Archives and Counter-Archives of Drone Warfare
(
Keynote
)
>
SlidesLive Video This talk asks: what can a standpoint situated amidst the smoldering ruins of drone bombardment in a post-colonial village in a post-colonial country teach us about the racial technics of digital war? US drones began bombing the Pakistan-Afghan borderlands in 2004. Over two decades of the war on terror, the weaponized drone has become the emblem of US techno-imperial power. It inspires terror and awe. Scholarship on the drone war has yielded important insights into the sociotechnical assemblage that constitutes the drone—from image analysts to pilots, to drone cameras, algorithmic kill lists, and data mining. In so doing however, it has tended to orient around US technics and the figure of the US drone pilot. Such work, while sympathetic to the racialized victims of war and colonialism, has nevertheless sometimes treated these subaltern social worlds as un-generative sites for elaborating an analytics of digital war. This presentation draws on ethnographic fieldwork amidst populations from the Pakistan-Afghan borderlands to decenter the drone so that we can better understand the drone war. It asks about the possibilities for generating what I tentatively term unruly data, forms of knowledge that is not reducible to the categories of “militants” or “civilians” often used to debate, discuss, and adjudicate drone bombardment. |
Madiha Tahir 🔗 |
Mon 8:35 a.m. - 9:05 a.m.
|
Adversarial Examples Beyond Security
(
Keynote
)
>
SlidesLive Video Adversarial examples are often perceived as threats that deceive AI models, posing security risks. This talk aims to reframe adversarial examples as beneficial tools, emphasizing their positive impact on AI deployment. Specifically, we will discuss their application in two key areas: designing robust objects and safeguarding against unauthorized AI-based image manipulations. Our discussion will offer a nuanced perspective on the role of adversarial examples in AI. |
Hadi Salman 🔗 |
Mon 9:05 a.m. - 9:45 a.m.
|
Poster Session
(
Poster
)
>
|
🔗 |
Mon 9:45 a.m. - 10:15 a.m.
|
Editing Language Models with Natural Language Feedback
(
Keynote
)
>
SlidesLive Video Even the most sophisticated language models are not immune to inaccuracies, bias or becoming obsolete, highlighting the need for efficient model editing. Model editing involves altering a model’s knowledge or representations to achieve specific outcomes without the need for extensive retraining. Traditional research has focused on editing factual data within a narrow scope—limited to knowledge triplets like ‘subject-object-relation.’ Yet, as language model applications broaden, so does the necessity for diverse editing approaches. In this talk, I will describe our work that introduces a novel dataset where edit requests are natural language sequences, expanding the editing capabilities beyond factual adjustments to encompass a more comprehensive suite of modifications including bias mitigation. This development not only enhances the precision of language models but also increases their adaptability to evolving information and application demands. |
Afra Feyza Akyürek 🔗 |
Mon 10:15 a.m. - 11:30 a.m.
|
Roundtable Discussions
(
roundtables
)
>
|
🔗 |
Mon 11:30 a.m. - 1:30 p.m.
|
Lunch Break
(
break
)
>
|
🔗 |
Mon 1:30 p.m. - 1:30 p.m.
|
Joint Affinity Poster Session
(
Joint Poster Session @ Great Hall, 3:30 - 4:30 PM
)
>
|
🔗 |
-
|
Towards Trustworthy Autonomous Ground Vehicles Design in Nigeria
(
Poster
)
>
link
Artificial intelligence (AI)-enabled autonomous ground vehicles (AGVs) must be safely integrated into societies to fulfil their potential. This presents challenges to vehicle designers and regulators alike because many societies have varying road norms that shape road users’ interactions and ensure their safety. For assured safety on use, AGVs must be designed to operate safely and communicate appropriately with other road users or agents – this requires considering relevant local environments in their design. Most research informing this design focus on Western and Asian road environments, which differ considerably from the conditions in most of Africa. This paper presents work aiming to fill this gap in the literature through an exploration of trust, human values, practices, behaviors, communication, and interactions in Nigerian roads. We collected qualitative data about 50+ road users’ lived experiences over 7 months in two Nigerian cities using contextual inquiry, observation, autoethnography, and interviews. Insights from this research will be used to model the Nigerian road system, on-road trust, and to define and prototype AGVs design requirements for Nigeria. This work may also contribute to the design of other AI systems in Africa – especially where considerations of trust dynamics are important for ensuring their safe and responsible adoption. |
Memunat A Ibrahim 🔗 |
-
|
Towards Understanding Speaker Identity Coding in Data-driven Speech Models
(
Poster
)
>
link
Speaker identity plays a significant role in human communication and is being increasingly used in societal applications, many through advances in machine learning. Representational spaces of current deep learning models, self-supervised models in particular, have shown significant performance in various speech-related tasks. In this work, we demonstrate that these representations are significantly better for speaker identification over acoustic representations. We also show that such a speaker identification task can be used to better understand the nature of acoustic information representation in different layers of these powerful networks. By evaluating speaker identification accuracy across acoustic, phonemic, prosodic, and linguistic variants, we report similarity between model performance and human identity perception. These empirical findings provide both enhanced interpretability to these representational spaces and also support using this family of models as candidates to study speaker identity perception in humans. |
Gasser Elbanna · Fabio Catania · Satra Ghosh 🔗 |
-
|
Explainable Identification of Hate Speech towards Islam using Graph Neural Networks
(
Poster
)
>
link
Islamophobic language is a prevalent challenge on online social interaction platforms. Identifying and eliminating such hatred is a crucial step towards a future of harmony and peace. This study presents a novel paradigm for identifying and explaining hate speech towards Islam using graph neural networks. Utilizing the intrinsic ability of graph neural networks to find, extract, and use relationships across disparate data points, our model consistently achieves outstanding performance while offering explanations for the underlying correlations and causation. |
Azmine Toushik Wasi 🔗 |
-
|
Building Domain-Specific LLMs Faithful To The Islamic Worldview: Mirage or Technical Possibility?
(
Poster
)
>
link
Large Language Models (LLMs) have demonstrated remarkable performance across numerous natural language understanding use cases. However, this impressive performance comes with inherent limitations, such as the tendency to perpetuate stereotypical biases or fabricate non-existent facts. In the context of Islam and its representation, accurate and factual representation of its beliefs and teachings rooted in the Quran and Sunnah is key. This work focuses on the challenge of building domain-specific LLMs faithful to the Islamic worldview and proposes ways to build and evaluate such systems. Firstly, we define this open-ended goal as a technical problem and propose various solutions. Subsequently, we critically examine known challenges inherent to each approach and highlight evaluation methodologies that can be used to assess such systems. This work highlights the need for high-quality datasets, evaluations, and interdisciplinary work blending machine learning with Islamic scholarship. |
Shabaz Patel · Mohamed Kane 🔗 |
-
|
Clinical characterization of data-driven diabetes clusters of pediatric type 2 diabetes
(
Poster
)
>
link
Pediatric type 2 diabetes (T2D) is highly heterogeneous, similar to adult-onset diabetes. We performed data-driven cluster analysis to identify five distinct clusters of pediatric T2D based on commonly available clinical and biochemical data at the time of diagnosis. Thes clusters exhibit unique characteristics and associations with risk factors, conditions, and treatment regimens. Our findings suggest the potential need for personalized early treatment strategies. Furthermore, we observed intriguing insights into the use of metformin for pediatric patients in specific clusters, raising questions about its suitability as a first-line treatment. |
Mahsan Abbasi 🔗 |
-
|
Unraveling the Effects of Age-Based Distribution Shifts on Medical Image Classifiers
(
Poster
)
>
link
Medical AI underperforms when faced with new data distributions. A notorious example of this is the Epic Sepsis Model (ESM), whose predictive capabilities diminished upon testing with unseen patient data distributions. We study the effects of subpopulation shifts on medical image classifiers using two representative datasets. Our results highlight the nuanced effects of class distribution imbalances on performance drops, the significance of comprehensive evaluation strategies, and the need to collect diverse samples to advance medical AI deployment. |
Kumail Alhamoud · Yasir Ghunaim · Motasem Alfarra · Philip Torr · Tom Hartvigsen · Bernard Ghanem · Adel Bibi · Marzyeh Ghassemi 🔗 |
-
|
Qalama: Towards Semantic Autocomplete System for Indonesian Quran
(
Poster
)
>
link
Semantically retrieving Quran verses offers valuable benefits to Muslims, especially when we recall only a portion of the verse and need to find a specific one. This research introduces Qalama, an Islamic lecture note-taking application that employs a Quranic autocomplete system to retrieve referenced verses during note-taking. Developing such applications requires fast retrieval and minimal resource consumption. We present a retrieval scheme and implement various optimization strategies to ensure the system's effectiveness in real-world usage. The proposed method achieves 76.47% accuracy in 2.2s retrieval time with around 384 MB memory consumption. |
Rian Adam Rajagede · Kholid Haryono · Rochana Hastuti 🔗 |
-
|
Geosemantic Surveillance And Profiling Of Abduction Locations And Risk Hotspots Using Print Media Reports.
(
Poster
)
>
link
Kidnapping is a significant social risk in Nigeria which often lacks adequate intervention due to the unavailability of local crime data, underreporting of cases due to fear of retaliation from suspected perpetrators, or involvement of security operatives. In response, we have developed a data-driven solution by generating a reliable dataset of crime locations and entities in Nigeria. Our approach involves geoparsing newspaper-reported crime locations and entities using NLP techniques, and geospatial analysis of identified social risk hotspots. We have designed an algorithm that geoparse locations in unstructured raw text. Our research aims to provide insights and solutions for combating the menace posed by kidnappers in Nigeria. |
Toyib Ogunremi · Olubayo Adekanmbi · Anthony Soronnadi · David Akanji 🔗 |
-
|
Muslim-Violence Bias Persists in Debiased GPT Models
(
Poster
)
>
link
Abid et al. (2021) showed a tendency in GPT-3 to generate mostly violent completions when prompted about Muslims, compared with other religions. Two pre-registered replication attempts found few violent completions and only a weak anti-Muslim bias in the more recent InstructGPT, fine-tuned to eliminate biased and toxic outputs. However, more pre-registered experiments showed that using common names associated with the religions in prompts increases several-fold the rate of violent completions, revealing a significant second-order anti-Muslim bias. ChatGPT showed a bias many times stronger regardless of prompt format, suggesting that the effects of debiasing were reduced with continued model development. Our content analysis revealed religion-specific themes containing offensive stereotypes across all experiments. Our results show the need for continual de-biasing of models in ways that address both explicit and higher-order associations. |
Babak Hemmatian · Razan Baltaji · Lav Varshney 🔗 |
-
|
DGCF: Deep Graph-based Collaborative Filtering recommender system
(
Poster
)
>
link
Recommender systems (RS) are increasingly leveraging the power of graphs to enhance accuracy. However, we stipulate that existing methods don’t take into consideration the inherent behavior of communities and the interaction between all the sub-groups of the network.In this work, we develop a Deep Graph-based Collaborative Filtering recommender system (DGCF), which incorporates the concept of community profiling and leverages the power of Graph Neural Networks. DGCF extracts the overlapping communities from the homophily user-user graph and also integrates the high-order information from the user-item bipartite graph. We conduct experiments and evaluate the DGCF on the MovieLens datasets (ML-100K and ML-1M), and Douban dataset. Our experiments reveal significant improvements over a number of the latest deep learning models for recommender systems as it extracts deep relationships using the community structure. |
SOFIA BOURHIM 🔗 |
-
|
MED-NAS-Bench: Towards a Multi-task Neural Architecture Search Benchmark for Advancing Medical Imaging Analysis
(
Poster
)
>
link
We introduce MED-NAS-Bench, a comprehensive Neural Architecture Search (NAS) benchmark for medical imaging analysis. This benchmark encompasses task-specific metrics and hardware efficiency assessment, addressing the field's lack of standardized evaluation platforms. It empowers informed decision-making and optimized AI model adoption in real-world healthcare applications leveraging electronic health records (EHR). |
Hadjer Benmeziane · Kaoutar El Maghraoui · Hamza Ouarnoughi · Smail Niar 🔗 |