The Second Arabic NLP School
Co-located with EACL 2026
Rabat, Morocco
March 24, 2026
Organized by SIGARAB
The Second Arabic NLP School 2026 is a one-day event co-located with EACL 2026 in Rabat, Morocco, on Tuesday, March 24, 2026. The first edition of the Arabic NLP School (Arabic NLP Winter School 2025) was co-located with COLING 2025 in Abu Dhabi, UAE (January 18-19, 2025). 100 people participated in the First Arabic NLP Winter School.
The 2026 edition of the School introduces a redesigned, highly practical format aimed at developing end-to-end research skills in NLP. The School will guide participants through the full research lifecycle: from conceiving an idea, formulating research questions, conducting responsible and thorough related-work reviews, designing meaningful experiments, and curating/annotating data, all the way to preparing submissions, producing camera-ready papers, and effectively presenting research at conferences. While these skills are central to any NLP research effort, the School focuses specifically on Arabic language technologies, Arab cultures, and the unique needs of emerging researchers from the Arab world.
The program is an intensive, hands-on experience combining expert panels, ethics-focused discussions, and collaborative teamwork to develop a complete, well-founded research proposal in Arabic NLP. The program seeks to empower the next generation of scholars and practitioners who will shape the future of Arabic NLP.
We anticipate a wide range of participants, including researchers, undergraduate and graduate students, PhD candidates, and industry and government practitioners with an interest in Arabic NLP.
Arabic NLP School 2026 is an application-based program offered free of charge to applicants who are accepted (thanks to our great ✨sponsors✨!). Application details are provided below.
🥇 TurathBench
Project: WarshRec: A Benchmark for Quranic Tajweed Recitation Error Detection and Correction
Team: Noureddine Khaous, Ismael, Amal El Mahraoui, Arif Hiba, Outhmane Marmouzi, and Fatmazzahra Elhoubri
Mentors: Kareem Darwish and Dris Namly
🥈 PolyArab
Project: Distinguishing Dialect from Disorder: Robust Clinical NLP Biomarkers for Arabic Speech
Team: Ghofrane Merhbene, Khalid Abid, Zakaria El-Aoufi, Nouhaila Houssa, and Zakaria Zaouak
Mentors: Hakim Hafidi and Khalil Mrini
🥉 The Sentimentals
Project: Thinking Emotionally in Arabic
Team: Aya Cherqi, Safae Bouhaddou, Fatima-zahra Aazi, Ikrame Kiyadi, Maida Aizaz, Mourad El Asmai, Nouhaila Rabii, and Rabia Rachidi
Mentors: Youness Moukafih and Haithem Afli
The School will take place at the EACL 2026 conference venue: Palais des Congrès Rabat Bouregreg (Google Maps).
8:00-9:00 — Registration
9:00–9:30 — Welcome, agenda, sponsor acknowledgment (Nizar Habash)
9:30–10:30 — Panel: Research from A to Z (Houda Bouamor, Nizar Habash, Walid Magdy, Moderator: Bashar Alhafni)
A comprehensive discussion on the full research pipeline and best practices.
Stories of failures, resilience, efficiency strategies, and pitfalls to avoid
10:30–11:00 — Coffee Break: Teams and Mentors Socialize
11:00-11:30 — Invited Talk: Arabic LLM Benchmarking (Kareem Darwish) [PDF]
Considerations for responsible research, especially in the context of Arabic NLP.
11:30–1:00 — Teamwork Session: Writing a Research Proposal
Teams complete a structured 4-page research proposal. The template focuses on research questions, related work, novelty, and feasibility.
1:00–2:00 — Lunch (Working Lunch)
2:00–3:30 — Proposal Finalization & Slide Preparation
Teams finalize proposals, exchange feedback with other teams, and prepare three presentation slides: title, research question, and plan.
3:30–4:00 — Reviewing
Each team will be judged by three reviewers (three mentors not associated with the team). 10 min per review.
4:00–4:30 — Coffee Break: Organizers identify best 6 teams to present.
4:30–5:00 — Team Presentations
Each of the 6 top teams delivers a 5-minute presentation of their proposed research (1 representative, others on stage). Online voting by all school participants.
5:00–5:30 — Awards and Closing Session
The best teams are acknowledged. Sponsors, Mentors, and Participants are thanked.
Fuṣḥā'izi: Fusḥa wa Fuṣḥā: From dialectal wording to semi-MSA
Dialect - Team 1: Rayyan Merchant (Lead), Amir Ejmail, Mouaad Errami, Walid Kerroumi, Ismail El Bazi, Hanae Baraka
Mentors: Walid Magdy and Salam Khalifa
Natakallam: Enhancing Arabic Dialect Understanding with Lexicons and Synthetic Data
Dialect - Team 2: Hiba Bouhnin (Lead), Nathaniel Robinson, Yassine Farah, Khalil Mellouk, Yassine Bouras, Youssef Mahdoubi, Rebbah Yahya
Mentors: Salam Khalifa and Walid Magdy
PolyArab: How Well Do LLMs Really Understand Arabic Dialects? A Benchmarking Study
Dialect - Team 4: Ghofrane Merhbene (Lead), Khalid Abid, Zakaria El-Aoufi, Nouhaila Houssa, and Zakaria Zaouak.
Mentors: Hakim Hafidi and Khalil Mrini
(2) Arabic Data Resources & Benchmarking
Tibyan: Corpus-Grounded Retrieval-Augmented Generation for Reliable LLM-Based Arabic Reading Assessment
Data Resources - Team 1: Noura Ogbi (Lead), Samir Abdaljalil, Hassan Oukhouya, and Hafssa Ziyati.
Mentors: Hamdy Mubarak and Go Inoue
ArabicRAG-Eval: A Faithfulness Benchmark for Arabic Retrieval-Augmented Generation Systems
Data Resources - Team 2: Zaineb Rahmani (Lead), Abderrahmane Jouilili, Abdellah Hasnaoui, Moumni Mohammed, and Wissal Saib.
Mentors: Go Inoue and Hamdy Mubarak
TurathBench: MaktaBench: Advancing Arabic NLP through Benchmarking & Dataset Documentation
Data Resources - Team 3: Ismail El Jamiy (Lead), Hiba Arif, Larbi Boulaarab, Noureddine Khaous, Fatima Ezzahra El Houbri, Amal El Mahraoui, Majda Essaadi, and Outmane Marmouzi.
Mentors: Kareem Darwish and Driss Namly
TeamWise-QAMAR AI: QMHallucinate: A Benchmark for Detecting Hallucinations in LLM-Based Morphological Analysis of Quranic Arabic
Data Resources - Team 4: Sara Faqihi (Lead), Firdaous Ait Mohamed, Adnane El Amrani, and Omar Momen
Mentors: Driss Namly and Kareem Darwish
(3) Domain-Specific Arabic NLP
NurNLP: Improving Quranic Question Answering and Benchmarking: Towards Hallucination Detection, Complex Query Handling, and Enhanced Retrieval
Domain-Specific - Team 1: Youssef Amzoug (Lead), Hiba Benkaddour, Oumayma Elbiach, Anas Hajbi, and Sana Khayou.
Mentors: Bashar Alhafni and Ehsaneddin Asgari
DarijaMed: A Patient-Centered Assistant for Explaining French Medical Reports in Moroccan Darija
Domain-Specific - Team 2: Saad Frihi (Lead), Chaima Ben Jaafar, Khaoula Draoui, Wiam Makboul, and Abdelillah Omari.
Mentors: Ehsaneddin Asgari and Bashar Alhafni
AraFinMerge: Accelerating large language models development for financial Arabic text classification through model merging methods.
Domain-Specific - Team 3: Mohamed Khenchouch (Lead), Mohamed El Amraoui, Yassine Mountassir Idrissi, and Mohammed Lahlou
Mentors: Lamia Ben Hiba and Tamer Elsayed
Al-Bayyinah: AL-QADI: An LLM-as-a-Judge Framework for Hallucination Detection and Abstention in Arabic Islamic Content
Domain-Specific - Team 4: Amina El Ganadi (Lead), Fatima-ezzahra Darfaoui, Rokaya El Gounidi, Mouad Hakam, and Sanjeev Kumar.
Mentors: Tamer Elsayed and Lamia Ben Hiba
The Sentimentals: Challenges and Opportunities in MSA Emotion Detection: insights from large language models
Social good - Team 1: Rabia Rachidi (Lead), Fatima-Zahra Aazi, Maida Aizaz, Aya Cherqi, Mourad El Asmai, Ikrame Kiyadi, and Nouhaila Rabii.
Mentors: Haithem Afli and Youness Moukafih
Lisan Lab: Dialect-Aware Counter-Speech Generation for Reducing Toxicity in Arabic Online Discourse.
Social Good - Team 3: Abed Qaddoumi (Lead), Chaimae Bouhatouss, Fatima Ezzahra Chourak, and Ouail Laamiri.
Mentors: Kawtar Younsi Dahbi and Wajdi Zaghouani"
AraFair: A Multi-Dimensional Gender Bias Evaluation Framework for Arabic Large Language Models
Social Good - Team 4: Roaa Abdelmagid (Lead), Chaimae Abouzahir, Hammam Akrami, Rihab Atbir, Zakaria Baannou, Abdelmonim Benbouchta, Marème Diop, and Zakaria Mourid.
Mentors: Wajdi Zaghouani and Kawtar Younsi Dahbi
Research Paper Template (Make a copy and edit)
Proposal Presentation Template (Make a copy and edit)
Capacity: The school will host 120 in-person student participants. Attendance is not hybrid; all participants must be physically present.
Selection: Participants will be chosen through an application process. Mentors, drawn from respected members of the Arabic NLP community, will take part in reviewing applications.
Team-Based Research: Attendees will be organized into 16 teams of 6-8, each guided by a dedicated primary mentor and secondary mentor.
Workspace: Round-table seating will accommodate teams of nine (mentor + eight team members) to encourage active collaboration and co-creation.
The goal of the team structure is to ensure that every participant gains hands-on experience in collaborative research planning and communication.
Build deep, practical research skills among emerging scholars in Arabic NLP
Cultivate long-term mentorship relationships and professional networks
Foster collaborative projects rooted in regional linguistic and cultural needs
Equip participants with the tools to become future leaders and contributors to SIGARAB and the broader NLP field
The school is designed to be an intensive, community-driven experience that accelerates participants’ readiness to engage in high-impact Arabic NLP research.
The School’s 16 groups will be organized in four clusters:
This cluster focuses on modeling Arabic varieties (dialects, standard, and classical) and handling variation across regions, registers, and writing styles. It addresses code-switching, spelling variation, and transfer between Modern Standard Arabic (MSA) and dialects.
Dialect identification systems
Code-switching modeling (Arabic–French/English)
Morphology-aware language modeling
Dialect normalization systems
(2) Arabic Data Resources & Benchmarking
This cluster focuses on building, annotating, documenting, and evaluating Arabic datasets. It strengthens research foundations through improved benchmarks, greater reproducibility, and better experimental design.
Corpus construction and dataset documentation
Annotation guidelines and inter-annotator agreement studies
Arabic NLP benchmark creation
Reproducibility and model comparison studies
(3) Domain-Specific Arabic NLP
This cluster develops Arabic NLP systems tailored to specific domains such as healthcare, law, education, media, or multimodal applications, where terminology and context require specialized modeling.
Healthcare or clinical text modeling
Legal and policy document analysis
Educational Arabic NLP
Arabic Financial NLP
Bias and fairness evaluation in Arabic LLMs
Cultural representation analysis
Ethical data collection frameworks
Hallucination and misinformation detection
NLP tools for social good and public services
Nizar Habash, CAMeL Lab, New York University Abu Dhabi
Houda Bouamor, Carnegie Mellon University in Qatar
The School is supported by a team of 20 mentors.
🌟 Haithem Afli
🌟 Bashar Alhafni
🌟 Si Lhoussain Aouragh
🌟 Ehsan Asgari
🌟 Lamia Benhiba
🌟 Houda Bouamor
🌟 Kawtar Younsi Dahbi
🌟 Kareem Darwish
🌟 Mo El-Haj
🌟 Tamer Elsayed
🌟 Nizar Habash
🌟 Hakim Hafidi
🌟 Go Inoue
🌟 Salam Khalifa
🌟 Walid Magdy
🌟 Youness Moukafih
🌟 Khalil Mrini
🌟 Hamdy Mubarak
🌟 Driss Namly
🌟 Wajdi Zaghouani
🚀🚀🚀 The application form for The Arabic NLP School is provided 🌟🌟🌟here🌟🌟🌟.
Important Dates
Application Deadline: January 31, 2026
Acceptance Notification: February 9, 2026
Arabic NLP School: March 24, 2026.
We reiceived over 300 applications! We selected 120 students to be part of the Arabic NLP School.
Interested in sponsoring the Arabic NLP School?
Check out out sponsorship levels and details.