I'm an ML Engineer working on the voice pipeline — STT, LLM, TTS — for voice agents in African and Arabic-speaking markets. Before that, six years of research data science: building multilingual datasets and shipping LLM tools for academic teams at Yale, Oxford, Georgetown, Princeton, and the Rockwool Foundation.
Build and maintain high-integrity social science datasets from diverse, multi-lingual, and often unstructured sources. Designing rigorous data collection strategies, implementing comprehensive quality control frameworks, and managing research teams for large-scale data products.
Develop custom, Python-based AI and computational tools to automate and streamline multi-step academic research workflows. This process includes leveraging Large Language Models (LLM) and APIs to efficiently extract, categorize, and structure complex variables, significantly reducing manual effort in coding and data preparation.
Apply and develop robust Machine Learning (ML) and Deep Learning models to solve critical social and public policy questions. This specialization encompasses building advanced classification systems for phenomena such as predictive analysis in social conflict, conducting sentiment analysis, and performing rigorous accuracy testing to ensure model reliability for research application.