Hello, I'm

Noah Darwich

ML Engineer & Research Data Scientist

I'm an ML Engineer working on the voice pipeline — STT, LLM, TTS — for voice agents in African and Arabic-speaking markets. Before that, six years of research data science: building multilingual datasets and shipping LLM tools for academic teams at Yale, Oxford, Georgetown, Princeton, and the Rockwool Foundation.

Noah Darwich - ML Engineer & Research Data Scientist

What I Do

Data Engineering & Quality Governance

Build and maintain high-integrity social science datasets from diverse, multi-lingual, and often unstructured sources. Designing rigorous data collection strategies, implementing comprehensive quality control frameworks, and managing research teams for large-scale data products.

AI/LLM-Powered Research Automation

Develop custom, Python-based AI and computational tools to automate and streamline multi-step academic research workflows. This process includes leveraging Large Language Models (LLM) and APIs to efficiently extract, categorize, and structure complex variables, significantly reducing manual effort in coding and data preparation.

Machine Learning & Predictive Modeling

Apply and develop robust Machine Learning (ML) and Deep Learning models to solve critical social and public policy questions. This specialization encompasses building advanced classification systems for phenomena such as predictive analysis in social conflict, conducting sentiment analysis, and performing rigorous accuracy testing to ensure model reliability for research application.

Institutions I've worked with