Researcher + Entrepreneur (ML + Healthcare``)
I develop large language models for clinical applications, particularly focused on building AI that can generate rigorous medical insights from real-world data. I combine knowledge graphs and large language models to generate hypotheses, and extract structured clinical data to evaluate them.
Industry Experience
March 2024 - Present
Mentor:
Mehraveh Salehi,
Saman Zarandioon
Training agent-based LLM systems to perform end-to-end medical research with real-world data. Designed and implemented LLM evaluation suite automatically evaluate the quality of LLM outputs. Designed multi-agent LLM systems to automatically identify and correct erroneous training data. Current projects are focused on automatically evaluating LLM performance on medical tasks and using LLMs to systematically probe for areas of model weakness
Sept 2022 - April 2024
Built an LLM-based assistant to provide personalized navigation of medical bills and healthcare costs. Our service reduced medical bills by 67% on average across all uses.
Summer 2022
Mentor:
Daniel Domingo-Fernandez,
David Healey,
Joe Davison
Performed systematic survey + implementation fo 20+ entity linking NLP models to improve accuracy evidence-based compound prioritization
Summer 2021
Mentor:
Minhazul Islam Sk
Designed and trained transformer-based semantic search document retrieval and ranking system to improve efficiency of customer support agents
Summer 2020
Mentor:
Anne Cocos
Built model jointly embed free-text entity mentions with structured entity knowledge graph for 30M research articles/abstracts and KG with 5M edges. Developed end-to-end pipeline to download, preprocess, and identify high-quality entity links for biomedical entities in 30M research articles. Engineered parallel model training workflow on distributed supercomputing cluster utilizing 10,000+ CPU cores and dozens of GPUs.
Nov 2018 - Aug 2019
Created credit scoring model and interactive job density visualizations to move into new domestic markets.
Summer 2018
Mentor:
Andrew Blevins
Developed and deployed recommender system to infer biological mechanism of action and repurposing potential of 1M+ compounds
May 2016 - May 2018
Mentor:
Andy Merrill
Built and deployed models to forecast individual patient risk of chronic disease onset and long-term complex care from EHR and environmental data. Published in IEEE ICHI (2017) and AJRCCM (2018).
Education
2018 - 2023
Ph.D. in Computational Science & Engineering
2017 - 2018
M.S. in Mathematics
GPA: 4.00/4.00
2010 - 2016
B.S. in Applied & Computational Mathematics
Magna Cum Laude, University Honors
Overall GPA: 3.96/4.00
Applied and Computational Mathematics Emphasis (ACME)
Honors and Awards
2022
1st Place and People's Choice, Georgia Tech Startup Exchange Pitch Competition
Medical billing startup to identify and correct errors in patient medical bills
2018
National Science Foundation GRFP Honorable Mention
Learning to Prescribe Optimal Disease Treatment via Machine Learning
2015
Dean and Helen Robinson Scholarship
Scholarship given to outstanding undergraduates in mathematics for Putnam Mathematics competition
2016
BYU University Honors
Awarded to undergraduates who write a thesis complete requirements in leadership, service, and cross-disciplinary scholarship.
2010-2016
BYU Heritage Scholarship
Full-tuition merit based scholarship for incoming students
2011
Amberly Rupp "Circle of Honor" Essay Contest Award
1st-place in university-wide essay contest
2010
National Merit Scholarship
Merit-based scholarship awarded top <1% of incoming university students
Selected Publications*
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). Singapore, 2023.
@inproceedings{
kartchner2023bioel,
title={A Comprehensive Evaluation of Biomedical Entity Linking Models},
author={Kartchner, David and Deng, Jennifer and Lohiya, Shubham and Kopparthi, Tejasri and Bathala, Prasanth and Domingo-Fern\'andez, Daniel and Mitchell, Cassie S},
booktitle={The 2023 Conference on Empirical Methods in Natural Language Processing},
year={2023},
url={https://openreview.net/forum?id=5Ob6DsDv2V}
}
Biology (Biology). 2023.
@article{kartchner2023cvd,
title={Literature-Based Discovery to Elucidate the Biological Links between Resistant Hypertension and COVID-19},
volume={12},
ISSN={2079-7737},
url={http://dx.doi.org/10.3390/biology12091269},
DOI={10.3390/biology12091269},
number={9},
journal={Biology},
publisher={MDPI AG},
author={Kartchner, David and McCoy, Kevin and Dubey, Janhvi and Zhang, Dongyu and Zheng, Kevin and Umrani, Rushda and Kim, James J. and Mitchell, Cassie S.},
year={2023},
month={Sep},
pages={1269}
}
22nd Workshop on Biomedical Natural Language Processing (BioNLP). Toronto, Canada, 2023.
46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Taipei, Taiwan, 2023.
@inproceedings{kartchner2023biosift,
author = {Kartchner, David and Al-Hussaini, Irfan and Turner, Haydn and Deng, Jennifer and Lohiya, Shugham and Bathala, Prasanth and Mitchell, Cassie},
title = {BioSift: A Dataset for Filtering Biomedical Abstracts for Drug Repurposing and Clinical Meta-Analysis},
year = {2023},
maintitle = {SIGIR},
booktitle = {46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
}
AI (AI). Online, 2022.
@article{kartchner2022rule,
title={Rule-Enhanced Active Learning for Semi-Automated Weak Supervision},
author={Kartchner, David and Nakajima An, Davi and Ren, Wendi and Zhang, Chao and Mitchell, Cassie S},
journal={AI},
volume={3},
number={1},
pages={211--228},
year={2022},
publisher={MDPI}
}
IEEE International Conference on Healthcare Informatics (ICHI). New York City, NY, USA, 2018.
@inproceedings{christensen2018machine,
title={Machine learning methods for disease prediction with claims data},
author={Christensen, Tanner and Frandsen, Abraham and Glazier, Seth and Humpherys, Jeffrey and Kartchner, David},
booktitle={2018 IEEE International Conference on Healthcare Informatics (ICHI)},
pages={467--4674},
year={2018},
organization={IEEE}
}
Benjamin D. Horne,
Elizabeth A. Joy,
Michelle G. Hofmann,
Per H. Gesteland,
John B. Cannon,
Jacob S. Lefler,
Denitza P. Blagev,
E. Kent Korgenski,
Natalie Torosyan,
Grant I. Hansen,
David Kartchner,
C. Arden Pope III
American Journal of Respiratory and Critical Care Medicine (AJRCCM). New York, NY, USA, 2018.
@article{horne2018short,
title={Short-term elevation of fine particulate matter air pollution and acute lower respiratory infection},
author={Horne, Benjamin D and Joy, Elizabeth A and Hofmann, Michelle G and Gesteland, Per H and Cannon, John B and Lefler, Jacob S and Blagev, Denitza P and Korgenski, E Kent and Torosyan, Natalie and Hansen, Grant I and others},
journal={American journal of respiratory and critical care medicine},
volume={198},
number={6},
pages={759--766},
year={2018},
publisher={American Thoracic Society}
}
*For all publications, please see my CV
Volunteer & Leadership Experience
2019 - 2022
Youth Mentor
Organize community service projects and teach leadership & life skills to youth ages 8-17
Fall 2019
English Teacher
Taught semester-long English as a second language course for immigrants to United States
2015-2018
Volunteer Translator
Provided occasional translation services to Tagalog-speaking visitors to BYU. Translation servies provided for visiting dignitaries at international Law and Religion symposium and Filipino missionaries receiving training prior to full-time service.
2017-2018
Student Alumni Relations Representative
Organized college-wide student-alumni networking dinner. Organized fundraising event for student-to-student need-based scholarship program. Met regularly with dean to discuss and address student needs.
Nov 2011 - Nov 2013
Full-time Missionary and Representative
Taught lessons in Tagalog language designed to strengthen families and communities. Organized quarterly conference and trainings for volunteers across six cities. Gathered and analyzed organizational data for regional leadership. Organized and coordinated community service projects with local leaders.
Technical Skills
Mathematics:
Matrix Analysis,
Complex Analysis,
Functional Analysis,
Numerical Linear Algebra,
Control Theory,
Probability Theory,
Parallel Computing,
Algorithm Design,
Linear & Nonlinear Optimization,
Active Learning,
Advanced Econometrics,
Abstract Algegra,
Differential Equations
Machine Learning:
Natural Language Processing (NLP),
Large Language Models (LLMs),
Knowledge Graphs,
Deep Learning,
Bayesian Statistics,
Computer Vision,
Semi-Supervised Learning,
Weak Supervision,
Information Retrieval
Packages:
Pytorch,
Pandas,
SpaCy,
NLTK,
RDKit,
Huggingface,
LangChain,
OpenAI
Programming:
Python,
R,
Stata,
Mathematica
Web:
HTML,
Web scraping,
SQL,
Cypher,
LaTeX,
Markdown,
Jekyll,
Git,
Google API suite
Visualization:
Figma,
Seaborn,
Bokeh,
Draw.io
Languages:
English (Native),
Tagalog (Professional),
Spanish (Intermediate),
German (Intermediate)