top of page
🎤 Ready to take the stage? Academy applications now open! Apply now
OUR SPEAKERS


Share on:
Hadas Baumer is a Senior AI Scientist at Intuit, where she focuses on building real-world applications by grounding Large Language Models (LLMs) in unique, use-case specific data. Her unique approach is rooted in her academic background, holding an M.Sc. in Neuroscience from the Weizmann Institute of Science, and has been applied to complex challenges like developing predictive safety models for the autonomous vehicle industry.
Her work is fueled by a deep-seated curiosity. For Hadas, brain sciences are an endless source of inspiration, providing novel perspectives and powerful analogies that she actively applies to pioneer new solutions in artificial intelligence.
Hadas Baumer
Senior AI Scientist @ Intuit
English, Hebrew
Languages:

Location:
Tel-Aviv, Israel

Can also give an online talk/webinar
Paid only. Contact speaker for pricing!
MY TALKS
AI Sanity Checks: A Neuroscientist's Guide to Unit Testing LLMs
Data / AI / ML



You just finished developing a cool new AI feature, and everybody’s excited to ship it to production. But… how can we know our LLM is doing what we think it should do? Testing a few examples looks fine, but is it enough?
We wouldn't merge a pull request without passing unit tests, so why are we deploying LLMs based on our gut feeling?
Before I was a Data Scientist, I was a neuroscientist, and I learned the hard way that a successful experiment comes only after a thoughtful, proactive search for faults and possible failures. I strongly feel that this mindset should be adapted into the tech world, and even more so as we advance into the new AI era, which constantly surfaces new challenges.
In this talk, I will share with you a practical framework for "unit testing" your LLM. I'll show how to move from vague business goals to a concrete set of evaluation probes by deconstructing any task into its fundamental capabilities. You'll learn how to derive targeted scenarios that test these capabilities, ensuring your validation set has maximum coverage and relevance, so you’ll be able to deploy your LLMs with confidence.

AI Sanity Checks: A Neuroscientist's Guide to Unit Testing LLMs
Completed
true
Visible
true
Order
3
3
Approve speaker
email was sent to speaker
Reject speaker
email was sent to speaker
bottom of page