top of page


Share on:
Asset 14icon.png
Asset 39icon.png
Asset 12icon.png

My name is Shir Caplan, I'm 28 years old.
Driven by the thrill of turning ideas into reality, I combine my experience as a seasoned software engineer and manager with a passion for conceptualizing and executing both business and R&D goals.

* Led the development of a private cloud computing platform for a successful IPO company, demonstrating my ability to deliver impactful technology solutions.
* Honed my skills in an elite IDF cyber unit within 8200, where I also earned my BSc in Computer Science.
* Committed to continuous learning and collaboration, I'm a member of a closed community of CTOs & VP R&Ds and actively participate in Meta's program to promote women in senior positions.

Beyond the technical:

I thrive in an environment where innovation meets execution, and I'm always looking for new challenges and opportunities to contribute my expertise. When I'm not building the future, you can find me collaborating with like-minded individuals or exploring the latest advancements in technology.

Shir Caplan

Head of R&D
Asset 12icon.png
Asset 1TWITTER.png
Asset 39icon.png
Asset 17icon.png
English, Hebrew
Asset 7TWITTER.png
Tel Aviv, Israel
Asset 7TWITTER.png
Can also give an online talk/webinar
Paid only. Contact speaker for pricing!


How to Run AI Models on High Volume in Real Time (ms): Unlocking Real-World Value

Data / AI / ML, Software Engineering, Backend

Asset 12SLIDES.png
Asset 21talk.png
Asset 11SLIDES.png

In today's data-driven world, the ability to deploy and run AI models at high volume with millisecond latencies is crucial for unlocking their true potential. This talk delves into the practical challenges and best practices for achieving real-time performance for your AI models, enabling them to power mission-critical applications and deliver seamless user experiences.

**The impact of latency in real-world AI applications: We'll explore the consequences of slow response times in various domains, such as fraud detection and Electronic components.
**Key considerations for high-volume, real-time inference: We'll demystify critical factors like model architecture, hardware optimization, and deployment strategies, helping you choose the right approach for your needs.
**Practical techniques and tools: Discover proven methods for efficient data pipeline and containerization for streamlined deployment. We'll also showcase popular tools and platforms designed for real-time AI inference.

Asset 1icon.png

How to Run AI Models on High Volume in Real Time (ms): Unlocking Real-World Value







Go to lecture page

bottom of page