I'm a passionate AI engineer determined to leave the world a little better than I found it. Welcome to my personal space on the web!
View My WorkHello! I'm Prashant Gupta, an AI engineer based in San Jose, California, focussed on Generative AI inference. I specialize in deploying and optimizing large-scale LLMs and multimodal models in production environments. I am also passionate about bringing cutting-edge AI research into real-world applications with performance, safety, and usability in mind.
My journey into Computer Science began while pursuing my Masters in CS from North Carolina State University. Since then, I've been dedicated to honing my skills in Cloud, DevOps, and now AI, and applying them to create impactful projects.
When I'm not coding or designing, I enjoy dancing and playing Pickleball. I'm always eager to learn new things and take on challenging projects.
Get In TouchIBM Spyre is the first production-grade Artificial Intelligence Unit (AIU) accelerator born out of IBM Research. The vLLM Spyre plugin (vllm-spyre) is a dedicated backend extension that enables seamless integration of IBM Spyre Accelerator with vLLM. It follows the architecture described in vLLM’s Plugin System, making it easy to integrate IBM’s advanced AI acceleration into existing vLLM workflows.
My work focused on understanding the intricacies of inferencing optimizations and attention mechanisms within large language models to ensure efficient hardware utilization on the Spyre Accelerator.
Loading contributions...
IBM watsonx is a portfolio of AI products that accelerates the impact of generative AI in core workflows to drive productivity. Users are free to choose an open source foundation model, bring their own, or use existing models while running it across any cloud and using open, transparent technology with governance and security controls built in.
My work focused on the inference stack, ensuring models performed optimally at scale.
A minimalistic go library/app to keep your mac active and alive! The simplest app that has the sole purpose of moving your mouse pointer at regular intervals so that your machine is kept awake! And best of all, it works ONLY when you are not working, so be rest assured that the mouse won't start moving on its own without the machine actually being idle.
This was a fun weekend project to explore Multithreading with Go Routines in Golang.
Loading contributions...
For a full overview of my activity, including commit history and contributions, please visit my GitHub Repositories and Profile Page.