Covalent is bridging the gap between HPC and AI solutions 🌏

Plus: Co-founders Oktay and Elliot on the evolution of AI infrastructure...

Published 11 Jun 2024

CV Deep Dive

Today, we’re talking with Oktay Goktas and Elliot MacGowan, Co-Founders of Agnostiq.

Agnostiq is operating at the forefront of AI infrastructure with their flagship product, Covalent. Covalent is a cloud-agnostic accelerated computing platform, allowing startups and enterprises to build any AI or HPC applications in a simple, scalable, and cost-effective way. Founded in 2018, Agnostiq aims to bridge the gap between HPC and scalable AI solutions, making sophisticated AI accessible and manageable for developers across various industries.

Oktay earned his PhD in Physics at the Max Planck Institute, Germany, and served as a researcher at the University of Toronto, where he met Elliot, who was completing his MBA. The company most recently raised in April 2023 from Differential Ventures, with participation from Green Egg Ventures, Scout Ventures, Tensility Venture Partners, Boost VC and Rob Granieri of Jane Street Capital.

In this conversation, Oktay and Elliot delve into the origins of Agnostiq, the unique capabilities of Covalent, and how their platform is transforming AI infrastructure by enabling scalable, production-grade AI solutions for a wide range of applications, from computational biology to AI agent systems.

Let’s dive in ⚡️

Read time: 8 mins


Our Chat with Oktay and Elliot 💬

Oktay and Elliot – welcome to Cerebral Valley! Firstly, give us a bit of background on yourselves and what led you to co-found Agnostiq.

Hey there - my name is Oktay, and I am a physicist by training. I did my PhD at the Max Planck Institute in Germany, followed by a post-doc and a few years in academia. My last position was at the University of Toronto, where I met Elliot, and where we started the company in 2018.

I’m Elliot! My background is in business operations, with an MBA from the University of Toronto, which is where I met Oktay. At the time, I was also working at the Creative Destruction Lab, an incubator and accelerator program. Before that, I worked in various corporate roles at Bell Canada.

We started out as a quantum and high performance computing (HPC) company, which ultimately exposed us to the infrastructure challenges we find ourselves solving today with Covalent, our flagship product.

How would you describe Covalent or Agnostiq to the uninitiated AI developer?

Covalent is the most intuitive way of doing scalable, production-grade AI. You interact only with Python and don’t need to switch to different frameworks or ways of thinking. You focus on your problem, reflect it in your Python workflow, and that’s it. Covalent makes it simple to create and scale computationally demanding applications seamlessly.

We started building Covalent around four and a half years ago. We think of Covalent as being as pivotal as the shift from traditional data handling to 'big data'. However this time around it isn’t just about scale; it’s about mastering the use of 'big compute' for AI in a way that’s incredibly efficient and accessible. Just as big data required more robust platforms, Covalent equips developers with the tools to easily manage and build on top of vast computational resources, from model training to real-time simulations.

Who are your users today – who’s finding the most value out of using Covalent?

Our users range from teams working in Generative AI to traditional AI such as computer vision to scientific computing. We have many users in computational biology, material science, robotics, nano-electronics, universities, and major AI research labs – so it’s really a broad spectrum.

We recently released a service for inference, and it's starting to gain traction. Before that, we focused on large-scale batch jobs, especially in supercomputing centers. For instance, someone could connect their laptop to a supercomputer like IBM’s Summit - or even a combination of supercomputers and clouds to run massive AI workloads, entirely in Python from their laptop.

Covalent is a general-purpose platform that supports a wide range of applications. Users can create digital twins or build complex, dynamic AI agent systems for tasks ranging from prompt-refining applications to chemistry simulations. Essentially, you can build almost anything you can imagine on top of Covalent.

We’ve seen more and more teams tackling infrastructure for GenAI. What sets Covalent apart from a developer perspective?

Covalent builds on the experience of traditional high-performance computing (HPC). While HPC is becoming a mainstream part of the AI software stack, it has been a staple in academia/scientific disciplines for decades. We modernize this experience and make it accessible to end-users, allowing them to perform large-scale simulations and build AI applications at scale, similar to traditional HPC but in a very efficient, scalable, and easy-to-use manner.

One of the key differentiators is our ability to support multi-cloud federated computing. Users can combine traditional on-prem HPC with cloud computing, integrating platforms like Kubernetes and Slurm across multiple clouds and on-prem systems. We offer a seamless Python user experience that is unique in the market. This allows developers to build any kind of AI applications, leveraging the power of traditional HPC within a unified Pythonic interface.

Are there any specific customer success stories that you’d like to share? How do you measure success?

We have a broad set of users, from supercomputing centers in the US, Europe and Japan, to various large corporations using our open-source tools. Our users include pharma companies and AI companies building applications ranging from AI agent workflows to video processing.

In terms of measuring success, we look at several factors:

  • Increased Hardware Utilization: We focus on increasing overall GPU or general hardware utilization rates. The serverless nature or Covalent allows for the highest possible utilization, while reducing costs and increasing availability.

  • Developer Experience: Covalent provides an intuitive development experience allowing developers to focus on the main business logic rather than the painful and time consuming infrastructure work.

  • Time to Production: While somewhat qualitative, faster times to production are crucial. We remove much of the in-between work from prototype to production, condensing the DevOps and platform setup into a single solution.

Time to market is critical, especially in the current competitive landscape. What might take months with best-in-class engineers can take just days or even hours with Covalent. Building a complex AI agent system and making it scalable can be done using a simple Python notebook without needing to interact with Kubernetes or any underlying GPU infrastructure. Covalent abstracts these complexities, enabling rapid productionalization of workflows in our - or clients' - compute environment.

What's the hardest technical challenge that you’ve faced building Covalent?

Integrating the experience of traditional high-performance computing (HPC) with modern cloud stacks was one of the hardest technical challenges in building Covalent. This involves incorporating GPU and CPU parallelization, task parallelization, and graph optimization into contemporary cloud infrastructure - and across multiple clouds, including specialized GPU cloud providers.

Traditional HPC is not designed for compute at scale in the cloud. Combining the HPC experience with the modern cloud stack is a significant challenge. We’ve spent considerable time solving this, ensuring that we deliver extremely high quality HPC capabilities within the cloud, while maintaining a modern and intuitive UX. This integration is a key strength of Covalent, enabling us to offer advanced computational solutions effectively.

Could you give us an idea of your roadmap? How is Covalent going to progress over the next 6-12 months?

Our goal is to make Covalent the standard tool/platform for building enterprise grade AI applications, and the next 6-12 months is going to be all about building in that direction. Two weeks ago we announced a major new inference/function serving capability within Covalent and have other similar releases planned to enhance the experience of AI engineers - making it even easier for them to develop & productionalize AI applications.

How does your team balance research with productization – especially with the pace of GPU and AI breakthroughs?

With Covalent, our guiding principle is application and hardware agnosticism—it's our true north. When faced with the challenge of leveraging high compute, you have a choice: rush to build quick, application-specific solutions, or craft a robust, universally adaptable framework. We've chosen the latter, inspired by the best practices from the traditional HPC/scientific computing world, and have followed this path steadfastly for the past five years.

By doing so, we have been able to easily implement any advances in applications/hardware into the Covalent platform, without disrupting end-user workflows. To that end, we have been designing Covalent with the anticipation that compute modalities & applications are going to be dynamic, so we will be well positioned to help our users take advantage of those breakthroughs/changes as they come.

Have you been thinking of the upcoming trends in AI, like Multimodal AI and Agentic AI, and how they affect Covalent?

Multimodal AI and agentic AI are going to be very important and will become integral to many workflows. Currently, people interact with these systems in simple ways, often as proof of concepts. But soon, they will be part of production environments everywhere.

Covalent is designed to make these agent-like workflows possible by integrating large language models and scientific computation. It supports any Pythonic workflow together with serving functionalities, acting like a collection of Lego blocks, allowing you to build whatever you need. As a result, we believe Covalent is well positioned to help end-users build increasingly complex Agentic systems - such as chemistry simulations or computational drug discovery. We already have some examples of companies building these systems using our platform today.

How would you describe the culture at Agnostiq? What do you look for in prospective hires?

High-trust, high-agency. Very interdisciplinary - the team is primarily composed of PhDs and Masters from Math, ML, Physics and best in class software engineers. We are currently hiring for various roles on both the business and technical sides. We're looking for software engineers, machine learning engineers, and go-to-market leaders. Reach out if you’re interested in learning more at careers@agnostiq.ai.

Lastly, is there anything else you think we should know about Covalent?

We've been working on this problem for a long time. This isn't just a ChatGPT spin-up company reselling GPUs. As we alluded to earlier, we have a big vision here. While GenAI is all the rage right now, we’re continuing to build towards a future where high-compute is used for a lot more than just that - think digital twins, simulations, robotics, quantum computing, etc. We know that tools like Covalent are going to be needed to help developers and researchers make major breakthroughs in these domains and we’re very motivated by that.

If you’re interested in learning more about Covalent, check out our tutorials here.


Conclusion

To stay up to date on the latest with Covalent, follow them on X and learn more about them at Covalent.

Read our past few Deep Dives below:

If you would like us to ‘Deep Dive’ a founder, team or product launch, please reply to this email (newsletter@cerebralvalley.ai) or DM us on Twitter or LinkedIn.

Join Slack | All Events | Jobs

Subscribe to the Cerebral Valley Newsletter