Our chat with OpenAI's Logan Kilpatrick (Pt. 1)

Plus: Logan on AI DevRel, Assistants API and the GPT Store...

Published 12 Feb 2024

CV Deep Dive

This week, we’re talking to none other than Logan Kilpatrick of OpenAI.

Logan is the prominent face of Developer Relations at OpenAI. Since joining the startup in November 2022, he’s been central to the way thousands of developers have interacted with OpenAI’s developer-facing products, including the GPT Series, various model APIs and the Plugins-turned-GPT-Store. He’s also involved in broader conversations around how OpenAI interacts with the developer ecosystem at large.

In Part 1 of this conversation, Logan walks us through his day-to-day at OpenAI, working with AI developers, and the launch of the GPT Store.

Let’s dive in ⚡️

Read time: 8 mins


Our Chat with Logan (Part 1) 💬

Logan - welcome to Cerebral Valley. First off, give us a bit of background on yourself and how you came to be the face of Developer Relations at OpenAI?

Like all great things, I think this is an accident that was in the making for many years. When I joined, the original perspective internally was like “hey, we have this amazing technology that developers don’t know much about”. There were a few companies building with our GPT-3.5 models - like Jasper AI and Copy AI - but those were very vertical-specific. And the team saw the potential of how transformative the technology could be, and thought “we need to get more developers building with this”.

I was brought in to help do that, and it just so happened that my first day was ChatGPT hitting 1 million users. All of a sudden, everybody was really excited about AI and we really haven’t had that same problem since! And so the problem space shifted from “hey, let’s get developers excited about this”, to “hey, now we have developers excited and we need to make the best developer product possible”.

When I joined, our API Platform team was roughly 7 or 8 people - and we really got to work on how we make things better for developers; from core dev-ex work to traditional developer relations where I’m doing a few demos here and there. Our work has been really focussed on our core product, improving our documentation, and getting developer feedback - this is what we care about most on a day-to-day basis.

Diving straight in - walk us through the immediate aftermath of ChatGPT’s explosion in December 2022. What was it like to experience?

It was a very surreal experience. I remember so many random tidbits from that first day - everything blowing up in that moment and being thrown right into the thick of it. There were so many more people joining our platform and so many more developers using our products. In a few words, everything was on fire.

This is the same experience everybody who onboards at OpenAI has - you get thrown into it and are immediately pushing on a really important thing. When I first joined from a developer platform side, we had the new Embeddings V2 coming out and I just got tossed straight into that. There was so much work to be done, in front of so many more eyeballs - it definitely felt higher stakes.

I don’t think people appreciate just how complex things get when millions of users show up and you have all this demand for your product and all this additional work to be done. Your intuition says “perfect, I’ll just go hire a bunch of people and that’ll solve our problems”. What people forget is that spinning up a hiring process takes time.

For us, even past the GPT-4 and ChatGPT API releases, we still had all of this work that needed to be done and not enough humans on the team who were able to do it all. It really wasn’t until the last 4 or 5 months that we scaled the team to the point that we can do all the things we need to do.

That sounds intense, especially in the first half of 2023. What does your current day-to-day look like in 2024?

It’s all over the place, and I think that’s what I love so much. This week, for example, we were working on the Assistants API beta and the GPT-4 fine-tuning waitlist. That said, I think there are a few core pillars of the work that I do on a weekly basis.

For example, I do a pretty heavy amount of our developer documentation. It’s super important to me because it’s the surface area for developers to use our products, and it’s fun too. I also run our developer forum, so if folks have ever had questions and wound up on community.openai.com, I work deeply with the teams there.

We’ve also spent a lot of time talking to developers about what needs to improve, hearing their feedback (including tweets), and trying to communicate what we’re working on as best as I can. In many ways, it’s more like comms work - not in terms of content, but just being like “hey, developers, here’s what’s important to us and how people can best leverage the things we’re building”.

So again, not the same as what I’d consider normal developer advocacy, which is trying to convince developers that they need to build with your thing - but more-so managing the huge amount of developers we already have. And that’s super exciting.

Tell us about the challenges that go with that. For example, there was a recent stir about GPT-4’s laziness - how much of your role touches those aspects of user comms?

Great question. I think I actually bear the brunt of feedback about a lot of the emerging issues, because people will come to me thinking that I have the power to make the changes they need. Something’s not working, and I get randomly pinged by people - which I do think is super helpful since it keeps me in the loop.

A lot of that is just me taking whatever the feedback is and channeling that to the right team, since it's not going to be me that fixes every single problem. There are a bunch of things that I can fix, which is nice, but model-level improvements is a great example of something that I can’t do myself. 

With the narrative of the model being lazy, there are so many different stakeholders that are involved in addressing an issue like that. We just released a new iteration of the model that addressed a lot of feedback - and that was a huge collaboration between many folks. So it’s a lot of those types of things - like “hey, this API end-point has some weird nuanced characteristic about it”. We can fix that, but the model-related fixes are definitely the hardest.

You just released the Assistants API. Tell us why developers should be excited to build on top of it?

Assistants API is the natural next step if you want to build more robust and comprehensive applications. We’re investing a ton into it so that developers get all of those benefits, and it also lays the groundwork for a lot of the future-facing work we’re doing. For example, we announced GPTs at Demo Day - which is really our first step towards building these agentic model systems - and Assistants is really what’s going to underpin that for developers.

Also, I’m super excited about expanding the surface area of tools we offer to developers today. You can use function calling and Code Interpreter and knowledge retrieval, but there’s such an opportunity for us to build more things for developers, i.e building abstractions on the top 50 or 1,000 tools that developers would want to integrate into the Assistants API, and then making it available via an SDK. That would be a huge unlock - things like that are going to make building with these tools so much easier.

To the uninitiated, walk us through the journey from the original APIs to the Assistants API. What was the driving force to get here?

The progression of our APIs started with completions - where you would put in a sentence like “my name is”, and the model would use the next token prediction to finish that sentence for you. We then moved to chat completions, which came out with GPT-4 and 3.5-turbo, where you would actually put in a system instruction and messages that the model could read and interpret the next response from its perspective. 

The challenge was that as you went into these multi-turn conversations and wanted different context to be accessible to the model, you had to manage all those messages yourself. So you’d have to put them into a vector store and then dynamically pull them in real time depending on the context.

With the Assistants API, you don't need to do that. You now have this notion of a thread - similar to iMessage or Twitter - where you can just append messages in and then give that thread ID to different assistants for different use cases. 

So now, you can actually have a general conversation with the model, and then an assistant that's specialized in life advice, for example. And you can decide dynamically where you want those to go, and which assistant you want them in, without having to manage the conversation history yourself - which is the huge part. There are also other benefits that come with it related to tool use - for code interpreter, knowledge retrieval, function calling and all that good stuff.

The GPT Store was another major release at the end of 2023. How are you encouraging developer activity for it, and what has surprised you most?

The GPT Store launch was super exciting. I’d spent a lot of time doing stuff with Plugins for developers as that was our first real developer surface area inside of ChatGPT. People made such cool plugins, and devs got real traction from ideas that weren't possible before. It also would have been hard to find a market for Plugins if you hadn’t built them directly into ChatGPT and leveraged the existing user base our products have. 

Fast forward, the GPT Store launch has been incredible. This is what people have been asking us for for a long time - like, how do we get better discoverability, or better leaderboards, or social signal as to whether or not this is a useful GPT? With the Store, we’ve provided those things - plus a bunch more, and so I think it’s been incredible for developers. You can really build much more robust experiences.

Additionally, it unlocks an entirely new user base that wasn't able to build before. If you were non-technical, you really couldn't build a Plugin; you needed to have an API, an open API spec, and to actually put that AI plugin manifest file on a URL somewhere. It had a bunch of hurdles for people who weren't technical. And now with GPTs, you can go in there, use the GPT builder, not have ever coded in your life, and build something that's actually super useful. 

A great example of this is my girlfriend, who came up with the idea for “Planty”, which is a houseplant gardening GPT. It's one of the official OpenAI GPTs right now, and there was no coding required. And it's materially better than just default ChatGPT in how it helps you use the knowledge that's already in the models and browsing and a few other capabilities to better take care of your plants - which has been super fun. 

It feels like making AI tools that are accessible to non-technical folk is a major cornerstone of the OpenAI strategy, at least in product releases.

This is absolutely top of mind, because that’s part of our mission - and for us to be successful in achieving it, we have to provide tools that are accessible to more people. If only a small sliver of people can build using AI, the use cases will be representative of just those people. So, we have to build tools that make it easier to build with AI, even if you're not somebody who has technical domain expertise. 

I think GPTs are the first shot on goal on this problem, and I hope that we'll continue to do more because I meet people all the time who are so excited about AI but who aren't developers. And there should still be tools, even above and beyond OpenAI. There are lots of companies trying to solve the problem of how we make AI accessible to the masses, but I think it's imperative for us to do it as well because it's impossible for us to fulfill our mission if we don't. 

You’ve mentioned reliability and robustness as two critical elements that developers are expecting from their APIs. What steps are you taking to ensure the models maintain and improve on those characteristics?

There are 3 different angles to this. Firstly, from an API reliability standpoint, up-time is front and center for us all the time. We know that the API needs to be super reliable so that developers feel comfortable building on it. So I think that's one element of it, and the most important one from a developer point of view.

The second element we’re focussed on is reducing hallucinations - which are primarily caused by these models prioritizing giving you correct answers to your question. The trend of reduction in hallucinations from GPT-3.5 to GPT-4 was almost 40%, which is awesome to see. And my hope is that it will continue with future model iterations now that they have been trained with more up-to-date information. So I think we’re on this trend-line to get to a point where they are less prominent.

The last element of this is the reality that the raw model output is maybe not what you want to use every single time. There are now systems and guardrails in place from other providers that help ground the model and make sure the outputs are valid. Again, this is also where RAG comes in. You can input your desired context into the model and you'll get more factual answers in that case. 

So overall, I think developers are finding solutions on their own, and I think that trend is also going to continue. I really don't think the future for using these models is simply taking the raw output from the model as the answer for every single question. You're going to have to do those RAG workflows, and you're going to have to do a bunch of other things to make sure that the outputs are really high quality and in line with what you want.  

There have been questions around how much OpenAI intends to productize its research as an AGI-focussed company. No doubt there are startups that also feel existentially threatened with every new release. Could you speak to those concerns?

There are definitely tension points and core learnings that we’ve taken forward

One is that it’s best to not spring new releases on developers without giving them a heads up of what's coming. That's where Dev Day becomes super important, and there are similar parallels in other ecosystems - Apple has WWDC,  and everybody knows when they can expect things to change and can plan ahead. I'm hopeful we'll do that with Dev Day. That said, we have so much we're working on, that it’s not possible for us to release it all on a single Dev Day. There'll be things that come before then. But I think that's one of the pieces.

I also recognize that with each release of a model or new capabilities, some number of startups get existentially affected. This might be true to an extent, but I think it's a very small number that are actually being disrupted. I think the interesting and opposing end of that spectrum is that with each new release, the number of newly-possible use cases and solutions also expands exponentially

The very real example of this was the jump from GPT-3.5 to GPT-4. The number of startups building on GPT 3.5 was small and the use case that worked well was mainly copywriting. GPT-4 brought a massive explosion in the capabilities of AI startups, and now you can pretty much tackle any problem. There are still a few verticals that aren't perfect, and my guess is that with the next model we release, those will now be accessible to developers to go and solve problems with. So I'm really excited and I think that trend is going to continue


Conclusion

That’s a wrap for our Part 1 of our Deep Dive with Logan! Follow him on X and LinkedIn to keep up with his work at OpenAI.

Read our past few Deep Dives below:

If you would like us to ‘Deep Dive’ a founder, team or product launch, please send us an email to newsletter@cerebralvalley.ai or DM us on Twitter or LinkedIn.

Subscribe to the Cerebral Valley Newsletter