Summary ThursdAI - Full Interview with Killian Lucas, Author of Open Interpreter - YouTube (Youtube) www.youtube.com
11,104 words - YouTube video - View YouTube video
Speaker 0 Hello. Welcome to special. My name is Alex Walker. I the host of Thursday and the founder of dot video. I'm very excited to bring you this special interview with Killian Lucas, the author of the very popular open source project called Open interpreter.
Speaker 0 Killian came on the live podcast recording in the middle of the first week of open interpreter being open sourced by him. In fact, as I'm recording this, Open interpreter is not even a week old. However, it's already has exploded in popularity and unlock. The imagination of tens of thousands of people, including boasting an incredible 23000 github stars at the time of this recording. Open Interpreter04 a way to have Ai run and execute code and your local machine using either Gp 4, or a local model like llama or called lam.
Speaker 0 And as you will hear, Killian has incredible plans for it already, many of which came from the community and the users. Given that Thursday has become a great place for the leaders of Open source L to discuss things like fine tuning datasets, better and better open models. The conversation of course, turned into a productive collaboration between fox on stage and Killian. I'm sure you'd like this interview. Unlike some of our regular chats.
Speaker 0 This 1 does not require any specific knowledge about data science or large language model. In fact, K shared the he dropped out of college to pursue Ai just a couple years ago. And is using J and rep to build open Interpreter04 here we are folks, a great Ai engineer creating an incredible product. That is very much needed as evident by the popularity. Just before we get started, I want to ask you.
Speaker 0 If you like this Interpreter04, please consider subscribing. It really does help me to continue doing these interviews and tuning every week. As I want to invest more and more into Thursday and actually make this a big part of my career. Thursday I started as just a few folks on Twitter space chatting about Ai. And I did not imagine in my wildest dream that Have.
Speaker 0 Sponsorship opportunities, premium subscribers, world class guests and c hosted tuning from week to week, would like to invite you to join us and become a part of the Thursday community. Whether you're coming to our life spaces recordings, reading the newsletter, Listening to the part in your car on your way to work or just following along
Speaker 1 on social media and living comments.
Speaker 0 I appreciate each and every 1 of you. Thank you for tuning in every week. And now I give you. Killian Lucas Author of open interpreter.
Speaker 1 And I wanna welcome Killian 2 stage go feel Interpreter04 all first. And then we're going to talk about open Interpreter04 little bit.
Speaker 2 Hey. It's great to be here. Can hear me? Okay?
Speaker 1 Yeah. Would be here fine. Thank you.
Speaker 2 Yes. So my name Killian, and I've I've been working on an Open source project called Open Interpreter, and which is a interface between language models and computers. So it's it's trying to build something that lets you kind of combine these 2 worlds of flexible computing with language models and the rigidity and deter of computers. So I'm like, I'm by the way not deep enough as an engineer for this room. I'm so honored to be here.
Speaker 2 This is such incredible stuff to get to hear. I'm very far on, like, the application layer where people are interacting with it. But, yeah, my main project right now is open interpreter.
Speaker 1 Purely? First of all, welcome to join every Thursday and those discussions will continue to happen. So feel free to to keep calling here. We covered Open interpreter before. I wanna to, like, scroll back even further open Eye gave us code Interpreter04.
Speaker 1 Code Interpreter04 talked about at length multiple times, And I and some other folks here are actually in a group to tries to take this 2 like the extreme. And basically, we noticed back then the code Interpreter04 4, and we think for like, a fine version that is able to correct it's all mistakes. We also know that many agents do the same kind of thing. Right? Many agents generate a task for themselves and then they run and look to be able to execute this task and we know the Gp for the best of them.
Speaker 1 And so then Gp can Gp gave us many called interpreter a way to upload files and run inference of them and actually execute this scope, and then How early when you saw this, did you start thinking about doing this openly on your own neck? And what led you to kind of start working on this?
Speaker 2 Oh, yeah, well, well, it was a few things that was earlier than that, and I think a lot of people have been playing with us a S Sham meme in 20 20. Was doing stuff with Gp 3 and and having it run code. And it was just like, holy shit that's think and run code. And that was when it Killian started, I then went on and I made a sort of called water which was where you could kind of design your own Ai agents and you could connect it to tools and there was 1 type of Ai agent that people were building on it that was un reasonably powerful. And it had access to 1 tool, and I was an interpreter.
Speaker 2 So it just blew everything else that people were building on the platform out of the water, just in terms of it it being able to integrate with anything and be able to do complicated logic and it it just was so clear to me that that kind of the Api into actions was just gonna be code and that this kind of idea of making toolkit and stuff and and doing all these things was kind of reinventing the wheel of the fact that code is our species usually like, best attempt at making a tool out of language. But making something that turns language and into actions. But it really like, honestly, it was me pasting something into J And then getting it to write some code. And then be going off and running it and it would, like produce an error and I'd be like, okay, Then I'd copy that and then put it
Speaker 3 in a charging. I was like there's gotta gonna be
Speaker 2 a way to just come up. So this was really just like the product version of it kind of came from that and obviously code Interpreter04, which is just so simple and as a dream. And an incredible product. So yeah, it it came from all those things and thinking how much more can we do and how close to the to people's machines, can we bring the said you have Interpreter04.
Speaker 1 Awesome. And so obviously called interpreter, which is now called advanced data and...
Speaker 2 You advanced student else. Yeah.
Speaker 1 Yeah. I don't know if you are going to change open data on the lyrics or not.
Speaker 2 No. It sucks. Yeah. It's terrible for. It's terrible timing.
Speaker 2 It was like the day.
Speaker 1 But but funnily enough, so many many folks for them, kind of... This was the use case. Right? So it allowed the first, I think, open the I. Interface to allow to upload actually files and perform things on those files.
Speaker 1 However, for folks who, like, follow open an night for a while, may remember when they risk Codecs they actually had, like a playground that runs Alex03, sorry Runs go. So you were literally able to ask Codex to kind of generate the website will show you some Javascript called run and show you the website. And I think this has been on their kind of docket for a while. And 1 of the things they did, they that people don't talk often about is they do the whole environment thing. They you don't have to do, like, environments, Python.
Speaker 1 You don't have to do anything then you pay the 20 bucks, you go to chat Interface. You right and execute code safe environment. Save being the operative word here. Open interpreter, the great thing that you've open source and now being exploded on popularity, which we'll talk about next, runs on my machine. How you think about that safety, that security?
Speaker 1 Because, like, when I show this in our Ai think meet up in denver. The first person and said, r dot like dash rf and you will delete my whole computer. So how do you think about you know, execution environment safety at ever.
Speaker 2 Yes. So this can be approached a lot of different ways of thinking about that problem. Because first of all, what's at the top of the repo is let L m's run code in your computer. And that was intentionally meant to be a bit controversial that I think a lot of the... The smartest people I know hate that.
Speaker 2 And they look at that and they're like this is the end of like come on. It is not... First of all, I I would encourage anybody to try it with Gp 4 and say to delete all my files. It won't... This is very interesting is that this is kind of an alignment thing now.
Speaker 2 It will say I I have to refuse that request. It that enters into that domain. We for a while had forbidden commands to that was... That would prevent it from running Our rm or doing things like that. Some people do wanna use it to to really freely delete directories and stuff.
Speaker 2 And I would say too, there's solutions to think about in terms of post processing of the code writes. There's the idea of running, I think it's called Sem rep and doing kind of guard dog is this thing that investigates N pm packages and pit packages that it tries to install for a malicious code. And the idea that generating code and then scanning it. That's 1 way to think about how to solve that problem. I think the best way to solve that problem is just to the fact that by default, this thing runs with confirmation.
Speaker 2 So you do need to actually get a permission. So if you ask it to go and do something, and they will write the code, and then it'll will sit there and a a beautiful the since syntax highlighted code block, and you can just hit yes to approve it and to run it. So just run into that mode, and then if this is not gonna be having that same thing. There's also the idea too that even if kind of the the fun thing about Interpreter04 that's really exciting and I think people are really liking is the fact that it can run locally. It's actually great in a cloud environment.
Speaker 2 So I I'm from a generation. I don't know. Maybe this is from a generation programmers that like this I started programming in 20 20 and I found Google Cola lab, and I have not left. So I've never run any code on my computer, except when when I bring Open Interpreter04 it and I asked to do stuff, I have been entirely a cloud programmer. So I built Interpreter04 rep.
Speaker 2 My style of programming has been just like, break shit and just like go and destroy the machine and I just like, bang up against the edges, which is as funny as that Interpreter04 sometimes Killian works that way that it's trying to kind of find the edges of reality by running code and seeing. Getting some feedback about how the world that's in works and then bounce off those edges and then it can go little further. Than if it was just set off in an open, know, environment. So I actually... It it's a lot of fun.
Speaker 2 And if you're worried about security, I really recommend running Killian rep running it in Google Cola. So actually, you just have this added layer of you could mount your Google driver, or still use all this flexibility of a development environment that you control as opposed to open Ai. But you can still... Not be concerned that it's gonna delete your operating system. But it really is not want to do that.
Speaker 2 And by giving the option to proof code before its run. I think we got around a lot of that.
Speaker 1 Awesome. That's good to hear. So 1 of the I won't say drawbacks Well, definitely 1 of the differences between running like a called interpreter that's let's sits within, besides the price besides everything is the safety, Right? Over there, like, a Sandbox environment, and you're suggesting, like, focusing can run this some rep can run this inside the Google cola lab, the benefits though of running this on local machine or vat let's talk about some of the benefits. I wanna specifically say the this next thing that Nest and I ran into.
Speaker 1 Code interpreter does not have access the Internet... This restricts the amount of tasks that you can do with it. Right? This restricts the amount of tasks that, like, it can do for you. Sometimes it tries and fails.
Speaker 1 So for example, a gear gear location and some other stuff. Running this locally, you have the full internet access, scary, but very exciting. As you talk about this in terms of like, how people use this and what it actually does, like it's different packages. You said in Bam packages could do dive into that benefit.
Speaker 2 Yeah. For sure. And this is something to play with the... As to the trajectory of what they're gonna do. Following, you know, we can all imagine they probably are gonna open up code Interpreter04 the Internet at some point.
Speaker 2 But I don't know if they're gonna open it up to really... Everything. Because, like, a lot of folks want to get structured data from the Internet. Web scraping. Honestly, and those kinds of use cases, This is not a lot of people using open Interpreter04 it, but it's just completely impossible with Code Interpreter04, obviously.
Speaker 2 A really fun... Probably the most magic thing you can do with code interpreter ask it to open Chrome into something because it'll use s. It's on it's on a local machine and it will... It's this magic experience of it actually opening Chrome, and clicking buttons and and navigating around and you can see it's filling out
Speaker 1 As incredible. So just for folks in the audience, S is a thing that does work.
Speaker 2 S is basically a wrapper around Chrome. It's something that can communicate with Chrome on your local machine. So just by it running code, like, that lets it open the browser and then get what's on the page and then switch tabs and do all the stuff. So so all of the magic of J and how you talk to it and how easy it is. That all gets to be pushed into a Chrome.
Speaker 2 I forgot what it's called the Chrome engine that opens up so it's just incredible to watch. And by way, great it really cool new browser stuff. Because I think that's such an interesting use case and that it opens up a lot of things that aren't Apis. So you can use tools that don't have an Api they just have a web you know, thing on the front and it can retain all your authentication because you're logged into stuff, that that's just the really interesting. Future for this, so there's great browser stuff that's coming into open interpreter really be soon.
Speaker 1 Wow. Wow. Okay. So many questions. Here.
Speaker 1 But first of all, the first basic question is how does it see what's happening in my browser?
Speaker 2 Yeah. So I guess there's no dear concern about me talking about this. This isn't in... Yeah. This is the stuff that I was talking about for the future.
Speaker 2 So right now, what we've experimented with is I I don't know if you remember Nat bought from a while ago.
Speaker 1 Yeah. So just
Speaker 2 the idea of kind of how you dis instill the Html that's on the page into... Something that is all the way reduced to its interactive elements and what the language model action needs to know. So maybe it just shows headings. So exploring that, we're gonna be releasing a package student called open browser. And with this is, it's kind of like a wrapper around s.
Speaker 2 Which is built for language models to use. So this is to me is that the first of what I think is gonna be a wave of programming tools that are built for language model programmers. So just a totally different way of thinking about the kind of things that you need as a programmer that's like trapped in a box like how a language model is. It it's not trying to write perfect code every time. It's trying to work while this workflow where it runs code and finds finds the errors.
Speaker 2 It needs to be verbose by default because it needs to know what's happening. Just there's all these little quirks of how you should write packages 4 language models to use. They're the main customer. Meta metaphors is an example of for a company that's thinking about this kind of stuff a lot. But Open browser is gonna be that for controlling a S browser locally, and you'll be able to use it for free and and open interpreter.
Speaker 2 And it's... Yeah, always it's gonna be just all those things applied aluminum as a wrapper around it, so it's much easier for to use. The idea of how it sees, I take Sam Almond very literally. On this He's talked about the fact that language models like G 4, they're gonna be able to see soon. I really think that they'll open up that Api.
Speaker 1 Yeah. We've we've covered this multiple times in the past waiting for vision to release while we're tracking all the open source like Gwen And, like different things. Even today, we've talked about a Pdf model from meta Can, like, see images basically. Yeah. Please go ahead.
Speaker 4 Yeah. I'll just stay real quick, Killian. If you've heard of the open source multi mobile models that could actually see. There's lava. That's called like L v and Qui v.
Speaker 1 Gwen was taken down from beta We'll find that through our friends. We'll find the weights. But, yeah, this is in terms of, like, how these models can see. Right?
Speaker 2 Oh yeah. So just to to play with that for a second there's obviously the idea of a multi modal model, that's ideal. And if there's stuff out there, we really want make it so an open interpreter, it's kind of like a distributor of local. Language models, that people... Tons of people are trying to install these things locally because they get all this benefit of the Interpreter04 stuff.
Speaker 2 So to include some multi model models would be a lot of fun and and finding right way to do But what we've played with with Open browser is the idea that you can actually just pass the visual of a website to something like unstructured dot Io. And get much better stuff than you can get with Html. So just scraping the Html and trying to find the interactive elements is not because we see websites. The whole point is that you actually seeing them. So you can't really get the right.
Speaker 2 You can't look at them from the back and and try to just get it from that Html. So this is all coming out really soon, but the idea of maybe using 2 this open source Ocr for Meta. It seems like it might be an even better way than unstructured, I don't know being able to... Pull out visual information and then just pass it as language to the model so it can really effectively use a browser. But, yeah.
Speaker 2 So so that's all fun. Open browser. Let's It's also... When it has Internet access and can download large files, what whatever your environment has the size 4. All kinds of interesting use cases come out of that.
Speaker 1 So so here's, I wanna just quickly interject and say No from Meta was is this the model that, like sees and understand. And, yeah, The next point I wanted to ask you about is Can this Ai run other ais by downloading it. Huge file from the Internet from samsung on page.
Speaker 2 Yeah. Oh, well, wait. Well. So first of all, open interpreter, the dream is really that's that you just be able to... And we're working to get it to work across a lot of different systems it's a hard problem.
Speaker 2 And I'm I'm not smart enough to solve it I'm not at the steep level. So I'd love actually to connect to people who think about this a lot and that they're all here. But yeah. So so actually running a interpreter dash local where you could just kinda pick a hug and face model and then just do like that, I took to Andrea at should be for all and got to hear about Vulcan before 2 days ago and I'm... That might be the way but So that's 1 version of that obviously is getting to run any of these models that we're talking about and seeing how they panel code interpretation.
Speaker 2 But the idea of it itself. Downloading a model and and running and and doing things. Is a lot of fun. So today, we released our support. So we're trying to to any programming languages just think and run.
Speaker 2 We're basically it's a really simple idea of Interpreter04. I'm sure you all kind of intuit that you're just giving language model access to 1 function. And that's execute. So it just execute code and you you just let it tell you what language it is and what's code to run And then we just handle the real time output of that really well and just make that work. So that state is saved between all this these different programming languages.
Speaker 2 When it's running the different blocks. So our support came this morning. And it's been a lot of fun to talk to this thing about training, like linear regression models. And, like, so so I'm gonna put out a demo pretty soon to show the our support, which is where we're we're replying with it actually training a model on a datasets, that's this on my desktop. So I just like, hey, Have these 2 folders on my desktop Can you use our and train an auto regression model or whatever.
Speaker 2 So so there... There's also stuff that people have been flying with can Interpreter04 run interpreter. We're playing with interpreter dash a sync, which would actually let it. You can kind of ask open interpreter to do something. And then it would spin up other instances of open Interpreter04 go off and to do this.
Speaker 2 Because right now it runs all. In in 1 drug you synchronous way of doing it. And a lot well 2 have played with opening multiple terminal windows. You can do that just as well and talk to all of the once and the all just and be be talking and and dealing with different tasks. So if you wanna have different instances that are running it's truly really easy to do now.
Speaker 2 But anyway, yeah, language models running language models or or other stuff. I don't know if that answers your question.
Speaker 1 But beautifully, it takes us to the next point. So running this async and the concept of, like, sub agents has been brought up in multiple asian conversations Right? Like, agent. Maybe an orchestra orchestrated agent that's like good at like tasks and then specific per agent. So you're saying, there's a way to generalize all this with code just like 1, 1 task.
Speaker 1 You have 1 tool in your box. This is the interpreter, You write the code and you run it. And so the interest I think that I wanted to talk about I noticed Obviously, that's not possible in the open interpreter is when I runs on the Mac, it uses Apples script. It knows that Running on Mac. Use Apples script.
Speaker 1 And Apples this is amazing thing for folks in the audience We don't know that native, I think only native, but please tell me if I'm wrong. Native micro applications. Yeah there is a way to run them automatically, It actually, like clicks all the buttons and does all the features. Like open and click, etcetera. And so this is the way that you are things showed in your incredibly well produced video that you launched to have, that you send an email to the to.
Speaker 1 Right? Really the Apples script the open mail and then creates a new thing? Could you talk about, like, Apples script specifically how well this runs on my mission machine but also, like, what do alternatives for Linux and when does it work the same?
Speaker 2 Yes. Yeah. Yeah. I cannot believe. The how lucky we are.
Speaker 2 And how clear it is to that something like this is gonna exist in theory because they have just total ecosystem, Grand central stage, you can control so much about your Mac. Even on iphone, by the way I wanna mention this because there's some folks in the open interpreter discord, talk about making an iphone app. There is apparently a way to communicate between the different applications on your phone, such that if you had something like this, you could ask it to open the trunk of your Tesla. It has such incredible communication between apps. Apple has just so nailed that.
Speaker 2 And, yeah, no... And it's interesting. I didn't actually know that it could do that. I sent it to a friend a few days before releasing it. And he was, like, yeah, I asked it to summarize my last 3 emails and then say that summary out loud.
Speaker 2 I was like, no, I don't think you did. Like, that's that doesn't sound right. How would it do that? And he's like, well, On a Mac and it just used Shell because he didn't have Apples script at the time. I just had Shell.
Speaker 2 To run some Apples script, To get the last 3 emails, and then it used say, which is just a native Apples scripts thing to say it out loud. And so then I was like, oh holy shit. So I... That's when I'm using... Apple is now supported as a language it can run.
Speaker 2 I think that's... You'll probably will also notice that the trailer is quite a lot based on the Windows copilot trailer. But that came out. They also have the same beats. Of like, it works for your apps.
Speaker 2 It works for your documents, I was like, it's not broken. I'm not gonna fix it. So I do the same thing. Windows is gonna have this with Windows copilot, of I see open interpreter kind of the linux of this type of system. That's really what we wanna be.
Speaker 2 Is to be something open source and cross platform. Think it's gonna be the only way to do it probably on linux for a little while and then for the kind of privacy minded windows and Apple folks who want everything running locally. Can use open interpreter for this kind of, you know, c copilot for your whole computer Interpreter04
Speaker 1 is amazing I agree. And it's unfortunate that it doesn't work as well for like reps, applications, like like node reps in their applications. Hopefully, there... The folks will find a way. Killian, just taking me to the next question real quick is like you are the creator of this thing, right?
Speaker 1 You model this based on the other things, let's say, but you're now like the creator and then still you sent it to somebody, and he exploded your mind. And there's this constantly I talk about in the ai for a long time, which is like imagination unlock. Many of the things that we get been these new tools. We don't know what to do with them until, like, and collective imagination unlocks and suddenly we we're like, oh, shit. We have this power now?
Speaker 1 Oh, crap, we now can do this. Had Your whatever success that you have, which feel free place to talk about this number stars get whatever everything. 1 of the benefits, for example, something like auto Gp, there also rose to the ranks and got, like very famous, very fast is not the execution of it, our other lgbt is not the best rating agent. It's just the amount of the community and the support and and then the the folks show each other. The same with me journey.
Speaker 1 Right? You in the this discord, you see some other prompts and suddenly you're like, oh, shit I can do this and this. How are you thinking about imagination, all unlocking now did, like, everything can run every code on everything. Like, in terms of like, bootstrapping the community that now is existing around open interpreter, and these new skills that we don't know how to use them essentially, like, right? The all new.
Speaker 1 Are you thinking about, like this this area, imagination unlock for the 2 use basically for humans?
Speaker 2 Yeah. The imagination unlock of this. So the thing... To me, the thing here is this community that's sprung up around it. When you have a general purpose tool, A lot of people before me told told me to narrow down and to find a vertical.
Speaker 2 And like, I think that's not it's not time for that right now. This is like a once in a life, I who knows if this will happen again that we have this general purpose technology like at the beginning of computing to do something general and to do something. It's totally different. So trying to then figure out how to use it. You get blank cursor prop.
Speaker 2 I've heard that a lot. From people that they use it. They're like, this is amazing. Like I saw the trailer and I downloaded it, and then I'm like, can you make a text file on my desktop? Like, I don't know what I'm supposed to When it's that the open, it's it's really the community that that comes up with all these incredible workflows.
Speaker 2 So there's this 1 guy in in Japan who made, it's actually cut a clip from Youtube. You paste in a Youtube blank. And ask it to cut clip from a certain segment. It uses Y d and and downloads it, sub title it, hard coded it, please translate it to Japanese, please, like it it's gonna be those kind of things where I think people from all these different industries are figuring out what workflows this thing is good for. And I've talked to product managers and they say, I wanna use it for this and this.
Speaker 2 Is it good at that? I'm like, I don't. I'm not a product manager. I'm... I was going to school to be middle school science teacher.
Speaker 2 Before this. I have no idea what kind of cool stuff in across all these different domains there is. We're gonna look to really organize that soon because I think what would be great is something like open Interpreter04 industrial design or something, open interpreter dot com slash. And that we would actually organize all of the workflows for all these different industries and verticals because to me, it's like, it's such a general purpose tool. There definitely needs to be something in the desktop version which is not gonna run in your terminal.
Speaker 2 Just could be an app. That you download for like, the Mac app store or something that is very much like like how Has that stuff that suggest what you're gonna do next. So, you know, maybe 1 of those is gonna be kinda clip from Youtube. Make a music video. There there's really cool stuff using replicate on the collab collab right now, which is like, turning every single frame of a Youtube video into a image to image version of it just makes this this incredible thing.
Speaker 2 And the idea of opening all that up to artists into people who can't code is really the exciting thing about this to me. But in my understand if from an art perspective, because that that's all the world. Like. And I dropped out of high school to to make dub up So I was like, those 2 use cases of, like music and art and a little bit of teaching. I stand, but no that the imagination unlock from everybody sharing their.
Speaker 2 And I guess you call them workflows, it's that's the value. Stand of this thing. It it really is a simple tool. I'm trying to make it as simple as possible. I also I actually wanna say this and I think people might find this kind of interesting because it goes back to this idea of like an orchestra orchestrated agent to running other agents.
Speaker 2 I think we've been kidding ourselves that we know how these things should plan. And I think that is that that's the stuff about Auto Gp and about baby Giants and stuff just does not jive with me. Like, I I read... I think it was called the hard truth. But was a paper a couple of years ago that just totally speak to me.
Speaker 2 Spokes to me it was actually about training machine learning models, which I... I'm not anywhere near that. And it was that you just cannot expect that you have the domain expertise for this. That is gonna be better than what it can learn. If you just give it enough data, you don't have compute, it is going to learn the structures of these things and any anytime we try to inject and be like, no, come on.
Speaker 2 It's like Cha, like... To humans that we would know this kind of stuff. It's not. It's gonna be better if if you just let it do it on own. So open Interpreter04 really an experiment in getting as close to the metal as possible in just using the raw intelligence of these things to get tasks done.
Speaker 2 It's something that has no planning mechanism and it just runs right in a single thread, and I really think that's the future of this stuff. And I think it's it's banking on context windows getting bigger. So obviously, we'll have some... I just think that these things, if you get in the way of it by trying to build in some planning mechanism that's like, will then have a right of deb dependencies file and do all this stuff. I think you're getting in the way.
Speaker 2 And even the kind of orchestration stuff, I'm not convinced that it calling other agents and doing some orchestration is any different from this. I kind of think that the way that this thing is gonna run, its single thread. As close to the metal of its intelligence as possible.
Speaker 1 Awesome. This takes me to... So so first of, reflex on on what you just said in terms of community providing different workflows or ideas. Right? The first thing that we start doing after the call Interpreter04 space that we had as as it came out.
Speaker 1 We're like, whoa, this is so amazing. I started collecting, like, with a hashtag called Interpreter04, just different things that I saw on Twitter. Just because there wasn't any place. So folks of are all, feel free to check out this hashtag, that there's quite a lot Nest and and I were like, we're run group. The folks they try to like, break the limits, but also, like, approaches.
Speaker 1 Then open ai kind of re rebranded the 2, advanced data analytics because like very advanced data analytics Because very quickly this turned out to be like, hey, upload a few Csv files, and you could do a bunch of stuff where you graph them etcetera. It feels to me that this is so bigger than just data analytics, like, if it runs on my computer, if it sees once you interpret like vision in there. It's way bigger than just data analytics, it's and folks will just need to know what to do with this. And I think 1 big thing that the community here can bring is those exact examples as well. And I think like Planning ahead in task like what you said, that you don't feel that this is the way, knowing in advance from your position, what your tool is going to be used.
Speaker 1 4 is also probably maybe sc realistic towards the collective human intelligence. So opening this up to the open source, I think it's very interesting. 4 So connecting the dots to the next thing I wanted to ask you about in terms of open source specific. So right right now, the best way I think to run Interpreter04 is via Gp 4 Api token, do you paste or you have and then maybe in your desktop application we can talk about later. This will be with baked 10 In addition, there's a local mode.
Speaker 1 And here on stage, we have multiple folks from new from alignment like folks who will fine tune and know the best coding models We've talked about a few of them today, 5 1.5, stark from all of these models get, like, very high code execution. Task. Right? So, like, they're not general. They can't maybe talk as well as Gp 4.
Speaker 1 Gp 4 is still the best version of, like, cog generation, but still these models get, like very... How should I say, impressive results in coding. However, they're fully local. How are you thinking in terms of like, already the local mode can you talk about this and then also specific models for task execution? Because 1 last thing that I wanna add here is we know the Gp 4 and c interpreter.
Speaker 1 So the the code interpreter Gp 4 in the Ui, was probably fine tuned to also fix its own mistakes and also know whether or not the code execute. You now have opened like a way bigger domain in understanding whether or not the task was successful. And so those 2 big questions I wanted to ask, like, how do you know if a task was successful and in how you use local models to kinda improve that?
Speaker 2 Yeah. Yeah. So what's funny is that this was the the intended experience of open Interpreter04 you mentioned this is with Gp 4. I built it with Gb 4 and it was not... It was like a few days before I was gonna put it out that I sent it to a friend, the same friend, by the way.
Speaker 2 Joe He, and he was like, you should put Llama in it. I think people would like that. And I was like, oh, I don't know anything about that because I I'm like, Yeah... My I'm just so so far away from figuring out how to get these things to work locally. And so I...
Speaker 2 Played around with Lam Cc python and worked on my machine and and I put it in it, and it has just been I did not anticipate that would be the thing that people really wanted, but it is. Because there's this idea of keeping everything totally local that you can have with the code interpreter experience or what I think, yeah, you're right is more open. It is a much more open ended and interesting thing to talk about as kind of a natural language interface your computer. Something much bigger than advanced data analytics. But the idea of having that ring fully locally is really exciting to be.
Speaker 2 So so I do in alignment labs, by the way, has been an incredible resource and help in this and in Austin if you're here in Blaze that they've just been fantastic.
Speaker 1 You you mentioned alignment labs, we had an interview with Austin from a light move levels before. I wrote in the audience. It's a... Organization of incredible people doing a incredible work open source and friends dear friends of the pod, and so I will introduce Austin, local austin.
Speaker 3 No. Thank you. That's really nice with you. No. I just wanted to just...
Speaker 3 Like, I think a lot of people the big appeal is... For having, like, a local model doing it is they're just less, like... For, like, people like me who use Gb 4 frequently and half for a while, it's less gross when a model talks to you and doesn't sound exactly like Gp 4 every single time. Ball it's free, and we're Gpu 4. And we gonna save our pennies.
Speaker 2 Yeah. Yeah. Yeah, that's a huge part to Gb 4 is really expensive. And it's just not it's not 10. But...
Speaker 2 That's gonna be how much it costs to use a system like this. And the idea of it being totally free is really exciting. So, yeah. And by the way, so it's interesting so just launched with code Lam, And it was... By the way, just...
Speaker 2 I... Austin is like 1 of the smartest people, I know. This is... It's amazing to think that there's like a future here and anybody who's kind of interested in this about locally running language models to do please get in touch because this has been a really hard problem because all of a sudden, all these people across all these different operating systems wanna install local language and just have it work. And We're just...
Speaker 2 Yeah, but the idea of it being entirely on the Cpu is 1 of those things that that we talked about that is just brilliant. And is... Because all the complexities around the Gp is really what a lot of people are facing with the install of the local models. Anyway, yeah, local is really exciting not just for privacy, but for cost and the idea too that's that's you might be able to even train a model on this workflow where, you know, it it's interesting how much of the dataset set you might be able to throw out, because this thing is not writing like long form functions that like, it's not trying to be a software developer and write large application and things. It's not that.
Speaker 2 Use cursor for that. It's this is about writing and running code right there. And the idea that this is like if a particular use case, where I do think we could probably get a really exciting local model that's just built for this. And the the idea, I don't know. I haven't looked the kind of Open ai is train any differently It honestly behaves very similarly.
Speaker 1 I think we don't notice this for sure. But definitely a conclusion we've gotten into compared to the other 1 because train there like a... They were released around the time where they changed the models, I think around June and then we saw that the... The code interpreter behaves, like the previous models before the unquote. And then we also saw a different kind of behavior in self reflection and how it it does different thing.
Speaker 1 So on the point of execution and fine tuning and training, it definitely... So first of all, you're in the right place. Right? The folks who find models, open source models, are here, and is great like you guys are already talking about this. Second of all, it sounds like open an interpreter once it gets more used and it's now already very popular which would love your comment on, because I I don't know if you expected this, but it's a great source of data collection as well.
Speaker 1 Right? Opening, I definitely collects. Every little piece that people interact with. This is why it's for free. You now have many people kinda using the tool, and there's definitely like a place to collect a bunch of tasks related, like, they work it not, like, did mean this, etcetera.
Speaker 1 I want you to talk a bit about the explosion of this. Do you expected, and how you're dealing with this? And could you please, like, give a summary of some numbers as well? Because I think that's very pertinent to the Interpreter04 folks in this... In something like this?
Speaker 2 As of yesterday, I haven't checked, but it was number 1 on github for the week. So it came out a week ago or week 2 no week 1 day ago. And it's at 20000 stars. We passed, I think yesterday, maybe 21 now. And rolling in it a week.
Speaker 2 And for comparison to somebody was saying in the discord after 5 days, we had the same number of stars as lang chained Interpreter04 months. So in 5 months, this did with in 5 days. 5 or way around. So it's been it's been a lot of fun. I think that it just kind of immediately, you can you can play with it and it's so easy to share and to make just incredible stuff.
Speaker 2 The imagination to unlock that that this happens. It really is about about the, like, people that are using it and the people have come in. We got 1500 people in the just discord. I'd love for anybody here too to come in and talk and rough about how to get these things running locally better. And it's been crazy.
Speaker 2 It's been crazy. I have never managed an open source project before. I've never... I barely even though how to use git. So I I was so lucky and it's it's just been total alignment of the planets that some of the most talented people I've met.
Speaker 2 Just because they want this tool and they wanna use this tool, have joined. And and they're in the discord and the fantastic product managers who's been instrumental in organizing the github and making all this happen And just out out of wanting a better tool. So that... That's great because I don't know, you know, I I definitely have been playing with these things for a long time. Like I I I'm I dropped to a high school to make up step, but then I did get my Ged then go back and went to Wood college of Education Western up in in above Seattle.
Speaker 2 And then dropped out when G 3 came out because I was like, I I gotta be. I'm either watching this happen or I'm a part of it. So I've been playing with them for a long time. And I definitely am am you know, I'm capable on that front. I really, I think I Understand the quirks of language models and how to kind of twist them.
Speaker 2 But I definitely did not expect this. And and don't, you know, I'm trying to learn as fast as I can about how to be a a good maintain. Have an open source community. And if anyone also has run a large open source project, which I know there credible people in the room that have, I would also love just hear from you. Right?
Speaker 2 Because I I don't, you know, it it's a lot, and I didn't expect it at all.
Speaker 1 So going, huge on the success of this. Right? Second of all I wanna highlight the point that
Speaker 0 you just made were, like, you're fairly new to, like...
Speaker 1 Coding as well and you came into the room and said, hey. There's like bigger recorders. That's all fine. Like, a lot of the stuff is just, like, making it and then being the the point person, it sounds was like you're were that and we're all grateful. I will add this additional thing that Austin from Alignment lab told us about His passing and how he started Here on stage also like fairly new and We're seeing this new wave of Ai engineer, Basically, a person who, like, can intuit it how to talk to these machines to extract value and then has ideas and builds on those ideas.
Speaker 1 And this success is like partly from what I hear is coming from that. And so this is great. This is the age we're living in. Right? Like, people do this.
Speaker 1 Stand up and talk to these machines and then build tools for these machines. So definitely, you're... At least to me huge alignment level like everybody here on the audience. Weren't coding before and now are doing incredible things are the embodiment of the Ai engineer. And so I definitely wanna leave you some space to talk about how you can get help who But to summarize, so far, there's a discord for going.
Speaker 1 Obviously, multiple people wanna run this a multiple environment. So if you folks in the audience have, like... Experience in running on different environments please go and contribute to Open interpreter. There's a there's a community here on Twitter that probably needs to happen as well as people talk and share their ideas. So again, folks in the audience, I'll I'll help killing out a little bit.
Speaker 1 If you... If you run this, open Interpreter04 for a cool use case then you don't think that anybody else did. Even if you do think, this posted. Tag Killian, hashtag open interpreter, and then just like, let's have a thing going where like we share an imagination unlock with.
Speaker 5 Yeah. I'm having it right now. I was just wondering... Is there a guide or someone who's done an easy way to emulate the entire response if open Api, so that then you can you can just run whatever G model or or or whatever we choose as the as as
Speaker 3 a local Api but or
Speaker 5 a local area now. Yeah. Then you don't have to really swap in and out of
Speaker 3 the model. They, like, really really portable. Again, I'm uploading right now, the y llama 01:20 buddha, so I thought
Speaker 2 that's so cool.
Speaker 5 Throwing code interpreter in there. By the way, we usually use these spaces as just like work meeting things. Kind of for a stand ups. So... Yeah.
Speaker 5 It's feel free if you have any problems or or or requests from people, a lot of library containers and stuff, sometimes are in the audience. So yeah. That's awesome. That good to me, by the way.
Speaker 2 But yeah. God, the Dvd thing is so cool that. I saw all of those. It's the cheap. It's like light.
Speaker 2 You're like encoding it in simple bumps that lasers get it's insane. To think the that thing spinning up a language model.
Speaker 1 This has been the experience of Mike and mister Yang like on the stage looks many folks in different areas, and then some folks meet. And I think alignment austin them you said the same thing. Right? Well, like they don't know how big sources is until everybody knows everyone and everybody's like friendly with everyone. So clearly welcome to this group, and feel free to, like, ask for help basically.
Speaker 1 And so, yeah, let me give you the stage of, like, how folks high, find open up what they need to do to help what's with specific things that you need or think of in terms of contribution. We often had folks ask for Gpus and then the was given to them from the audience. So, yeah, the floor is yours in terms of like, how how the community can support open table?
Speaker 2 No. I'm tu worried about the idea that, like, should this be kind of an open Ai? Compatible end endpoint, like like is there stuff for that? Yeah. There there is so so and this is what's by the end of the week or maybe by Sunday.
Speaker 2 What you're going find in the open interpreter is that basically we're we're trying to ref it so that essentially language model just comes from an open compatible. Because there's some brilliant work done by folks at light L is a company that that basically lets us just hit an open can battle at about a point and then that hits claude that hits and thought hits c here and been open nice as they handle the cloud stuff. Still trying to figure out the right provider for the local stuff, We played with Gp for all, which kind of behaves like this. It's not an opening compatible endpoint, but it hasn't worked on some systems. We're not sure with the way to go.
Speaker 2 And yeah Llama has this too and talking to Jane Morgan about this maintain of Llama. And it just spins up literally a local host that's a open compatible 1 point. So as long as we can get everything down to that and just be talking to an open compatible endpoint. Theoretically, you should be able to just flip between any cloud model you want and any local model you want. Also stuck to the L studio guys have been great.
Speaker 2 About building something that that just would spin up a local host so you pick a model from hug face. And this thing is really like a distributor of locally running language model So to just get something that that folks can really easily tap in any local model is the dream. And Yep.
Speaker 1 There's a community. I think window dot the ai and open the router that also probably you should take a look in. If you haven't chatted with Zen. They kinda also do local models influence in the style of Gb 4 as well.
Speaker 2 Oh, yeah. Yeah. Yeah. And by the way, so by by that happening? And, yeah, that's great.
Speaker 2 And if anybody else has any other suggestions of of just people who have thought about this because it's just... It's not the problem that I think open interpreter should be trying to solve is how to... Run language models which should just be picky Killian and running on the shoulders of giants of these brilliant people who have found out how to run language models locally. But, yeah, on the idea of kind of getting this into a opening About a 0.1 thing I wanna mention is that right now, this thing is really heavily entangled with Gb 4 and with to an extent code lam, and that sort of prompts templates and to the terminal, because the terminal gui, basically is how this thing is meant is used right now. We're gonna un that.
Speaker 2 So on on Sunday, the idea is that you were actually just gonna be able to import interpreter and then use this as like a Python generator. I really think that there is an interesting thing that could happen here where you can build open Interpreter04 into the stuff you're working on. The goal is really to be the linux of this space. This really early space of Ai code interpretation and and find the right way to be this open source. Nexus between language models, running code between flexible, intelligent language models and then this rigid deter a powerful world of computing it And just find the right way to what is that connective tissue?
Speaker 2 And so open interpreter an attempt to be that and what I really want it to be is easy for developers to put into stuff. So I really want to just be a python generator where you put a string in of a message somebody sends it, and then you can just start yank tokens out of it. Just the yank out the token of the message that it sends, then when it goes to run code, it just runs the code, and then you start yank out the output. Each line by line of what the code output is. So that's really easy to build this thing in.
Speaker 2 So we're working on exposing that. And that that plays very much into this idea of making it opening a compatible endpoint. And to get further along in this and and to make that happen, it is... Most people come from Github. That then it...
Speaker 2 The most talented it's been incredible to see that people just see it on the trending page. And just come and join the discord and did I'm say, hey, I saw on the drink page. So it actually matters a surprising amount and if if you could go star the repo, it it contributes actually to the project a lot. To just get it seen. I'm noticing that that's where pretty much all of our traffic comes from.
Speaker 2 So that's of of great support. But really the main thing is that we're most of the issues people are having with it or of getting language models running locally. This is a problem that I just would love to kind of... Find the right people to work with to make this something that we abstract away. So really I'm just kind of we're we're passing...
Speaker 2 We're talking to an opening compatible point on someone's local host. And finding something that really works across all these different systems we're testing has been a challenge. And this does seem like the room to ask. And by the way, what we're... What what we're working on with this and this might be fun for people who think about this.
Speaker 2 About prompting them. Is you don't have to be a function calling language model. Obviously, most of these models are not just open Ai that has function calling. So what we actually ask the model to do, and this is why we can have any language model at all. Can use Go Interpreter04 just say, hey, if when you run markdown down code block when you print 1.
Speaker 2 So do 3 back and then say the language, it will be run and then you'll be given the output afterwards. So then suddenly, any... That's kind of a function call. So you give that ability to like, any language model, and that's how we sit with code law and all these other ones. And it will just write it in markdown down and specify the language and then we just grab that.
Speaker 2 And then once it closes that up, that closes the fun function call and sends it off, and then it gets the output and we feed it back into the prompt. So any language models this not function calling. Yeah. I do you think that since the biggest problem we're having right now is getting locally running language models. Please reach out to me.
Speaker 2 That to me is the most important thing. People want to use this thing locally.
Speaker 5 Okay. How The biggest role we're having with the... Hydra Emily moe model is, like, getting very good usability. We realize we haven't developed the Ui as much. And actually just develop automation in general as much.
Speaker 5 So I'm really glad that there are other people tackling these issues so that people are a little bit less overwhelm with the amount of tasks and stuff they're taking. But, yeah, you're welcome to join in and contribute. I'd like to... Like, I'm gonna actually try and just dump open and Interpreter04 there, And there might also be some possibility Actually, yeah, yeah. Wanted to ask about this So do you think, for example, for the Linux, the models could be trained better for instruction following.
Speaker 5 Or for things like opening, like like, you were using Apple script in on Mac for things like using... Using Millennium or cyprus or right over to. Automate some of the the web browsing work. Do you think the Open source models could use another point tune for instruction following in this case? And what has been your experience with, like, getting automation to work in general.
Speaker 5 Have you've only gotten an... To work on Mac and and Windows. I can tell us a little bit more about it?
Speaker 2 Yes. So no. It's... What we do is we pass into the prompt and we say this is this user's platform. So it works on linux and it works on Windows.
Speaker 2 It it's it really works with Gp 4, the problem is getting the locally running language model to work by all different platforms. We've seen it across all 3 of them, but it seems like just crazy architecture stuff that has to be. Worked out. But, yeah, No. It's...
Speaker 2 It is capable of running the equivalent command to, like Md find on a Mac which uses a spotlight, if you ask it to like, open up Or something, It will use the equivalent because it knows that you're on a windows system we're on a Linux system. So like, I use it on rep and Google Cola Lab those are Linux. So it it does. It's it's... I don't know I answer question, but...
Speaker 2 Oh, yes. And the idea 2 of fine tuning this. God, there's something really interesting about that because again, it's a subs section of what these language do. The idea of getting something small that runs on consumer hardware. That's capable of this workflow of self reflection and on code outputs is really exciting.
Speaker 2 There's even exciting stuff too and the fact that we have such a clean separation between when it needs to be chatting and when it needs to be coding that 1 of the best experiences I've had with it was actually to download both Lam 2 clean and code lam. And then have this this kind of hard mixture of experts situation where it's talking to me and it's llama and it's planning and it's thinking and it's clarifying. Then as soon as it opens up this 3 back and goes star code, we switch to code lam. And so it's... There...
Speaker 2 There's really interesting things there too. That's not fine tuning. That's getting a little bit further to, like, we at least understand that about the the problem in its workflow, we can kind of wrap what we're doing around it. But I think that there's getting so much to explore about the idea of fine tuning it, and I'm just not the person. I don't know how to fine tune.
Speaker 1 You found found the place. You found the people
Speaker 5 you're generating with the data just really quickly, I wanna I wanna... Are you keeping the logs and is there a way if we were up... To ask people in the community to to share their logs of their automation workflows, like, what they're getting from.
Speaker 4 Yeah. Things.
Speaker 5 That that that would be to That that would be nice. If...
Speaker 3 That's gold that data that is the most valuable data ever generated.
Speaker 2 I think that it would need to be really clear that you were consenting to it. But the idea if folks would be willing to accept, you know, to to send it so that we could make something for the open source community that would...
Speaker 4 Yeah. So real quick, I was just gonna say that even if 1 percent of the users, like, even considering it's already trending on Github and everything. I think even if 1 percent of the users just opt in and explicitly opt in to have their data logged and everything and sent to some centralized place that you can access and use for fine tuning, that's super valuable data that if we end up fine tuning a model specifically to do apple scripts, I think it could be amazing and way better than what Gp 4 can do, but simultaneously, it can have a bunch of speed benefits of Just imagine something that's able to do all of these automations and all this open Interpreter04, way faster than Gp 4 can do it.
Speaker 1 I have... I I wanna add to this multiple things Killian and just 1 off the bat. I I was the interview first, but I also have some ideas. 1 of them is Ranks smith as all the encoding everything. So if you add Links smith, you'll actually see the runs if you ask a person to do this.
Speaker 1 Very easy. Like all the centralized place will be there. Also we'll make it easy to share what my session was versus your session. And second of all, we had sponsorships before. We've talked about sponsor chips.
Speaker 1 There's potentially, if you give somebody users like a free use, if somebody sponsored your Gp tokens, G 4 tokens, even if open the eye icon you. Then, you know, turn to that, they can share their sessions, like, kind of, like, the open the eye model. But q could also work. Killian, I wanna... First of all, huge, congrats on this, and thank you for joining us and talking about us.
Speaker 1 See great, benefit open Interpreter04 this already adding this to their stuff. And you now consider their friend of the pub please feel free to come back here at any point. Into the g as well. Thank you. And we'll watch in every new release that you have feel free to come here and and chat with us about this.