Summary LangChain just launched their new "LangSmith" platform - YouTube (Youtube) www.youtube.com
2,512 words - YouTube video - View YouTube video
n/a LangChain just announced their new platform called LangSmith, and it's a pretty big thing. So it's
n/a a unified platform for debugging, testing, evaluating and monitoring your large language model applications. This video will be
n/a a brief first look into the platform. We'll go over what LangSmith is, wide exist how you can use it and what you can actually do with it. So let's start with who this is actually for. So as most of you probably know, otherwise, you probably do not land on this video LangChain and exist to make
n/a it as easy as possible to develop large language model powered applications. This is what I've been diving into over the past cop of weeks. Made tons of videos about
n/a this as well. Laying change is just awesome for building applications with large language models. I've been experimenting with this both personally and professionally with the clients I work with, and LangChain has enabled us to pretty quickly develop prototypes. That was the first main blocker when you wanted to create build something around large language models as they highlight in this article. But now we're entering the next phase where people now know how to build these prototypes in literally like 5.
n/a Lines of code with with LangChain. But now the problem is okay, How do we take that prototype, that proof of concept and how
n/a do we add put that into production, How do we
n/a put that into an actual application that users companies can rely on for their workflow.
n/a So So that's really where LangSmith
n/a is coming from. It's a platform to help developers close the gap between prototype and production. So LangSmith really is a platform for you to If that is your goal, you don't just wanna, like, play around with large language models play around with LangChain, which you actually wanna put them into production create applications and potentially implement this for your own company or sell them as a surface or product to potential clients.
n/a And now if that sounds like you then make sure to like this video and also subscribe to the channel. I'm going all in on this and Taking you guys with me on the journey.
n/a And now the platform is currently in enclosed beta. And unfortunately, I don't have access yet. So you can already go
n/a to the website smith dot LangSmith dot com and sign up and then apply for the wait, so currently
n/a I'll be eagerly waiting until I can get access to it to further explore it, but the documentation is already online, so we can already get a sneak preview of what it's like.
n/a And there's also already an official video available from LangChain going over the platform what you can do with it.
n/a So we basically went through all of the information
n/a that is available right now to give you a quick summary of everything that you need to know. And now this announcement and launch of lang, really came at a great time for me personally and for my company data Alumina, which is an e learning business, but we also do consulting projects in the data science and
n/a The ai space. And we're currently working with a couple of clients where we really try to go from proof of concept for a large language model application to production. And we have literally been getting questions about okay, how do we monitor and evaluate these applications over time. So we were all already looking internally into creating some kind of like a system a lock system to keep track of all of that. But now LangSmith is here to to take care of all of that.
n/a And probably the best way to illustrate the the added value, the benefits of using LangSmith cement is by by giving you example. So as as you all know, these large language models are sarcastic in nature, then they're non deter, which means that every time you query you can get kind of like a different result. And now you can tweak the temperature to make it a little more deter. But if you set
n/a up a system which requires user input, depending on what
n/a kind of question the user asks to the system, you not always like get the same result back. So everyone is different. Everyone is going to interact different with these applications. So that introduces a lot of ambiguity. Right?
n/a So Let me actually show you an example of a recent project that I showed here on my Youtube. So here, we're using a simple L chain with LangChain
n/a and we're using Ser Api and we're using a math 2. And we ask what is the median salary of a senior? Data scientist in 2023 and
n/a what is his figure if there's a 10 percent increment. So this is under the hood doing a
n/a lot of things. So let
n/a me just run this and show you what we get as an output. So it's using Google search
n/a and it's using a math tool a math.
n/a Agent to get these results. So it's loading
n/a for quite some time and then it says the median seller of a seen the sizes in 20 with a
n/a 10 percent increment and then we get the total number. So this is like the
n/a kind of answer that you would put into an application.
n/a Right? So you build a whole layer, all the logic, all the code is behind some application behind the front end, and then people users interact with it, ask a question, so this would be the input over here
n/a and then they get this
n/a as the output. But okay. So what happened? What what did it do? What were the steps?
n/a To how many tokens did we use? What is the Api cost for this? Like all of that information is of course, something you want to log into an application because what if this is totally wrong. You've of course, wanna monitor that. Now, what we can do if we set the verbose to true with it within this.
n/a Run. We can run it 1
n/a more time, and then you can already kind of get an idea of what it's doing. So it's entering a new chain I need
n/a to find out what the median salary of a senior data scientist is this is and it goes through its its reasoning. This is something really awesome that LangChain built. So they have these steps observation thoughts and then they get to the final answer. So this already gives us a little bit more insight into how it it came to to this answer. But how how do we lock this?
n/a This where do we store this? What are the the token costs? What which user? The query is, how do we ensure that this answer is correct. You can see all
n/a of those
n/a questions really start to become important and start to become a problem as you go from okay, It's now working here locally.
n/a On Vs code. I can put this into a server and then user can interact
n/a with this using Api, but all of this, all of it that this that is happening behind scenes and even more what do we do with that? Okay. So I think
n/a that gives a lot of context. So the way LangSmith is trying to counter isn't trying to help you with
n/a the this is by logging your runs.
n/a And from the documentation and examples available right now, the process seems very straight
n/a this forward. So once you get access to the platform, you can get an Api key and you can potentially link a project and you just forward export that as environment variables. So you export them, and then you can just run your application like you normally would. You can just interact with LangChain as normally and in the back end it's going to via that Api connection and the end point, it's going to lock everything. So what can you then do and this is where it gets exciting.
n/a So first of all, you can visualize your runs. So what does that look like Here you never have an example of an agent executor, where we have a similar example kind of to what we were doing. So this is a math, problem. But here you can see, okay, We have a a division problem. And here we have the output, and then not only can we see all the steps that's the agent too to get to that answer and show that visually
n/a in a tree, which we can drill down to, which I will show in
n/a a bit. We can also see how many tokens were used, so this translates to the costs. And we can also see and this is something for evaluation, we can also see a reference output. So is this is an agent executed that was run on an evaluation dataset set, So you can provide the the correct answer and then really visualize, hey, is what's going on in my application. Is that still what I wanted
n/a it to be. And now in the LangSmith launch video, they go over some examples and also show you a little bit behind the scenes of how all this work, So they have a notebook and they go over a couple of questions, which you
n/a can see in the inputs list over here. And then they later show how this all shows up. In the LangSmith platform where you can see. So, for example, here you have all the inputs. So those are the questions that they run, start time latency token stacks, so you can already tell if your application is running into production and users are interacting with this.
n/a This platform this dashboard becomes extremely vital to to monitor everything? How long
n/a is it taken? What are
n/a the costs? So I highly recommend if
n/a you wanna learn more about this to to watch this video because they really give you an in... Overview of what you can do. So here's the example again that I just
n/a showed where they go over the division problem, and then they show the tree, but then what's also really cool is you can do a drill down. So they start from... So you can click down on any of
n/a the steps and you can
n/a see the actual prompt that was sent to the Api. So here you can see the human prompt and then the the Ai intermediate step, really visualize what's going on. Because if you use LangChain and you're running all these change and also using user input, they're combining all of that information into various prompts that they then send to the Api, and then you get the information back. You wanna monitor that. It's all in ear.
n/a And now another common issue that you will run into especially when you're debugging these applications is experimenting with the the prompts, tweaking the prompts to get the desired output. So even answer from
n/a a large language model is is is not correct, you wanna tweak the prompt to
n/a get it right, and this is experimentation, and they have introduced a playground for that where you can can play with that similar to the playground that open. But the cool thing is that you can go from LangSmith from your tree, goes straight to the specific point that you wanna invest to. And then in the top corner, you can click opening playground. So here you
n/a can see opening playground and boom. You're in there with the prompt there already, you can
n/a run it and you can check the output. And this is also where
n/a you can play around with different models, temperatures, etcetera.
n/a So very very helpful feature.
n/a Now you can also... Share your work, so there's a share button to send this over to to someone else and I've heard that they're they're working. They're trying to build out a team feature where you can work, collaboratively on these environments. But right now, the best way
n/a to do it is to share it with other people and then another great feature is it
n/a has the ability to create datasets for testing and evaluation. So you have an input and you have an output and you can
n/a put this into a data set. And you can use this for references. And this is what we saw all
n/a the way in the beginning where we have an input with an output
n/a and we have a reference output. And those reference outputs, those you can get from a dataset. So this can all be done either in the platform itself or via the a Ai. And this again also for
n/a making sure for monitoring the quality of these applications, this becomes crucial. Bi And then based on those dataset, based on those inputs and outputs, they also have models in place for evaluation. So in the video, they show multiple models. So I still have to really look into all of them specifically, but
n/a you can get scores like correctness, helpful to really assess. And evaluate your applications, again, very powerful stuff. So that covers most of the things that you
n/a need to know right now about LangSmith. It's a unified platform or debugging, testing, evaluating and monitoring your large language model app occasions. And now like I've said, this really is for you if you wanna take it to the next level from prototype to production. And now if that's something you're interested in then make sure there's subscribe to a newsletter in
n/a the first comment below this video. On this newsletter, I share tips, stories and best practices
n/a that I learned from actually building and developing large language models applications for the clients that I work with and also the insights that
n/a I get from my mastermind data freelancer where we connect with a lot
n/a of data professionals internationally. Make sure to check that out, it's completely free. And if
n/a you wanna see more examples of how I actually built these applications then make sure
n/a