Summary of Ethan Steininger on Mixpeek and the AI Landscape - Weaviate Podcast #42!

Summary Ethan Steininger on Mixpeek and the AI Landscape - Weaviate Podcast #42! - YouTube (Youtube) www.youtube.com

14,457 words - YouTube video - View YouTube video

One Line

The text covers a wide range of topics including MongoDB, Weaviate database, language models, chatbots, AI limitations, user understanding, data-to-text, API companies, open-source software, retrieval and instructor models, historical data splitting, serverless GPU space, Kubernetes challenges, state management, convolutional kernels, AI benefits, and the future of AI.

Slides

Slide Presentation (11 slides)

Copy slides outline Copy embed code Download as Word

The AI Landscape and the Future of Search

Source: www.youtube.com - video - 14,457 words - view

Introduction

• Ethan Stein is a software engineer and entrepreneur with experience in search and AI projects.

MongoDB's Search Architecture

• MongoDB has a change stream API and a search architecture that replicates changes into an inverted index.

• The aggregation pipeline allows for hybrid search combining database queries with full-text search functionality.

Visual: Diagram of MongoDB's search architecture

Introducing Mixpeek

• Mixpeek is a multi-modal indexing, embedding, and search API for working with different types of files.

• It simplifies the process of extracting and searching document contents.

Visual: Illustration of Mixpeek's indexing and search process

AI Applications in Building and Construction

• Language models and AI have potential applications in the building and construction industry and the Internet of Things.

• Custom language models can be created to understand company knowledge bases.

Visual: Image of a building connected to IoT devices

Importance of User Understanding and Experience

• Understanding end users and building exceptional user experiences are crucial for startups in the AI landscape.

• Nailing the developer experience is particularly important for API developers.

Visual: Screenshot of a user-friendly interface

Licensing Restrictions on Code and Models

• OpenAI and other companies have licensing restrictions on their code and models.

• Restrictions include limitations on commercial use and developing competing models.

Visual: OpenAI logo with a "Restricted" sign

Complementing Language Models with Search Engines

• The use of language models and search engines can complement each other in providing accurate and relevant results.

• Search engines act as a source of truth, while language models enhance understanding and generate responses.

Visual: Diagram showing the interaction between language models and search engines

Future of AI: Domain-Specific Models and Self-Hosting

• The future of AI involves the development of domain-specific models, self-hosting capabilities, and different niche applications.

• OpenAI could become the app store of AI models, while other companies focus on domain-specific knowledge bases.

Visual: Conceptual illustration of domain-specific AI models

Conclusion

• AI is advancing rapidly and offers vast possibilities.

• Startups need to focus on user needs and exceptional experiences.

• The future of AI includes domain-specific models and self-hosting capabilities.

Visual: Image representing the future of AI

The AI Landscape and the Future of Search

• AI is transforming search with hybrid approaches and innovative applications.

• Understanding user needs and building exceptional experiences are key.

• The future of AI includes domain-specific models, self-hosting, and diverse niche applications.

Key Points

Ethan Stein is a software engineer and entrepreneur with experience in search and AI projects.
MongoDB has a change stream API and a search architecture that replicates changes into an inverted index.
Mixpeek is a multi-modal indexing, embedding, and search API for working with different types of files.
Language models and AI have potential applications in the building and construction industry and the Internet of Things.
Understanding end users and building exceptional user experiences are crucial for startups in the AI landscape.
OpenAI and other companies have licensing restrictions on their code and models.
The use of language models and search engines can complement each other in providing accurate and relevant results.
The future of AI involves the development of domain-specific models, self-hosting capabilities, and different niche applications.

Summaries

456 word summary

Ethan praises MongoDB.

Real-time alerts and hybrid search in Weaviate database. Mixpeek solves PDF search problem.

Multi-modal API simplifies file types, enhances document search. Open AI plug-in brings paradigm shift. Weaviate is default search engine. Mixpeek's API streamlines database management, keyword search, file parsing. PDF ingestion made easy. Opportunities in AI-enabled Etl for unstructured data.

Microsoft research: chatbot capabilities, startup crises. Startups: pain points, user experiences. Chatbot marketplaces: service links.

Mixpeek: language interface, marketplace. Co-pilot X, Chat boutique. Content creation framework.

AI limitations, licensing, training, feedback for models. Language models in IoT convergence. Study on monitoring buildings through IoT.

Language models and AI

User-friendly interface simplifies AI, open-source framework enhances language models.

Language models for knowledge graphs, chatbots, and SQL queries.

Modules, adapters, repositories, filtering searches, code structure, workspaces, vector search, and production use cases discussed. Engineers acknowledged.

Understanding users. Embedded search bar tracks user activity and creates activity sequence for specific metric. Learn-to-rank model reorders results based on conversion propensity. Digitizing user features and content-only-based crossing encode. Translating tab features into text-based inputs for transfer learning. Mapping user actions to text sequences for sequential inference.

Data-to-text prevents overfitting. Combined models yield fast results. Meta rank services handle hosting, ingestion, and validation. Innovative approaches needed for versioning and refreshing. Weaviate excels in AI production tooling.

API companies uncertain, OpenAI dominates. Fine-tune prompts, niche AI models important. OpenAI app store, domain-specific knowledge focus.

Open-source software, bert training, language model, learning to rank, hybrid search engines, original content, query embeddings, Facebook DP, open ORC models, 0 shot model, lexi BMW 25, effective in 80% of cases, future of TFF in BM 25 and string matches in vector searches uncertain.

Applications use retrieval and instructor models for intent and prompting. Argue anna and Beer datasets focus on counter arguments and use cases. An academic dataset is needed for capturing the evolution of documentation over time. Tuning the model using delta between stages captures the evolution of a corpus.

Historical data splitting, audience building, open source, podcast potential.

Closed source benefits, Mosaic manages.

AI company offers seamless search experience with two API calls. Cloud providers neglect serverless GPU space. Modal and Banana deploy models with serverless GPUs.

Kubernetes challenges, serverless offers solutions.

Server challenges: state management, replication consistency, distributed systems. Azure cloud, Ray, Neural Magic for GPU management, model compression. Future models: smaller size, compatibility with commodity hardware. Different architectures, hardware influence deep learning decisions (e.g. large matrix multiplication due to GPU popularity).

Convolutional kernels, GPU use, quantum potential, fear of AI. Breaks, camper van living. Mention of blog.

AI benefits, policy simulations, AI's control, plug-in marketplace.

AI future: digital species, improved chatbots, verification

Temperature adjustment; content verification; classifiers and code compilation; Mixpeek and colleague.ai; new search era; innovative thinking.

745 word summary

Ethan Stein praises MongoDB and other search engines.

Weaviate database has real-time alerts and hybrid search. Mixpeek solves enterprise PDF search problem.

Mixpeek's multi-modal API simplifies working with different file types, addressing challenges in extracting and searching document contents. The Open AI plug-in marketplace is expected to bring a paradigm shift. Weaviate is a default vector search engine. Mixpeek's API calls streamline database management, keyword search, and file parsing, highlighting the ease of data ingestion, particularly with PDFs. It also presents opportunities in L-enabled Etl for unstructured data.

Microsoft research showcased chatbot capabilities, solving startup crises. Startups should focus on understanding pain points and building user experiences. Chatbot marketplaces serve as links between services.

Mixpeek: language interface, marketplace, $20/month. Co-pilot X, Chat boutique: code-writing tools. Content creation framework, no final copies.

AI limitations for commercial use and competing models. Licensing, training, and feedback for AI models. Potential for language models in IoT convergence. Interesting study on monitoring buildings through IoT.

Ethan Steininger on language models and AI.

User-friendly interface addresses AI complexity, open-source framework improves language models.

Using language models to create knowledge graphs, instruct chatbots, and write SQL queries.

Discussion on modules, adapters, repositories, filtering searches, code structure, workspaces, vector search, and production use cases. Team of engineers acknowledged.

Understanding the end user is key. One use case is an embedded search bar that tracks user activity and creates a sequence of activities that can be converted into a specific metric. Building a learn-to-rank model that can reorder results based on propensity to convert aligns with business goals. Digitizing user features and relying on content-only-based crossing encode is interesting. Translating tab features into text-based inputs for transfer learning involves mapping user actions to unique text sequences processed by a language model and stored in a search engine for sequential inference.

Translating data to text avoids overfitting; combining models yields fast results. Meta rank services handle model hosting, data ingestion, and validation. Innovative approaches needed for model versioning and embedding refreshing. Weaviate excels in AI tooling for production.

API-based companies attract attention, uncertain if OpenAI dominates. Users can fine-tune prompts. Niche AI models emerging, self-hosting important. OpenAI as app store, others focus on domain-specific knowledge.

Mosaic ML offers open-source software for bert training. Language model and learning to rank are important. Hybrid search engines with original content are significant. Potential ideas for updating query embeddings include Facebook DP and open ORC models. The 0 shot model with lexi BMW 25 is effective in 80% of cases. The future of TFF in BM 25 and string matches in vector searches is uncertain.

Applications use retrieval and instructor models for intent and prompting in search. Datasets like Argue anna and Beer focus on counter arguments and use cases. An academic dataset is needed to capture the evolution of documentation over time. Capturing the evolution of a corpus requires tuning the model using the delta between stages.

Importance of historical data splitting in AI models. Building an audience and receiving feedback. Open source as a content strategy. Potential of podcasts for collaboration.

Closed source can benefit marketplace businesses and communities. Mosaic helps manage open source projects.

AI company abstracts patterns, extracts files, offers search. Seamless experience with two API calls. Cloud providers neglect serverless GPU space. Modal and Banana deploy models with serverless GPUs.

Interested in Kubernetes and scaling resources. Curious about weights and biases in deep learning. Realized need for different resources in cluster management with determined AI. Kubernetes challenging, but serverless environments offer query embedding models. Cluster creation and maintenance challenging for me.

Challenges in server functions and databases include state management, replication consistency, and distributed systems. Azure cloud, Ray, and Neural Magic are mentioned for GPU management and model compression. The future of models involves reducing size and compatibility with commodity hardware. Different architectures and hardware influence decisions in deep learning, such as the use of large matrix multiplication due to the popularity of GPUs.

Convolutional kernels, tension matrix multiplication, GPU use. Quantum computers' potential, fear of AI. Take breaks, live in a camper van. Mention of blog about building a van.

AI in job roles may increase productivity, create equal opportunities, and concentrate power. Simulations inform policy decisions. AI's control over the universe is advancing. The plug-in marketplace showcases AI's problem-solving abilities.

AI future: digital species, improved chatbots, verification.

Temperature adjustment allows varied outputs; content verification filters. Steininger discusses classifiers determining code compilation. Mixpeek and colleague.ai mentioned. New search era and innovative thinking explored.

1294 word summary

Ethan Stein, a software engineer and entrepreneur, discusses MongoDB and other search engines' simplicity and functionality.

The Weaviate database has a change stream API for real-time alerts and a hybrid search functionality. It can be easily integrated with existing systems. Mixpeek aims to solve the problem of searching PDF files for enterprise customers.

Mixpeek simplifies working with different file types through its multi-modal API. It addresses challenges in extracting and searching document contents. The Open AI plug-in marketplace is expected to bring a paradigm shift. Weaviate is well positioned as a default vector search engine. Mixpeek's API calls streamline database management, keyword search, and file parsing. It also highlights the ease of data ingestion, particularly with PDFs, and opportunities in L-enabled Etl for unstructured data.

Microsoft research showcased impressive chatbot capabilities, potentially solving startup existential crises. Startups should focus on understanding business pain points and building exceptional user experiences. Chatbot marketplaces may resemble app stores or APIs, serving as links between devices or adhesive glue for different services.

Mixpeek offers a language interface and a marketplace for services. It has a monthly cost of $20. Co-pilot X and Chat boutique are useful code-writing tools. Mixpeek serves as a content creation framework but doesn't produce final copies.

Some AI models have limitations on commercial use and competing model development. Licensing policies may need to change. Training models involves fine-tuning and feedback loops. AI and IoT convergence has potential for language models at the edge. Monitoring buildings through IoT is an interesting area of study.

Ethan Steininger explores language models and AI in construction and IoT, emphasizing custom models and domain-specific datasets for businesses.

A user-friendly interface is essential for addressing AI complexity. An open-source framework involves using a search engine to embed and chunk content, then running a query to extract important information for generating responses. This framework can handle large amounts of context, enhancing language models by integrating the Lam index.

The speaker discusses using a language model to create a knowledge graph from search results and explores innovation in that area. They mention a paper on instructing chatbots to ask questions and create mental maps. The speaker also discusses using chatbots for extract transfer load activities and raises concerns about token limits. They discuss using language models for writing SQL queries and applying symbolic filters in Weaviate.

The speaker discusses modules, adapters, repo, filtering searches, folders, directories, code base structure, workspaces, symbolic properties, vector search, fine-tuning, production use cases for learn to rank, and re-ranking results. They acknowledge their team of engineers.

Understanding the end user is key. One use case is an embedded search bar that tracks user activity and creates a sequence of activities that can be converted into a specific metric. Building a learn-to-rank model that can reorder results based on propensity to convert aligns with business goals. The use of user features and interaction events in models like x Boost is known, but digitizing these features and relying on content-only-based crossing encode is interesting. One idea is to translate tab features into text-based inputs for transfer learning. This involves mapping user actions to unique text sequences processed by a language model and stored in a search engine for sequential inference.

To avoid overfitting, translating data into text and using user descriptions is effective. Combining a large language model with a high-performance cross encode model yields fast results. Meta rank services provide a solution for hosting models, handling data ingestion, versioning, and validation. Model versioning, hosting, and refreshing embeddings require innovative approaches. Weaviate's efforts in AI tooling are noteworthy for productionizing applications.

Interest in API-based companies attracting market and investor attention, uncertain if OpenAI will dominate, potential for monopoly-like situation. Users may fine-tune prompts if unsatisfied. Future of AI models: niches emerging, developer experience and self-hosting important. OpenAI as app store, other companies focusing on domain-specific knowledge bases.

Mosaic ML offers open-source software with managed enterprise hosting, cutting the cost of bert training. Language model for custom applications and learning to rank using tab features are important. Hybrid search engines requiring original content for re-embedding are becoming more significant. Potential ideas for updating query embeddings include the Facebook DP model and open ORC coherent embedding model. The 0 shot model with lexi BMW 25 and hybrid setup is effective in 80% of cases. The future of TFF in BM 25 and string matches in vector searches is uncertain. (17 words)

Some applications incorporate intent and prompting in search using retrieval models and instructor models. Datasets like Argue anna and Beer focus on retrieving counter arguments and understanding use cases. There is a need for an academic dataset that captures the evolution of documentation over time. The sequence problem in capturing the evolution of a corpus requires tuning the model using the delta between stages.

The importance of historical data splitting for training and testing in AI models is discussed, as well as the relevance of building in public and showcasing the steps taken. Building an audience, receiving feedback, and fostering a strong user base are emphasized. Open source as a content strategy, with examples like Laying Chains, is mentioned. The potential of podcasts to highlight the work of others and incentivize collaboration is also noted.

Closed source businesses can have advantages, especially for marketplace businesses or those reliant on community support. However, managing open source projects can be challenging, which is where companies like Mosaic come in to abstract server architecture.

A company in the AI landscape abstracts patterns, extracts file contents, and offers a search interface. They aim to provide a seamless experience with two API calls. Major cloud providers are not targeting the serverless GPU space. Some companies, like Modal and Banana, abstract model deployment using serverless GPUs, which is more efficient and user-friendly than traditional methods.

I am interested in Kubernetes and scaling resources. Deep learning made me curious about the value of weights and biases. I learned about cluster management with determined AI and realized the need for different resources for callbacks and training. Kubernetes can be challenging, but serverless environments offer simple query embedding models. One challenge with Kubernetes is cluster creation and maintenance, while serverless environments can overcome this by sharing context between servers. This has always been a challenging aspect for me as a software developer.

State management, replication consistency, and distributed systems are challenges in server functions and databases. The Azure cloud, Ray, and Neural Magic are mentioned for GPU management and model compression. The future of models involves reducing size and making them compatible with commodity hardware. Different architectures and hardware influence decisions in deep learning, such as the use of large matrix multiplication due to the popularity of GPUs.

The speakers discuss convolutional kernels, tension matrix multiplication, and the use of GPUs in machine learning. They mention quantum computers' potential and the fear of AI models taking over. They advise taking breaks and living in a camper van to disconnect. The conversation ends with a mention of a blog about building a van.

AI in job roles may replace tasks, leading to power concentration, but also increase productivity and provide equal opportunities. Simulations with reinforcement learning inform policy decisions. AI's control over the universe is advancing. The plug-in marketplace showcases AI's problem-solving abilities and tool usage.

The future of AI could lead to the emergence of a new digital species, with companies consisting solely of language models. Sequential thinking in chatbots could be improved by utilizing context and layers of large language models. Content verification and decoding pathways are important in AI development.

Temperature adjustment in language models allows for varied outputs. Content verification layers filter outputs. Ethan Steininger discusses using classifiers to determine code compilation. He mentions Mixpeek and colleague.ai. The conversation explores new search era and innovative thinking.

2672 word summary

Ethan Stein is a software engineer and entrepreneur who has worked on various projects in the search and AI space. He has experience with embedding open source libraries, such as Lucene, into databases like MongoDB. Ethan has also worked as a sales engineer, focusing on the positioning and business pain points of products. He discusses the underlying technology of MongoDB and other search engines, highlighting the simplicity of their functionality.

The database has a change stream API that allows for real-time alerts of changes. The architects behind MongoDB created a search architecture that replicated changes into a loose inverted index. They also developed an aggregation pipeline for running search queries on the index. This allows for a hybrid search combining database queries with full-text search functionality. The Weaviate database is robust and can be easily integrated with existing systems of record. The founding vision behind Mixpeek was to solve the problem of searching the contents of PDF files for enterprise customers.

Mixpeek is a multi-modal indexing, embedding, and search API that simplifies the process of working with different types of files such as PDFs and spreadsheets. The development of Mixpeek was motivated by the challenges of extracting and searching document contents using existing frameworks and architectures. The introduction of vector k n capability into the core luc branch of loose scene opened up new possibilities for vector search projects. As startups adapt to rapidly changing AI landscapes, the announcement of Open Ai plug-in marketplace is expected to bring about a paradigm shift similar to the Apple App Store. Weaviate is well positioned to be a default vector search engine in this evolving landscape. Mixpeek's API calls streamline the messy aspects of database management, keyword search, vector search, file parsing, extraction, and re-ranking, making it easier to work with various types of data. The conversation also touches on the ease of data ingestion, particularly with PDFs, and the opportunities presented by innovations in L-enabled Etl for unstructured data and chunking.

Microsoft research recently released a paper on chatbots demonstrating artificial general intelligence. The paper showcased impressive use cases, including one where a chatbot not only described a picture but also extracted its contents. This kind of advanced interpretation by chatbots has the potential to render many startup existential crises moot. To succeed in this landscape, startups need to focus on understanding the business's pain points and building exceptional user experiences. For example, as an API developer, nailing the developer experience is crucial. While there are existential threats, getting close to the business and addressing their needs puts startups in a good position. The podcast also discussed the emerging topic of chatbot marketplaces and whether they will resemble app stores or APIs. There are two potential avenues: a Zapier-like experience where chatbots act as links between various devices, or an API-focused approach where chatbots serve as adhesive glue for different services.

There are two main areas of focus in Mixpeek: a language interface and a marketplace for procuring services. The language interface acts as a single entry point for accessing various apps and functionalities. The marketplace allows users to book flights and access other services within the chat interface. The business model involves a monthly cost of $20 for accessing the App store. Chat boutique has become the primary tool for coding, replacing the need for Stack Overflow. Co-pilot X is another exciting tool that helps with code writing and learning new languages. There are other options available, such as OpenAI and Stanford Llama, which offer open source models. Overall, Mixpeek serves as a framework for content creation but does not produce final copies, except for short emails.

The licensing of code and models in the AI landscape is an interesting topic. Some models, like Llama, are only available for research purposes and cannot be used commercially. OpenAI also restricts the use of their models for developing competing models. There may need to be a shift in licensing policies for these code bases and models, especially when they are trained on closed source databases. The process of training models, like A pac, is fascinating as it involves fine-tuning with prompt completion pairs and creating a feedback loop. The convergence of AI and IoT has potential for applications such as language models at the edge. Monitoring buildings through the Internet of Things is also an interesting area of study.

In the podcast, Ethan Steininger discusses the potential of language models and AI in the building and construction industry, as well as the Internet of Things. He considers the idea of fine-tuning language models for specific purposes and talks about conversations he's had with experts in the field. The focus is on creating custom language models for businesses, particularly in the enterprise market. Steininger highlights the importance of domain-specific datasets and training models to understand company knowledge bases. He also mentions the significance of search engines as a source of truth and how generative models and AI can complement the search landscape.

AI complexity is best addressed with a user-friendly interface. The proposed open-source framework involves using a search engine to embed and chunk content, then running a query to extract important information for generating responses. This combination of search and generative methods is a standard framework. The process can handle massive amounts of context, allowing for more powerful language models. The integration of Lam index retrieves results and extracts structure for the language model to use.

The speaker discusses using a language model to turn the top 100 search results into a knowledge graph. They express curiosity about how to handle search results and the potential for innovation in that area. The speaker mentions an artificial general intelligence paper that demonstrated instructing a chatbot to ask questions and create a mental map of a house. They also mention the idea of using chatbots for extract transfer load activities. The speaker raises concerns about the token limit for chatbots and wonders if layering language models could solve that issue. They discuss the idea of using language models to write SQL queries and how to apply symbolic filters in Weaviate.

The speaker discusses the use of modules, adapters, and repo in the language model, and how to prompt the model to filter searches. They also mention the relationship between folders and directories in the Weaviate code base and how it affects queries. The structure of code bases and workspaces is also discussed, along with the use of symbolic properties and vector search. The speaker expresses interest in fine-tuning and asks about suitable production use cases for learn to rank. The speaker acknowledges their team of engineers who are working on core services and models. Re-ranking results is also mentioned.

At its core, understanding the end user is key. One use case being explored is an embedded search bar on websites that tracks user activity across multiple pages and creates a sequence of activities that can be converted into a specific metric defined by the company. Building a learn-to-rank model that can reorder results based on the propensity to convert aligns with the business goal. There is a relationship between this approach and fine-tuning, which was discussed in a podcast with Erica from Wave Aid Enrollment and Ce from Meta Rank. The use of user features and interaction events in models like x Boost is known, but the digitization of these features and the reliance on content-only-based crossing encode is an interesting topic. One idea is to translate tab features into text-based inputs for transfer learning. This approach involves mapping user actions to unique text sequences that are then processed by a language model and stored in a search engine for sequential inference.

When collecting numerous features, there is a risk of overfitting to specific patterns in the feature vector. Translating the data into text and incorporating user descriptions can be effective in avoiding overfitting. Utilizing a large language model for reasoning and combining it with a high-performance cross encode model can yield fast results. However, there is a need for software layers to make such processes accessible, particularly for training ranking models. Meta rank services offer a comprehensive solution for hosting models, handling data ingestion, model versioning, and validation. The challenges of model versioning, hosting, and refreshing embeddings require careful consideration and innovative approaches. Tooling in the AI space is crucial due to the complexity of productionizing applications, making Weaviate's efforts noteworthy.

There is a lot of interest in API-based companies that help simplify complex processes and this has attracted both market and investor attention. It is uncertain whether there will be a dominant winner among these models, as OpenAI currently seems to be dominating the field. However, some believe that relying on a single model may result in a monopoly-like situation. It is more likely that users will fine-tune their prompts if they are unsatisfied with the answers. The future of AI models is an interesting topic that requires further exploration. It is believed that different niches will emerge, such as domain-specific models and those focused on security. The developer experience and the ability to self-host models are also important considerations. It is possible that OpenAI could become the app store of AI models, while other companies like C here could focus on domain-specific knowledge bases, similar to Salesforce's approach.

Mosaic ML is an impressive company that offers an open-source software business model with managed enterprise hosting. They have a composer library and are cutting the cost of bert training. The language model for custom applications and learning to rank using specific tab features are important in the AI landscape. Hybrid search engines that require original content for re-embedding are becoming more important. The Facebook DP model and open ORC coherent embedding model are potential ideas for updating query embeddings. 80% of cases find the 0 shot model with lexi BMW 25 and hybrid setup to be effective. The future of TFF in BM 25 and string matches in vector searches is uncertain.

Some applications, like Bmw 25, incorporate intent and prompting in search. Task retrieval with instructions and instructor models are used to prompt search in embedding models. One dataset called Argue anna focuses on retrieving counter arguments. The relationship captured in the embedding model is based on semantic similarity. Beer, a benchmark dataset, has better understanding of use cases like "I am not happy" versus "I am happy." There is a need for an academic dataset that captures the evolution of documentation over time. The sequence problem in capturing the evolution of a corpus requires tuning the model using the delta between stages.

The text discusses the importance of using historical data splitting for training and testing in AI models. It also highlights the relevance of building in public and showcasing the steps taken to reach the final product. The benefits of building an audience, receiving feedback, and fostering a strong user base are emphasized. The concept of open source as a content strategy is mentioned, with examples such as Laying Chains. The potential of podcasts to highlight the work of others and incentivize collaboration is also mentioned.

Closed source businesses can have advantages, particularly if they are marketplace businesses or rely on community support. However, managing an open source project can be challenging, especially when it comes to ensuring high availability and guaranteed service level agreements for customers. This is where companies that abstract server architecture, such as Mosaic, play an important role.

There is a company in the AI landscape that focuses on abstracting patterns seen across other companies in order to provide a competitive advantage. They extract file contents while maintaining the structure and offer a search interface that spans all files. The goal is to provide a seamless experience with just two API calls. The ability to spin up an inference engine in a cheap and fast manner is a valuable asset for companies in the ML space. However, it seems that major cloud providers are not targeting the serverless GPU space. Some companies, like Modal and Banana, are abstracting the deployment of models using serverless GPUs. This approach is seen as more efficient and user-friendly compared to traditional methods like Kubernetes.

I have always been interested in Kubernetes and scaling resources for different types of jobs. As I studied deep learning, I became curious about how weights and biases are valued at a billion dollars, as it seemed like hyper parameter logging and tuning. I worked with determined AI and learned about cluster management, realizing that callbacks require different resources than training. Kubernetes can be challenging, but serverless environments offer the ability to have query embedding models running with a simple Python function. One challenge with Kubernetes is the creation and maintenance of the cluster, as well as ensuring consistent state across distributed inference engines. Serverless environments can overcome this challenge by sharing context between servers. As an active software developer, this has always been a challenging aspect for me.

State management is a challenge in server functions and databases. There is discussion about replication consistency and the foundation of distributed systems. The Azure cloud and companies like Ray and Neural Magic are mentioned in relation to GPU management and model compression. The future of models involves reducing their size and making them compatible with commodity hardware. Examples are given of running models on phones and Raspberry Pi without GPUs. Different architectures and hardware influence the decisions made in deep learning. The decision to use large matrix multiplication was influenced by the popularity of GPUs.

In a conversation about machine learning, the speakers discuss the implementation of convolutional kernels and tension matrix multiplication. They also mention the use of GPUs in the ML space and speculate on potential alternatives. One speaker recalls a talk on quantum computers and their potential for calculations. They touch on the fear and uncertainty surrounding AI models taking over, and one speaker advises taking breaks and operating in sprints. They also mention living in a camper van and encourage others to find their own way to disconnect. The conversation ends with a mention of a blog about building a van.

The use of AI in job roles is a concern, as it could potentially replace certain tasks. This may lead to a concentration of power at the top due to increased information control. However, increased productivity is a positive outcome of AI implementation, and once AI models become more accessible, it could level the playing field and provide equal opportunities for all. It is important to strive for equal baseline opportunities while recognizing that there will always be variations in advantage and outcome. Running simulations with reinforcement learning agents can inform policy decisions, and there are various interesting ideas surrounding AI's potential. While AI's complete control over the universe may be a distant possibility, it is definitely advancing. The plug-in marketplace offers impressive use cases, such as AI's ability to solve problems independently and utilize tools, similar to how humans discovered tool usage.

The future of AI could lead to the emergence of a new digital species. Language models could be used in various roles within companies, with different models accessing different information. This could result in companies consisting solely of language models with specific skill sets. The challenge of sequential thinking in chatbots could be overcome by incorporating context from previous states, utilizing layers of large language models and different agents. The possibilities in this field are vast and exciting, akin to a gold rush. Content verification layers and the ability to sample different decoding pathways are also important considerations in the development of AI.

Temperature can be adjusted to generate different outputs from a language model. The model works by decoding through a probability tree, allowing for multiple pathways and generations. Content verification layers are used to filter the outputs. Ethan Steininger discusses the idea of using classifiers to determine if code would compile. He thanks the podcast host and mentions his project Mixpeek and colleague.ai. The conversation touches on the new era of search and innovative ways of thinking about it.

Raw indexed text (79,671 chars / 14,457 words)

Source: https://www.youtube.com/watch?v=EDPk1umuge0
Page title: Ethan Steininger on Mixpeek and the AI Landscape - Weaviate Podcast #42! - YouTube
Meta description: Thank you so much for watching the 42nd episode of the Weaviate Podcast! Ethan Steininger is the founder of Mixpeek, an intelligence layer that sits on top o...

[00:00:00 - 00:00:17]

Speaker 0: Hey, everyone. Thank you so much for watching other episode of the Alleviate gave podcast. I'm super excited to welcome Ethan Stein. Ethan is a prolific entrepreneur who's created Mixpeek peak and Colleague. And he's really got our attention to someone who's building cool things with We and generally in the search and Ai space, master of software.

[00:00:17 - 00:00:20]

Speaker 0: Sharing. So, Ethan, thank you so much for joining the podcast.

[00:00:20 - 00:00:25]

Speaker 1: I definitely know don't know about master, but I appreciate the enthusiastic intro.

[00:00:26 - 00:00:34]

Speaker 0: Awesome. So maybe then could would you dive into your kind of background and software engineering and search? Maybe yeah maybe it search more the master?

[00:00:36 - 00:01:01]

Speaker 1: Yeah. I'm actually only recently immersed in the search ecosystem, but I guess that doesn't act as too much of a d tractor considering the industry has changed so much in the last year alone. So my background is actually as a software engineer, went to school for programming. Actually started a lot of projects, a lot of startups ups. Most, if not all of them have failed.

[00:01:03 - 00:01:44]

Speaker 1: So I've got my own. If you've ever seen the killed by Google website, I've got my own graveyard of projects and github repo, etcetera that I've abandoned. So no shortage of failed projects, but my foray and the search was actually at my most recent job which was at Mongodb. They made the decision to actually embed loose scene, which is like the core for those unfamiliar with Lu, the core open source library that powers elastic search, solar and a and a ton of other search engines out there. And Mongodb decided to kind of embed that open source library, within the database and couple it together.

[00:01:45 - 00:02:26]

Speaker 1: And because that's such a novel concept at the time, they needed a a special swap team to help take that product to market. And so that's what I was summoned to do, which was really just a matter of defining the the profile, building out technical collateral, talking to customers and being that kind of liaison to the product who's actually, I won't mention his name, but he's 1 of the formal advisors of of Weaviate and actually 1 of my old time mentors when I first got into search. But that's kind of how I first got into the industry just being summoned to support the go to market of Mongodb search product.

[00:02:28 - 00:02:47]

Speaker 0: That's so interesting I... Yeah. Learning about... Well, I guess, I'm kind of curious about this, like, loose scene Mongodb, maybe I'm very like, I don't know if it's the most entertaining podcast topic, but I I'm very curious like, like like, what are these technologies kind of like is a to... Because I know so little about this kind of world.

[00:02:47 - 00:03:01]

Speaker 0: I've I've like, kind of... I joined V coming from, like, the deep learning. Background of being very interested in just kind of representation learning optimized vectors. And then start... And then so from there, I came into, like, elastic surge flu seen solar hearing about these things.

[00:03:01 - 00:03:07]

Speaker 0: So I would mind maybe tell a little more about, like, what is the underlying technology of Mon and solar and these kind of things.

[00:03:08 - 00:03:34]

Speaker 1: Yeah. I love the fact that you take most things from a research angle like that. The how it doesn't work. Which is something that I need to get better at, but I've historically tried to figure out like the the positioning, of the product and, like, what kind of immediate business pain is does it solve. I was in I was a sales engineer at Mongodb ebay But under the hood, it it's really actually quite simple.

[00:03:34 - 00:04:17]

Speaker 1: I mean, every database has a change stream Api. My sequel, Mon D, they all have some kind of listener that you can open up like a cursor, on the database. So anytime there's a change, you get alerted of that change. And so the architects behind that Mongodb search architecture, they basically coupled the change stream cursor Api which listens in on changes and replicated all of those changes into a loose seen inverted index. And so once you have those 2 data structures together, then they basically created a a wrapper on top that exposes if it's called the aggregation pipeline.

[00:04:18 - 00:05:07]

Speaker 1: In Mongodb, it's kinda like the cor language. They expose this aggregation stage that allowed you to run a search query on the loose inverted index. And then once you have the results of that search query, it would actually pipe it back into the Mongodb b on database, which would allow you to do... Kind of you have the term hybrid search, that's hybrid if you had, like, database queries, But I know you guys have hybrid as in, like, T plus vector search. So it's a little bit of a a different positioning where customers that really value the acid compliant properties of a database and wanna combine that with the with the the full text search functionality.

[00:05:09 - 00:05:21]

Speaker 0: Yeah. That's so interesting. And I I think... Yeah, I really curious you've written this great article on the Ml stack where you the acid compliant thing and then Weaviate and I was like, Alleviate has the database stuff too? I was like...

[00:05:22 - 00:05:28]

Speaker 0: Do you not agree... Like, I'm curious like, is there something to the acid support in Mon this left out of Weaviate?

[00:05:29 - 00:05:47]

Speaker 1: Yeah. I wouldn't say there's anything left out, but I... As someone that has sold to enterprise cost for a long time, customers are reluctant to change their system of record. Right? And so like, when people are already using a mon or even in Oracle, and they're using that as their source of truth, their their database.

[00:05:47 - 00:06:20]

Speaker 1: It's a lot easier to tackle on an existing search engine. And have that just be coupled together via some kind of streaming technology rather than just replacing it all together. So I I'm sure that the the Weaviate database, Api set and methods are are super robust Just... Thinking about it from customer standpoint unless you're a startup and you're just doing something new for the first time. They typically have big customers typically have a lot of legacy code, data stores in old database, etcetera.

[00:06:21 - 00:06:22]

Speaker 1: But it's

[00:06:22 - 00:06:36]

Speaker 0: a good question. Yeah. Super date. And I kind of coming back to something else you said earlier about the Mongodb db aggregate pipeline. Sebastian would elect at also former Mon and and now alleviate he show me that this kind of aggregate thing in the...

[00:06:36 - 00:06:52]

Speaker 0: Like, thinking about the Weaviate pipe design and all these different filters that be can be attached to Alleviate. I think it's so interesting like the... Yeah, just all these kind of things you can do to it. So maybe pivoting topic some. Can you tell me about the kind of the founding vision behind Mixpeek peak and and the problem your tackling?

[00:06:52 - 00:07:19]

Speaker 1: Yeah. 1 of the most common problems. That I had experienced. So I I was traveling the world talking to our most challenging customers when I was at Mon and 1 of the themes that I had learned about pretty much every week. If not every day was really that of how do we search the contents of And most of the use case was really Pdf files just because these were mostly enterprise customers.

[00:07:20 - 00:07:57]

Speaker 1: Enterprise companies have a ton of Pdfs, spreadsheets, even like doc files, document files. And so I was just basically telling them all to, hey, like, you gotta create this framework in this architecture and you're using T and you're extracting the Tikka contents into your search engine, you're doing chunking and all these things, and it got to the point where it's like, alright. This is a big enough pain. I've seen this enough and the vector search projects were becoming more and more popular. And this was right around when loose scene.

[00:07:57 - 00:08:49]

Speaker 1: I forget which version it was, but it's when they first introduced the vector k n capability into the the core luc branch, and that really unlocked a lot of super interesting, opportunities and decided to to quit Mon and build that out full time And then I'm very grateful to have raised the seed round from some investors and that's given me a little bit of a cushion. To explore some of these really radically changing areas. I've I've been talking to a lot of startup. I live in New York and I've been talking to a lot of startup founders around the New York City area about, like, how has your roadmap really changed in the last 3, 4 months and every single 1 of them is like, yeah. Drastically.

[00:08:50 - 00:09:29]

Speaker 1: We've pivoted so significantly to keep up with all these changes and, like, Open Ai plug in marketplace was just announced yesterday and you best believe at least thousands of startups are going to be considered moot. Because they've just built wrappers on top of the chat Api. Mh. So we're we're definitely about to experience an Apple to App store like paradigm shift, and I'm I'm super impressed that Weaviate, I saw Bob's post Weaviate is on the the plug in marketplace. You guys are well positioned to be that default vector search engine.

[00:09:30 - 00:10:06]

Speaker 1: That wraps on top of chat E right out of the box. So... But, yeah, that was kinda like the intro to why I started to develop Mixpeek and at its core, it's really just a multi modal, indexing, embedding and search Api. So we take all the messi ness around the database, the keyword search, the vector search, the file parser, the file extraction, and finally, the search and we just expose it as 2 Api calls. With re ranking, chunking, analytics, etcetera, all baked in.

[00:10:07 - 00:10:18]

Speaker 0: Yeah. Amazing. So I think there's a few things on that. I I think I'd like to come back to the Plugin in because that's such an interesting topic. But first, I really wanna just kinda pick your brain more about the Mixpeek peak goals and...

[00:10:19 - 00:10:46]

Speaker 0: So so on this idea of... And And I also think that we could talk about Ka as well. This, like, really nice user interface for getting, like, dropping Pdfs in, and then it's in the vector database in the Chat Thing, like, all this kind of ranking, all that is like accessible. So I... I'm so curious about this kind of data ingestion, like kind of I think the Pdf ingestion with, like, III recently called it Ocr on the podcast with Dennis zhu from Mem, and he said, no.

[00:10:46 - 00:11:16]

Speaker 0: It's it's it's so powerful. You can't call it Ocr. Like, the way that G 4 said it's like, having a human looking at your Pdf. But surely, these things will just make it super easy to get Pdfs into... Ted you know, vector databases and stuff like that and yeah all of that I So I'm curious kind of like, how do you see this these innovations and maybe like this L enabled Etl for unstructured data kind of as well as this kind of chunking idea like what are some of the opportunities may be around chunking.

[00:11:17 - 00:11:40]

Speaker 1: Yeah. I I think 1 of the most impressive use cases that I saw for Gp 4. By the way, there was a paper that was just released by Microsoft research. I believe 2 days ago, the 20 second, Today is the 20 fourth. All about, hey, we think that chat For is demonstrating a spark of of artificial general intelligence.

[00:11:40 - 00:12:13]

Speaker 1: And in that paper, they used a couple of use case examples, 1 of which 1 was like you said, somebody provided a picture, and it not only described the picture, but it extracted the contents from it. But obviously, well beyond Ocr. It's like a human is interpreting it. And that again, that helps us kind of position what we think is the most valuable. And at its core, I mean, this is my theory on, like, A lot of startups ups are having existential crises right now.

[00:12:15 - 00:12:39]

Speaker 1: And, like, chat Eb bts rendering a lot of them moot. And so what I'm what I'm telling anybody that's asking is like, you need to get closer to the business. And you need to understand what their pains are and build phenomenal user experiences on top of that. Right? And so if you think about your user is, for me, I'm building an Api, build So my user experience is the developer experience.

[00:12:39 - 00:13:01]

Speaker 1: And if I can really nail that developer experience and make it as sticky as possible. That's my mode. That's my competitive note. And I advise the same thing for anybody else, because like in the end of the day, and this is what I've always told people that challenged when we were positioning Mongodb Atlas search. In the end of the day, any engineer can build anything, really?

[00:13:02 - 00:13:20]

Speaker 1: At its core. I mean, maybe it won't be as good. It won't be as robust. It won't have high availability, service level Sla But they could build it? It's just a matter of, like, a, do they want to and b, is it as intuitive and clean as you could.

[00:13:21 - 00:13:29]

Speaker 1: So definitely a couple of existential existential threats there. But as long as you're getting close to the business and understanding the pain, then you're in a good spot.

[00:13:30 - 00:13:45]

Speaker 0: Yeah. So interesting. I really like kind of the story that's emerging in our podcast so far of, like, this kinda like... Mixpeek tests oriented sales engineer thing and how that leads you to think about how the perspective that creates when you're building this kinda startup in the space and Yeah. So interested.

[00:13:45 - 00:14:00]

Speaker 0: And maybe kind of staying on this topic of mood existential prices, we could come back to the chat G of like in them. It's... Yeah. It's so interesting. This kinda like, the marketplace for Chad G being, like, hosted on the Ui.

[00:14:01 - 00:14:31]

Speaker 0: It because we saw so much we saw so much, like building around the Api, like, I, like Lang chain llama index... The 2 things that I've been super invested in learning is, like, these, you know, building these kind of append around Chat Rather than kind of the the Chat Ui like host the apps. And, yeah. I'm just curious what your perspective like, I think it's such a fun like, emerging topic is is like, is this gonna be the App store kind of or... Like, because, yeah, or would it be, like, the Api?

[00:14:31 - 00:14:33]

Speaker 0: Like, just how do you see the that kind of space.

[00:14:34 - 00:14:53]

Speaker 1: Yeah. I'd I'd wonder and obviously, everybody's just, like, con right now but there's really 2 avenues I think it could take. It's really either going to be a zap here, like experience where you could just combine all these things together I can use. Chat E to be the link between... I don't know.

[00:14:54 - 00:15:08]

Speaker 1: My Roku, with my alexa and my thermostat. Right? Where Apache G is the link between them them all where the kind of adhesive. The glue is my human language. Do this than that, whatever.

[00:15:09 - 00:15:52]

Speaker 1: That's that's 1 area. And the other is really just like a standard marketplace where you can procure services from within the chat t, like interface. I saw a couple of examples where people were booking flights from within chat t. So that could be that single point interface for all of the apps within the marketplace t It's probably more so gonna be the second rather than the first, because the first, I mean, there's really not much moat around just being a glue between everything, but the first is really like, if this is your default entry point for getting anything done, accessing the Internet, your calendar, your trips, then... That's it's remarkable.

[00:15:54 - 00:16:02]

Speaker 0: Yeah. Wow. That is... I mean, I yeah. It's like the the business model, the mo is super fascinating.

[00:16:02 - 00:16:11]

Speaker 0: Like, the... Obviously, they make, like... I think it's, like, 20 dollars a month is the cost. For this Ui right? Like, to even access the App store because it's only...

[00:16:11 - 00:16:16]

Speaker 0: Can... Is is that a correct? Understanding firstly that it's only the paid version that has this plug or...

[00:16:16 - 00:16:22]

Speaker 1: I'm not sure about that, but I am a very proud Gp t 4 driver. You know, I'll find out soon.

[00:16:23 - 00:16:24]

Speaker 0: Yeah. Subscribers as well as.

[00:16:25 - 00:16:40]

Speaker 1: It's it's funny. I feel like no offense to anyone that works at Google, but the last time I've googled Google that I'm a... I'm coding all day. The last time I've used Stack overflow is, like weeks ago. It's exclusively been chat boutique, which is...

[00:16:41 - 00:16:44]

Speaker 1: Really demonstrating a significant paradigm shift.

[00:16:45 - 00:17:07]

Speaker 0: Yeah. I've also been just... Why I also so excited to check out c copilot x. I have haven't yet, like, spun it up yet, but, I mean, the the ability of it to write code is amazing. Like, my kind of story with this is that, you know, we've guys written and go lang and I don't really know go lang, but I can, like, with the help of Chat, it's just making the learning curve like, so much quicker.

[00:17:07 - 00:17:20]

Speaker 0: Like I, you know, if I say, hey, I need you to sort this list. Like, I have this dictionary I have these keys. I don't even need to know about, like, the map inter... The map thing of go, it'll write it for me. And then once I see it and now I can quickly that.

[00:17:20 - 00:17:29]

Speaker 0: So that's... I remember there is this paper from Facebook that came out, like, a 3 years ago I was, like, translating between Javascript and Python. I was like, wow. That's so ambitious, but... Yeah.

[00:17:29 - 00:17:42]

Speaker 0: Now it's like, I think Yeah. What? I I would love to talk more about get your perspective on how you how Gb 4 helps your coding productivity? Because for me that kind of... Help me learn a new language has been just the biggest 1.

[00:17:42 - 00:18:06]

Speaker 1: Yeah. Well, before I answer that question, I think that there is a big opportunity around using, this large language model and really others. I mean, we shouldn't only focus on chat, like open Ais is doing really well. The open source models are super exciting. I know that Mh Llama Stanford Llama is is generating a lot of hype just because it's smaller it's more lightweight.

[00:18:07 - 00:18:33]

Speaker 1: It can run with less Cpus and it's open source. So just well, there's there's other options out there. But at its core, like, what I've been doing is... I mean, chat For any use case in the we'll call it content creation, so code, copy, whatever. It it really just serves as like, providing a framework for most things.

[00:18:33 - 00:18:52]

Speaker 1: It's never really producing a final copy. Maybe for like a 1 line email, it will, but for code, I mean, no one's just dropping it right into their code and like, deploying to production. Like, if if that happens, then Weaviate got things to worry about. But I think we're a far away from that. Let famous last words, obviously.

[00:18:53 - 00:19:03]

Speaker 0: Yeah. So I think the... Like, the Llama a packaging thing... 1 thing about that that's so interesting to me is kind of like the licensing of it. Like, with Llama, it it doesn't have a commercial license.

[00:19:03 - 00:19:15]

Speaker 0: Like, you... It's like, I think it's, like, just for a research purposes. So, like, the out packets, like, you can use it for your phone, but you can't, like, just use it. And I think that is another interesting thing like Open Ai. They have this thing where they're saying...

[00:19:16 - 00:19:32]

Speaker 0: You can't use the model to develop models that compete with Open. So obviously, like, the knowledge dis installation thing is, like, you could just copy the outputs basically and take any model. So, yeah, that whole thing around, like, what you can use the models for is so interesting. Yeah.

[00:19:33 - 00:20:12]

Speaker 1: I wouldn't be surprised if... There is going to meet need to be a, like precedent shift around licensing for some of these code bases, these models because like, I'm sure that there is some fin around the legal definition of licensing when you've really trained the model on the entire closed source database of Github. Mh. So III wonder if there's going to need... If anybody is a policy maker out there, that's a conversation that probably should be had sooner rather than later.

[00:20:13 - 00:20:37]

Speaker 1: Because like, is it is it really fair to have that... Line item in your legal statement where you've kind of done something different. I don't know. But a A, the way A pac was trained is really, really fascinating. I mean, they really just started with this open source library or this open source model called Lam, which I think was like 7000000000 parameters.

[00:20:38 - 00:20:58]

Speaker 1: They started with that open source model, And then they basically... So whenever you whenever you fine tune a... We'll call, like, these cheap Gb models. You provide prompt completion pairs. And what Stanford did is they actually use chat to generate prompt completion chairs, and they pairs.

[00:20:59 - 00:21:47]

Speaker 1: And they actually created a feedback loop of positive versus negative rejecting versus accepting these prompt completion pairs and fed that into lam. And I guess that margin of new new new pairs has got it to the level of really competing with the Gp, maybe not 4, but maybe the 3. Which is which is super valuable like, imagine the the example that I always give is really like, imagine these large language models at the edge, right? If we have an Ll on our phone or even like, some off grid device, it's able of it's able to consume. Any kind of input and generate an output, agnostic of its connectivity to the Internet and using all the data that it's connecting.

[00:21:48 - 00:22:03]

Speaker 1: It's kind of training on the fly at the edge in real time. It's it's really a lot faster. It's So that could be, like another convergence of the 2 fields is like this Ai and Iot. Mh. It's the really first time I've thought about that.

[00:22:04 - 00:22:10]

Speaker 1: I don't know. What more use cases beyond what I just made up are. So if anybody has any of them, feel free to share.

[00:22:11 - 00:22:32]

Speaker 0: Yeah. The the Iot thing is just it's super interesting. I I don't know if this is productive to mention, but I like, I I took a little grad class and graduate school class and, like, in in the in the Internet of things, it was about, like, monitoring buildings and how you, like, send a pulse in 1 part of, like, construction... Like, you know, building and construction and it networks and they're like, okay. The building is okay.

[00:22:33 - 00:22:55]

Speaker 0: Right? And thinking about, like, I don't know, could language models and that be productive or something like... And Internet of Things topic has always been interesting that I've known a little bit about not too much, but anyway, so so pivoting to the first thing you said about how A pac was trained and this kind of reinforcement and learning. Human feedback thing. I think I'm thinking a lot about, like, as impressive as Chad Is from what if I find...

[00:22:56 - 00:23:26]

Speaker 0: Like, what if I fine tuned it and labeled it on the prompt completion that I like like you said, like... I think a lot about this, like, could... Like, I've had a lot of conversations with, like, Jonathan Frank at Mosaic, Ml and their perspective on businesses wanting their own custom language models. It also talked Nathan Lambert about hugging faces, reinforcement and learning from human feedback and, like, search Ai and all these things like, how this might emerge. Because I think about, like, to let's say I trained a language model on, like, the Weaviate Slack.

[00:23:26 - 00:23:32]

Speaker 0: Like... And then I start labeling its generations. I feel like that's just gonna give me something better than chat Ep.

[00:23:33 - 00:23:44]

Speaker 1: Yeah. Yeah. And and that's a a really interesting point and the exact... It's gonna direction that I'm taking with Mixpeek peak, which is around, like, domain specific datasets sets.

[00:23:44 - 00:23:44]

Speaker 0: Mh.

[00:23:44 - 00:24:11]

Speaker 1: Right? So I mean, what I'm finding and really everybody is finding is like, exactly right. I mean, it's nice that chat G understands the Internet, but maybe the reason why the revenue model isn't quite sound yet is because they're just hoping that Connor and underneath pay 20 dollars a month. They happily do, but the the real the real money will always be in the b b market. And not only the B to b market, the enterprise market.

[00:24:11 - 00:24:57]

Speaker 1: And so which there's a ton of companies out there, including me including Mixpeek peak that are really trying to understand a company's enterprise. Let's say, knowledge base and training models, maybe they'll call them agents, training these agents around understanding that data and then providing some kind of chat Like interface on top. There's so many sources of information. There's Slack, There's Jira, there's email, inbox, Wiki, like, these companies have so many different segments of of knowledge and they're all really like, very well formatted. Like, for better or worse, their q and A, like Slack Someone asks a question.

[00:24:57 - 00:25:07]

Speaker 1: I'm sure in the Weaviate Slack channel. Mh. There's an answer. And that's a very obvious prompt completion pair that can be sent to. A Gb model.

[00:25:08 - 00:25:33]

Speaker 1: So to answer your question, I think that there's really going to be a strong push towards towards customizing models. But what I'm also really interested in is like, where Weaviate comes in is like you guys have a search engine. And the search engine will always be the source of truth. And I was actually on 1 of Bob's Twitter spaces. I forget it was somebody from c here as well.

[00:25:34 - 00:26:15]

Speaker 1: And I asked the question, Hey, like, what do you guys think about how generative models and Ai? How does that contend with the search landscape? And a lot of, I mean, my initial thought would be like, it's replacing search, obviously, because you can create information rather than retrieve. But after hearing Bob's great explanation and doing a bit of research, there will always be the need for kind of like, referencing the source of how it came to this conclusion. And per complexity Ai has a great Ui for that, I'm quite fond of theirs, and I'm modeling the interface that we're building on top of Ka.

[00:26:16 - 00:27:08]

Speaker 1: Around that that interface, which is really like not... I proposed this like open source framework for kind of creating the verified truth of generative responses, and it it really involves this search engine. You embed an entire corpus of content, then you you chunk it, so that each 1 has its own meaningful importance, like modular, meaningful, independent importance, and you embed each 1. And so then you run alleviate query of cross those entire embedding, you get the ones that are most important or maybe the top k, most important, and then you feed that into a chat E t. And so then Chat Eb has a little bit of context and it understands that little bit of information that can then be used to either create a summary.

[00:27:09 - 00:27:29]

Speaker 1: Or an answer or something. So that's how I'm seeing a lot of people combine the search plus the generative. And I see that as being a pretty standard framework for creating these things. And I just wrote an article that was like, here are all the steps that that need to happen. To do that because I've built this internally for...

[00:27:29 - 00:27:48]

Speaker 1: I we could talk about Ka, which is like a really simple user experience for embedding search on your application. But like, I've seen and built this framework enough where it probably makes sense where someone standardized it. So Anyone's interested in collaborating on that. Happy happy to talk. Yeah.

[00:27:49 - 00:27:56]

Speaker 0: Super cool. And Oh yeah. I love that topic. I think kind of... If 1 thing about that that I still think is so exciting.

[00:27:56 - 00:28:11]

Speaker 0: Well this kind of... Like, the first thing is, like, right now, the general... How do you take the context and put it into the language model is like, you just put it in the input. Like, the input is like, you know, please answer question based on the documents, documents, you know, go for it. It's whereas there is...

[00:28:11 - 00:28:41]

Speaker 0: The science is looking at these models like, do deep you mind had a model called the retro. There's, like, these fusion decoder layers, there's this... Memorizing transformers where you would keep the embedding, keep the vector embedding, and then you say, like, put that in layer 9 out of 12 of the language model. And so it this says really interesting scaling properties because you can... You know, the way that you can kind of, like, transpose the matrix multiplication, you could put, like, pretty massive documents into these models and attend over more context.

[00:28:41 - 00:29:11]

Speaker 0: So I think that's an incredibly exciting part of this is that I don't think we've seen yet the full power of these retrieval augmented language models and what they can do when they can take in really massive context, And then, yeah, as you mentioned, I read your article as well. And I've obviously, I've mentioned earlier I've been studying like lang chain and Llama Index and old just kinda thinking of, like, okay. How exactly are we gonna organize the, you know, the search results to presented to the language model? It's such an interesting topic. Yeah.

[00:29:12 - 00:29:47]

Speaker 0: Maybe we could talk a bit about this idea, the the Lam index idea. Like, there's 1 thing that I really like about this integration where it's like, you retrieve a hundred results from Alleviate, and then you need to, like, extract structure from those results before handing it to language model and the language model kind of use... The language model it's what's used to extract the structure as well. So it's if it's like, you wanna turn the top 100 results into a knowledge graph. You're using the language model to go like, what kind of, like, relational triples are in this and it parse it out and then you have this new thing that then goes into the input.

[00:29:47 - 00:29:59]

Speaker 0: Like, another language model? So so kinda like, yeah. I'm just curious you thoughts generally on, like, how exactly do you hand the search results to the language model and to how much opportunity is there maybe be for more innovation in that?

[00:30:00 - 00:30:24]

Speaker 1: Yeah. It's good question. I haven't actually explored the... Exact use case that you just had, I love the idea of instructing the creation of a knowledge graph. And actually that artificial general intelligence paper by market by Microsoft's research team, actually showcase like, they instructed chat To ask a sequence of questions understand something.

[00:30:25 - 00:30:39]

Speaker 1: I don't... I forget what it was. I think it was, like, the interior of a house and it demonstrated the creation of a mental map of that house. That was accurate. Like a a visual diagram of how that house is laid out.

[00:30:39 - 00:31:05]

Speaker 1: Mh. I'm just asking questions. So I I'm constantly finding myself not only maze, but like, I'm I'm finding myself, like, asking, like, why aren't I just using chat T? To, like even like an Et etl, an extract transfer load kind of activity. They're they're very suitable for that I think the challenge that I have with, like, even feeding.

[00:31:06 - 00:31:48]

Speaker 1: I think you said, like, hundred or so items. Like, you're gonna fast come up to the the token limit for a chat Team? And I'd be curious from your standpoint is that a something that you see going away and b, I mean, this is the first I've heard about, like, the layering of L m's, which really would solve a lot of those token upper bound issues because if you can just like I I don't know enough about to even, like, explain that, but seems like that would address a lot of the... The the challenges with, like feeding your entire corpus in 1 point to the the chat Api. Right?

[00:31:49 - 00:31:52]

Speaker 1: Yeah. I mean, I'm just fascinated about this topic. This has been kind

[00:31:52 - 00:32:06]

Speaker 0: of the number 1 thing on my mind lately is like, these... Like... Text to Sql is kind of the starting idea where the language model is, like writing an Sql query to get information out of the database, but think... We're thinking about, like, okay. How do we do this with Weaviate?

[00:32:07 - 00:32:28]

Speaker 0: How do we get the language model to use the symbolic filters? So it's like, if I... Let's say I have the Weaviate code bases like the thing that's too many tokens to just put it all in there at once. So we need to, like, cleverly search through it. And we have these kind of symbolic filters of, like, the, like, the folder structure, I think is a really good 1 for these, you know, big code repo.

[00:32:28 - 00:32:40]

Speaker 0: You you know, you have, like, modules is here, like, repo, adapters, like these kind of things. And it's like, the... How do you tell the language model? How do you prior it like, prompt this. Sorry.

[00:32:41 - 00:33:00]

Speaker 0: Prior. How do you prompt it to say, you know, you have these categories that you can filter your search through. So, like, do you only wanna look through, like, you know, this particular folder of the project structure Right. Yeah. That kind of thing of like, you getting the language models to use the database interfaces.

[00:33:00 - 00:33:03]

Speaker 0: So are you saying that the...

[00:33:04 - 00:33:24]

Speaker 1: Repository, the the Weaviate code base repo is an ancillary to the cor language and by understanding the relationships between the folders and directories, they can... They being the Gb, whatever, it can then understand how these manifests into the queries?

[00:33:25 - 00:33:37]

Speaker 0: That yeah. That idea is the 1 that I think is Yeah yeah. I think that I just sick to... It's like the... Because it's this general bank

[00:33:37 - 00:33:47]

Speaker 1: on it on any code bit. Right? Yeah. I think that that linking... That that does imply that your software architecture is like really well mapped.

[00:33:48 - 00:33:58]

Speaker 1: Around, like, the methods that you expose in the query language 2, like you said, the modules, the I don't know enough about. You said it was written and rust?

[00:33:58 - 00:34:01]

Speaker 0: A go lang But, yeah. But like... Yeah. Yeah. Yeah.

[00:34:02 - 00:34:25]

Speaker 0: Exactly. This... Because all these code bases have this kind of and that's like the thing is, like the structure is like like, we're talking with Dennis about Mem and these, like, you know, like workspaces where you create a workspace like I don't know, like, you have some chapter of your molecular biology class and you're like... And then you keep, like, you create a page and then you add content and it... Like, there's, like, this structure as well.

[00:34:25 - 00:35:04]

Speaker 0: And I think this structure is really well captured and, like, V has the symbolic properties as well as the vector search and then the filtered vector search is, like, made crazy fast Like, this bit map index thing like these intense like, filtered disk and is a paper that came out another, like, exciting idea of how you integrate those 2 things. But, yeah, is just such an interesting topic. So maybe there's something I I really wanna come back to with the mix peak in the specific problems thing is... And then we touched on it already with this idea of how A pac was fine tune with, like, a different dataset set of reinforcement learning from human feedback, but how are you thinking about fine tuning? I always...

[00:35:04 - 00:35:06]

Speaker 0: I ask everyone about this topic because I love it.

[00:35:06 - 00:35:45]

Speaker 1: Okay. So I'm gonna caveat everything, which is by saying that. Like, I I have a team of really brilliant engineers that I'm working with and they're the ones that are really creating, like, I'm kind of creating the the little bit of an abstraction layer on top of, like, think of it as you have core services, and then you have, like, routers on top of those core services. The really impressive team of engineers. Is building like the core, embedding, inference and re ranking and fine Steininger models, but I think we actually talked about this if the alleviate meetup Conor, which is like, how...

[00:35:45 - 00:36:18]

Speaker 1: Or maybe it was your colleague, but I was I was asking, like, what are some of the... Like, really suitable production use cases of learn to rank. Right. And I know learn to rank isn't quite fine tuning. I there's probably a little bit of a nuance, but I think that like re ranking results around the previous sequence of some kind of signal that a user has and reorder it based off of some kind of conversion metric that's that's exciting.

[00:36:18 - 00:36:57]

Speaker 1: But I think at its core, again, it goes back to understanding the end user. So, like, For example, I'm exploring with this colleague use case, which is it's really like an embedded search bar on any website, just like a Javascript widget that you can embed on your site. That understands all of the files and the directory of the files. And I think if if we're able to kind of track the activity of a user across a number of pages. We can create kind of analog between, hey, this sequence of activities, with user a, converted and converted is just a metric that is specific to this company.

[00:36:57 - 00:37:29]

Speaker 1: So maybe if it's an e commerce company, they bought a chair or if it's a Weaviate company, they deployed some kind of instance. So everybody has their own metric that they use to define conversion. And so if we can build a learn to rank model, that is actually reorder the results based off the propensity to convert, then I think that's really attaching ourselves to the goal of the business. But I realize that's a little bit different than fine tuning.

[00:37:31 - 00:37:34]

Speaker 0: Well, you... Yeah. No I... I think there is a relationship, but... Yeah.

[00:37:34 - 00:37:45]

Speaker 0: And we'll come back to particularly, like, what kind of fun thing we have learning to rank. I love this topic Yesterday or recorded podcast with Erica from Wave aid enrollment in Ce from Meta rank. And so

[00:37:45 - 00:37:46]

Speaker 1: Oh yeah.

[00:37:46 - 00:38:02]

Speaker 0: Great people. Right? Yeah. And so this kind of idea of you know, using, like, an x Boost style model that takes in, like, user features as you mentioned, like the there's like, the interaction event. Obviously, like I think people are aware of this kinda like, you know, you you...

[00:38:02 - 00:38:06]

Speaker 0: People... They collect collect data about you and like, when do you like, pick.

[00:38:07 - 00:38:08]

Speaker 1: All etcetera.

[00:38:09 - 00:38:19]

Speaker 0: Yeah. So so this kind of layer... I mean, I guess I think I this kind of learning the rank thing. I think, I'm very curious about, like, the general digitization of a kind of, like, the... Like...

[00:38:20 - 00:38:48]

Speaker 0: Because you're using all these features, and I I wonder if that's gonna be as kind of robust as, like, the content only based crossing encode. And then I 1 idea I think is extremely interesting is maybe taking these tab features and just kind of translating them to text and just making it a text based input and then you know, then you can access the transfer learning kind of part of it where. You know... That's what I think is kind of the frontier of learning to rank. Per...

[00:38:48 - 00:38:51]

Speaker 0: Yes. I think when I... Yep.

[00:38:51 - 00:39:24]

Speaker 1: Let me just make sure I understand that said differently. We have... User a that goes and clicks a button on the first page and then scrolls down to the second page and then clicks another button on the third page. That's 3 different steps that could be mapped to text, unique text, and then that sequence, let's say it's an array of strings get sent to a large language model or whatever, and then the embedding is generated, stored in a search engine, and then you can run like sequential inference, Is that kind of maybe not?

[00:39:24 - 00:39:26]

Speaker 0: Yeah. Yeah. That's... Well, you you... Yeah.

[00:39:27 - 00:39:54]

Speaker 0: The the... Because that you kinda confused me... Bit like, in changing my understanding with how you're describing, like, the scroll down the direction because I I suppose you could collect a crazy amount of tab features that might be too much to translate into text. But then I also think if you're collecting that many features, you're gonna over fit to some pattern in this feature feature vector that you've done, like, to you know, scroll down for 3 seconds Weaviate for 2 seconds hovered over this or 5 seconds. Like, I I feel like if that kind of vector...

[00:39:54 - 00:40:09]

Speaker 0: That kind of feature engineering is gonna over fit. But, yeah. No know. That's exactly it Is like, you translate it all into text, and then like, I think about if I had some kind of user description... Like, usually, cross encode is, like, query document.

[00:40:09 - 00:40:40]

Speaker 0: But like, if it was, like, query user description document, you know, I don't see why that wouldn't work just as well. And then also you you use the large language model to do that kind of reasoning and then you dis steal it into, like, the 20000000 parameter cross encode that runs crazy fast. And and and then the the topic then, I think is you know, they'll need to be, like software layers to make that kind of thing accessible. Really, like, training ranking model. So I really thought that I meta a rank...

[00:40:40 - 00:40:50]

Speaker 0: Podcast was great because I I really like what they're doing, How they've built, like ml ops just for ranking. I think it's a pretty cool niche like a Yeah. Because you have all of the... How do

[00:40:50 - 00:40:53]

Speaker 1: they do it? If you were to summarize in, like, a paragraph?

[00:40:54 - 00:41:22]

Speaker 0: Well, I can say, like... Coming from the beginning of it, My interest... With integrating this to Weaviate was, like, We has this module system where you access the retrieval from, you know, like, vector search hybrid search with the s symbolic filters And and then you can pipe that into, like, a question answering model. And so we can also pipe that into these ranking models, And so it's like, if if we can just make that Api to the Meta rank services. So Meta rank does the full set of, like, the...

[00:41:22 - 00:41:33]

Speaker 0: You know, inference hosting the model, like the... You know, how you do the data ingestion, how you do the model version, the validation that, it's like quite a package of Emma ml things and...

[00:41:33 - 00:41:38]

Speaker 1: Yeah. Yeah. I... While I was actually at Mon. I was...

[00:41:38 - 00:42:08]

Speaker 1: Trying to do my best on educating the entire solutions architect org, which was, like, 400 people at the time around vector search and from interviewing all of the companies that were exploring it, what you just described were the biggest pains around it, which was like, Okay. How do we do model version? Where do we host the models? How do feedback loops work? What's like the best reference architecture for the Ml ops to to handle all of this.

[00:42:09 - 00:42:56]

Speaker 1: What I'm finding is like, I'm forced to store model versions as like a string and some kind of other data store and then create an index for all of those different model versions because as you know, if we just tune a model even slightly, it kind of renders all of the embedding obsolete, and that poses an entirely new challenge, which is okay. How do you kind of refresh all of the existing embedding. And I've seen some people have some pretty impressed distributed compute. Workloads, they're using Spark to just run everything in, like, this horizontally compute fashion. Some people are using, like Server list, Gpus to do that, But, yeah, I mean, it's it's a huge challenge.

[00:42:56 - 00:43:37]

Speaker 1: And this is why, like, if if anybody's exploring, starting a company in the Ai space, it's tooling is hot. Just because of, like, no 1 knows how to production eyes some of these applications. And like, I'm sure Weaviate is doing phenomenally for that reason. But like all these Ml ops and even what we're working on, which is these Api based companies to help abstract a lot of these complexities, there's there's a lot of interest in that, both from the market and from investors. So You did ask a question earlier, Connor, which was like, is there going to be a winner take call?

[00:43:37 - 00:43:55]

Speaker 1: For these models. I'm curious to hear what your opinion is because like, everyone's... It's so obvious right now that open the eyes, like absolutely dominating. But like, is is that really the future? Are we just gonna all be succumb to the Whim of Microsoft?

[00:43:58 - 00:44:09]

Speaker 0: Oh. Well, I think it's a it's a super interesting question. I I think... Yeah. I mean, well, it...

[00:44:09 - 00:44:32]

Speaker 0: It's... I could see it being kind of monopoly like in the sense that... Well, this kind of, like 0 shot thing to me is very mono. Like this, you know, like, the whisper model that it goes audio to text, similar Gb 4 this Ag. Like, I think Ag is pretty like, I don't I don't think that you'll be like, oh, I didn't like the answer from Gb b t for.

[00:44:33 - 00:44:48]

Speaker 0: Let me go ask C here or let me go ask Bard. Right? This kind of thing like where you go ask another 1 of the Ag. Like, I think it's more likely that you'll just see tune your prompt kind of if you're unhappy with the answer, but I do think, like, I don't I don't know. I'm I'm not sure.

[00:44:48 - 00:44:57]

Speaker 0: I'm not I haven't really thought about this too much, but I And it's such an interesting topic. I wish I had a more... Like, a more thought through yeah. Fictional... Alright.

[00:44:57 - 00:45:11]

Speaker 1: Sorry to put you on the spot. I know you guys Weaviate it has a lot of partnership with... For different modeling companies. And I've I've used most of most of them c coherent hugging face. I've I've used for like, so many of the L m's out there.

[00:45:12 - 00:45:32]

Speaker 1: But... Yeah. IIII think personally, there's probably... There's going to need to be some kind of like staggering of the offerings and we talked about how, like, the companies that are offering these, like, domain specific models and they are like, supporting the training on companies knowledge bases. They're gonna do really well.

[00:45:33 - 00:46:00]

Speaker 1: But then there's this entire arm of companies that really need security. I went to a security meetup yesterday a company in my... Our lead investors portfolio and there like, what we talked about earlier, they've completely pivoted their roadmap to support how do we create what do they... What did he call it? Like, 0 knowledge proofs around, like, what you providing to the model?

[00:46:01 - 00:46:34]

Speaker 1: And ensuring that you can validate that the model is... Actually, I don't wanna butcher it. But it's an area of academic rigor and, like, the lowest hanging fruit is really the models that you can self host. And if that's never a direction that chat goes in, which could unlock a can of worms and Pandora box because I know there's a lot of filters, that they're putting on top. But I think there's a lot of avenues, different model companies can go after.

[00:46:34 - 00:47:00]

Speaker 1: It's like, is security, the most important is domain specific. The most important is general usage. The most important coherent has a phenomenal api Api set. So so maybe it's just like a developer experience. So I think there's gonna be there's gonna need to be, like, different niches and maybe Open A does become the app store, but then, like, c here becomes the Salesforce.

[00:47:00 - 00:47:09]

Speaker 1: Right? Like, there's no Salesforce on the app store. Maybe there is nobody used it, but, like, there's still a behemoth because they attack the domain specific knowledge base.

[00:47:11 - 00:47:25]

Speaker 0: Yeah. I... Like, you know, we've a lot of partners and friends and so about like, an another company that, you know, that I'm super interested in it is what Mosaic Ml is doing. And I think as you mentioned the Salesforce, I think Mosaic Ml is the... Like, the...

[00:47:25 - 00:47:50]

Speaker 0: Is a company that's just super impressed me with their pro. I love like the, like, I love this kind of business model. Obviously, it's like the V business model so I like a lot of where you kind of open source the software, but then Mh. Like, enterprise hosting is managed and... So, you know, Mosaic ml, they they have this, like, composer library, they're sharing all this knowledge on regular regulation and, you know, they're they're hitting their cutting the cost of bert.

[00:47:51 - 00:48:03]

Speaker 0: I think they just said that they train bert for, like, 20 dollars. And it's like they're... You know, they started out by saying I can get you Gp 3 at 450000 dollars. Now today it's 300000 dollars and they're, you know, they're they're cutting this down. And I...

[00:48:04 - 00:48:22]

Speaker 0: Yeah. III think it's super interesting that kind of idea of the 3 you know, the language model for your custom thing, but then there's... So so, yeah, I think actually, we could segue this into 2 things. There's the language model for your custom thing, and we talked about the kind of custom re rank and how that... Learning to rank, generally use all these cray specific like tab features about you.

[00:48:22 - 00:48:37]

Speaker 0: And then there is the embedding models like like, what do you think of... And and you mentioned that problem of the rev vector problem. That's a pretty big problem. Like, if you update the embedding and you have a billion... Embedding, you don't need to rec compute a billion embedding with the new model.

[00:48:38 - 00:48:40]

Speaker 0: Right. I've seen some interesting compute,

[00:48:40 - 00:48:59]

Speaker 1: but, like, retrieve the original corpus. As well, which is like, its own massive challenges. And that's why these like, hybrid search engines are going to become more and more important which is like, you need the original content in order to do the re embedding. I'm sorry, I cut you off.

[00:49:00 - 00:49:02]

Speaker 0: Oh, no, yeah. That's... Yeah. That's great. I mean, the...

[00:49:03 - 00:49:24]

Speaker 0: Yeah, like, vector a billion documents. Yeah. Like, I've seen a cool idea which is like like, the Facebook Dp model, like, where you just update the query embedding model. I think that's a potential idea where the 0 shot embedding happens with the, like, you know, The open Or c coherent embedding model. So should c here is multilingual ling embedding model is amazing.

[00:49:24 - 00:49:42]

Speaker 0: And you, you know, that's your document embedding, and then you update maybe the query, And then, yeah. Another interesting thing from that c Twitter talk that Bob said was that he thinks. Like, 80 percent of the cases, the 0 shot model paired with... The, like, lexi Bmw 25 and the hybrid setup. That that's a pretty good...

[00:49:42 - 00:49:43]

Speaker 0: You know, that's a pretty strong bit.

[00:49:44 - 00:49:44]

Speaker 1: Okay.

[00:49:45 - 00:49:45]

Speaker 0: Yep

[00:49:45 - 00:50:10]

Speaker 1: Yeah. And I... This is an interesting topic that I've been curious about, which is like, is TFF in Bm 25, were are they gonna become obsolete? At some point. Like, are we always going to need to do, like, string matches to some capacity or will vector searches kind of just dominate obviously, you guys have an opinion there because you've baked it into your map.

[00:50:11 - 00:50:21]

Speaker 1: But Yeah. I've... So so you were saying, sorry. So the Bm m 25 combination with what has excel? So, like, a 0 shot embedding model.

[00:50:21 - 00:50:24]

Speaker 1: So, you know, like the open Ai embedding, pairing that would

[00:50:24 - 00:50:38]

Speaker 0: be in 25 for... Let's say looking through air airline manuals. It like some, you know, like some application like that, Yeah. And I I do think Bmw 25. It's pretty interesting because there is definitely search it.

[00:50:39 - 00:51:04]

Speaker 0: Like, I like this idea of having, like, intent and bike prompting. Search as well. There are a couple papers like task retrieval with instructions and it's in the instructor models that are, like, you prompt search kind of the embedding model as well. Because the embedding model also is, like, capturing a relationship. So it's like a 1 of the great academic datasets is beer, and Beer has this 1 dataset set called Argue anna and in Argue anna, you're retrieving counter arguments.

[00:51:04 - 00:51:16]

Speaker 0: So you you put in some argument. It's not... It doesn't say git me what agrees with this Git me what doesn't accrue with this. And that little difference to, like, the shooting models obviously, aren't, like, adapted to the... It's and it's very interesting because, like...

[00:51:16 - 00:51:30]

Speaker 0: It's like that mitigation thing where you say, like, I am happy. I am not happy and then you're like, oh, why are these vectors similar to each other and it's like, well, they are semantic similar and that's the relationship that's captured. So z, I might have gotten on a rant there for a bit. But I... Yeah.

[00:51:30 - 00:51:35]

Speaker 1: And Yeah. So because beer was trained on... And it looks like it's BEIR. Right?

[00:51:36 - 00:51:36]

Speaker 0: Not Yeah.

[00:51:36 - 00:52:10]

Speaker 1: Sure. So it because beer was trained on, like, the inverse or the of some of these comments. They have slightly better understanding of the use cases where, like, I am not happy versus I am happy, which is interesting because, like, as someone that's experimented with a lot of the like, hugging face off the shelf models. That's something that has historically been quite challenging to grasp because you could have an entire sentence it's just 1 word makes that entire sentence completely different. Which is probably an area of academic rigor.

[00:52:11 - 00:52:14]

Speaker 1: So task where retrieval with instructions. Okay? Bookmark that?

[00:52:15 - 00:52:18]

Speaker 0: Yeah. Yeah. It's so fascinating. I... And I think kind of 1 more thing.

[00:52:18 - 00:52:41]

Speaker 0: I I have strong opinions about b or BEI because I've spent so much time working on adding this to Alleviate. But like... So the beer is like this 0 shot benchmark. Like, I think 17 in total, but 3 are like close source of 14 are like, openly available. And so you don't do any training as the idea of it is like, how well can a model trained on something else generalize to this.

[00:52:42 - 00:53:02]

Speaker 0: And I think that that... It... And that's, like that and this... I think, like, Ms Marco is 1 of the datasets but also has a training set. I don't think we have a good academic dataset set for the, like, continual learning case, like, but like, it would be amazing if there is, like, an academic dataset that was, like, let's say, the p towards documentation and, like, how it evolves over time.

[00:53:03 - 00:53:17]

Speaker 0: I I think that kind dataset is needed because it... You know, it's like like, I love this example with Weaviate where it's like, in 1.6, we introduced Rev back. And it's like, model would have no clue what Rev is. You know, like, it's this kind of sequence problem I think.

[00:53:18 - 00:53:37]

Speaker 1: And And by sequence, you mean kind of capturing the evolution of a corpus and and using the delta between these stages, the tune. The model? I mean, that's a that's a good point of, like, how exactly do you wanna do this? I mean, I... Because I just think, like the

[00:53:37 - 00:54:02]

Speaker 0: it... It's like the kind of like, the train test sets is like there's you like, this Ii id, you know, like, independent and identically distributed where you... You know, have all your data and you're, like, randomly sampling the training set, randomly sampling the testing set compared to this, like, historical data splitting where it's like, you trained on, 20 12 to 20 18 tested on 20 19, 20 20. And I think that that's more realistic in my view. Right.

[00:54:02 - 00:54:09]

Speaker 0: Right. And I think the use cases for that are are really aligned with just this concept of

[00:54:09 - 00:54:48]

Speaker 1: a knowledge base where the content, it's self is evolving and you need to kind of not only train but test on, like, the the delta of those stages over time. To If it's alright with you, I'd like to pivot slightly into something that might be a little bit irrelevant to Nlp, which is something like a little bit of a hot take that I have as of late. So I've been going to a lot of these, like meetups in New York I'm pretty new to the city, including the Weaviate 1 and they often have you put, like, name tags on. And what I've been really into doing is just, like, putting some kind of like, contentious opinion. On these name tags.

[00:54:48 - 00:55:34]

Speaker 1: And III think that 1 of the most important things as somebody that's building and and trying to explore and Weaviate aids doing a phenomenal job at this is building in public. Mh. And really showcasing not only like the final released version, but all the steps that you got to get there. And not only does it help you build an audience on the way, but it really kind of showcases that you're a human. And you're a little bit vulnerable because you're kind of maybe nervous about the steps that you're exposing, but it unlocks so many different opportunities around getting feedback at every stage, garner, like a really strong evangelist user base and the the best example of this is really open source.

[00:55:34 - 00:55:53]

Speaker 1: Right. If everybody can see every pull request, every commit every issue, then it's really quite obvious how everything's is going. And... Especially if you're just like a sole entrepreneur engineer, It's like, everybody can see, Hey, this is the progress this guy is made on on this product. I don't know.

[00:55:53 - 00:56:04]

Speaker 1: I don't know if you have any project that you've done independent of alleviate in the past. But like, is that something that you've explored and are are there any companies that are doing that really well?

[00:56:05 - 00:56:09]

Speaker 0: Yeah. I, I love this sup again. Well I... Yeah. Because I think about it a bit.

[00:56:10 - 00:56:22]

Speaker 0: I think like... Oh, it's very interesting. It's like, this question about, like the business model of Open source kind of in general. And then as... Like, as I think open source is also kind of like a content strategy.

[00:56:22 - 00:56:40]

Speaker 0: In a way, like, you you... You're like... Yeah. By, like, constantly doing these releases and then explaining all the details of it. You curate an audience and a lot of these products like having a community, that's, like, super valuable because especially if they're making pull requests and things like, like I know, like, with Yeah.

[00:56:41 - 00:56:52]

Speaker 0: Yeah. Like, I think Laying chains is a great example of something that's achieved this, like, you know, people... Especially with, like, the integrations integrations being such a massive... A part of this, like, with other software companies and it... Like...

[00:56:53 - 00:57:19]

Speaker 0: So, yeah. Like, if you're creating this content and then It's kinda what like about doing this podcast as well as it's like the potential to, you know, have somebody who doesn't necessarily work a we get, but you can kinda like, highlight what they're doing and it kind of incentivize everyone to work together. Yeah. It's pretty fascinating. But I I think some parts of the business I don't know Like, I think closed source because there has to be some kind of advantage unless you're like a marketplace business.

[00:57:19 - 00:57:22]

Speaker 0: Like, something where the community is the moat.

[00:57:22 - 00:57:46]

Speaker 1: Well, and the community is is often like a a really powerful competitive edge. I mean, as someone that worked in an open source, company for a long time. I can I can say that? But I think the real mode is around the abstraction. And, like, managing an open source project is is really always going to be a challenge.

[00:57:47 - 00:58:13]

Speaker 1: If you... Especially if you wanna... Have some kind of sem of high availability and an Sla, guaranteed dear your customers. And so therein lies the importance of like, having these servers all managed for you. And there are some interesting companies that are, like, really abstract the server the the, like, server architecture of I'm sure Mosaic is doing that to some degree.

[00:58:14 - 00:58:28]

Speaker 1: There's 1 company. I won't give them a name. But they are purely just a decor in Python that says, hey, Run this and, like, run this as a server list function. There's a couple of companies that are doing that this. Why didn't wanna name anybody.

[00:58:29 - 00:58:56]

Speaker 1: But I I think that for me at how I'm trying to make myself Mixpeek pete competitive is really just around, hey, let's do our best to abstract a lot of the patterns that we're seeing across these companies. And for us, it's like everybody is doing... Everybody is probably storing their files in some kind of content repo like an est. They're all trying to extract the contents. They have very strong variants of files.

[00:58:56 - 00:59:18]

Speaker 1: We want to extract the contents in a way that it's maintaining the the the structure of the file. So for example, paragraphs and pages in Pdfs, rows in a spreadsheet. We wanna maintain that. And then offer some kind of search interface that spans all of them. And all of these are like, all these different steps.

[00:59:19 - 00:59:34]

Speaker 1: And if we were just to open source the entire code based, and we could people could probably do a lot of that, but they won't get the same experience as just like a 2 Api call kind thing. And hopefully, you guys are doing... Are exploring that in the same way with We cloud.

[00:59:36 - 00:59:45]

Speaker 0: Yeah. That is really... I... Maybe I... If you could teach me a little more about the server list thinking, we had Eric Burn on the podcast to talk about

[00:59:45 - 00:59:46]

Speaker 1: thinking of a total.

[00:59:46 - 00:59:52]

Speaker 0: Yeah. And, yeah, I saw your article with bananas as well. Could you kind of explain to me what these companies are doing? Okay just...

[00:59:54 - 01:00:59]

Speaker 1: Yeah. I mean, I I had an idea of, like, in in in any industry, the companies that kind of consolidate all the different steps tend to do really well, and it is kinda compounded in the Ml space, which is like, the more data you have, the more valuable you are, the more like we said, the domain specific models to every customer is is is the future. Like what Mosaic is doing. And so And if you're baking in the ability to spin up and inference engine, a model in an architecture that is both cheap and fast and you're able to do that in a really simple way and expose that to your users, then it's it's a really powerful asset that you have in making your competitive mode as as a company. It's like the analogy that everybody understands is like, everyone's coming from Google, everyone's coming from Facebook.

[01:00:59 - 01:01:14]

Speaker 1: They have expectations when they interact with your software. And that expectation is it's fast, it's accurate. And very often, it's a lot more affordable. So... And that's really why Server is such an interesting space.

[01:01:14 - 01:02:04]

Speaker 1: But what's really fascinating and this is something that after talking to a bunch of engineers at Aws, Gcp and Azure is like is that it doesn't seem like any of them are really, attacking the server list Gpu space. I don't I don't know why, but like, I've seen articles, I've written articles around hey, If you wanna do, like server as Gpus, then you gotta create an elastic cloud, an Ec architecture, then attach whatever the Gpu, instance is and kind of have scaling baked in with maybe like Kubernetes. And Mh. Anybody that's used Kubernetes knows that it's like the biggest pain. And so like modal, banana, they're all abstract the kind of deployment of these models via, the server list Gpus.

[01:02:04 - 01:02:12]

Speaker 1: And I like Modal approach, which is just a decor in Python. I haven't actually used banana, but, yeah. Eric K guy.

[01:02:13 - 01:02:21]

Speaker 0: Yeah. That's amazing. I I've always been so interested in this. I I'm mean, full, like, disclaimer. I'm, like, I spend most of my time reading research favors and stuff like, I'm just...

[01:02:21 - 01:02:35]

Speaker 0: This is just like time a edge. Not an expert on this, but like, the the thinking of, like, the Kubernetes and the scaling different resources for different kind of... Jobs. Like, I was always kind of as a... As studying deep learning.

[01:02:35 - 01:03:12]

Speaker 0: I always really curious, like, how weights and biases... Like, how are they valued at, like, a billion dollars because it's like a... It looks to me, like, hyper parameter logging, tuning logging and I did, like, some, like, marketing work with determined Ai where they were also doing this like, hyper parameter cluster thing And so I was like, starting to learn about this kind of cluster management thing, and I kind of came to this thinking that, like, the kind of like the callbacks require different resources than the training. And so, you know, having this... Kinda like with Alleviate like, where you have some resources for the V, and then you have different resources for, like, say, the query embedding container.

[01:03:12 - 01:03:24]

Speaker 0: Like, this these require different kinds of computers. And, like, all that kind of thing. And it's pretty... I mean, I... Like, I really don't know what's too much about kubernetes at but like, or, like, what the particular pains are, but I...

[01:03:24 - 01:03:41]

Speaker 0: That's how I understand the idea is like, yeah, like, server listen to me, sounds like, if you wanna just have a query embedding model on, you know, running, you just you know, right deck on a python function. Right? And that that kind of thing is super cool. Yes. So I don't know if I have too many ideas on this, but, you I think it's super much...

[01:03:41 - 01:03:48]

Speaker 0: I mean, Yeah, like, what is the big kubernetes is a pain problem? Like I I've heard this so many times, but I don't... Yeah.

[01:03:49 - 01:03:56]

Speaker 1: I think it... It's it's around like, there's 2... There's 2 aspects to it. 1, and I'm also not a Kubernetes expert by any means. I've just used it.

[01:03:57 - 01:04:31]

Speaker 1: It's it's probably the fact that I'm not an expert, which is, like, creating a little bit of a bias in me complaining about how challenging this. Which is like a... Kind of a weird situation. But for me, it's really like the creation of the kubernetes cluster in addition to the maintenance of it. And I what what could be and this is my own theory, is like, when you have distributed Inference engines, the state is consistent across them.

[01:04:31 - 01:05:07]

Speaker 1: Mh. And I'm sure Eric could make some explanation on why like server less environments can overcome that challenge, but like, if you have an app server and you're distributing the workload across 3 different servers and you're routing the... A query across 1 of those 3 different servers. How do you guarantee that there is like a state between the servers like the perfect example that everybody could probably understand is this concept of context with the chat Eb bts. So let's imagine we have 3 different server functions or 1...

[01:05:07 - 01:05:33]

Speaker 1: Even 1 server function and every time a user is calling this decor, it is deploying this inference on, let's say, server less environment a, and then another user is doing server list environment b. How is context shared between them because they don't... There's no state. It's turbulence. I I I'm probably just speaking out of my my Ke here.

[01:05:34 - 01:05:49]

Speaker 1: I don't know much about the space, but that's an area that has always been challenging from purely, I'm an active software developer because not a researcher, but that's always been an area of challenge with, like server functions and databases you need some kind of state management.

[01:05:50 - 01:05:56]

Speaker 0: Yeah. Mesa, I've been learning about, like, replication consistency from Weaviate and and it sounds

[01:05:56 - 01:06:01]

Speaker 1: 1. Is he related to the guy that created the clock? Let...

[01:06:01 - 01:06:10]

Speaker 0: Oh, 0, no. He he... Nathan hasn't been on the podcast. Yeah. Just friend mine have talking about and I don't know about the clock But I'm...

[01:06:11 - 01:06:11]

Speaker 0: Yeah.

[01:06:12 - 01:06:17]

Speaker 1: But the lamp clock is, like, the foundation of, like, distributed systems and, like, consistency

[01:06:18 - 01:06:25]

Speaker 0: Got Gotcha. Yeah. I've just... Yeah. So I I definitely didn't, like, study too much replication in school, but I'm...

[01:06:25 - 01:06:41]

Speaker 0: Now... Listening to it from, like Eddie and Parker and Red and like just, you know, being you know to fly the wall in these conversations. But, like the Yeah. That kind of thing of, like, if I have... If I need 4 Gpus to run my chat G inference, I suspect that's, like, the Azure cloud has maybe been built.

[01:06:41 - 01:06:54]

Speaker 0: Around this in tandem with open Ai if I was... Because, yeah, that sounds like a terrible problem. I think... And I know company like, ray that do this kinda like, distributed Gpu management. It's super interesting.

[01:06:54 - 01:07:33]

Speaker 0: I mean, I'm more so interested in this company called Neural Magic that's trying to compress the models and, you know, either run them on Cpus or they recently got the... There's a research paper with 1 of the 2 researchers has the neuro affiliation. That's run it on a single Gpu, the 175000000000 parameters and I like quan spa city, these Spa city is like, 1 of these things that hasn't been realized is like, the lottery ticket hypothesis is like, you could train the Spa networks from scratch, but now there's, like, a lot of, like, okay, How do we really realize Scarcity city, but... Yeah. I'm talking out of my keys, like, I don't.

[01:07:33 - 01:07:36]

Speaker 0: I'm thinking about these things too often, but I do think this is

[01:07:37 - 01:07:59]

Speaker 1: Yeah. Really... Sounds like what you're saying is the the future of a lot of these models is you're really just them reducing in size. But not only size, but the ability to run on more so commodity hardware because not everybody has other than the Bitcoin miners out there and not everybody has some like really significant Gpu setups even in the cloud.

[01:08:00 - 01:08:07]

Speaker 0: Yeah. Yeah. That's what... I I mean, I think, you know, they got the the A pack model, People are like running that on their phones. So so like...

[01:08:08 - 01:08:20]

Speaker 1: Yeah. I saw Raspberry pi running it, which obviously implies there's no Gpu involved at all. I'm sure it's... Slow as hell, but the fact that they got it running is just a magnificent accomplishment.

[01:08:21 - 01:08:41]

Speaker 0: Yeah. Yeah. And I... Yeah. I mean, this I guess it's like the the the transformers, I think became so popular because of how it uses this big matrix multiplication thing for Gpus, but maybe new architectures, like, there's obviously, like, the spa mixture of experts kinda model And, yeah.

[01:08:41 - 01:08:58]

Speaker 0: I'm like, maybe new architectures that don't that aren't... There's, like, this great paper from a Sarah H called the hardware lottery that's, like, talks about how much the hardware has influenced. So the architecture decisions and deep learning and these kind of things? And, yeah. It's a definitely not expert in this space.

[01:08:59 - 01:08:59]

Speaker 0: Do do

[01:08:59 - 01:09:26]

Speaker 1: you think and is there like, the idea that the decision to build the models around these really large matrix multiplication and arithmetic? Is that really... Was that decided on because of Gpu popularity you think? Or, I guess, are there other options when you do machine learning? I've only known about just like the linear algebra.

[01:09:30 - 01:09:37]

Speaker 0: Why I think... Yeah. It's like... I'm trying to think... Because I I got...

[01:09:37 - 01:09:54]

Speaker 0: I'm trying to think about how the difference in the conversation with, like, the convolutional kernel is compared to the, like, tension matrix multiplication and I do think that the convolutional Kernels are implemented by just kind of, like replicating the Kernel and then making that also big matrix. So I'm not... Yeah. I'm not exactly sure. I...

[01:09:54 - 01:10:08]

Speaker 0: Yeah. About exactly how the transformer utilizes more of the Gpu than the convolutional model does or or if that is really the argument, But I think it's just generally this thing of, like, you know, big matrix multiplication.

[01:10:10 - 01:10:10]

Speaker 1: Yeah.

[01:10:11 - 01:10:11]

Speaker 0: Far as I said it.

[01:10:13 - 01:10:21]

Speaker 1: Yeah. No. I mean, no... I'm definitely not an expert at all. But it is interesting to see, like, there's only a couple of hardware companies out there.

[01:10:21 - 01:10:50]

Speaker 1: And they are... I mean, Nvidia is basically open Ai at this point, in terms of their, like, market out capitalization on the Ml space. So it'll be interesting to see who else is gonna take over in the in the Gpu offering space. And maybe there's other options that aren't Gpus, who knows. I went to a I went to Aws reinvent last year and there was a talk on quantum computer.

[01:10:50 - 01:11:09]

Speaker 1: And she like, the lady... I think it was ox Oxford research company. She brings out this, like, physical quantum computer, and it's like the size of my desk. And Maybe that will be capable of some of these calculations at some point who knows?

[01:11:10 - 01:11:15]

Speaker 0: Yeah. That sounds like good topic for, like a phd student like yeah

[01:11:17 - 01:11:35]

Speaker 1: Yeah. The whole bonus. 1 another point that I wanna leave... With before we go is like, there's a lot of fu and real fu fear uncertainty and doubt there's a lot of fu. And like I said before, like existential threats and kind of just...

[01:11:35 - 01:12:21]

Speaker 1: Feeling people feeling generally really struggling with the idea of like these models taking over, And while a lot of it might be true, she'd always encourage, like, remove yourself from the situation. I mean, I'll tell anybody that wants to listen about this, but I lived in a camper van for a year before moving to the city. Like outfitted a a ford transit high roof with electrical, running water, satellite, bed everything and really just like, removed myself for a year. And I encourage everyone to not do that. That's obvious pretty extreme, but like, operate in Sprints and and coasts.

[01:12:22 - 01:12:51]

Speaker 1: So like you could sprint for a couple of months, but don't forget to coast. And whatever c is for you, like, do it for a, like a like a fixed amount of time continuously. It's not enough to... In my mind, just say like a weekend trip here or there, And I realize not everybody has the the benefit of this, but it's really helpful, especially when you're trying to figure out how to... Research whatever project that was just rendered obsolete by the plug in marketplace.

[01:12:53 - 01:13:06]

Speaker 1: But, yeah. And if everyone, anyone wants to see how to build a van, It's van life dot com is a a blog that I maintain, documenting the build out process and the travel. And I'll be taking it again this summer, hopefully to Northern Canada.

[01:13:07 - 01:13:11]

Speaker 0: That's awesome. Yeah. That's super awesome. That's great advice. I think Yeah.

[01:13:12 - 01:13:24]

Speaker 0: It's... III think everyone's felt a little bit of existential pike of what is this with the Gp 4. I think Yeah... People who are so dismissive of it. I'm like, it's obviously scary Like...

[01:13:26 - 01:13:26]

Speaker 0: Taking out

[01:13:26 - 01:13:41]

Speaker 1: of every single job. I don't think replacement is the right word, but, like, assets is certainly the right word. Anybody that's not using it is certainly going to fall behind. Yeah. Yeah I it is pretty intense.

[01:13:42 - 01:13:42]

Speaker 1: I mean, I think

[01:13:42 - 01:13:56]

Speaker 0: a lot I think the, like, the societal issue is like, will this just cause more concentration of power at the top. Because now you're like, the top can, like, has has even more leverage with just the information control,

[01:13:57 - 01:14:07]

Speaker 1: Yeah. Yeah. I think productivity is probably the best metric to to look at. I mean, if someone's able to produce more in less time then they... Clearly have an advantage.

[01:14:08 - 01:14:58]

Speaker 1: Hopefully, these models, again, it's is probably still in that, like Open source, A pack Lam, space. Once these models do become more and more comm and democrat, then, it does even the playing field and actually create equal opportunity for all, which is really 1 of the most important things with creating, like a a really robust society is like, if everyone has access to the same baseline chat model, then I think there... There is that even playing field ensure everyone's going to have their advantages in some capacity, but we should do our best as a society to... Ensure that everyone is, like, doing their best... Or everyone is exposed to the same, like, baseline opportunity and there's a difference between opportunity and outcome.

[01:14:59 - 01:15:09]

Speaker 1: There will always be people working harder, and there will always be people that have a little bit more of an advantage, but if everyone has the same baseline opportunity. I think that's what we should strive for.

[01:15:10 - 01:15:36]

Speaker 0: Yeah. There's a lot of great topics around this. I I think if I recently listened to Richard Sa was on the Gradient podcast and he was talking about his work with the Ai Economist paper and this kind of like... Running simulations to inform policy decisions where the where the agents in the simulation are reinforcement and learning, controlled, like to you know, make it a little more realistic of a simulation and all these these kind of ideas... Yeah, super interesting.

[01:15:36 - 01:16:00]

Speaker 0: I, I do, like, Fran S had this interesting tweet where he was, like, saying that... Oh, I'm gonna I'm gonna budget this like crazy. But he's saying something like thinking that Gp 4 is like super intelligence. Similar to thinking that, like, A3D printer is just like arbitrary manipulation of matter that you can just like... And I don't know, like, that kind of thing of, like, how are we...

[01:16:01 - 01:16:15]

Speaker 0: Like, it's scary, but is it that far is it like complete control of the universe so to say, you know, again, I don't know if it really is that in that productive yet, but it seems like it's on the weight. To it.

[01:16:16 - 01:16:51]

Speaker 1: Yeah. It's it's definitely on the way. And I think in the plug in marketplace, some of the use cases that I saw on the website are really like Oh, phenomenal, the paper that I was mentioning by the Microsoft research team, which again is really biased because Open In Microsoft are obviously together. But they this ability for chat E to figure out problems on its own. I via questions, etcetera, with the time in which homo sapiens discovered their ability to use tools.

[01:16:53 - 01:17:32]

Speaker 1: So, like, for example, Open Or chat Teams capable of kinda like, calling a third party Api when it wasn't able to figure something out What it's doing. And it's clearly demonstrating enough reasoning, to use a tool to accomplish its goal, which is again that same overlap, which is Homo sapiens discovering how to use a knife. Buyer is. So we could be dig, like, the dawn of AAA new species, a a digital species. And maybe there will be that, like, evolution staging of of this new species.

[01:17:32 - 01:17:36]

Speaker 1: Who knows. A Yeah. A lot of Wifi f articles around it.

[01:17:38 - 01:17:52]

Speaker 0: Yeah. I mean, before we go, I do wanna stay on this little... Like, I've been thinking it... Ton about this idea of how with language models, they can have this like role playing. Like 1 language model is the writer, the other is the editor, or say Weaviate, right?

[01:17:53 - 01:18:19]

Speaker 0: Where we did is a like, a remote company where, you know, for the most part, we interface via texting each other and writing code and pull request and like, having calls of people outside the company and things like this. It's like, you could have the, you know, the core team full of language models that, basically, the difference is, like, what information the language model is hooked into. So, like, if you're on the core team, I don't know... Like, if I'm sampling someone... I don't.

[01:18:19 - 01:18:32]

Speaker 0: But, like, you know, people access different information, and I think what we'll see is, like, entire, like, digital like companies that are just like language models with different roles. Kinda how... Yeah.

[01:18:32 - 01:18:55]

Speaker 1: I mean, so you're saying that there might not even be a single employee had a company. It'll just be all these... Largely huge models that have their own personas and maybe they're trained on their specific skill sets you've got a large language model that is specifically marketing. You've got 1, that's specifically sales, engineering, etcetera. I could see that.

[01:18:55 - 01:19:23]

Speaker 1: I did see a tweet. I don't know who it was, but he basically posed a challenge to chat to turn a hundred dollars into 200 dollars. Mh And he like, do whatever means possible and I think what they amounted to was some kind of like affiliate marketing project, which is like, cool and all, like, what are you gonna do with that? So it kind of it suggested a domain name. It wrote the website.

[01:19:23 - 01:19:33]

Speaker 1: It reached out to a bunch of affiliates and it kind of built the analytics. Structure. So I guess each of these... They they need to be modular. Right?

[01:19:33 - 01:20:16]

Speaker 1: You need to be asking these questions in a little bit of a modular fashion. I think this is also something that chat Eb struggles with is like, every ask is sequential and that really what Chat Excels in which is like, you ask you a question, it gives you an answer, and you ask it to tune that answer or replace it or whatever. It's that sequence is how we all think, but what if it could kind of go way back, into some other state and this is again, like that state. If it could go back into another state and resume mh some kind of context there. And this is probably where the layers of large language models comes into play.

[01:20:17 - 01:20:36]

Speaker 1: And even these are like the concepts of agents. Maybe you have a version that's doing x and a version, that's doing y where version y has access to the context of... I don't know, some other agent This is interesting. I don't know. This is the first time I'm thinking about this, but there's there's so many different opportunities.

[01:20:36 - 01:20:38]

Speaker 1: This is a gold rush. Certainly.

[01:20:38 - 01:20:47]

Speaker 0: Yeah. Definitely. And I really like your article about the content verification layers. Because I've been thinking a lot also about, like, you can sample many different decoding pathways. It's like you're saying with this...

[01:20:48 - 01:21:13]

Speaker 0: State thing, like, you know, it's... Okay. It, you know, it's thinking, and then it gets to this node and then you have, like, this like I think temperature is now, like, I say temperature people will understand that you can adjust the temperature to get different, more random, more deter outputs from the language model. But really the way it works is there's, like, this probability tree that it's decoding through and you could take many pathways through that tree to get. Several different generations.

[01:21:14 - 01:21:29]

Speaker 0: And, yeah. I mean, yeah. That is, like... Because the thinking with the content verification layers is like, you sample all these outputs and then you just kinda, like, filter it that way. Like, obviously, the most of the filters, I think are right now, like, it's more like guard rails of like, well, we don't say that.

[01:21:31 - 01:21:45]

Speaker 0: Well like that kind of thing whereas, I feel like if it's, like writing code, maybe a classifier could classify if this would compile or not? Or or not even classify. Maybe... I don't like, I don't... I don't know that's a great example because, like, obviously, you like, compile it.

[01:21:45 - 01:21:55]

Speaker 0: I don't. But like, this kind of thing of, like, sampling anyway. So I I do think I gotten totally topic. Ethan, thank you so much for joining V podcast. I mean, we covered so many topics...

[01:21:55 - 01:22:01]

Speaker 0: Like, really challenge my I thinking in a lot of these different things and I hope Didn't say anything too stupid is. But you

[01:22:01 - 01:22:08]

Speaker 1: know No. I mean, you could always censor it. After post production. But no, this is great Connor. Always interested in talking to people in the space.

[01:22:09 - 01:22:32]

Speaker 1: And if anybody wants to reach out, I mean, I'm I'm happy to talk building out this this project, the start up, doing it in public. There's Mixpeek peak dot com, and there's colleague dot ai and I am Ethan Stein on anything. I mean, I'm pretty Google or I guess chat Gb. Maybe I should check if. Per per complexity dot ai.

[01:22:32 - 01:22:37]

Speaker 1: Definitely does. I encourage anyone to search their name and there. It's really cool.

[01:22:37 - 01:22:40]

Speaker 0: Yeah. Yeah. Super cool. I... The perplexing thing.

[01:22:41 - 01:22:51]

Speaker 0: I mean, I I think... Yeah. Just the whole, like, kind of the new era of, like, u dot com, n these, like, just brand new ways of thinking about search. Yeah.

[01:22:52 - 01:22:52]

Speaker 1: Yeah.

[01:22:53 - 01:22:53]

Speaker 0: Yeah. Is super cool.

[01:22:54 - 01:22:54]

Speaker 1: Cool. These