Technology
all-MiniLM-L6-v
A high-speed, 22-million parameter transformer optimized for converting sentences into dense 384-dimensional vector embeddings.
Engineered for production efficiency, all-MiniLM-L6-v2 balances performance and speed by distilling knowledge from large-scale models into a compact 6-layer architecture. It maps text to a 384-dimensional dense vector space, making it ideal for semantic search, clustering, and retrieval-augmented generation (RAG) tasks. The model handles up to 256 input tokens and was trained on over 1 billion sentence pairs to ensure robust cross-domain accuracy. Use this when you need sub-millisecond latency without sacrificing the semantic nuance required for modern NLP pipelines.
Recent Talks & Demos
Showing 1-0 of 0