Summary CodeCompose AI-assisted Code Authoring Deployment arxiv.org
8,982 words - PDF document - View PDF document
One Line
CodeCompose is an AI-assisted code authoring tool that suggests code based on contextual information, fine-tuned for Meta-specific languages, and offers features such as auto-completion, API discovery, and standard library suggestions.
Key Points
- CodeCompose is an AI-assisted code authoring system that suggests entire statements or blocks of code during development.
- It utilizes large language models (LLMs) to offer coding suggestions based on the organization's code repository.
- CodeCompose has been deployed on various code authoring surfaces across the company, offering quantitative metrics and qualitative feedback to measure its impact.
- The system has been trained on various corpora to assimilate vast amounts of knowledge and assist developers in achieving efficiency.
- CodeCompose aims to improve developer productivity throughout the software development life cycle.
Summaries
345 word summary
CodeCompose is an AI-assisted code authoring tool that suggests entire statements or blocks of code during development based on contextual information. It is fine-tuned for Meta-specific languages such as Hack and Flow and utilizes large language models (LLMs) to offer coding suggestions based on the organization's code repository. CodeCompose aids in code generation, documentation, and suggestion accuracy and offers features such as auto-completion, API discovery, and standard library suggestions. The tool has a reduction in coding iteration time when exposed to single-line ML completion and generates code across 10+ languages using first-party training data from Meta's code repositories and notebooks. The system design allows for the ability to plug CodeCompose into any code editing surface, and the architecture makes it straightforward to integrate with in-house developer tools. The tool received a 91.5% favorable response from users and made 4.5 million suggestions during the deployment period. CodeCompose proposes LCM to evaluate the system and employs a suite of optimizations to bring down the end-to-end latency below a threshold. CodeCompose is an AI-assisted code authoring deployment designed to improve developer productivity. Its three primary components are the server, Language Service Protocol (LSP), and the client. The system generates a sequence of tokens to auto-complete statements by fine-tuning on Meta's internal code and has a 91.5% favorable feedback rating. In a 15-day period, 4.5 million suggestions were shown to developers, 22% of which were accepted. CodeCompose also helps developers discover unfamiliar APIs and write documentation. The authors discuss building trust, designing the user experience, and measuring feedback to understand the system's impact on code authoring. Various AI companies offer commercial versions of code completion tools, including Google, Microsoft, and GitHub. Amazon's CodeWhisperer is a fully functional code completion tool that can generate correct syntax and pass unit tests in programming languages it is not trained for. GitHub Copilot is a code completion tool that provides a useful starting point for programming tasks but has some shortcomings. IntelliCode Compose and Pythia are also code completion tools. CodeCompose is an entirely internal tool that has only been deployed at Meta.
592 word summary
CodeCompose is an AI-assisted code authoring deployment that aims to improve developer productivity. It has three primary components: the server, Language Service Protocol (LSP), and the client. CodeCompose generates a sequence of tokens to auto-complete the statement(s) by fine-tuning on Meta's internal code. The system has desirable side effects on the coding experience of developers, as shown by a 91.5% favorable feedback rating. In a 15-day time period, 4.5 million suggestions were shown to developers, 22% of which were accepted. The system also helps developers discover unfamiliar APIs and write documentation. The authors discuss building trust, designing the user experience, and measuring feedback to understand the impact of the system on code authoring. Various AI companies offer commercial versions of code completion tools, including Google, Microsoft, and GitHub. Amazon's CodeWhisperer is a fully functional code completion tool that can generate correct syntax and pass unit tests in programming languages it is not trained for. GitHub Copilot is a code completion tool that provides a useful starting point for programming tasks, but it has some shortcomings. IntelliCode Compose and Pythia are also code completion tools. CodeCompose is an entirely internal tool that has only been deployed at Meta. CodeCompose is an AI-assisted code authoring tool that aids in code generation, documentation, and suggestion accuracy. It predicts relevant suggestions and offers features such as auto-completion, API discovery, and standard library suggestions. It is well-received by developers who work on building pipelines and common infrastructure and those who follow typical coding patterns. However, it struggles with specialized APIs and libraries. The tool has a reduction in coding iteration time when exposed to single-line ML completion. CodeCompose generates code across 10+ languages using first-party training data from Meta's code repositories and notebooks. The system design allows for the ability to plug CodeCompose into any code editing surface, and the architecture makes it straightforward to integrate with in-house developer tools. The system collects data by mining user feedback posts in the support group and manually analyzes suggestions. The tool received a 91.5% favorable response from users and made 4.5 million suggestions during the deployment period. CodeCompose suggests code and generates comments, messages, and documentation using LLMs. It's deeply integrated with Meta's version of VSCode and can suggest docstrings for functions. Suggestions need to appear within a certain timeframe, and the system has to match the developer's typing speed. CodeCompose proposes LCM to evaluate the system. The tool employs a suite of optimizations to bring down the end-to-end latency below a threshold. CodeCompose is an AI-assisted code authoring tool developed and deployed at Meta for serving tens of thousands of developers. It offers inline code suggestions based on contextual information and is fine-tuned for Meta-specific languages such as Hack and Flow. CodeCompose suggests entire statements or blocks of code during development, utilizing large language models (LLMs) to offer coding suggestions based on the organization's code repository. It has been deployed on various code authoring surfaces across the company, offering quantitative metrics and qualitative feedback to measure its impact. CodeCompose addresses the challenge of knowledge discovery in a dynamic environment by surfacing internal knowledge during code authoring, encouraging developers to produce better quality code. Trust-building was important for CodeCompose's productization, which was achieved by working with language partner teams at Meta and obtaining feedback from early adopters. The authors adopted a rollout strategy to incrementally build trust in CodeCompose by rolling it out to the company in waves of languages. At every step, they were able to gather developer feedback and iterate on the product before rolling out further.
1080 word summary
CodeCompose is an AI-assisted code authoring system that suggests entire statements or blocks of code during development. It utilizes large language models (LLMs) to offer coding suggestions based on the organization's code repository. CodeCompose has been deployed on various code authoring surfaces across the company, offering quantitative metrics and qualitative feedback to measure its impact. The system has been trained on various corpora to assimilate vast amounts of knowledge and assist developers in achieving efficiency.
CodeCompose addresses the challenge of knowledge discovery in a dynamic environment by surfacing internal knowledge during code authoring. Its impact on Meta's internal code authoring experience over a 15-day time window was significant, and it encourages developers to produce better quality code. Trust-building was important for CodeCompose's productization, which was achieved by working with language partner teams at Meta and obtaining feedback from early adopters.
CodeCompose is an AI-assisted code authoring tool developed and deployed at Meta for serving tens of thousands of developers. It offers inline code suggestions based on contextual information and is fine-tuned for Meta-specific languages such as Hack and Flow. It is based on the InCoder LLM that merges generative capabilities with bi-directionality and is multi-lingual.
The paper presents details about the CodeCompose model, system architecture, challenges for coding assistants, industrial setting and how code is authored at Meta, developer feedback on the impact of CodeCompose on Meta, results from an extensive large-scale deployment, threats to validity, and related work. The authors adopted a rollout strategy to incrementally build trust in CodeCompose by rolling it out to the company in waves of languages. At every step, they were able to gather developer feedback and iterate on the product before rolling out further. CodeCompose is an AI-assisted code authoring deployment that suggests code and generates comments, messages, and documentation using LLMs. It's deeply integrated with Meta's version of VSCode and can suggest docstrings for functions. Code generation can be done at multiple levels, but requiring too much rework may reduce trust in the model. Suggestions need to appear within 300ms-500ms and not beyond 1s, and the system has to match the developer's typing speed. Balancing the rollout of suggestions is a unique challenge, and CodeCompose proposes LCM to evaluate the system. The underlying LLM architecture involves a generative model trained with the CM objective. Developers can request multi-line suggestions by pressing Tab multiple times. CodeCompose also employs a suite of optimizations to bring down the end-to-end latency below a threshold. CodeCompose developed an AI-assisted code authoring deployment that generates code across 10+ languages using first-party training data from Meta's code repositories and notebooks. They used a specialized language model, LCM, and a tokenizer to encode each sentence separately with metadata, code before, code after, and target sentences. The system design allows for the ability to plug CodeCompose into any code editing surface, and the architecture makes it straightforward to integrate with in-house developer tools. The system collects data by mining user feedback posts in the support group and manually analyzes suggestions. The tool received a 91.5% favorable response from users and made 4.5 million suggestions during the deployment period. CodeCompose is an AI-assisted code authoring deployment that accelerates coding, aids in discovery, documentation, and suggestion accuracy. It helps developers with coding tasks such as generating in-code documentation, writing boilerplate, and discovering new APIs. The tool successfully predicts relevant suggestions without being too noisy and offers features such as auto-completing lines of code, API discovery, generating in-code documentation, and suggesting standard libraries. CodeCompose struggles to suggest correct code in scenarios where developers use specialized APIs and libraries. However, it is well received by developers who work on building pipelines and common infrastructure and those who follow typical coding patterns. The productivity impact of CodeCompose was observed through a reduction in coding iteration time when exposed to single-line ML completion. The closest work to CodeCompose is the deployment of hybrid semantic ML code completion at Google. Qualitative feedback from developers was categorized into different categories, with the majority finding CodeCompose to be a net positive experience. However, there were some UX problems, such as overloaded keyboard shortcuts and disruptions caused by competing suggestion systems. Some developers found traditional auto-complete to be more useful in certain cases. Various AI companies offer commercial versions of code completion tools, including Google, Microsoft, and GitHub. Amazon's CodeWhisperer is a fully functional code completion tool that can generate correct syntax and pass unit tests in programming languages it is not trained for. GitHub Copilot is a code completion tool that provides a useful starting point for programming tasks, but it has some shortcomings. IntelliCode Compose and Pythia are also code completion tools. CodeCompose is an entirely internal tool that has only been deployed at Meta.
The system architecture of CodeCompose is presented, along with the challenges faced in building an AI-assisted code authoring deployment. The authors discuss building trust, designing the user experience, and measuring feedback to understand the impact of the system on code authoring. A systematic literature survey of related work in this field is also summarized.
CodeCompose has desirable side effects on the coding experience of developers, as shown by a 91.5% favorable feedback rating. In a 15-day time period, 4.5 million suggestions were shown to developers, 22% of which were accepted. The system also helps developers discover unfamiliar APIs and write documentation.
CodeCompose is an AI-assisted code authoring deployment that aims to improve developer productivity throughout the software development life cycle. The system is built using three primary components: the server, Language Service Protocol (LSP), and the client. The underlying InCoder-based LLM generates a sequence of tokens to auto-complete the statement(s) by fine-tuning on Meta's internal code.
In this paper, the authors introduced an AI-based coding assistant system named CodeCompose, which was scaled to 16k developers. The authors also referenced several related studies on user interactions with query auto completion, participant response bias in HCI, and efficient training of language models. They concluded by highlighting the potential productivity benefits of ML-enhanced code completion systems from Google, Microsoft, and Amazon. This article discusses improving code autocompletion with transfer learning and cites Attention Is All You Need and Pythia as relevant resources. It also examines the impact of AI on developer productivity and presents a method for automatic evaluation of machine translation. An empirical evaluation of GitHub copilot's code suggestions is discussed, as well as the legal and ethical challenges of ChatGPT. The article also explores code prediction by feeding trees to transformers and natural language generation.
2348 word summary
Improving Code Autocompletion with Transfer Learning is discussed in an article from the International Conference on Software Engineering: Software Engineering in Practice. Attention Is All You Need and Pythia are cited as relevant resources. The impact of AI on developer productivity is examined in a paper from Arxiv. A method for automatic evaluation of machine translation is presented in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. An empirical evaluation of GitHub copilot's code suggestions is discussed in a paper by Nhan Nguyen and Sarah Nadi. The Dark Side of ChatGPT: Legal and Ethical Challenges from Stochastic Parrots and Hallucination is addressed in a paper by Zihao Li. Code Pre-diction by Feeding Trees to Transformers and Natural Language Generation are explored in an article from ACM Comput. Surv. In this paper, the authors introduced an AI-based coding assistant system named CodeCompose, which was scaled to 16k developers. The system uses generative models for code infilling and synthesis, pre-training of deep bidirectional transformers for language understanding, and learning from examples to improve code completion systems. The authors also referenced several related studies on user interactions with query auto completion, participant response bias in HCI, and efficient training of language models. They concluded by highlighting the potential productivity benefits of ML-enhanced code completion systems from Google, Microsoft, and Amazon. CodeCompose is an AI-assisted code authoring deployment that aims to improve developer productivity throughout the software development life cycle. The system is built using three primary components: the server, Language Service Protocol (LSP), and the client. The underlying InCoder-based LLM generates a sequence of tokens to auto-complete the statement(s) by fine-tuning on Meta's internal code. CodeCompose offers suggestions for individual lines of code and leverages semantic information to perform pre-processing and post-processing to improve suggestion accuracy.
CodeCompose has desirable side effects on the coding experience of developers, as shown by a 91.5% favorable feedback rating. In a 15-day time period, 4.5 million suggestions were shown to developers, 22% of which were accepted. The system also helps developers discover unfamiliar APIs and write documentation.
The system architecture of CodeCompose is presented, along with the challenges faced in building an AI-assisted code authoring deployment. The authors discuss building trust, designing the user experience, and measuring feedback to understand the impact of the system on code authoring. A systematic literature survey of related work in this field is also summarized. There are several generative AI companies that offer commercial versions of code completion tools, including Google, Microsoft, and GitHub. Amazon's CodeWhisperer is a fully functional code completion tool that can generate correct syntax and pass unit tests in programming languages it is not trained for. GitHub Copilot is a code completion tool that provides a useful starting point for programming tasks, but it has some shortcomings, such as generating code that can be further simplified and code that relies on undefined helper methods. There have been several empirical evaluations of GitHub Copilot, and the results have been positive overall. IntelliCode Compose is a general-purpose multilingual code completion tool that is built on a state-of-the-art generative transformer model trained on 1.2 billion lines of source code in Python, C#, JavaScript, and TypeScript programming languages. Pythia is a code completion tool deployed with Intellicode by Microsoft that is built using DL models trained on code contexts extracted from abstract syntax trees. CodeCompose is an entirely internal tool that has only been deployed at Meta. The generalizability of CodeCompose has not been measured separately with statistical significance. CodeCompose is an AI-assisted code authoring deployment that aims to improve developer productivity. The scope of the paper is to present the results from building and deploying CodeCompose at scale and its usage. The productivity impact of CodeCompose was observed through a reduction in coding iteration time when exposed to single-line ML completion. The closest work to CodeCompose is the deployment of hybrid semantic ML code completion at Google. Qualitative feedback from developers was categorized into different categories, with the majority finding CodeCompose to be a net positive experience. However, there were some UX problems, such as overloaded keyboard shortcuts and disruptions caused by competing suggestion systems. Some developers found traditional auto-complete to be more useful in certain cases. CodeCompose struggles to suggest correct code in scenarios where developers use specialized APIs and libraries, and is not helpful for tasks that involve writing heavily templatized code. However, it is well received by developers who work on building pipelines and common infrastructure and those who follow typical coding patterns. CodeCompose offers features such as auto-completing lines of code, API discovery, generating in-code documentation, and suggesting standard libraries. The value-add of fine-tuning the LLM on Meta's internal code may not translate externally, and investing in UX research is important to make CodeCompose a productive experience for all developers. CodeCompose sometimes suggests hallucinations and struggles with function signatures, but it has helped some developers reduce coding time and improve their workflow. CodeCompose is an AI-assisted code authoring deployment that helps developers with coding tasks such as generating in-code documentation, writing boilerplate, and discovering new APIs. It highlights the value of naming and quality of documentation, providing immediate effectiveness boosts. CodeCompose successfully predicts relevant suggestions without being too noisy. It helps speed up the coding process and is generally accurate. Developers have reported positive feedback about their experience using CodeCompose. However, some developers found it intrusive and requested to disable it. Overall, CodeCompose is a useful tool for developers looking to improve their coding workflow. CodeCompose is an AI-assisted code authoring tool that accelerates coding, aids in discovery, documentation, and suggestion accuracy. The tool received positive feedback from users, who found it helpful in accelerating their coding activity, discovering new APIs, and automating tedious tasks such as boilerplate code. The tool was found to be accurate and able to auto-complete lines of code. The labeling process of user feedback involved three steps: independent labeling, team collaboration, and refinement of categories. The qualitative feedback was collected through a forum and analyzed manually. The tool also generates in-code and API documentation. CodeCompose is an AI-assisted code authoring tool that received a 91.5% favorable response from users, with Python being the most commonly used language. The tool uses a randomized rollout strategy for each language to avoid anomalies and collects data on the acceptance rate of suggestions, the number of characters typed, and the display time of suggestions. CodeCompose made 4.5 million suggestions during the deployment period and was integrated into internal tools to limit demands on the client. The Language Server Protocol (LSP) is responsible for recording telemetry and ensuring consistency across various surfaces where CodeCompose is used. An observational study was conducted to evaluate the usefulness of CodeCompose in developer workflows. CodeCompose is an AI-assisted code authoring deployment system that utilizes user feedback to improve code suggestions. The system collects data by mining user feedback posts in the support group and manually analyzes suggestions. To evaluate performance, the system tracks various events in the IDE, such as displaying a suggestion inline, accepting or rejecting a suggestion, and the length of accepted suggestions. The system is equipped with A100 GPUs and is fine-tuned with a customized LCM objective to improve performance for languages like Hack, C++, Python, and Flow (JavaScript). The system processes requests via Thrift and is optimized for latency rather than throughput. CodeCompose is an AI-assisted code authoring deployment that employs a client-server architecture with a language server that is reused across multiple editor integrations. The system design allows for the ability to plug CodeCompose into any code editing surface, and the architecture makes it straightforward to integrate with in-house developer tools. Clients are extensions that run locally on developers' machines and are responsible for ultimately displaying inline suggestions in the editor. The LSP implements core request logic such as debouncing and caching, and supports only one meaningful request type: textDocument/inlineCompletions. To mediate requests between the client and server, an LSP was implemented. InCoder-1.3B, the public model with 1.3 billion parameters, was fine-tuned further on first party data with the LCM objective. CodeCompose developed an AI-assisted code authoring deployment that can generate code across 10+ languages. They collected first party training data from Meta's code repositories and notebooks, applying several filters to avoid bugs and exclude old/obsolete patterns. To keep the training data fresh, they only looked at diffs going back up to 2 years and only kept the latest versions of files. They used a specialized language model, LCM, which has only one mask in any task and avoids training on code that may have been added a long time ago but is never modified. LCM only masks at certain trigger characters and lifts the masking step to the language level. CodeCompose's tokenizer encodes each sentence separately with metadata, code before, code after, and target sentences. They found an optimal 70-30 split of the model's input length between code before and code after the cursor. The model generates only one sequence of tokens for the user experience and stops the generation early once a newline token has been generated. They provide telemetry that is required to evaluate or monitor the product. CodeCompose is an AI-assisted code authoring deployment that aims to address issues related to compliance and proprietary data. Evaluating the usefulness of AI-generated suggestions is a major challenge, and metrics like acceptance rate may underrepresent the actual benefits. To evaluate the system, CodeCompose proposes Language Causal Masking (LCM). The underlying LLM architecture for CodeCompose involves a generative model trained with the Causal Masking (CM) objective. Developers can request multi-line suggestions by pressing Tab multiple times. CodeCompose also employs a suite of optimizations to bring down the end-to-end latency below a threshold. CodeCompose is an AI-assisted code authoring deployment. Developers require suggestions appearing within 300ms - 500ms, and not beyond 1s. Suggestions are not shown if there is any code to the right of the cursor, except certain tokens such as parentheses, brackets, etc. The system has to match the developer's typing speed. Performance and latency are important considerations. Balancing the rollout of CodeCompose's suggestions is a unique challenge. Suggestions are generated at different granularities depending on factors such as suggestion confidence, user context, and task completion. Code generation can be done at multiple levels due to the generality of LLMs, but requiring a lot of rework might contribute to the reduction of trust and confidence in the model. Studies show that developers are fine with reworking a suggestion as long as the model provides a useful starting point or structure with relevant libraries and function calls. However, if the model is not generating enough suggestions or generates only a few accurate suggestions, developers might eventually stop caring about the system as it becomes sparse. Other factors like training data, security issues, vulnerabilities, constant data drift, biases in the source corpus used for training and its validity, and complexities of parsing source code or developing semantic understanding can also impact trust.
CodeCompose is a coding assistant that uses LLMs to suggest code and generate comments, messages, and documentation. It adheres to the comment and can also inline comments in natural language and generate code that adheres to the comment. CodeCompose is deeply integrated with Meta's version of VSCode and can suggest the docstring for a function from code that conventionally appears after the docstring. CodeCompose can look beyond the cursor to suggest code at the current position, fluently generate comments, messages, and documentation, and understand natural language proficiency.
The paper presents details about the CodeCompose model, system architecture, challenges for coding assistants, industrial setting and how code is authored at Meta, developer feedback on the impact of CodeCompose on Meta, results from an extensive large-scale deployment, threats to validity, and related work. The authors adopted a rollout strategy to incrementally build trust in CodeCompose by rolling it out to the company in waves of languages: (i) only Python, (ii) Hack, Flow (Javascript) and C++, (iii) others, and within each wave, they rolled it out to increments of 25% of the developer population. At every step, they were able to gather developer feedback and iterate on the product before rolling out further. CodeCompose is an AI-assisted code authoring tool developed and deployed at Meta for serving tens of thousands of developers. It offers inline code suggestions based on contextual information, such as the code after the cursor, the file being edited, or the kernel being used. CodeCompose is fine-tuned for Meta-specific languages such as Hack and Flow and has been trained on 10+ programming languages. It is based on the InCoder LLM that merges generative capabilities with bi-directionality and is multi-lingual.
CodeCompose has an overwhelming 91.5% positive reception and has an acceptance rate of 22% across several languages, with 8% of the suggestions being accepted by users. It helps generate more in-code documentation and identifies obsolete patterns. CodeCompose is part of Meta's Software Development Life Cycle (SDLC) and is embedded in the organization's internal code repository that hosts source code written in 20+ programming languages.
The challenges faced in deploying CodeCompose include unique challenges in terms of user experience and metrics that arise when deploying such tools in large-scale industrial settings. CodeCompose's impact on Meta's internal code authoring experience over a 15-day time window was significant, and it encourages developers to produce better quality code. Trust-building was important for CodeCompose's productization, which was achieved by working with language partner teams at Meta and obtaining feedback from early adopters. CodeCompose is an AI-assisted code authoring system that suggests entire statements or blocks of code during development. It utilizes large language models (LLMs) to offer coding suggestions based on the organization's code repository. CodeCompose has been deployed on various code authoring surfaces across the company, offering quantitative metrics and qualitative feedback to measure its impact. The system has been trained on various corpora to assimilate vast amounts of knowledge and assist developers in achieving efficiency. However, the dynamic environment at a large software company poses challenges with respect to knowledge discovery, as tools get added and deprecated, and developers move across teams. CodeCompose addresses this challenge by surfacing internal knowledge during code authoring.