Devin, from Cognition Labs, is an AI software engineer with its own command line, browser, and code editor. Got your interest yet? It should. Head over to https://www.cognition-labs.com/introducing-devin and check out what they’ve built.
Devin is claimed to be the world’s first fully autonomous AI software engineer. It is designed to work alongside human engineers or can work independently completing coding tasks for review. The goal is to allow engineers to focus on more interesting problems while Devin handles routine tasks.
Key capabilities of Devin include:
- Long-term reasoning and planning to execute complex engineering tasks
- Ability to use common developer tools like code editors, shell, browser in a sandbox
- Real-time progress reporting and collaboration by accepting feedback
- Learning to use new technologies by reading documentation
- End-to-end app building, deployment and adding requested features
- Autonomously finding and fixing bugs in codebases
- Training and fine-tuning its own AI models
- Contributing to open source by addressing bugs/issues on GitHub repos
- Completing real coding jobs from platforms like Upwork
Devin was evaluated on the SWE-bench coding benchmark, correctly resolving 13.86% of open source issues end-to-end. This exceeds the previous state-of-the-art of 1.96% for complete issue resolution on this benchmark.
The examples showcase Devin’s skills in areas like steganography, web development, debugging, model fine-tuning, open source contributions across projects like sympy, Django, scikit-learn and handling real paid coding jobs.
In other words, Devin demonstrates autonomous AI capabilities for a wide range of software engineering tasks, outperforming prior models on industry benchmarks while working interactively with humans.