DeepSWE: Revolutionizing Coding Agents with Open-Source Reinforcement Learning
In a significant advancement for AI-driven software development, DeepSWE-Preview has emerged as a groundbreaking open-source coding agent. Developed through a collaboration between the Agentica team and Together AI, this agent leverages reinforcement learning (RL) to achieve a remarkable 59% pass rate on the SWE-Bench-Verified benchmark, according to Together AI.
Revolutionizing Software Engineering
DeepSWE-Preview is built upon the Qwen3-32B model, utilizing only RL to enhance its capabilities. This approach allows the agent to outperform other open-weight coding agents, achieving a Pass@1 rate of 42.2% and a Pass@16 rate of 71.0%. The model was trained over six days using 64 H100 GPUs, tackling 4,500 real-world software engineering tasks sourced from the R2E-Gym training environments.
Harnessing the Power of rLLM
The training of DeepSWE-Preview is facilitated by rLLM, Agentica’s framework designed for post-training language agents. This framework allows for the open-sourcing of datasets, code, and training logs, encouraging collaborative efforts to scale and improve agents using RL. The full training recipe for developing a 32B model into an intelligent coding agent is now available to the public, promoting transparency and innovation.
Emerging Behaviors and Performance
DeepSWE-Preview has demonstrated emergent behaviors during its training, such as anticipating edge cases and conducting thorough regression tests. These capabilities are crucial for handling complex software engineering tasks, which require navigating extensive codebases and ensuring compatibility with existing functionalities.
Test-Time Scaling and Further Developments
DeepSWE-Preview employs test-time scaling (TTS) to enhance its performance, combining execution-free and execution-based verification methods. This hybrid scaling strategy significantly boosts its Pass@1 performance, setting it apart from other models. Future research aims to explore larger models and extend capabilities to different domains, including web agents.
DeepSWE-Preview represents a pivotal step in democratizing AI development, showcasing the potential of reinforcement learning to tackle long-horizon, multi-step challenges in software engineering. With its open-source nature, it invites the global research community to contribute to and build upon its successes.
Read More
NVIDIA Omniverse Deprecates Launcher for Enhanced Developer Experience
Jul 02, 2025 0 Min Read
Exploring Context Engineering in AI Agent Development
Jul 02, 2025 0 Min Read
Solana (SOL) Breakout Hackathon: Winners Announced with Major Prizes
Jul 02, 2025 0 Min Read
Exploring How Technology is Transforming Reading: Insights from a16z Crypto
Jul 02, 2025 0 Min Read
IOTA Unveils Key Developments in Q2 2025 with Mainnet Launch and TWIN Foundation Debut
Jul 02, 2025 0 Min Read