AI-Powered Developer Tools: Weekly Roundup
MASAI: Modular Architecture for Software-engineering AI Agents
A new architecture where different LLM-powered sub-agents are used to solve complex software engineering problems, achieving high performance on the SWE-bench Lite dataset.

MASAI: Modular Architecture for Software-engineering AI Agents
CREF: Conversational Software Repair Framework
A new framework that uses LLMs to assist programming tutors in repairing code defects, showing significant improvement in repair capabilities through interactions with tutors and historical conversations.
CREF: An LLM-based Conversational Software Repair Framework for Programming TutorsPerfCurator: Large-scale Performance Bug Dataset
A tool for collecting performance bug-related commits at scale, utilizing a BERT model to classify commits. The resulting dataset enhances the effectiveness of data-driven performance bug detection systems.
PerfCurator: Curating a large-scale dataset of performance bug-related commits from public repositoriesLong Code Arena: Benchmarks for Long-Context Code Models
A suite of six benchmarks for code processing tasks that require project-wide context, covering various aspects like library-based code generation, CI builds repair, and bug localization.
Long Code Arena: a Set of Benchmarks for Long-Context Code ModelsWaDec: WebAssembly Decompiler Using LLM
A novel approach using a fine-tuned LLM to decompile WebAssembly binary code into more comprehensible source code, outperforming current state-of-the-art tools in various metrics.
WaDec: Decompile WebAssembly Using Large Language ModelMokav: Execution-driven Differential Testing with LLMs
A tool that leverages LLMs to generate difference exposing tests (DETs) for detecting functional differences between two programs, outperforming state-of-the-art methods in effectiveness.
Mokav: Execution-driven Differential Testing with LLMs