1 min read

AI-Powered Developer Tools: Weekly Roundup

MASAI: Modular Architecture for Software-engineering AI Agents

A new architecture where different LLM-powered sub-agents are used to solve complex software engineering problems, achieving high performance on the SWE-bench Lite dataset.

MASAI: Modular Architecture for Software-engineering AI Agents

CREF: Conversational Software Repair Framework

A new framework that uses LLMs to assist programming tutors in repairing code defects, showing significant improvement in repair capabilities through interactions with tutors and historical conversations.

CREF: An LLM-based Conversational Software Repair Framework for Programming Tutors

PerfCurator: Large-scale Performance Bug Dataset

A tool for collecting performance bug-related commits at scale, utilizing a BERT model to classify commits. The resulting dataset enhances the effectiveness of data-driven performance bug detection systems.

PerfCurator: Curating a large-scale dataset of performance bug-related commits from public repositories

Long Code Arena: Benchmarks for Long-Context Code Models

A suite of six benchmarks for code processing tasks that require project-wide context, covering various aspects like library-based code generation, CI builds repair, and bug localization.

Long Code Arena: a Set of Benchmarks for Long-Context Code Models

WaDec: WebAssembly Decompiler Using LLM

A novel approach using a fine-tuned LLM to decompile WebAssembly binary code into more comprehensible source code, outperforming current state-of-the-art tools in various metrics.

WaDec: Decompile WebAssembly Using Large Language Model

Mokav: Execution-driven Differential Testing with LLMs

A tool that leverages LLMs to generate difference exposing tests (DETs) for detecting functional differences between two programs, outperforming state-of-the-art methods in effectiveness.

Mokav: Execution-driven Differential Testing with LLMs