3 min read

[AI Dev Tools] Content Filtering, Diversity-Empowered Problem-Solving, LLM-Powered Requirements Engineering ...

[AI Dev Tools] Content Filtering, Diversity-Empowered Problem-Solving, LLM-Powered Requirements Engineering ...
Source: https://arxiv.org/abs/2408.07060v1

X Content Filter: Browser Extension for X.com Content Analysis and Filtering

X Content Filter is a browser extension that analyzes and filters content on X.com using the Groq API, based on user-configured topics and thresholds.

Key Features:
  • Automatically analyzes and hides posts on X.com that exceed configured thresholds.
  • Uses the Groq API for content analysis.
  • Customizable topics and thresholds for filtering.
  • Cache reset functionality available through the browser console.
  • Mobile support via userscript for iOS devices.
Source: https://github.com/ricklamers/x-ai-content-filter-groq

DEI: Enhancing Software Engineering Problem-Solving with Agent Diversity

DEI (Diversity Empowered Intelligence) is a framework that leverages the diverse expertise of LLM agents to improve problem-solving in software engineering tasks.

  • A meta-module designed to work with existing software engineering agent frameworks, managing agent collectives for enhanced performance.
  • Experimental results show significant improvements over individual agents. For example, a group of open-source agents guided by DEI achieved a 34.3% resolve rate on SWE-Bench Lite, a 25% improvement over the best individual agent's 27.3% rate.
  • The best-performing group using DEI reached a 55% resolve rate, securing the highest ranking on SWE-Bench Lite and outperforming most closed-source solutions.
  • This research contributes to the field of collaborative AI systems and their application in solving complex software engineering challenges.
Tools you can use from the paper:
No implementation tools or repository links are provided.

Source: Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

DeeperMatcher: LLM-Powered Crowd-Based Requirements Engineering

DeeperMatcher is an approach that helps agile teams utilize crowd-based requirements engineering (CrowdRE) in their issue and task management using LLMs.

  • The tool matches issues with relevant user reviews, aiding in the convergence of developers' and users' perspectives.
  • Validation was conducted on an existing English dataset from a well-known open-source project.
  • A single-case mechanism experiment was performed with developers using Brazilian Portuguese feedback to test multilingual support.
  • Preliminary analysis shows the accuracy of the approach is highly dependent on the text embedding method used.
  • Further refinements are needed to ensure reliable crowd-based requirements engineering with multilingual support.
Tools you can use from the paper:

Source: Multilingual Crowd-Based Requirements Engineering Using Large Language Models

Hyperion: Detecting DApp Inconsistencies with LLMs and Symbolic Execution

Hyperion is an approach that automatically identifies inconsistencies between front-end descriptions and back-end smart contract implementations in decentralized applications (DApps).

  • The method combines a fine-tuned LLaMA2 model for analyzing DApp descriptions with dataflow-guided symbolic execution for contract bytecode analysis.
  • Seven types of inconsistencies were identified through an empirical study of real-world DApps, serving as a basis for detection patterns.
  • Evaluation on a ground truth dataset of 54 DApps showed 84.06% recall and 92.06% precision in reporting inconsistencies.
  • Analysis of 835 real-world DApps revealed 459 applications containing at least one inconsistency, highlighting the prevalence of this issue.
Tools you can use from the paper:

Source: Hyperion: Unveiling DApp Inconsistencies using LLM and Dataflow-Guided Symbolic Execution

HGEN: Automated Software Documentation Generation

HGEN is a pipeline that automatically generates hierarchical software documentation from source code using LLMs, addressing the challenge of maintaining high-quality documentation.

  • The system transforms source code through six stages to create well-organized, formatted document hierarchies.
  • Evaluation involved generating documentation for three diverse projects and comparing it with manually-crafted documentation, showing similar quality but higher concept coverage.
  • HGEN was piloted in nine industrial projects, with stakeholder feedback highlighting its potential for accelerating code comprehension and maintenance tasks.
  • The tool aims to solve the common issue of inadequate documentation in codebases due to the time-consuming nature of creating and maintaining multi-level software documentation.
  • Results and supplemental materials are available at https://zenodo.org/records/11403244.
Tools you can use from the paper:

Source: Supporting Software Maintenance with Dynamically Generated Document Hierarchies

Security Analysis of LLM-Generated Code

A study examining the security of code generated by major LLMs for Python and JavaScript, using the MITRE CWE catalog as a security benchmark.

  • Initial code generation by some LLMs resulted in 65% insecure code, as evaluated by trained security engineers.
  • The quality of the programmer's prompt significantly influences the security of the generated code.
  • With increasing manual guidance from a skilled engineer, almost all analyzed LLMs eventually produced nearly 100% secure code.
  • Lack of best-practice examples in training data may contribute to security weaknesses in AI-generated code.
  • The study emphasizes the ongoing need for human expertise in ensuring the security of AI-assisted programming tasks.
Tools you can use from the paper:
No implementation tools or repository links are provided.

Source: "You still have to study" -- On the Security of LLM generated code