[AI Dev Tools] Chat with your dirs, Dependency Management, and Self varification
![[AI Dev Tools] Chat with your dirs, Dependency Management, and Self varification](/content/images/size/w960/2024/07/Screenshot-2024-07-12-165812.png)
Dir-assistant: Chat with Directory Files using Local or API LLMs
Dir-assistant is a tool that enables chatting with files in your current directory using local or API-based LLMs, featuring CGRAG (Contextually Guided Retrieval-Augmented Generation) for improved accuracy.
Key Features:- Supports local LLMs via llama-cpp-python and API LLMs through LiteLLM, with platform support for various CPU and GPU architectures.
- Implements file watching to automatically update the index when files change, eliminating the need for manual restarts.
- Utilizes a RAG system with embedding models to identify relevant files for LLM processing.
- Offers configuration options for both local and API LLMs, allowing customization of model parameters and API settings.
- Provides file ignoring capabilities through command-line arguments and a global ignore list in the configuration file.
- Example use: Navigate to a directory and run "dir-assistant" to start chatting with the files in that location.
- Example use: Ignore specific files or directories by running "dir-assistant --ignore some-project-directory .git .gitignore".
DepsRAG: Managing Software Dependencies with LLMs
DepsRAG is a proof-of-concept approach that uses Retrieval Augmented Generation (RAG) to manage software dependencies across four popular ecosystems.
- The system constructs a Knowledge Graph (KG) of direct and transitive dependencies for software packages.
- It answers user questions about dependencies by generating queries to retrieve information from the KG and augmenting LLM inputs with this data.
- Web search capability is included to address questions beyond the KG's scope.
- DepsRAG aims to simplify the complex task of understanding dependencies and revealing hidden properties such as dependency chains and depth.
- While offering tangible benefits, the approach also has limitations that are acknowledged by the developers.
Source: DepsRAG: Towards Managing Software Dependencies using Large Language Models
FunCoder: Recursive Function Decomposition for Complex Code Generation
FunCoder is a code generation framework that uses recursive function decomposition and consensus-based evaluation to improve performance on complex programming tasks.
- The framework employs a divide-and-conquer strategy, breaking down complex requirements into smaller, manageable sub-functions organized in a tree hierarchy.
- Sub-functions are composed to achieve more complex objectives, allowing for better handling of intricate programming requirements.
- FunCoder uses functional consensus to designate functions by identifying similarities in program behavior, which helps mitigate error propagation.
- Benchmark results show FunCoder outperforms state-of-the-art methods by an average of 9.8% on HumanEval, MBPP, xCodeEval, and MATH datasets using GPT-3.5 and GPT-4.
- The framework also enhances performance of smaller models, enabling StableCode-3b to surpass GPT-3.5 by 18.6% and achieve 97.7% of GPT-4's performance on HumanEval.
Source: Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
ChatGPT's Self-Verification Capability in Code-Related Tasks: An Empirical Study
A comprehensive investigation evaluates ChatGPT's ability to self-verify its performance in code generation, completion, and repair tasks.
- The study assesses ChatGPT's capability to generate correct code, complete code without vulnerabilities, and repair buggy code, followed by self-verification of these tasks.
- Findings reveal that ChatGPT often incorrectly predicts its generated faulty code as correct, demonstrating self-contradictory hallucinations in its behavior.
- The self-verification capability of ChatGPT can be improved by using guiding questions about assertions on incorrectly generated or repaired code and vulnerabilities in completed code.
- ChatGPT-generated test reports can identify more vulnerabilities in completed code, but explanations for incorrectly generated code and failed repairs are mostly inaccurate.
- The study provides implications for future research and development using ChatGPT in software development processes.
Source: Fight Fire with Fire: How Much Can We Trust ChatGPT on Source Code-Related Tasks?
LLM-Based Software Vulnerability Detection: A Benchmarking Study
A comprehensive study evaluating the effectiveness of LLMs in detecting software vulnerabilities, comparing their performance to traditional static analysis tools.
- The study proposes using LLMs to assist in finding vulnerabilities in source code, leveraging their ability to understand and generate code.
- Multiple state-of-the-art LLMs were tested to identify the best prompting strategies for optimal performance in vulnerability detection.
- LLMs outperformed traditional static analysis tools in terms of recall and F1 scores, identifying a greater number of issues.
- The research provides an overview of the strengths and weaknesses of the LLM-based approach to vulnerability detection.
- Findings aim to benefit software developers and security analysts in ensuring code is free of vulnerabilities.