11 Jul 2024 6 min read

[AI Dev Tools] Command-Line Assistants, LLM-Powered Linting, AI-Enhanced Development Workflows ...

Source: What’s Wrong with Your Code Generated by Large Language Models? An Extensive Study https://arxiv.org/pdf/2407.06153v1

Shell-ask: AI-Powered Command-Line Assistant

Shell-ask is a command-line tool that allows users to interact with LLMs directly from their terminal, enabling natural language queries and task automation.

Key Features:

Supports multiple LLMs including OpenAI, Anthropic Claude, Ollama, Google Gemini, and Groq.
Processes natural language queries and generates appropriate shell commands or explanations.
Allows piping of command outputs or file contents as context for queries.
Provides options for follow-up questions, web searches, and fetching web page content.
Offers customizable result types, including command-only output and structured data.
Enables the creation of reusable AI commands for common tasks.
Installation is simple via npm: `npm i -g shell-ask`.
For example, you can generate git commit messages based on diff output or convert video files using ffmpeg with natural language instructions.

Source: https://github.com/egoist/shell-ask

Code2Prompt: Codebase-to-AI-Prompt Conversion Tool (1)

Code2Prompt is a command-line tool that generates comprehensive prompts from codebases, facilitating interactions between developers and LLMs for code analysis, documentation, and improvement tasks.

Key Features:

Generates well-structured Markdown prompts that capture the entire project's essence, including a hierarchical view of the codebase structure.
Offers customizable prompt templates using Jinja2, allowing tailored outputs for specific AI tasks.
Implements smart token management to ensure compatibility with various LLM token limits.
Integrates with .gitignore rules and provides flexible file handling using glob patterns for accurate project representation.
Provides multiple output options, including clipboard copying, file saving, and console display.
Enhances code readability by adding line numbers to source code blocks for precise referencing.
Enables contextual understanding for LLMs, leading to more accurate suggestions, improved documentation, and efficient refactoring recommendations.
Includes a token price estimation feature for various AI providers and models, helping developers manage costs effectively.

Source: https://github.com/raphaelmansuy/code2prompt

GPTLint: LLM-Powered Code Quality Enforcement

GPTLint uses LLMs to enforce higher-level best practices across codebases, extending traditional static analysis tools like ESLint.

Key Features:

Enforces complex best practices beyond traditional AST-based approaches using LLMs.
Uses simple markdown format for rules, with easy customization and project-specific rule creation.
Integrates seamlessly with existing workflows, supporting the same CLI and config format as ESLint, including inline overrides and config files.
Provides content-based caching and outputs LLM stats per run for cost and token usage tracking.
Supports various LLM providers and local models, with extensively tested built-in rules.
For example, you can enforce array indexing best practices or create custom rules specific to your project's needs.

Source: https://github.com/gptlint/gptlint

llm.nvim: Neovim Plugin for LLM-Assisted Programming

llm.nvim is a Neovim plugin that enables LLM-assisted programming with a minimalist approach.

Key Features:

Integrates LLM capabilities directly into Neovim, allowing users to prompt LLMs with selected text or file content up to the cursor.
Supports multiple LLM services including Groq, OpenAI, and Anthropic, with the ability to add custom OpenAI-compatible services.
Offers flexible configuration options, including timeout settings and service-specific parameters.
Provides functions for creating dedicated LLM interaction files and using text objects and motions for LLM prompts.
Installation requires setting up API keys for the desired LLM services and adding the plugin to Neovim using a package manager like lazy.nvim.
Users can customize keybindings for various LLM interactions, such as prompting, replacing text, and using different services.

Source: https://github.com/melbaldove/llm.nvim

LLM Answer Engine: Advanced Query Processing with Multiple AI Technologies

An answer engine that leverages multiple AI technologies to process user queries and return comprehensive results including sources, answers, images, videos, and follow-up questions.

Key Features:

Utilizes a combination of technologies including Groq, Mistral AI's Mixtral, Langchain.JS, Brave Search, Serper API, and OpenAI for query processing and content retrieval.
Built with Next.js and Tailwind CSS, providing a modern and responsive user interface.
Implements RAG (Retrieval-Augmented Generation) techniques using OpenAI Embeddings and Langchain.JS for text operations.
Offers optional features like Ollama for local inference and embeddings, and Upstash Redis for rate limiting and semantic caching.
Includes function calling support for enhanced capabilities such as location services, shopping, stock data, and Spotify integration.

Source: https://github.com/developersdigest/llm-answer-engine

code2prompt: Codebase to LLM Prompt Converter (2)

This is another project of called code2prompt, with similar functionality. Itis a CLI tool that transforms codebases into single LLM prompts, featuring source tree structures, prompt templating, and token counting.

Key Features:

Generates well-formatted Markdown prompts from entire codebases, respecting .gitignore and allowing file filtering via glob patterns.
Customizable prompt generation using Handlebars templates, with built-in templates for common use cases like code documentation and security vulnerability detection.
Displays token count of generated prompts using various tokenizers compatible with OpenAI models.
Includes optional features like Git diff output, automatic clipboard copying, and line numbering for source code blocks.
Supports user-defined variables in templates, allowing for dynamic prompt customization based on user input.
For example, you can use it to generate a Git commit message for staged files or create a GitHub Pull Request description by comparing branches.

Source: https://github.com/mufeedvh/code2prompt

Plock: LLM-Powered Text Generation from Anywhere

Plock is a tool that enables users to generate text using LLMs directly from any text input field, with real-time streaming output.

Key Features:

Seamless integration with any application where text can be typed. Users can write a prompt, select it, and use a keyboard shortcut to replace it with the LLM-generated output.
Context-aware generation using clipboard content. Users can copy text as context before writing a prompt, enhancing the relevance of generated content.
Fully local operation by default, with options to use external APIs or custom shell scripts for text generation.
Customizable settings through a JSON file, allowing users to modify shortcuts, models, prompts, and output behavior.
Flexible trigger system that supports chaining actions, environment variable storage, and multiple output options including streaming text, writing final text, or displaying images.

Source: https://github.com/jasonjmcghee/plock

CodeUpdateArena: Benchmark for Knowledge Editing in Code LLMs

CodeUpdateArena is a benchmark designed to evaluate how Large Language Models (LLMs) can update their knowledge about evolving API functions in the code domain.

The benchmark consists of synthetic API function updates paired with program synthesis examples utilizing the updated functionality.
It covers 54 functions from seven diverse Python packages, with 670 program synthesis examples across various update types.
Success requires LLMs to correctly reason about the semantics of modified functions, not just reproduce syntax.
Experiments show that existing knowledge editing techniques and prepending update documentation to open-source code LLMs have limited effectiveness.
The benchmark aims to inspire new methods for knowledge updating in code LLMs, addressing the challenge of keeping models current with evolving libraries and APIs.

Tools you can use from the paper:

https://github.com/leo-liuzy/CodeUpdateArena

Source: CodeUpdateArena: Benchmarking Knowledge Editing on API Updates

Rectifier: Error Correction for LLM-Based Code Translation

A general corrector model designed to repair translation errors in code generated by LLMs during language migration tasks.

Rectifier addresses common errors in LLM-based code translation, including compilation, runtime, functional, and non-terminating execution issues.
The model learns from errors generated by existing LLMs, making it applicable to correct mistakes produced by any large language model.
Experimental results demonstrate Rectifier's effectiveness in repairing translations between C++, Java, and Python.
Cross-experiments highlight the robustness of the method, suggesting its potential for broader application in software migration tasks.

Tools you can use from the paper:

No implementation tools or repository links are provided.

Source: Rectifier: Code Translation with Corrector via LLMs

Prompting Techniques for Secure Code Generation: A Systematic Investigation

A study investigating the impact of different prompting techniques on the security of code generated by LLMs from natural language instructions.

The research identified and classified potential prompting techniques for code generation through a systematic literature review.
A subset of these techniques was adapted and evaluated for secure code generation tasks using GPT-3, GPT-3.5, and GPT-4 models.
The evaluation used an existing dataset of 150 security-relevant code-generation prompts.
Results showed a reduction in security weaknesses across the tested LLMs, particularly when using the Recursive Criticism and Improvement (RCI) technique.
The study contributes valuable insights to the ongoing discourse on the security of LLM-generated code.

Tools you can use from the paper:

No implementation tools or repository links are provided.

Source: Prompting Techniques for Secure Code Generation: A Systematic Investigation

LLM Code Generation: Performance Analysis and Bug Mitigation

A comprehensive study evaluating the performance of several LLMs in code generation, identifying common bugs, and proposing a novel method for improving code quality.

The study assessed three closed-source and four open-source LLMs on three popular benchmarks, analyzing code length, cyclomatic complexity, and API usage.
LLMs struggled with complex problems, often producing shorter but more complicated code compared to canonical solutions.
A taxonomy of bugs was developed, categorizing errors into three main categories and 12 sub-categories.
A new real-world benchmark of 140 code generation tasks revealed different bug distributions compared to existing benchmarks.
The researchers proposed a training-free iterative method using self-critique and compiler feedback, which increased the passing rate by 29.2% after two iterations.

Tools you can use from the paper:

No implementation tools or repository links are provided.

Source: What's Wrong with Your Code Generated by Large Language Models? An Extensive Study