AI enabled pair programmer for Claude, GPT, O Series, Grok, Deepseek, Gemini and 300+ models
-
Updated
Apr 12, 2026 - Rust
AI enabled pair programmer for Claude, GPT, O Series, Grok, Deepseek, Gemini and 300+ models
Every Code - push frontier AI to it limits. A fork of the Codex CLI with validation, automation, browser integration, multi-agents, theming, and much more. Orchestrate agents from OpenAI, Claude, Gemini or any provider.
Fast, cross-platform, real-time token usage tracker and cost monitor for Gemini CLI / Claude Code / Codex CLI / Qwen Code / Cline / Roo Code / Kilo Code / GitHub Copilot / OpenCode / Pi Agent / Piebald.
Modern desktop application (Rust + Tauri v2 + Svelte 5 + Candle (HF)) for communicating with AI models that runs completely locally on your computer. No subscriptions, no data sent to the internet — just you and your personal AI assistant
Local LLM inference engine written from scratch in Rust — hand-written AVX-512 assembly kernels, Metal & Vulkan compute shaders. Supports Qwen3, Mistral3, ... Q4/INT8/BF16 quantization.
Apple Neural Engine (ANE) LLM inference engine — reverse-engineered private APIs, Metal GPU shaders, hybrid ANE+GPU+CPU on Apple Silicon. 32 tok/s matching llama.cpp, 3.6 TFLOPS fused ANE mega-kernels.
A lightweight, portable executable for invoking LLM with multi-API support - eliminating installation requirements while maintaining operational efficiency. 轻量级大语言模型api调用工具,无需安装,仅一个~10M可执行文件,支持自定义多种模型(OpenAI、Claude、Gemini、DeepSeek等,以及第三方提供的api)和prompt。
Production-grade LLM client - Rust, Python, TypeScript. 100+ providers, 11,000+ models.
alibaba dashscope qwen llm rust sdk
A modern real-time chat application built with Rust, WebSocket, and OpenAI integration. NetChat supports both group chat and AI assistant features, along with file sharing capabilities.
rust api wrapper for llm-inference chatllm.cpp
Rust-native MoE inference runtime with custom CUDA kernels for Blackwell GPUs. Includes DFlash speculative decoding, multi-tier Engram memory, and entropy-adaptive routing. Targets Qwen3.5-35B-A3B on a single RTX 5060 Ti 16GB.
CLI tool that explains programming errors using a locally-embedded LLM. No API keys. No internet. No patience required.
🖥️ Explore CPU-SLM, a Rust-based SLM/LLM project that runs on CPU, offering efficient inference and chat with minimal dependencies.
A voice assistant based on Qwen3-Omni, supporting voice cloning to customize the assistant's voice. No installation required, just a single ~18MB executable file. All chat audio is saved locally. 基于Qwen3-Omni的语音助手,支持声音克隆,实现自定义助手的声音,无需安装,仅一个~18M可执行文件,聊天语音均保存在本地。
Deterministic Agent Control via Geometric Latent Space. An LLM-driven Lattice VM (Rust/WASM) that compiles natural language intent into verifiable assembly, eliminating execution hallucination.
All-in-one inference box for vec* family projects. Supports Qwen3-VL multimodal embedding
🤖 Enable runtime governance for AI systems, ensuring accountability and control by determining actions, interventions, and halting processes as needed.
Add a description, image, and links to the qwen topic page so that developers can more easily learn about it.
To associate your repository with the qwen topic, visit your repo's landing page and select "manage topics."