yippie
This commit is contained in:
135
README.md
Normal file
135
README.md
Normal file
@@ -0,0 +1,135 @@
|
|||||||
|
# SmarterAgents: Autonomous Multi-Agent Orchestration Framework
|
||||||
|
|
||||||
|
An elite, production-grade local AI agent framework designed for unified memory architectures, specifically optimized for the **AMD Framework 16 Laptop (Ryzen 9 7940HS / Radeon 780M iGPU / GFX1103)** running accelerated inference via the Mesa Vulkan `RADV` driver on Fedora Linux.
|
||||||
|
|
||||||
|
This framework abandons monolithic agent loops in favor of a **Decoupled Model Context Protocol (MCP)** architecture. It utilizes a single, resident 14B-class LLM, dynamically toggling its cognitive reasoning state to orchestrate a multi-persona pipeline (Planner $\rightarrow$ Builder $\rightarrow$ Reviewer). All local state, execution failures, and design variables are actively vectorized into a localized CPU-bound RAG pipeline for low-latency, self-healing execution.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. System Architecture & Directory Layout
|
||||||
|
|
||||||
|
The environment strictly separates the transport broker layer from tool execution and active agent sandboxes. No automated agents write to the `core/` or `dev/` directories.
|
||||||
|
|
||||||
|
```text
|
||||||
|
.
|
||||||
|
├── agents # Immutable agent configuration profiles, tools, and dynamic spaces
|
||||||
|
│ ├── default_agent.md
|
||||||
|
│ ├── modules
|
||||||
|
│ │ └── geoscaper # Low-level workspace tool logic invoked by MCP server
|
||||||
|
│ ├── tools
|
||||||
|
│ │ └── tool_rules.gbnf # Strict structural logit constraints for Builder turns
|
||||||
|
│ └── workspaces/ # Isolated Agent Sandbox Environment
|
||||||
|
│ └── MMDDprojectname_workspace/ # Dynamic task-specific runtime directory
|
||||||
|
│ ├── 00cache.json # Global design vectors, tokens, and active styles
|
||||||
|
│ ├── 00memory_vault.db# In-process sqlite-vec engine (768-dim float array store)
|
||||||
|
│ ├── 00tasks.json # Project feature checklist and sequence goals [Fully Embedded]
|
||||||
|
│ ├── 00lessons_learned.md # Global closed-fault and successful repair ledger
|
||||||
|
│ ├── 01plan_handoff.md# Planner execution payload (Thinking: Enabled)
|
||||||
|
│ ├── 02build_handoff.md# Builder compilation summary payload
|
||||||
|
│ └── 03review_audit.md# Accumulative Audit Ledger [Append-Only Block Records]
|
||||||
|
├── core # Source logic and foundational runtime engines ONLY
|
||||||
|
│ ├── smarterframework.py # Transport broker (Discover, Constrain, Execute)
|
||||||
|
│ ├── llama.cpp # Core inference execution backend
|
||||||
|
│ ├── cache/ # System-level internal runtime engine caches
|
||||||
|
│ └── logs/ # Standard input/output framework system logs
|
||||||
|
├── dev # Strict User Space: Backups, manual synching, manual logs
|
||||||
|
│ ├── 00plan_v1.md
|
||||||
|
│ ├── 00scratch.txt
|
||||||
|
│ ├── agent_stream.log
|
||||||
|
│ ├── llama-server.log
|
||||||
|
│ └── staging
|
||||||
|
│ ├── 01CACHE
|
||||||
|
│ ├── 02OLD
|
||||||
|
│ └── 03PROTO
|
||||||
|
├── models # Binary Storage Tier
|
||||||
|
│ ├── Huihui-Qwen3-8B-abliterated-v2.i1-Q5_K_M.gguf
|
||||||
|
│ ├── Qwen3.5-4B-Q4_K_M.gguf
|
||||||
|
│ └── Qwen3-14B-Instruct-Abliterated-Q4_K_M.gguf
|
||||||
|
├── README.md
|
||||||
|
├── stage.sh
|
||||||
|
└── start.sh
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Core Operational Pipelines
|
||||||
|
|
||||||
|
### The Decoupled Transport Broker (`smarterframework.py`)
|
||||||
|
|
||||||
|
The Python orchestrator is entirely tool-agnostic. It does not parse code, resolve paths, or execute system commands directly. It strictly manages the MCP network cycle:
|
||||||
|
|
||||||
|
1. **Discover:** Queries the MCP server for capabilities (`tools/list`).
|
||||||
|
2. **Constrain:** Maps schema parameters to local `GBNF` JSON grammars.
|
||||||
|
3. **Execute:** Forwards rigid argument values to the target tool (`tools/call`).
|
||||||
|
|
||||||
|
### The Multi-Persona State Machine
|
||||||
|
|
||||||
|
To prevent instruction leakage and context dilution, the orchestrator executes a hard wipe of the context window between every phase transition, toggling the model's cognitive mode natively:
|
||||||
|
|
||||||
|
* **Phase 1: The Planner.** (`enable_thinking=True`). Outputs strategic architecture to `01plan_handoff.md`.
|
||||||
|
* **Phase 2: The Builder.** (`enable_thinking=False` + GBNF Constraint). Emits zero-latency JSON tool payloads to execute components. Outputs to `02build_handoff.md`.
|
||||||
|
* **Phase 3: The Reviewer.** (`enable_thinking=True`). Audits compiled outputs against `00tasks.json` constraints. Appends results to `03review_audit.md`.
|
||||||
|
|
||||||
|
### Active Memory Pipeline (`MemoryVault`)
|
||||||
|
|
||||||
|
All workspace iterations are systematically vectorized in-process. To preserve Vulkan execution lanes for the primary 14B model, the embedding pipeline runs entirely on the host CPU.
|
||||||
|
|
||||||
|
* **Engine:** `sqlite-vec` + `nomic-embed-text-v1.5`.
|
||||||
|
* **Self-Healing Recovery:** If the Builder receives a non-zero exit code or the Reviewer logs a rejection block, the framework automatically extracts the error trace, executes a k-NN distance search against the `00memory_vault.db`, and spins up a fresh Builder instance pre-loaded with the top 3 historic fixes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Target Hardware Optimization
|
||||||
|
|
||||||
|
This framework operates safely within a **32GB Shared Memory Ceiling** operating at 89.6 GB/s bandwidth.
|
||||||
|
|
||||||
|
* **VRAM Allocation:** Up to ~9.5 GB allocated to the 14B Q4_K_M model matrix via Vulkan.
|
||||||
|
* **Context Cap Limit:** Context space is artificially restricted to `16384` (`--ctx-size 16384`) to prevent the KV Cache from exceeding ~4.4GB and causing out-of-memory kernel panics.
|
||||||
|
* **Unified Memory Safety:** Pre-allocates unified memory safely and prevents page table collisions by bypassing OS memory mapping (`--no-mmap`).
|
||||||
|
* **Physical Core Alignment:** Restricts processing to exactly `8` physical threads (`--threads 8`), eliminating SMT resource contention.
|
||||||
|
* **GPU Queue Splitting:** Batch processing is throttled (`--batch-size 128` and `--ubatch-size 64`) to prevent `amdgpu` driver timeouts (TDR).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Setup & Compilation
|
||||||
|
|
||||||
|
### Step 1: Install Python Dependencies
|
||||||
|
|
||||||
|
The orchestrator requires specific libraries for runtime type guarding and the local vector engine:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install pydantic sentence-transformers sqlite-vec numpy
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Compile the Vulkan Inference Engine
|
||||||
|
|
||||||
|
Compile `llama-server` locally using the Mesa Vulkan driver stack, utilizing Link-Time Optimization (`LTO`) and `jemalloc`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd core/llama.cpp
|
||||||
|
cmake -B build -DGGML_VULKAN=1 -DLLAMA_LTO=ON -DLLAMA_JEMALLOC=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_FLAGS="-O3 -march=native -mtune=native" -DCMAKE_CXX_FLAGS="-O3 -march=native -mtune=native"
|
||||||
|
cmake --build build --config Release -j$(nproc)
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: ELF Dynamic Linking Fix (One-Time Execution)
|
||||||
|
|
||||||
|
To execute `llama-server` seamlessly from the root folder without dynamic linker errors, patch the binary to search its own directory for compiled shared objects:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo dnf install patchelf
|
||||||
|
patchelf --set-rpath '$ORIGIN' ./core/llama.cpp/build/bin/llama-server
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Execute the Pipeline
|
||||||
|
|
||||||
|
Run the unified start script from the root directory. This script initializes the Vulkan backend, performs memory health checks, boots the CPU-based `MemoryVault`, and triggers the asynchronous `smarterframework.py` orchestrator loop.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
chmod +x start.sh
|
||||||
|
./start.sh
|
||||||
|
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user