LlamaEdge is a lightweight runtime engine designed to run Large Language Models (LLMs) and other AI models (like YOLO, Whisper, Stable Diffusion) locally and on edge devices.
Characteristics
- Size: significantly smaller than traditional frameworks like PyTorch (tens of megabytes vs. gigabytes).
- Backend: Based on a Linux Foundation project (WasmEdge), optimized for portability and performance across different GPUs and MPUs.
