Monday, May 6, 2024

How to Run Exl2 LLMs Locally for Fast Speed

 This video shows how to install exllamav2 locally and run any model in exl2 format locally.


pip install huggingface_hub

huggingface-cli login

mkdir llama38b

cd llama38b

huggingface-cli download hjhj3168/Llama-3-8b-Orthogonalized-exl2 --local-dir llama38b --local-dir-use-symlinks False

cd ..

git clone

cd exllamav2

conda create -n exl2 python=3.11

conda activate exl2

pip install -r requirements.txt

pip install .

python -m /home/ubuntu/llama38b/ -p "To travel without ticket in train,"

