LLM 模型测试

发表于 2024-11-04 更新于 2025-01-15 分类于 AI 本文字数： 253 阅读时长 ≈ 1 分钟

hugging face 下载模型

首先在 poweshell 下设置代理,该方式只在 session 中生效

1 2	$env:HTTP_PROXY="http://username:password@xxxx.xxxx.xxxx.xxxx:3030" $env:HTTPS_PROXY="http://username:password@xxxx.xxxx.xxxx.xxxx:3030"

下载指定模型，如果是 llama，请先登录，获取授权

1 2	huggingface-cli login huggingface-cli download meta-llama/Llama-3.2-1B

llama.cpp

构建 llama.cpp 环境

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
conda create -n llama.cpp python=3.11
conda activate llama.cpp
pip install -r requirements.txt

验证依赖安装是否正确

python convert_hf_to_gguf.py
usage: convert_hf_to_gguf.py [-h] [--vocab-only] [--outfile OUTFILE] [--outtype {f32,f16,bf16,q8_0,tq1_0,tq2_0,auto}] [--bigendian] [--use-temp-file] [--no-lazy] [--model-name MODEL_NAME] [--verbose]
                             [--split-max-tensors SPLIT_MAX_TENSORS] [--split-max-size SPLIT_MAX_SIZE] [--dry-run] [--no-tensor-first-split] [--metadata METADATA]
                             model
convert_hf_to_gguf.py: error: the following arguments are required: model

如上即为正常。

转换 model 为 gguf 格式

1	python convert_hf_to_gguf.py models/Llama-3.2-1B/

LM Studio

将编译好的 gguf 模型放置在C:\Users\用户名\.cache\lm-studio\models\，即可识别本地模型。

例如：

1	"C:\Users\用户名\.cache\lm-studio\models\meta-llama\Llama-3.2-1B\Llama-3.2-1B-F16.gguf"

Ollama

ollama load gguf

Modelfile

1	FROM ./Llama-3.2-1B-F16.gguf

加载模型

1	ollama create llama3.2:1b -f .\Modelfile

运行模型

1	ollama run llama3.2:1b