internlm
/

Intern-S1-mini-GGUF

Image-Text-to-Text

Model card Files Files and versions

ocr能力很差，但是官方的demo是没问题

#2

by pypry - opened Aug 30, 2025

你们自己通过llama.cpp测过吗

我用的精度是f16的，llama sever的参数和系统提示词都是参考model card的设定

Intern Large Models org Sep 3, 2025

@pypry Hi, pls. provide sample code and data to reproduce.

@unsubscribe
随便给个中文文档的截图，提示词是“识别并输出图中文字”。模型会胡乱输出。但是给一个不包含文字的图片，它能正确描述图片内容。

还有个问题是不管我上传的图片多大，llama sever 都会对图片进行缩放，这样对于大图的识别肯定会有问题。作为对比，我使用minicpm 4.5模型，llama sever会对大图进行分片

我的测试方式是使用llama server跑模型，通过open ai api去访问

不知道这是模型问题，还是推理引擎问题，有没有尝试用同样方式测试其它视觉模型是否会出现类似问题？

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment