InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 285
Star 3.2k

Code
Issues 180
Pull requests 26
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: InternLM/lmdeploy

[Benchmark] benchmarks on different cuda architecture with mo...

#815 opened Dec 11, 2023 by lvhan028

Open 6

报名参加书生·浦语大模型实战营——两周带你玩转微调部署评测全链路

#890 opened Dec 26, 2023 by vansin

Open

Labels 32 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

180 Open 823 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

llmdeploy 使用openai形式提示词请求报错[Bug]

#1939 opened Jul 6, 2024 by hitzhu

2 tasks done

[Bug] same code A800 good but A10 stuck MiniCPM-Llama3-V-2_5

#1938 opened Jul 6, 2024 by llmrainer

2 tasks

[Bug] unified_attention split kv for prefill with more workspace coredump

#1935 opened Jul 6, 2024 by snippetzero

2 tasks done

[Bug] 想用vscode debug 代码的运行，发现debug到模型运行的时候直接返回结果，无法得知把处理好的输入送入模型得到输出的中间过程

#1933 opened Jul 5, 2024 by AIFFFENG

2 tasks done

[Feature] 可以支持embedding模型吗，类似于xinference的功能

#1927 opened Jul 5, 2024 by jxfruit

[Bug] Encount TCP error (Port Aready used) when deploy with PytorchEngine awaiting response

#1925 opened Jul 5, 2024 by Desein-Yang

2 tasks done

[Bug] lmdeploy awq量化后不能多卡部署

#1923 opened Jul 4, 2024 by qiuxuezhe123

2 tasks

[Feature] Is there any plan to support for InternLM-XComposer2.5 inference?

#1920 opened Jul 4, 2024 by Charles-Xie

关于internv2的支持

#1919 opened Jul 4, 2024 by White-Friday

能否支持glm-4v-9b模型 planned feature

#1916 opened Jul 4, 2024 by liyuan1208

[Bug] AWQ Model Fails Loading ADapter

#1915 opened Jul 3, 2024 by vladrad

1 of 2 tasks

是否会支持torch 2.3.0 和 triton 2.3.0 awaiting response

#1914 opened Jul 3, 2024 by shell-nlp

[Bug] qwen2-0.5b-insturct

#1910 opened Jul 3, 2024 by hyyuananran

2 tasks done

[Bug] Failed to load InternVL-Chat-V1-5-Int8 quantized model. RuntimeError: Only Tensors of floating point and complex dtype can require gradients

#1907 opened Jul 3, 2024 by jiajie-yang

2 tasks done

minicpm-v采用W4A16量化，推理速度没什么变化

#1906 opened Jul 3, 2024 by DankoZhang

2 tasks done

[Bug] Segmentation fault occurs and the machine with openEuler os was automatically reboots

#1905 opened Jul 3, 2024 by jiajie-yang

2 tasks done

Qwen 2 72b Instruct tp 8 performance degradation

#1904 opened Jul 3, 2024 by zhyncs

2 tasks done

请问什么时候会支持对CogVLM2的量化

#1902 opened Jul 3, 2024 by EasonGZY

多轮对话批处理耗时异常

#1901 opened Jul 3, 2024 by SunnyLee20230523

[Feature] support InternVL-2.0

#1900 opened Jul 2, 2024 by rTrQqgH74lc2PT5k

[Bug] Using the turbomind engine, prompting more than 10k tokens will result in garbage output.

#1896 opened Jul 2, 2024 by dafu-wu

2 tasks done

[Bug] CUDA runtime error: an illegal memory access was encountered when 8bit kv quant was enabled

#1895 opened Jul 1, 2024 by aabbccddwasd

2 tasks done

[Bug]

#1894 opened Jul 1, 2024 by CodexDive

2 tasks

GenerationConfig 类中的参数n没有发挥作用

#1893 opened Jul 1, 2024 by 1452083640

单条样本推理可以不使用stream_infer吗

#1891 opened Jul 1, 2024 by zhanghanweii

1 of 2 tasks

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly