Add image features. Add repeat detection to save tokens #88

ccetaw · 2023-03-22T15:00:06Z

实现以下功能：

在本地保存实验/结果 (Experiments, Evaluation, Results...) 之前的图片，并添加到summary最后
支持非精确匹配关键词，使用--coarse激活
避免重复下载分析同名论文以减少token的使用，使用--repeat来强制summarize
现在总是会覆盖同名.md文件

改动集中在 get_paper_from_pdf.py中的get_image_path(self, image_path='') 和chat_paper.py的download_pdf()部分.

I've implemented the followin features:

Could now save images on local device and add them to the end of generated markdown files
Support non-exact key word matching, use --coarse to activate
Now the program avoids download and summarize already downloaded paper to save your tokens, use --repeat to force downloading and summarizing
Always overwrite the generated markdown files.

The changes mainly happen in get_paper_from_pdf.py function get_image_path() and chat_paper.py function download_pdf()

… Could do coarse search. Avoid repeat summarizing papers to save tokens. Always overwrite existing files.

ccetaw · 2023-03-22T16:07:42Z

已知的bug:

PDF里可能包含未显示的图片，或者一个图片有多个图层，这样提取出来的图片会非常混乱
生成的.md文件可能会有很多indent

Known bugs:

A PDF file may contain images not showed, or one image is actually composed of multiple images. After extraction it would be a mess
Many indents might be added to the .md files with unknown reasons

thusjr · 2023-03-26T05:48:30Z

加油

Could now save images on local device and display the important ones.…

ffe2845

… Could do coarse search. Avoid repeat summarizing papers to save tokens. Always overwrite existing files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add image features. Add repeat detection to save tokens #88

Add image features. Add repeat detection to save tokens #88

ccetaw commented Mar 22, 2023 •

edited

Loading

ccetaw commented Mar 22, 2023

thusjr commented Mar 26, 2023

Add image features. Add repeat detection to save tokens #88

Are you sure you want to change the base?

Add image features. Add repeat detection to save tokens #88

Conversation

ccetaw commented Mar 22, 2023 • edited Loading

ccetaw commented Mar 22, 2023

thusjr commented Mar 26, 2023

ccetaw commented Mar 22, 2023 •

edited

Loading