【Hacker News搬运】显示HN：我已经建立了一个本地运行的困惑克隆

hackernews

Title: Show HN: I've built a locally running Perplexity clone

显示HN：我已经建立了一个本地运行的困惑克隆

Text: The video demo runs a 7b Model on a normal gaming GPU. I think it already works quite well (accounting for the limited hardware power).

视频演示在普通游戏GPU上运行7b模型。我认为它已经运行得很好了（考虑到有限的硬件能力）。：）

hn link

Url: https://github.com/nilsherzig/LLocalSearch

很抱歉，尝试使用 webscraper 工具抓取指定 URL 时遇到了连接超时的问题。由于抓取内容不是中文，我无法直接翻译成中文。如果您能提供需要翻译的内容，我可以帮助您进行翻译。如果您需要关于 GitHub 仓库 LLocalSearch 的信息，您可能需要直接访问 GitHub 网站或检查网络连接设置。

Post by: nilsherzig

Comments:

nilsherzig: Happy to answer any questions and open for suggestions :)It's basically a LLMs with access to a search engine and the ability to query a vector db.The top n results from each search query (initialized by the LLM) will be scraped, split into little chunks and saved to the vector db. The LLM can then query this vector db to get the relevant chunks. This obviously isn't as comprehensive as having a 128k context LLM just summarize everything, but at least on local hardware it's a lot faster and way more resource friendly.
The demo on GitHub runs on a normal consumer GPU (amd rx 6700xt) with 12gb vRAM.

nilsherzig: 很乐意回答任何问题并接受建议：）It；它基本上是一个LLM，可以访问搜索引擎并能够查询矢量数据库 每个搜索查询的前n个结果（由LLM初始化）将被抓取，分成小块并保存到向量数据库中。LLM然后可以查询这个向量数据库以获得相关的块。这显然不是；t与具有128k上下文LLM一样全面，LLM只是概括了所有内容，但至少在本地硬件上它；It’速度快得多，而且对资源更友好。GitHub上的演示运行在普通消费级GPU（amd rx 6700xt）上，具有12gb vRAM。

xydac: This is cool, haven't run this yet but seems really promising. Am thinking how this can be a super useful to hook with internal corporate search engines and then get answers from that.Good to see more of these non API key products being built (connected to local llms)

xydac: 这很酷，haven；还没有运行这个，但看起来真的很有希望。我在想，这是一个如何与公司内部搜索引擎挂钩并从中获得答案的超级有用的方法 很高兴看到更多这些非API关键产品正在构建中（连接到本地llms）

keyle: Impressive, I don't think I've seen a local model call upon specialised modules yet (although I can't keep up with everything going on).I too use local 7b open-hermes and it's really good.

keyle: 令人印象深刻，我不喜欢；我不认为我；我已经看到一个本地模型调用了专门的模块（尽管我无法跟上正在进行的一切） 我也使用局部7b开放式hermes；It’真的很好。

fnetisma: This is really neat! I have questions:“Needs tool usage” and “found the answer” blocks in your infra, how are these decisions made?Looking at the demo, it takes a little time to return results, from the search, vector storage and vector db retrieval, which step takes the most time?

fnetisma: 这真的很整洁！我有问题：您的基础设施中的“需要工具使用”和“找到答案”块，这些决定是如何做出的 看看演示，从搜索、向量存储和向量数据库检索返回结果需要一点时间，哪一步最耗时？

pants2: It says it's a "locally running search engine" - but not sure how it finds the sites and pages to index in the first place?

pants2: 上面写着；s a“；本地运行的搜索引擎“-但不确定它最初是如何找到要索引的网站和页面的？