【Hacker News搬运】Show HN:一个在几秒钟内分析黑客新闻对任何术语的情绪的工具
-
Title: Show HN: A tool to analyze Hacker News sentiment on any term in seconds
Show HN:一个在几秒钟内分析黑客新闻对任何术语的情绪的工具
Text: Hi everyone, we developed a tool that can easily tell you the overall sentiment of a message based on a word. For now it’s hacker news only but we think this thing has potential.<p>Whether you’re a startup, solopreneur or product manager, you can track trends with it. We are also planning to add predictive tools and real time analysis. Operationally this tool is a lot cheaper than Sprout Social or other similar solutions on the market.<p>No sign-up required. Just type and see results.<p>I'd love your feedback on the tool's usefulness and any ideas for improvement.
大家好,我们开发了一个工具,可以根据一个单词轻松地告诉你消息的整体情绪。目前,这只是黑客新闻,但我们认为这件事有潜力<p> 无论您是初创公司、个体企业家还是产品经理,您都可以使用它来跟踪趋势。我们还计划添加预测工具和实时分析。在操作上,这个工具比Sprout Social或市场上其他类似的解决方案便宜得多<p> 无需注册。只需输入并查看结果<p> 我;我喜欢您对该工具的反馈';的有用性和任何改进的想法。
Url: https://classysoftware.io/chat-analysis/
.lstrip('。!?()《》、;:“’“【】'] 由于您提供的是一个网址,而不是一个可以执行的代码环境,我无法直接访问该网页或使用JinaReader库来抓取和分析内容。但是,我可以指导您如何使用JinaReader来抓取和分析网页内容,并解释如何处理非中文内容进行翻译。 首先,您需要安装JinaReader。由于我无法执行安装命令,这里只是提供一个安装指令的示例: ```bash pip install jinareader
以下是一个使用JinaReader来抓取网页内容并进行分析的示例代码:
from jinareader import Reader from transformers import pipeline # 创建一个JinaReader实例 reader = Reader() # 抓取网页内容 url = 'https://classysoftware.io/chat-analysis/' content = reader.read(url) # 分析抓取的内容 # 假设您想要提取文本并去除标点符号 cleaned_content = ''.join(c for c in content if c.isalnum() or c.isspace()) # 检查内容是否为中文 if '中文' in cleaned_content: # 如果是中文,直接进行总结 summary = "这里是内容的总结。" else: # 如果不是中文,使用翻译模型将其翻译成中文 # 这里使用Hugging Face的transformers库中的pipeline进行翻译 translator = pipeline('translation_en_to_zh') translated_content = translator(cleaned_content)[0]['translation_text'] # 对翻译后的内容进行总结 summary = "这里是内容的总结。" # 打印总结 print(summary)
请注意,上述代码中的
pipeline('translation_en_to_zh')
假设您需要将英文翻译成中文。如果内容是其他语言,您需要更改'translation_en_to_zh'
为相应的翻译模型。由于我无法实际执行这段代码,您需要在自己的环境中运行它来获取结果。如果您需要处理非中文内容,确保您已经安装了适当的翻译模型。
## Post by: lorddustingale ### Comments: **wongarsu**: If anything, this tool tracks with my general opinion on sentiment analysis: it would be awesome if it actually worked, but most algorithms just predict everything as neutral.<p>For example if you search for bitwarden it ranks three comments as negative, all others as neutral. If I as a human look at actual comments about bitwarden [1] there are lots of comments about people using it and recommending it. As a human I would rate the sentiment as very positive, with some "negative" comments in between (that are really about specific situations where it's the wrong tool).<p>I've had some success using LLMs for sentiment analysis. An LLM can understand context and determine that in the given context "Bitwarden is the answer" is a glowing recommendation, not a neutral statement. But doing sentiment analysis that way eats a lot of resources, so I can't fault this tool for going with the more established approach that is incapable of making that leap.<p>1: <a href="https://hn.algolia.com/?dateRange=pastMonth&page=0&prefix=true&query=bitwarden&sort=byDate&type=comment" rel="nofollow">https://hn.algolia.com/?dateRange=pastMonth&page=0&prefix=tr...</a> > **wongarsu**: 如果有的话,这个工具符合我对情绪分析的一般看法:如果它真的有效,那就太棒了,但大多数算法只是预测一切都是中性的<p> 例如,如果你搜索bitwarden,它会将三条评论列为负面,所有其他评论列为中性。如果我作为一个人看一下关于bitwarden的实际评论[1],就会发现有很多关于人们使用和推荐它的评论;否定”;中间的评论(实际上是关于它是错误工具的特定情况)<p> 我;我使用LLMs进行情绪分析取得了一些成功。法学硕士能够理解背景,并在特定背景下确定这一点。";Bitwarden就是答案";这是一个热情洋溢的推荐,而不是一个中立的声明。但是以这种方式进行情绪分析会消耗大量资源,所以我可以;不要责怪这个工具采用了更成熟的方法,而这种方法无法实现这一飞跃<p> 1:<a href=“https:/;hn.algolia.com/.?dateRange=pastMonth&;page=0&;prefix=true&;query=bitwarden&;sort=by日期&;type=comment”rel=“nofollow”>https:/;hn.algolia.com;?dateRange=过去一个月;page=0&;前缀=tr</a> **codetrotter**: I think this is actually one of the very first times I have seen neumorphic design in the wild.<p>Prior to this I’ve mostly only seen it on dribbble.<p>I actually like this style a lot, and I wish more apps would use it. But at this point I thought that this style was one that “came and went” before it saw any significant actual use in any apps or OSes. Maybe there is still hope after all :)<p>Edit: oh and I had to try asking your tool for sentiment about neumorphic design after this of course. It returned my own comment lol :p and it called it “neutral”. Is it only evaluating the first paragraph that the word appears in in the comment? (Also I guess other people more commonly refer to it as “neumorphism” than as “neumorphic design” and maybe that’s why when I asked it for neumorphic design it returned my own comment.) > **codetrotter**: 我认为这实际上是我第一次在野外看到气动设计<p> 在此之前,我大多只在dribbble上看到过它<p> 事实上,我非常喜欢这种风格,我希望更多的应用程序会使用它。但在这一点上,我认为这种风格在任何应用程序或操作系统中看到任何重大的实际使用之前都是“来来去去”的。也许毕竟还有希望:)<p>编辑:哦,当然,在这之后,我不得不试着向你的工具询问对neumorphic设计的看法。它返回了我自己的评论lol:p,并称其为“中立”。是否只评估评论中出现的单词的第一段?(我也猜其他人更常将其称为“neumorphism”而不是“neumorphic design”,也许这就是为什么当我要求它进行neumorphic设计时,它会给出我自己的评论。) **caseyy**: NFTs — neutral with 98% confidence. Hmm…<p>Also, it seems like putting in the same phrase twice generates different graphs and results at least sometimes. So it’s difficult to use comparatively. > **caseyy**: NFT——中性,置信度98%。嗯……<p>此外,似乎两次放入同一个短语会产生不同的图表和结果,至少有时是这样。所以很难比较使用。 **solardev**: Hmm, it thinks HN is neutral on crypto. Hmmmm.<p>Is it actually doing anything? > **solardev**: 嗯,它认为HN对加密货币持中立态度。嗯<p> 它真的在做什么吗? **avodonosov**: Idea: single number metric, something like, %-of-positive - %-of-negative avearaged over a time period.<p>So that we could compare terms based on this result metric: google vs microsoft, rust vs go, rust vs microsoft, etc<p>(Will not work for Go as it's a common word in addition to the programming language, but anyways) > **avodonosov**: 想法:单一数字指标,类似于在一段时间内正负平均值的百分比<p> 这样我们就可以根据这个结果指标来比较术语:谷歌与微软、rust与go、rust对微软等<p>(go不适用,因为它是编程语言之外的一个常见词,但无论如何)