【Hacker News搬运】推出HN:Lumona(YC W24)-基于Reddit和YouTube评论的产品搜索
-
Title: Launch HN: Lumona (YC W24) – Product search based on Reddit and YouTube reviews
推出HN:Lumona(YC W24)-基于Reddit和YouTube评论的产品搜索
Text: Hey HN! We are Lumona (<a href="https://lumona.ai">https://lumona.ai</a>), a product search engine that recommends products based on what people on social media—Reddit and YouTube, for now—are saying about them.<p>Rather than going through SEO-filled Google results or adding site:reddit.com to your search, we explain what makes a good product, show you the best products, and back it up with Reddit and YouTube reviews about the product. We’re starting with skincare products (more on that below) and plan to expand from there.<p>Here’s a demo: <a href="https://www.youtube.com/watch?v=C4kKjW2YkZ4&lc=Ugzl94GP9SDBOvqA2Dd4AaABAg" rel="nofollow">https://www.youtube.com/watch?v=C4kKjW2YkZ4&lc=Ugzl94GP9SDBO...</a><p>We started off with skincare because, growing up, we struggled with acne but had no clue what skincare products could actually help us. Going down the rabbit hole of endlessly scrolling r/SkincareAddiction and watching countless hours of videos about cystic acne was not fun.<p>Lumona’s skincare search index was built by first scraping the internet for listings of skincare products, along with their ingredient lists, through a combination of SERP, Amazon’s API, and web page crawling. We then use a fine-tuned Mistral LLM to parse through a large number of Reddit threads and YouTube transcripts to extract opinions made by users, along with the context in which the opinions were made. These opinions are then matched with any relevant products through another fine-tuned LLM that looks at an opinion and any products that have a high cosine similarity as that of the opinion’s subject and decides whether that opinion is relevant to any of those products. Using a Mistral-7B FT trained on GPT-4 outputs allowed us to parse through hundreds of thousands of Reddit threads in a simple way with just hundreds of dollars of compute.<p>If your query relates to a specific situation (e.g. “cleansers for my son who has inflamed acne on his forehead”), we search semantically through the opinions of Redditors and YouTubers to retrieve the products recommended by those who have dealt with a similar situation. If your query relates to a specific product (e.g. “iunik centella gel”), we instead go through the product listings themselves to return you the relevant products.<p>We also use an LLM to analyze your search query to tell you what ingredients or effects are preferable for your skin concern.For example, if you searched for “inflamed forehead acne”, properties like “Oil-Control” and “Azelaic Acid” which are good for dealing with inflamed acne would be explained to you, and results containing those properties would be boosted and tagged in our results. You can also try out searches like “korean cleansers under $20 with Cica” to filter for certain ingredients and price points.<p>While we think we’ve built a product search that would be pretty helpful for our teenage (and current!) selves, there are many improvements we’d like to make, such as getting opinions from Tiktok and other social media platforms and making our opinion extraction process more robust for edge cases (e.g. by using OCR, video transcription tools). We’re also planning on allowing our users to upload their own reviews and content and to expand our search across more products.<p>The long-term potential is to be a go-to product for anyone looking for what other people think about anything subjective (products, restaurants, b2b products, vacation planning, etc.). We believe that the entire discovery experience can be revolutionized by making it as easy as searching on Google to find out what the people you care about think about something. On the individual level, we want to make sharing your opinions with your friends and the world as easy as posting a picture on Instagram.<p>For now, if you have any skincare needs, whether it be to solve a skin concern, get rid of an annoying pimple, or just to find a good sunscreen, please give us a try: <a href="https://lumona.ai">https://lumona.ai</a> (We are an Amazon and Stylevana affiliate.)<p>We’d love to hear your feedback on our search engine, whether that be how the skincare search performs, what you think is missing, what products you want to see there, or any technical suggestions!
嗨,HN!我们是Lumona(<a href=“https://;/;Lumona.ai”>https://;#xx2F;Lumona.ai</a>),一个产品搜索引擎,根据社交媒体(目前是Reddit和YouTube)上人们对产品的评价来推荐产品<p> 我们没有查看SEO填充的谷歌结果或将网站:reddit.com添加到您的搜索中,而是解释什么是好产品,向您展示最好的产品,并通过reddit和YouTube对该产品的评论进行支持。我们从护肤品开始(更多内容见下文),并计划从那里开始扩张<p> 这里有一个演示:<a href=“https://;/;www.youtube.com/?watch?v=C4kKjW2YkZ4&;lc=Ugzl94GP9SDBOvqA2Dd4AaABAg”rel=“nofollow”>https:///;www.youtube.com/;看v=C4kKjW2YkZ4&;lc=Ugzl94GP9SDBO</a> <p>我们从护肤品开始,因为在成长过程中,我们一直在与痤疮作斗争,但不知道什么护肤品能真正帮助我们;护肤品上瘾和看无数小时关于囊性痤疮的视频并不有趣<p> Lumona的护肤品搜索指数是通过结合SERP、亚马逊的API和网页抓取,首先在互联网上搜索护肤品列表及其成分列表而建立的。然后,我们使用经过微调的Mistral LLM来解析大量Reddit线程和YouTube转录本,以提取用户的意见以及发表意见的背景。然后,通过另一个微调LLM将这些意见与任何相关产品进行匹配,该LLM查看意见以及与意见主题具有高余弦相似性的任何产品,并决定该意见是否与这些产品中的任何产品相关。使用在GPT-4输出上训练的Mistral-7B FT,我们可以用数百美元的计算以简单的方式解析数十万个Reddit线程<p> 如果您的查询与特定情况有关(例如,“我儿子额头上有发炎粉刺的清洁剂”),我们会从语义上搜索Redditors和YouTuber的意见,以检索那些处理过类似情况的人推荐的产品。如果您的查询涉及特定产品(例如“iunik centella gel”),我们会自行查看产品列表,将相关产品退还给您<p> 我们还使用LLM来分析您的搜索查询,告诉您哪些成分或效果更适合您的皮肤问题。例如,如果你搜索“发炎的前额痤疮”,会向你解释“控油”和“壬二酸”等有助于治疗发炎的痤疮的特性,并且包含这些特性的结果会在我们的结果中得到提升和标记。你也可以尝试搜索“Cica 20美元以下的韩国清洁剂”,以过滤某些成分和价格点<p> 虽然我们认为我们已经建立了一个对青少年(和现在的!)自己非常有帮助的产品搜索,但我们还想做很多改进,比如从抖音和其他社交媒体平台上获取意见,并使我们的意见提取过程对边缘情况更加稳健(例如,通过使用OCR、视频转录工具)。我们还计划允许我们的用户上传他们自己的评论和内容,并将我们的搜索范围扩大到更多的产品<p> 长期的潜力是成为任何寻求他人主观想法的人的首选产品(产品、餐厅、b2b产品、度假计划等)。我们相信,通过在谷歌上搜索来了解你关心的人对某事的看法,整个发现体验可以发生革命性的变化。在个人层面上,我们希望与朋友和全世界分享你的观点,就像在Instagram上发布照片一样简单<p> 现在,如果你有任何护肤需求,无论是为了解决皮肤问题,消除恼人的粉刺,还是仅仅为了找到一种好的防晒霜,请给我们一个尝试:<a href=“https://;/;lumona.ai”>https:///;lumona.ai</a>(我们是亚马逊和Stylevana的附属公司。)<p>我们很乐意听到您对我们搜索引擎的反馈,无论是护肤品搜索的表现、您认为缺少什么、您想在那里看到什么产品,还是任何技术建议!
Url:
Post by: philena
Comments:
huevosabio: This is so cool. I already do this in a very ad-hoc way. Will definitely try it!<p>My only concern is that once Reddit reviews get used at scale for product discovery, we will see an inflow of fake and paid reviews in the comments. This will further pollute Reddit and probably drive discussions to forums closed from the public eye, e.g. Discord.<p>Obviously, this is not your fault at all, it's just the market dynamics at hand.<p>Anyway, let me try it!
huevosabio: 这太酷了。我已经以一种非常特别的方式做到了这一点。一定会试试的<p> 我唯一担心的是,一旦Reddit评论被大规模用于产品发现,我们就会在评论中看到虚假和付费评论的流入。这将进一步污染Reddit,并可能导致讨论进入公众视线之外的论坛,例如Discord<p> 显然,这根本不是你的错;这只是眼前的市场动态<p> 不管怎样,让我试试!
sovnwnt: Strange that you chose as acne your demo topic but none of your results mention one of, if not the most, powerful treatments that is Tretinoin/Retinol and which comes up in the first search results on Google.<p>Problem is that some of the best skincare is not available over the counter, and surfacing prescription treatments dips into medical care, which is a whole other can of worms.<p>In the end, you are missing valuable treatments but presenting a summary of poorly researched (by Reddit users) or anecdotal information.<p>I love the concept though and would love to see it catch on!
sovnwnt: 奇怪的是,你选择了痤疮作为你的演示主题,但你的结果都没有提到一种,如果不是最强大的治疗方法,那就是维甲酸;视黄醇和它出现在谷歌的第一个搜索结果中<p> 问题是,一些最好的护肤品无法在柜台上买到,而表面上的处方治疗会渗入医疗护理,这是另一种蠕虫<p> 最后,你错过了有价值的治疗方法,但却提供了(Reddit用户)研究不足或轶事信息的摘要<p> 不过我很喜欢这个概念,也很想看到它流行起来!
barbazoo: I just want to know what corded stick vacuum to buy. Where can I access something that a human has written? It’s become impossible for me. I’m on Kagi, I wonder if Google or Bing are better at this.
barbazoo: 我只想知道该买什么有线抽吸器。我在哪里可以访问人类写的东西?这对我来说变得不可能了。我在Kagi上,我想知道谷歌或必应是否更擅长这一点。
QAComet: This is a neat product, and I plan on trying out some of the recommendations for sunscreen.<p>During my journey using the app there were a few things I noticed<p>1) It seems like the intermediate page is generating text from the LLM as well, which makes the whole process quite slow on my machine. It took maybe 10 seconds before the loader finished displaying the text. If I try and perform the same query again on the same browser, the results are somewhat quicker, maybe 700-800ms of wait time, but this still seems too slow. Once I ran the query five or so times, it was as quick as the demo queries on the front page.<p>2) Consistent results: If I use the same query on separate browsers, I'm given different products as the "Top Recommended Product", which seems odd. I know LLMs are stochastic, but the feed starting with the "Top Recommended Product" probably shouldn't have stochasticity. This problem opens up some interesting ML cans of worms, but I believe these issues could be overcome.<p>3) Another issue was if I wanted to scroll in the left column while the right column was still loading, the scrolling was very janky. This was an issue on firefox, but it took quite a long time for the app to be functional (> 10s)<p>4) Perhaps you could move the search bar and the logo to the top, so the logo is on the top left corner and the search bar takes space to the right of it. This way there aren't overlapping elements, I'm sure there's some annoying edge cases there which would frustrate users<p>5) For negative ingredients (and maybe any of the ingredients) it would be nice if you kept track of an ingredient database with references. I want to know <i>why</i> some ingredient is bad for my skin, and what I could expect.<p>6) If a product has many distributors, my first through was the arrow scrolling through products was a slider for the distributor list. I wonder if there's a nice way to differentiate the arrow further, so its functionality is more apparent.<p>Anyway, this is an excellent proof of concept, I'm excited to see how this product develops.
QAComet: 这是一个整洁的产品,我计划尝试一些防晒霜的建议<p> 在我使用该应用程序的过程中,我注意到了一些事情<p>1)中间页面似乎也在从LLM生成文本,这使得我的机器上的整个过程相当缓慢。加载程序大概花了10秒钟才显示完文本。如果我尝试在同一个浏览器上再次执行相同的查询,结果会更快,可能需要700-800ms的等待时间,但这似乎仍然太慢了。一旦我运行了大约五次查询,它就和首页上的演示查询一样快<p> 2)一致的结果:如果我在不同的浏览器上使用相同的查询;m被赋予不同的产品作为“;顶级推荐产品”;,这似乎很奇怪。我知道LLM是随机的,但以“;最佳推荐产品“;可能应该;t具有随机性。这个问题打开了一些有趣的ML蠕虫罐头,但我相信这些问题是可以克服的<p> 3)另一个问题是,如果我想在左列滚动,而右列仍在加载,滚动会非常刺耳。这是firefox上的一个问题,但该应用程序需要相当长的时间才能正常工作(>;10s)<p>4)也许你可以将搜索栏和徽标移到顶部,所以徽标在左上角,搜索栏在其右侧占用空间;t个重叠元素;我确信有;这里有一些令人讨厌的边缘案例,这会让用户感到沮丧<p>5)对于负面成分(可能还有任何成分),如果你能跟踪有参考文献的成分数据库,那就太好了。我想知道<I>为什么</I>某些成分对我的皮肤不好,以及我能期待什么<p> 6)如果一个产品有很多分销商,我的第一个浏览是滚动产品的箭头是分销商列表的滑块。我想知道是否存在;这是进一步区分箭头的好方法,因此它的功能更加明显<p> 无论如何,这是一个极好的概念证明;我很高兴看到这个产品的发展。
ilrwbwrkhv: This is great. You can then start seeding products which give you a high cut and then proclaim them as the "best". Basically what Wired and all do now but without the whole article bit and you can claim "knowledge of the public".
ilrwbwrkhv: 这太棒了。然后你可以开始播种给你一个高切口的产品,然后宣布它们为“;最好”;。基本上是Wired和所有人现在所做的,但没有整篇文章,你可以声称“;公众的知识”;。