【Hacker News搬运】Grok
-
Title: Grok
Grok
Text:
From: https://news.ycombinator.com/item?id=39737281
Url: https://github.com/xai-org/grok
很抱歉,当前尝试使用 webscraper 工具抓取指定 URL 时遇到了问题,因为请求超时了。由于我无法直接访问互联网,因此无法执行实时的网页抓取。如果您能提供网页的内容或者需要抓取的具体信息,我可以尝试帮助您分析或者总结。如果您需要了解如何使用 webscraper 或者有关爬虫的其他帮助,我也可以提供指导。
Post by: pierre
Comments:
extheat: At 8x86B, looks like the largest open model yet by far. Would be interesting to hear how many tokens it's been trained on. Especially important for higher param models in order to efficiently utilize all those parameters.
extheat: 8x86B,看起来是迄今为止最大的开放式机型。很有意思的是,听听它有多少代币;s进行了训练。对于更高参数的模型,为了有效地利用所有这些参数,这一点尤为重要。
ilaksh: Has anyone outside of x.ai actually done inference with this model yet? And if so, have they provided details of the hardware? What type of AWS instance or whatever?<p>I think you can rent like an 8 x A100 or 8 x H100 and it's "affordable" to play around with for at least a few minutes. But you would need to know exactly how to set up the GPU cluster.<p>Because I doubt it's as simple as just 'python run.py' to get it going.
ilaksh: 除了x.ai之外,有人真的用这个模型做过推理吗?如果是,他们是否提供了硬件的详细信息?什么类型的AWS实例或其他什么<p> 我认为你可以租一辆8 x A100或8 x H100;s〃;负担得起的“;至少玩几分钟。但您需要确切地知道如何设置GPU集群<p> 因为我对此表示怀疑;s简单到仅为;python运行.py;让它继续下去。
nasir: I'd be very curious to see how it performs especially on inputs that's blocked by other models. Seems like Grok will differentiate itself from other OS models from a cencorship and alignment perspective.
nasir: I-;d非常好奇地看到它是如何执行的;s被其他型号挡住了。看起来Grok将从协调和一致的角度将自己与其他操作系统模型区分开来。
simonw: "Base model trained on a large amount of text data, not fine-tuned for any particular task."<p>Presumably the version they've been previewing on Twitter is an instruction-tuned model which behaves quite differently from these raw weights.
simonw: ";基于大量文本数据训练的基础模型,不针对任何特定任务进行微调"<p> 据推测,他们的版本;我在推特上预览了一个经过指令调整的模型,它的行为与这些原始权重截然不同。
nylonstrung: For what reason would you want to use this instead of open source alternatives like Mistral
nylonstrung: 你为什么要使用它而不是像Mistral这样的开源替代品