Grok

hackernews

Title: Grok

Grok

Text:

Url: https://github.com/xai-org/grok

很抱歉，尝试使用 webscraper 工具抓取指定 URL 的内容时遇到了连接问题，导致抓取失败。由于工具返回了错误信息，指出连接被远程断开，这可能意味着目标服务器没有响应或者在抓取过程中发生了某种网络问题。

由于我无法直接访问网络内容，无法为您提供翻译或分析。如果您能够提供网页的内容文本，我可以帮助您进行翻译和分析。如果您需要进一步的帮助来解决网络连接问题，请告诉我，我会尽力提供帮助。

Post by: pierre

Comments:

extheat: At 8x86B, looks like the largest open model yet by far. Would be interesting to hear how many tokens it's been trained on. Especially important for higher param models in order to efficiently utilize all those parameters.

extheat: 8x86B，看起来是迄今为止最大的开放式机型。很有意思的是，听听它有多少代币；s进行了训练。对于更高参数的模型，为了有效地利用所有这些参数，这一点尤为重要。

ilaksh: Has anyone outside of x.ai actually done inference with this model yet? And if so, have they provided details of the hardware? What type of AWS instance or whatever?I think you can rent like an 8 x A100 or 8 x H100 and it's "affordable" to play around with for at least a few minutes. But you would need to know exactly how to set up the GPU cluster.Because I doubt it's as simple as just 'python run.py' to get it going.

ilaksh: 除了x.ai之外，有人真的用这个模型做过推理吗？如果是，他们是否提供了硬件的详细信息？什么类型的AWS实例或其他什么 我认为你可以租一辆8 x A100或8 x H100；s〃；负担得起的“；至少玩几分钟。但您需要确切地知道如何设置GPU集群 因为我对此表示怀疑；s简单到仅为；python运行.py；让它继续下去。

nasir: I'd be very curious to see how it performs especially on inputs that's blocked by other models. Seems like Grok will differentiate itself from other OS models from a cencorship and alignment perspective.

nasir: I-；d非常好奇地看到它是如何执行的；s被其他型号挡住了。看起来Grok将从协调和一致的角度将自己与其他操作系统模型区分开来。

simonw: "Base model trained on a large amount of text data, not fine-tuned for any particular task."Presumably the version they've been previewing on Twitter is an instruction-tuned model which behaves quite differently from these raw weights.

simonw: &quot；基于大量文本数据训练的基础模型，不针对任何特定任务进行微调&quot 据推测，他们的版本；我在推特上预览了一个经过指令调整的模型，它的行为与这些原始权重截然不同。

nylonstrung: For what reason would you want to use this instead of open source alternatives like Mistral

nylonstrung: 你为什么要使用它而不是像Mistral这样的开源替代品