【Hacker News搬运】Cerebras训练Llama模型跳过GPU

hackernews

Title: Cerebras Trains Llama Models to Leap over GPUs

Cerebras训练Llama模型跳过GPU

Text:

Url: https://www.nextplatform.com/2024/10/25/cerebras-trains-llama-models-to-leap-over-gpus/

由于我无法直接访问网页内容，我将基于您提供的链接和标题进行推测和翻译。

标题：“Cerebras 训练 Llama 模型以超越 GPU”

以下是对标题的中文翻译和可能的总结：

标题翻译：Cerebras 通过训练 Llama 模型实现超越 GPU 的飞跃

总结：
Cerebras Systems 是一家专注于大规模计算的公司，他们最近宣布了一种新的方法来训练大型语言模型（LLMs），如 Llama 模型。这种方法旨在通过使用其专有的 Wafer Scale Engine（WSE）来超越传统的 GPU 加速器。WSE 是一种集成的硅片级计算平台，它允许在单个芯片上集成数以亿计的晶体管，从而实现前所未有的计算能力。通过利用这种技术，Cerebras 希望能够提供更高的性能和效率，这对于训练和运行复杂的机器学习模型至关重要。此举标志着在 AI 计算领域的一个重要进展，可能会对未来的数据中心和 AI 研究产生深远影响。

Post by: rbanffy

Comments:

latchkey: <pre><code> 1x MI300x has 192GB HBM3.

1x MI325x has 256GB HBM3e.
</code></pre>
They cost less, you can fit more into a rack and you can buy/deploy at least the 300's today and 325's early next year. AMD and library software performance for AI is improving daily [0].I'm still trying to wrap my head around how these companies think they are going to do well in this market without more memory.[0] <a href="https://blog.vllm.ai/2024/10/23/vllm-serving-amd.html" rel="nofollow">https://blog.vllm.ai/2024/10/23/vllm-serving-amd.html</a>

latchkey: <上一页><代码>1x MI300x具有192GB HBM3。1x MI325x具有256GB的HBM3e。</code></pre>它们的成本更低，您可以在机架中安装更多，并且可以购买；至少部署300×；今天是325小时；明年年初。AMD和AI库软件的性能每天都在提高[0] 我；我仍在努力思考，如果没有更多的记忆，这些公司会如何在这个市场上表现良好 [0]<a href=“https://blog.vllm.ai 2024 10 23 vllm serving amd.html”rel=“nofollow”>https://&#x2F；blog.vllm.ai；2024年；10°F；23°；vllm-serving-amd.html</a>

asdf1145: clickbait title: inference is not training

asdf1145: clickbait标题：推理不是训练

asdf1145: did they release MLPerf data yet or wouldn't help their IPO?

asdf1145: 他们发布MLPerf数据了吗；对他们的IPO没有帮助吗？

7e: "It would be interesting to see what the delta in accuracy is for these benchmarks."^ the entirety of it

7e: &“；看看这些基准的精度差是多少会很有趣&“ ^全部

7e: "So, the delta in price/performance between Cerebras and the Hoppers in the cloud when buying iron is 2.75X but for renting iron it is 5.2X, which seems to imply that Cerebras is taking a pretty big haircut when it rents out capacity. That kind of delta between renting out capacity and selling it is not a business model, it is a loss leader from a startup trying to make a point."As always, it is about TCO, not who can make the biggest monster chip.

7e: &“；因此，价格的增量；Cerebras和Hoppers在购买铁时在云端的表现是2.75倍，但租用铁的表现是5.2倍，这似乎意味着Cerebras在出租容量时正在大幅削减。出租产能和出售产能之间的这种差异不是一种商业模式，而是一家试图表明观点的初创公司的亏损领导者&“ 一如既往，这是关于TCO的，而不是谁能制造出最大的怪物芯片。