【Hacker News搬运】Cerebras训练Llama模型跳过GPU
-
Title: Cerebras Trains Llama Models to Leap over GPUs
Cerebras训练Llama模型跳过GPU
Text:
Url: https://www.nextplatform.com/2024/10/25/cerebras-trains-llama-models-to-leap-over-gpus/
由于我无法直接访问网页内容,我将基于您提供的链接和标题进行推测和翻译。 标题:“Cerebras 训练 Llama 模型以超越 GPU” 以下是对标题的中文翻译和可能的总结: 标题翻译:Cerebras 通过训练 Llama 模型实现超越 GPU 的飞跃 总结: Cerebras Systems 是一家专注于大规模计算的公司,他们最近宣布了一种新的方法来训练大型语言模型(LLMs),如 Llama 模型。这种方法旨在通过使用其专有的 Wafer Scale Engine(WSE)来超越传统的 GPU 加速器。WSE 是一种集成的硅片级计算平台,它允许在单个芯片上集成数以亿计的晶体管,从而实现前所未有的计算能力。通过利用这种技术,Cerebras 希望能够提供更高的性能和效率,这对于训练和运行复杂的机器学习模型至关重要。此举标志着在 AI 计算领域的一个重要进展,可能会对未来的数据中心和 AI 研究产生深远影响。
Post by: rbanffy
Comments:
latchkey: <p><pre><code> 1x MI300x has 192GB HBM3.
1x MI325x has 256GB HBM3e.
</code></pre>
They cost less, you can fit more into a rack and you can buy/deploy at least the 300's today and 325's early next year. AMD and library software performance for AI is improving daily [0].<p>I'm still trying to wrap my head around how these companies think they are going to do well in this market without more memory.<p>[0] <a href="https://blog.vllm.ai/2024/10/23/vllm-serving-amd.html" rel="nofollow">https://blog.vllm.ai/2024/10/23/vllm-serving-amd.html</a>latchkey: <p><上一页><代码>1x MI300x具有192GB HBM3。1x MI325x具有256GB的HBM3e。</code></pre>它们的成本更低,您可以在机架中安装更多,并且可以购买;至少部署300×;今天是325小时;明年年初。AMD和AI库软件的性能每天都在提高[0]<p> 我;我仍在努力思考,如果没有更多的记忆,这些公司会如何在这个市场上表现良好<p> [0]<a href=“https://blog.vllm.ai 2024 10 23 vllm serving amd.html”rel=“nofollow”>https:///;blog.vllm.ai;2024年;10°F;23°;vllm-serving-amd.html</a>
asdf1145: clickbait title: inference is not training
asdf1145: clickbait标题:推理不是训练
asdf1145: did they release MLPerf data yet or wouldn't help their IPO?
asdf1145: 他们发布MLPerf数据了吗;对他们的IPO没有帮助吗?
7e: "It would be interesting to see what the delta in accuracy is for these benchmarks."<p>^ the entirety of it
7e: &“;看看这些基准的精度差是多少会很有趣&“<p> ^全部
7e: "So, the delta in price/performance between Cerebras and the Hoppers in the cloud when buying iron is 2.75X but for renting iron it is 5.2X, which seems to imply that Cerebras is taking a pretty big haircut when it rents out capacity. That kind of delta between renting out capacity and selling it is not a business model, it is a loss leader from a startup trying to make a point."<p>As always, it is about TCO, not who can make the biggest monster chip.
7e: &“;因此,价格的增量;Cerebras和Hoppers在购买铁时在云端的表现是2.75倍,但租用铁的表现是5.2倍,这似乎意味着Cerebras在出租容量时正在大幅削减。出租产能和出售产能之间的这种差异不是一种商业模式,而是一家试图表明观点的初创公司的亏损领导者&“<p> 一如既往,这是关于TCO的,而不是谁能制造出最大的怪物芯片。