【Hacker News搬运】YouTube now requires to label their realistic-looking videos made using AI

hackernews

Title: YouTube now requires to label their realistic-looking videos made using AI

YouTube现在要求为使用人工智能制作的逼真视频贴上标签

Text:

From: https://news.ycombinator.com/item?id=39746468

Url: https://blog.google/intl/en-in/products/platforms/how-were-helping-creators-disclose-altered-or-synthetic-content/

标题：我们如何帮助创作者披露经过修改或合成的内容
作者：YouTube团队
发布日期：未提供
顶部图片链接：未提供
文本：

生成式AI正在改变创作者表达自己的方式——从故事板想法到实验增强创作过程的工具。但是，观众越来越希望更多透明度地了解他们看到的内容是否经过修改或合成。这就是为什么我们今天在创作者工作室中引入了一个新工具，要求创作者向观众披露当真实感内容——观众可能会轻易将其误认为是真实人物、地点或事件——是通过修改或合成媒体制作的，包括生成式AI。

正如我们在11月宣布的，这些披露将作为标签出现在扩展描述中或视频播放器的正面。我们不会要求创作者披露明显不真实、动画化、包含特效或使用生成式AI作为制作辅助的内容。这个新标签旨在加强与观众的透明度，建立创作者与观众之间的信任。需要披露的一些内容示例包括：
使用现实人物的肖像：数字修改内容以替换一个人的脸部为另一个人的脸部，或合成为视频叙述一个人的声音。
修改真实事件或地点的镜头：例如让一座真实的建筑物看起来像着火，或修改真实的城镇景观使其与现实不同。
生成现实场景：展示虚构重大事件的逼真描绘，如龙卷风向一个真实的小镇移动。

当然，我们认识到创作者在创作过程中以多种方式使用生成式AI。我们将不会要求创作者披露如果生成式AI用于提高生产力，如生成脚本、内容想法或自动字幕。我们也不会要求创作者披露当合成媒体不真实且/或变化不重要时。这些情况包括：
明显不真实的内容，如动画或某人骑着独角兽穿越奇幻世界
色彩调整或照明滤镜
背景模糊或复古效果等特效
美颜滤镜或其他视觉增强
您可以在我们的帮助中心看到更长的示例列表。对于大多数视频，一个标签将出现在扩展描述中，但对于涉及更敏感主题的视频——如健康、新闻、选举或金融——我们将在视频本身显示更突出的标签。

您将开始在所有YouTube表面和格式上看到这些标签，从您的手机上的YouTube应用开始，很快将在您的桌面和电视上出现。虽然我们希望给我们的社区时间来适应新的流程和功能，但在未来，我们将考虑对那些持续选择不披露此信息的创作者采取执行措施。在某些情况下，YouTube可能会添加一个标签，即使创作者没有披露它，特别是如果修改或合成的内容有可能使人困惑或误导。

重要的是，我们继续与整个行业合作，帮助提高数字内容的可追溯性。这包括我们作为内容真实性与 authenticity（C2PA）联盟的指导成员的工作。

与此同时，正如我们之前宣布的，我们将继续努力更新隐私流程，以便人们可以请求删除模拟可识别个体的AI生成的或其他合成或修改的内容，包括他们的面部或声音。我们将很快分享更多关于我们如何在全球范围内引入流程的信息。

创作者是YouTube的核心，他们将继续在帮助观众理解、接受和适应生成式AI的世界中发挥重要作用。这将是一个不断发展的过程，我们 at YouTube 将持续改进以学习。我们希望这种增加的透明度将帮助我们一起更好地欣赏AI如何继续赋权人类创造力。

Post by: marban

Comments:

extheat: At 8x86B, looks like the largest open model yet by far. Would be interesting to hear how many tokens it's been trained on. Especially important for higher param models in order to efficiently utilize all those parameters.

extheat: 8x86B，看起来是迄今为止最大的开放式机型。很有意思的是，听听它有多少代币；s进行了训练。对于更高参数的模型，为了有效地利用所有这些参数，这一点尤为重要。

ilaksh: Has anyone outside of x.ai actually done inference with this model yet? And if so, have they provided details of the hardware? What type of AWS instance or whatever?I think you can rent like an 8 x A100 or 8 x H100 and it's "affordable" to play around with for at least a few minutes. But you would need to know exactly how to set up the GPU cluster.Because I doubt it's as simple as just 'python run.py' to get it going.

ilaksh: 除了x.ai之外，有人真的用这个模型做过推理吗？如果是，他们是否提供了硬件的详细信息？什么类型的AWS实例或其他什么 我认为你可以租一辆8 x A100或8 x H100；s〃；负担得起的“；至少玩几分钟。但您需要确切地知道如何设置GPU集群 因为我对此表示怀疑；s简单到仅为；python运行.py；让它继续下去。

nasir: I'd be very curious to see how it performs especially on inputs that's blocked by other models. Seems like Grok will differentiate itself from other OS models from a cencorship and alignment perspective.

nasir: I-；d非常好奇地看到它是如何执行的；s被其他型号挡住了。看起来Grok将从协调和一致的角度将自己与其他操作系统模型区分开来。

simonw: "Base model trained on a large amount of text data, not fine-tuned for any particular task."Presumably the version they've been previewing on Twitter is an instruction-tuned model which behaves quite differently from these raw weights.

simonw: &quot；基于大量文本数据训练的基础模型，不针对任何特定任务进行微调&quot 据推测，他们的版本；我在推特上预览了一个经过指令调整的模型，它的行为与这些原始权重截然不同。

nylonstrung: For what reason would you want to use this instead of open source alternatives like Mistral

nylonstrung: 你为什么要使用它而不是像Mistral这样的开源替代品

jjcm: I think it's smart to start trying things here. This has infinite flaws with it, but from a business and learnings standpoint it's a step toward the right direction. Over time we're going to both learn and decide what is and isn't important to designate as "AI" - Google's approach here at least breaks this into rules of what "AI" things are important to label:• Makes a real person appear to say or do something they didn't say or do• Alters footage of a real event or place• Generates a realistic-looking scene that didn't actually occurAt the very least this will test each of these hypotheses, which we'll learn from and iterate on. I am curious to see the legal arguments that will inevitably kick up from each of these - is color correction altering footage of a real event or place? They explicitly say it isn't in the wider description, but what about beauty filters? If I have 16 video angles, and use photogrammetry / gaussian splatting / AI to generate a 17th, is that a realistic-looking scene that didn't actually occur? Do I need to have actually captured the photons themselves if I can be 99% sure my predictions of them are accurate?So many flaws, but all early steps have flaws. At least it is a step.

jjcm: 我认为；在这里开始尝试是明智的。这有无限的缺陷，但从商业和学习的角度来看；这是朝着正确方向迈出的一步。随着时间的推移，我们；我们将学习并决定什么是和不是；将其指定为“；AI”-谷歌；这里的方法至少将其分解为“什么”的规则；AI”；事物是重要的标签：＜p＞•让真实的人看起来说或做了他们没有做的事情；t说或做•改变真实事件或地点的镜头？生成逼真的场景；t实际发生至少这将检验这些假设中的每一个；我将从中学习并不断迭代。我很想看看每一个不可避免地会引发的法律争论——颜色校正是否会改变真实事件或地点的镜头？他们明确表示这不是；在更广泛的描述中，但美容滤镜呢？如果我有16个视频角度，并且使用摄影测量；高斯飞溅；人工智能生成的第17个场景，是一个看起来很逼真的场景；实际上没有发生？如果我能99%地确定我对光子的预测是准确的，我是否需要真正捕获光子本身 这么多缺陷，但所有早期步骤都有缺陷。至少这是一个步骤。

summerlight: Looks like there is a huge grea area that they need to figure out in practice. From <a href="https://support.google.com/youtube/answer/14328491#" rel="nofollow">https://support.google.com/youtube/answer/14328491#</a>:Examples of content creators don’t have to disclose:<pre><code> * Someone riding a unicorn through a fantastical world

Green screen used to depict someone floating in space
Color adjustment or lighting filters
Special effects filters, like adding background blur or vintage effects
Production assistance, like using generative AI tools to create or improve a video outline, script, thumbnail, title, or infographic
Caption creation
Video sharpening, upscaling or repair and voice or audio repair
Idea generation
</code></pre>
Examples of content creators need to disclose:<pre><code> * Synthetically generating music (including music generated using Creator Music)
Voice cloning someone else’s voice to use it for voiceover
Synthetically generating extra footage of a real place, like a video of a surfer in Maui for a promotional travel video
Synthetically generating a realistic video of a match between two real professional tennis players
Making it appear as if someone gave advice that they did not actually give
Digitally altering audio to make it sound as if a popular singer missed a note in their live performance
Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen
Making it appear as if hospital workers turned away sick or wounded patients
Depicting a public figure stealing something they did not steal, or admitting to stealing something when they did not make that admission
Making it look like a real person has been arrested or imprisoned</code></pre>

summerlight: 看起来他们需要在练习中找出一个巨大的格雷阿区域。来自<a href=“https://；&#x2F；support.google.com#xx2F；youtube#xx2F！answer&#x2F！14328491#”rel=“nofollow”>https://&#x2F；support.google.com&#x2F；youtube&#x2F；答案&#x2F；14328491#</a>：内容创作者不必披露的例子：<pre><code>有人骑着独角兽穿越奇幻世界绿色屏幕用于描绘漂浮在太空中的人颜色调整或照明过滤器特效滤镜，如添加背景模糊或复古效果制作辅助，如使用生成人工智能工具创建或改进视频大纲、脚本、缩略图、标题或信息图标题创建视频锐化、放大或修复以及语音或音频修复产生想法</code></pre>创作者需要披露的内容示例：<pre><code>综合生成音乐（包括使用创作者音乐生成的音乐）语音克隆他人的语音以用于画外音综合生成真实地方的额外镜头，比如毛伊岛冲浪者的宣传旅游视频综合生成两名真实职业网球运动员比赛的逼真视频让人看起来好像有人给出了他们实际上没有给出的建议对音频进行数字更改，使其听起来像流行歌手在现场表演中错过了一个音符展示龙卷风或其他天气事件向真实城市移动的真实写照，但实际上并没有发生让人觉得医院工作人员拒绝了生病或受伤的病人描述一个公众人物偷了他们没有偷的东西，或者在他们没有承认的情况下承认偷了东西让它看起来像一个真人被逮捕或监禁</code></pre>

the_duke: They don't bother to mention it, but this is actually to comply with the the new EU AI act.> Providers will also have to ensure that AI-generated content is identifiable. Besides, AI-generated text published with the purpose to inform the public on matters of public interest must be labelled as artificially generated. This also applies to audio and video content constituting deep fakes<a href="https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai#:~:text=Providers will also have to,video content constituting deep fakes" rel="nofollow">https://digital-strategy.ec.europa.eu/en/policies/regulatory...</a>.Some discussion here: <a href="https://news.ycombinator.com/item?id=39746669">https://news.ycombinator.com/item?id=39746669</a>

the_duke: 他们不；我不想提，但这实际上是为了遵守新的欧盟人工智能法案 &gt；提供商还必须确保人工智能生成的内容是可识别的。此外，人工智能生成的文本是为了向公众通报公共利益事项而发布的，必须被贴上人为生成的标签。这也适用于构成深度伪造的音频和视频内容<a href=“https://；&#x2F；数字战略.ec.europa.eu&#x2F！en&#x2F，政策&#x2F：监管框架ai#：~：text=提供商%20will%20also%20have%20to，视频%20content%20construction%20deep%20fakes”rel=“nofollow”>https://&#x2F；数字战略.ec.europa.eu；en；策略；监管的一p> 这里的一些讨论：<a href=“https://；&#x2F；news.ycombinator.com&#x2F？id=39746669”>https://&#x2F；news.ycombinator.com&#x2F；项目id=39746669</a>

yoavz: Most interesting example to me: "Digitally altering audio to make it sound as if a popular singer missed a note in their live performance".This seems oddly specific to the inverse of what happened recently with Alicia Keys from the recent Superbowl. As Robert Komaniecki pointed out on X [1], Alicia Keys hit a "sour note" which was silently edited by the NFL to fix it.[1] <a href="https://twitter.com/Komaniecki_R/status/1757074365102084464" rel="nofollow">https://twitter.com/Komaniecki_R/status/1757074365102084464</a>

yoavz: 对我来说最有趣的例子是：；对音频进行数字更改，使其听起来像是流行歌手在现场表演中错过了一个音符” 这似乎与最近超级碗中艾丽西亚·凯斯的遭遇正好相反。正如Robert Komaniecki在X[1]上指出的那样，Alicia Keys打出了一个“；酸味”；NFL对其进行了静默编辑以修复它。[1]<a href=“https://；&#x2F；twitter.com&#x2F：Komaniecki_R&x2F；status&#x2F，1757074365102084464”rel=“nofollow”>https://&#x2F；twitter；Komaniecki_ R；status；1757074365102084464</a>

sigmoid10: >Some examples of content that require disclosure include: [...] Generating realistic scenes: Showing a realistic depiction of fictional major events, like a tornado moving toward a real town.This sounds like every thumbnail on youtube these days. It's good that this is not limited to AI, but it also means this will be a nightmare to police.

sigmoid10: &gt；一些需要披露的内容示例包括：[…]生成真实场景：显示虚构重大事件的真实描述，如龙卷风向真实城镇移动 这听起来像是最近youtube上的每一个缩略图。它；这不仅限于人工智能，这很好，但也意味着这将是警方的噩梦。