【Hacker News搬运】VideoGigaGAN:迈向细节丰富的视频超分辨率
Title: VideoGigaGAN: Towards detail-rich video super-resolution
Url: https://videogigagan.github.io/
VideoGigaGAN是一个新的生成式视频超分辨率模型,它能够产生具有高频细节和时间一致性的视频。这个模型基于大规模图像超分辨率器--GigaGAN。作者提出了一些关键技术,显著提高了超采样视频的时间一致性。实验表明,与以前的视频超分辨率方法不同,VideoGigaGAN生成的视频具有更精细的外观细节,并且具有时间一致性。作者通过将VideoGigaGAN与公共数据集上的最先进视频超分辨率模型进行比较,并展示了8倍超分辨率的视频结果,验证了VideoGigaGAN的有效性。 VideoGigaGAN的建筑结构是基于图像GigaGAN超分辨率器的非对称U-Net架构。为了强制时间一致性,作者首先将图像超分辨率器膨胀为视频超分辨率器,通过在解码器块中添加时间注意力层。为了增强一致性,作者还通过集成流量引导传播模块的特征来增强一致性。为了抑制混叠 artifacts,作者在编码器的下采样层中使用抗混叠块。最后,作者通过跳线连接直接将高频特征传递到解码器层,以补偿BlurPool过程中细节的损失。 与以前的模型相比,VideoGigaGAN提供了一个丰富的细节结果,并且具有可比较的时间一致性。该模型能够处理不同类别的通用视频。
Post by: CharlesW
metalrain: Video quality seems really good, but limitations are quite restrictive "Our model encounters challenges when processing extremely long videos (e.g. 200 frames or more)".<p>I'd say most videos in practice are longer than 200 frames, so lot more research is still needed.
metalrain: 视频质量似乎真的很好,但限制性很强;我们的模型在处理超长视频(例如200帧或更多)时遇到了挑战”<p> I-;d表示,实践中的大多数视频都超过200帧,因此还需要更多的研究。
Aissen: This is great for entertainment (and hopefully the main application), but we need clear marking of such type of videos before hallucinated details are used as "proofs" of any kind by people not knowing how this works. Software video/photography on smartphones is already using proprietary algorithms that "infer" non-existent or fake details, and this would be at an even bigger scale.
Aissen: 这非常适合娱乐(希望是主要应用),但在使用幻觉细节之前,我们需要对这类视频进行清晰的标记;证明";不知道这是怎么回事的人。软件视频;智能手机上的摄影已经在使用专有算法;推断“;不存在或虚假的细节,这将是一个更大的规模。
scoobertdoobert: Is anyone else concerned at the societal effects of technology like this? In one of the examples they show a young girl. In the upscale example it's quite clearly hallucinating makeup and lipstick. I'm quite worried about tools like this perpetuating social norms even further.
scoobertdoobert: 还有人关心这种技术的社会影响吗?在其中一个例子中,他们展示了一个年轻女孩。在高档示例中;很明显,她对化妆品和口红产生了幻觉。I-;我非常担心像这样的工具会进一步延续社会规范。
geor9e: This is great. I look forward to when cell phones run this at 60fps. It will hallucinate wrong, but pixel perfect moons and license plate numbers.
geor9e: 这太棒了。我期待着手机以每秒60帧的速度运行。它会产生幻觉,但像素完美的月亮和车牌号。
cjensen: The video of the owl is a great example of doing a terrible job without the average Joe noticing.<p>The real owl has fine light/dark concentric circles on its face. The app turned it into gray because it does not see any sign of the circles. The real owl has streaks of spots. The app turned them into solid streaks because it saw no sign of spots. There's more where this came from, but basically only looks good to someone who has no idea what the owl should look like.
cjensen: 猫头鹰的视频是一个很好的例子,它在普通乔没有注意到的情况下做了一件糟糕的工作<p> 真正的猫头鹰有很好的光线;它脸上的黑色同心圆。该应用程序将其变成灰色,因为它看不到任何圆圈的迹象。真正的猫头鹰有条纹。该应用程序将它们变成了实线,因为它看不到斑点的迹象。有;这是从哪里来的,但基本上只对那些不知道猫头鹰应该是什么样子的人来说好看。