美文网首页
视频生成小综述起稿

视频生成小综述起稿

作者: Feather轻飞 | 来源:发表于2018-05-06 22:05 被阅读0次

Year 2018

March

1. Probabilistic Video Generation using Holistic Attribute Control https://arxiv.org/pdf/1803.08085.pdf

   a. Videos express highly structured spatio-temporal patterns of visual data. two factors:

        (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame

        (ii) an interframe motion or scene dynamics (e.g., encoding evolution of the person ex- ecuting the action).

   b. VideoVAE

       video generation + future prediction.

       generates a video (short clip) by:

           decoding samples sequentially drawn from a latent space distribution into full video frames.

              -VAE: encoding/decoding frames into/from the latent space

              -RNN: model the dynamics in the latent space.    

        improve the video generation consistency through temporally-conditional sampling and quality

              -structuring the latent space with attribute controls

              -ensuring that attributes can be both inferred and conditioned on during learning/generation

2.Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks

3.Every Smile is Unique: Landmark-Guided Diverse Smile Generation 

Year 2017

-By the Way

 I like this stanford homework paper http://cs231n.stanford.edu/reports/2017/pdfs/323.pdf

1. Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image

-spatial constructs <---- target image; dynamics <------source video sequence

 To preserve the spatial construct of the target image:

             - the appearance of the source video sequence is suppressed

             - only the dynamics are obtained before being imposed onto the target image.  (using the proposed appearance suppressed dynamics feature.)

 the spatial and temporal consistencies are verified via two discriminator networks.  

             - discriminator A validates the fidelity of the generated frames appearance,

             -  B validates the dynamic consistency of the generated video sequence.

Results:

             - successfully transferred arbitrary dynamics of the source video sequence onto a target image

             - maintained the spatial constructs (appearance) of the target image while generating spatially and temporally consistent video sequences.

Note: It is ### everything (Literature Review in its intro) because it is quite new.

2. Deep Video Generation, Prediction and Completion of Human Action Sequences https://arxiv.org/pdf/1711.08682.pdf


3. Video Generation from Text https://arxiv.org/pdf/1710.00421.pdf

-Hybrid VAE plus GAN

-Two parts:

-Static( Using gist to sketch text-conditioned background color and object layout (LSTM, RNN structure)

-Dynamic ( A text2Filter. )

-3.3 Text2Filter

-Note: Quite compact. Need time to digestilter

4. Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks

   https://arxiv.org/pdf/1709.07592.pdf

5. MoCoGAN: Decomposing Motion and Content for Video Generation

   https://arxiv.org/pdf/1707.04993.pdf


6. To Create What You Tell: Generating Videos from Captions

    https://www.microsoft.com/en-us/research/wp-content/uploads/2017/11/BNI02-panA.pdf


-Temporal GANs conditioning on Captions, namely TGANs-C

     - transformed into a frame sequence with 3D spatio-temporal convolutions.

      -  GAM evaluation metric ( Section 3.4 Experimental Setting)

-  Model Architecture

            -3.1.1 Generator

                     -Given a sentence 𝒮, a bi-LSTM is utilized to contextually embed the input word sequence,  + a LSTM- based encoder to obtain the sentence representation S. + concatenated input of the sentence representation S and random noise variable z.synthesize realistic videos with these

             -3.1.2 The discriminator network 𝐷 includes three discriminators:

                           a.video discriminator classifying realistic videos from generated+ optimizes video-caption matching           

                           b. frame discriminator( between real and fake frames)and aligning frames with the conditioning caption

                           c. motion discriminator emphasizing that the adjacent frames in the generated videos run smoothly

              -3.1.3 The whole part trained with 3 losses:video-level matching-aware loss, frame-level matching-aware loss and temporal coherence loss

                   .

   Year 2016

1. Generating Videos with Scene Dynamics

     https://arxiv.org/abs/1609.02612

- a spatio-temporal convolutional architecture

- untangles the scene’s foreground from the background.

- experiments show the model internally learns useful features for recognizing actions with minimal supervision,

- scene dynamics are a promising signal for representation learning.

- Slides : https://pdfs.semanticscholar.org/presentation/7188/6726f0a1b4075a7213499f8f25d7c9fb4143.pdf

相关文章

  • 视频生成小综述起稿

    Year 2018 March 1. Probabilistic Video Generation using H...

  • 【技术博客】生成式对抗网络模型综述

    34-生成式对抗网络模型综述 作者:张真源 GAN GAN简介 生成式对抗网络(Generative advers...

  • 起稿

    开展此次活动的主要目的是宣传日常的养生养老方面的知识,让同学们了解日常健康的生活作息和生活习惯,以及不良的生活习惯...

  • 起稿

    今天去洗了个头发…… 本人很少去发廊洗头,这次去洗是因为切鸡骨头连着手也切了(掉了三块肉,可疼了)。 可能因为没被...

  • 视频质量评估综述

    自上次在MSTC2019 介绍了“Intel 实时视频评估框架” 受到了非常多同学的关注,分享时间有限,为了让更多...

  • 小靴子

    彩铅小靴子 铅笔起稿 修图中 完稿 配个小框子

  • 鱼头起稿

  • 起稿草图

    刚刚画好的2副起稿草图 因为还需要勾线笔画,所以起稿不能用笔太狠。 线条不能断,连着画 不然会显得你很low

  • 起稿完毕!

    一步步起稿,原图是偏中性风的大姑娘,果然不出所料被我画成了大男人,不过貌似有那么点杀阡陌大姐姐的味道呢!明...

  • 光(起稿)

    很久很久以前,女娲一个人无聊就在河边玩泥巴,她把泥巴捏成各式各样的小人儿,不过把其中一个小人儿给捏坏了,曼妙空灵的...

网友评论

      本文标题:视频生成小综述起稿

      本文链接:https://www.haomeiwen.com/subject/pkrprftx.html