LoVA

Demos

We select both 10-second samples from VGGSound and long-form samples form UnAV100. Since the video is relatively long, we appreciate your patience while the server loads the content. Thank you!
It is recommended to use earphones to hear the demos videos, raise the volume and zoom in the videos.

Section 1: Long-Form UnAV100 results.

Sample 1: Playing the flute.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 2: Wind, wave, boat.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 3: Cattle mooing.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 4: Raining, lighting and thunder.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 5: Playing ping pong.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)







Section 2: 10-second VGGSound results.

Sample 1: Footstep, ice breaking.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 2: Engine starting.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 3: Child groaning.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 4: Owl hooting.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)
Sample 5: Playing the violin.
Ground Truth SpecVQGAN IM2WAV Diff-Foley
foleycrafter TiVA LoVA (Ours)