У нас есть 2 видео. Одним из них является профессионально созданное видео исполнителя, который исполняет песню в определенных частях, чтобы позволить пользователю петь. Второе видео - это поток, который записывается, пока пользователь смотрит первое видео и поет в тех частях, где остановился исполнитель. Идея состоит в том, чтобы затем объединить эти 2 видео в одно видео, где они будут находиться рядом друг с другом. У нас проблема в том, что видеопоток содержит пропущенные кадры, и при объединении двух видео эти пропущенные кадры приводят к несинхронизации звука между двумя видео.
Команда, которую мы используем для объединения двух видео:
ffmpeg -i "OCTAVIA - DESPUES DE TI_330X220.mp4" -i input.flv -filter_complex "[0:v]setpts=PTS-STARTPTS, pad=iw*2:ih[bg]; [1:v]setpts=PTS-STARTPTS[fg]; [bg][fg]overlay=w; amix=duration=longest" -strict -2 merged.mp4
Вывод мы получаем из этой операции:
ffmpeg version 2.2.2 Copyright (c) 2000-2014 the FFmpeg developers
built on May 8 2014 19:48:20 with llvm-gcc 4.2.1 (LLVM build 2336.11.00)
configuration: --prefix=/Volumes/Ramdisk/sw --enable-gpl --enable-pthreads --enable-version3 --enable-libspeex --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-avfilter --enable-libopencore_amrwb --enable-libopencore_amrnb --enable-filters --enable-libgsm --enable-libvidstab --arch=x86_64 --enable-runtime-cpudetect
libavutil 52. 66.100 / 52. 66.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 2.100 / 4. 2.100
libswscale 2. 5.102 / 2. 5.102
libswresample 0. 18.100 / 0. 18.100
libpostproc 52. 3.100 / 52. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'OCTAVIA - DESPUES DE TI_330X220.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.34.100
Duration: 00:01:40.62, start: 0.033333, bitrate: 379 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 330x220 [SAR 1:1 DAR 3:2], 246 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
Input #1, flv, from 'input.flv':
Metadata:
creationdate : Wed May 21 18:52:06
level : 3.1
keyFrameInterval: 15
bandwith : 0
fps : 15
codec : H264Avc
profile : baseline
Duration: 00:01:40.46, start: 0.000000, bitrate: 64 kb/s
Stream #1:0: Video: h264 (Baseline), yuv420p(tv), 330x220 [SAR 1:1 DAR 3:2], 15.17 fps, 15 tbr, 1k tbn, 30 tbc
Stream #1:1: Audio: nellymoser, 22050 Hz, mono, flt
File 'merged.mp4' already exists. Overwrite ? [y/N] y
[libx264 @ 0x7ff3f5000600] using SAR=1/1
[libx264 @ 0x7ff3f5000600] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.1 Cache64
[libx264 @ 0x7ff3f5000600] profile High, level 2.1
[libx264 @ 0x7ff3f5000600] 264 - core 142 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'merged.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.33.100
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 660x220 [SAR 1:1 DAR 3:1], q=-1--1, 12288 tbn, 24 tbc (default)
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp, 128 kb/s (default)
Stream mapping:
Stream #0:0 (h264) -> setpts
Stream #0:1 (aac) -> amix:input0
Stream #1:0 (h264) -> setpts
Stream #1:1 (nellymoser) -> amix:input1
overlay -> Stream #0:0 (libx264)
amix -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:02.26 bitrate= 67.3kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:03.47 bitrate= 199.4kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:05.24 bitrate= 241.1kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:07.70 bitrate= 279.7kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:10.04 bitrate= 301.8kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:11.96 bitrate= 323.1kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:14.33 bitrate= 343.0kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 16 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 8 times
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 4 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 13 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 4 times
- this happens a lot more times
Last message repeated 7 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 19 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 7 times
frame= 2414 fps=116 q=-1.0 Lsize= 4709kB time=00:01:40.62 bitrate= 383.4kbits/s
video:3067kB audio:1573kB subtitle:0 data:0 global headers:0kB muxing overhead 1.476866%
[libx264 @ 0x7ff3f5000600] frame I:11 Avg QP:17.45 size: 15375
[libx264 @ 0x7ff3f5000600] frame P:905 Avg QP:23.36 size: 2650
[libx264 @ 0x7ff3f5000600] frame B:1498 Avg QP:28.24 size: 383
[libx264 @ 0x7ff3f5000600] consecutive B-frames: 12.1% 7.2% 25.2% 55.5%
[libx264 @ 0x7ff3f5000600] mb I I16..4: 10.6% 49.9% 39.5%
[libx264 @ 0x7ff3f5000600] mb P I16..4: 0.9% 3.5% 1.2% P16..4: 15.9% 10.1% 6.6% 0.0% 0.0% skip:61.8%
[libx264 @ 0x7ff3f5000600] mb B I16..4: 0.1% 0.2% 0.0% B16..8: 14.8% 2.7% 0.8% direct: 0.7% skip:80.7% L0:41.5% L1:47.1% BI:11.4%
[libx264 @ 0x7ff3f5000600] 8x8 transform intra:61.0% inter:46.8%
[libx264 @ 0x7ff3f5000600] coded y,uvDC,uvAC intra: 49.2% 31.1% 19.7% inter: 6.3% 3.7% 0.7%
[libx264 @ 0x7ff3f5000600] i16 v,h,dc,p: 46% 20% 16% 18%
[libx264 @ 0x7ff3f5000600] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 28% 16% 31% 3% 4% 5% 4% 5% 4%
[libx264 @ 0x7ff3f5000600] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 31% 17% 15% 5% 6% 7% 6% 6% 7%
[libx264 @ 0x7ff3f5000600] i8c dc,h,v,p: 71% 11% 15% 3%
[libx264 @ 0x7ff3f5000600] Weighted P-Frames: Y:0.1% UV:0.1%
[libx264 @ 0x7ff3f5000600] ref P L0: 71.2% 13.7% 10.2% 4.9%
[libx264 @ 0x7ff3f5000600] ref B L0: 88.8% 9.0% 2.3%
[libx264 @ 0x7ff3f5000600] ref B L1: 95.6% 4.4%
[libx264 @ 0x7ff3f5000600] kb/s:249.78
Мы попытались преобразовать flv в mp4 перед попыткой слияния, но мы все еще получаем операторы переполнения буфера, и звук все еще не синхронизирован.
У нас есть более 100 видео, которые нужно обрабатывать таким образом, поэтому нам нужен не ручной способ решения этой проблемы.
Любые указатели на то, что мы могли бы попытаться исправить это пустошь, будут оценены.
Два файла, которые я пытаюсь объединить, можно найти здесь:http://www.lauchenauer.info/test/input.flv http://www.lauchenauer.info/test/artist.mp4
Вы можете увидеть проблему синхронизации через 35 секунд в песне, где исполнитель передает ее пользователю, где затем должно начаться аудио пользователя, но после слияния двух файлов она начинается примерно на 8-10 секунд раньше.