With the promotion of 4G networks and the improvement of network bandwidth, video has become the main consumption carrier of Internet users, and users share and browse information through short videos. Therefore, the editing function of video is becoming more and more important and common. Apps for video editing have also sprung up like mushrooms after rain. In order to better promote the development of the Dewu App community business, Dewu also self-develops video editing tools that meet the needs of Dewu. We are committed to creating a “faster, stronger” video editing tool.

In order to let everyone better understand the video editing tool of Dewu App, let’s briefly introduce the main functions of the video editing tool.

The following are the main functions of Dewu App video editing tool:

The highlights of the video editing tool are as follows:

  • Resources that video editing tools need to operate:

    • Characters: including ordinary characters, special artistic characters, flower characters, etc.;
    • Images: including static images, such as JPEG/PNG, etc., as well as dynamic images such as HEIC/GIF;
    • Video: including various videos (various encoding and encapsulation formats), the mainstream formats are generally MP4 encapsulation format, H264 video encoding format, AAC audio encoding format, etc.;
    • Audio: including various audio (various encoding and packaging formats), and of course video also contains audio tracks.
  • The main operation mode of the video editing tool:

    • Manipulating pictures and video frames: We know that videos are composed of pictures frame by frame, so manipulating video frames is the same as manipulating pictures. We add some special effects on pictures and video frames to achieve some interesting effects to attract user.
    • Operating audio: The mainstream operating audio methods, such as double speed, adjusting volume, changing pitch, etc., are the main ways of playing short videos today.
  • What the video editing tool finally generates is a new video, which applies some special effects to specific resources to generate a new video.

The flow chart below can easily let everyone understand the workflow of video editing. For convenience, we input a video, add some special effects, and generate a new video.

As can be seen from the above process, the original video A.mp4 is decapsulated to separate the audio track and video track, after decoding them, apply audio effects to the audio data, apply video effects to the video frame data, and then encode and encapsulate to synthesize a new video. Of course, both decoding and encoding are controlled by a queue, which is marked on the flow chart, and there is no in-depth development, so everyone can understand it.

After the above introduction, everyone has a general understanding of video editing tools. In fact, to measure whether a video editing tool is good or not, we mainly start from the following aspects:

The following is a detailed explanation of the mental journey of optimizing the video editing tool of Dewu App from these three aspects.

Performance is the primary indicator of whether a program is good or not. No matter how powerful a tool is, if it crashes at one point, or the memory is skyrocketing, and the application freezes, it is estimated that this application cannot be called an excellent application. Let’s talk about it in detail below. An optimized detection scheme for video editing tools.

Optimizing memory starts with good coding habits, especially for audio and video applications that have very high memory requirements. For example, for a 1080 * 1920 video, the size of a frame of decoded original data is also 1080 * 1920, and the occupied memory is 1080 * 1920 * (8 * 3 ) / 8 = 5.93 MB. One video frame takes up such a large amount, usually in 1 second If there are 30 frames, it will take up 177.9MB. If it is not controlled, no matter how high-performance the mobile phone is, it will not be able to withstand such a toss. Hope the following memory detection and optimization scheme can bring you some help.

3.1 Rational Design Queue

Above we talked about the concept of decoding queue and encoding queue in the video introducing the video editing process. In fact, the concept of queue is used very frequently in audio and video. It is precisely because of the limitation of memory that the control method of queue is introduced. You may still be a little confused, but after reading the flow chart below, I believe you will suddenly see the light.
We only select the decoding part to analyze the important application of the queue.

There are several important queues in a video editing tool:

  • During decoding:

    • Video Packet Queue: The queue where the Packet is stored before video decoding. The generally recommended queue size is 100
    • Audio Packet Queue: The queue in which the Packet is stored before audio decoding. The generally recommended queue size is 150
    • Video Frame Queue: The queue where Frame is stored after video decoding, the general recommended queue size is 3
    • Audio Frame Queue: The queue where the Frame is stored after audio decoding. The general recommended queue size is 8
  • During encoding:

    • Encode Video Packet Queue: The queue where Packet is stored after video encoding, the general recommended size is 100
    • Encode Audio Packet Queue: The queue where the Packet after audio encoding is stored. The general recommended size is 150

Designing the size of the queue according to the above method can minimize memory usage and improve user experience while ensuring normal functions.

3.2 Troubleshoot memory leaks

There are many ways to troubleshoot memory leaks on Android, here are two:

The full name of Asan is AddressSanitizer, which is a compiler-based quick detection tool for detecting memory errors in native code. Asan can solve the following four core problems:

  • Stack and heap buffer overflow, underflow

  • Heap reuse issues after free

  • Stack usage out of bounds

  • Double free, wrong free problem

For the usage of Asan, it is recommended to refer to the official Google documentation, so I won’t introduce more here: https://github.com/google/sanitizers/wiki/AddressSanitizer

Regarding the use of Profile, if you need to detect Native memory usage, you need to meet API>=29, and you need to be very careful when using it.

The following is the stack we used Asan to grab in the demo:

20042-20042/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
20042-20042/? A/DEBUG: Build fingerprint: 'samsung/t2qzcx/t2q:11/RP1A.200720.012/G9960ZCU2AUGE:user/release-keys'
20042-20042/? A/DEBUG: Revision: '13'
20042-20042/? A/DEBUG: ABI: 'arm64'
20042-20042/? A/DEBUG: Timestamp: 2021-09-17 00:32:31+0800
20042-20042/? A/DEBUG: pid: 19946, tid: 20011, name: AudioTrack  >>> com.jeffmony.audioplayer <<<
20042-20042/? A/DEBUG: uid: 10350
20042-20042/? A/DEBUG: signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
2021-09-17 00:32:31.157 20042-20042/? A/DEBUG: Abort message: '=================================================================
    ==19946==ERROR: AddressSanitizer: heap-use-after-free on address 0x004ac1e41080 at pc 0x007157f69580 bp 0x00705c0bb350 sp 0x00705c0bab08
    READ of size 1792 at 0x004ac1e41080 thread T32 (AudioTrack)
        #0 0x7157f6957c  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libclang_rt.asan-aarch64-android.so+0x9f57c)
        #1 0x706549c228  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x14228)
        #2 0x706549bcd4  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x13cd4)
        #3 0x70654994f0  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x114f0)
        #4 0x70654a9cbc  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x21cbc)
        #5 0x70654a91d4  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x211d4)
        #6 0x715af9d188  (/system/lib64/libwilhelm.so+0x1c188)
        #7 0x71570ea290  (/system/lib64/libaudioclient.so+0x8b290)
        #8 0x71570e9480  (/system/lib64/libaudioclient.so+0x8a480)
        #9 0x7156b664d4  (/system/lib64/libutils.so+0x154d4)
        #10 0x71593e9974  (/system/lib64/libandroid_runtime.so+0xa5974)
        #11 0x7156b65db0  (/system/lib64/libutils.so+0x14db0)
        #12 0x7156ace234  (/apex/com.android.runtime/lib64/bionic/libc.so+0xb6234)
        #13 0x7156a68e64  (/apex/com.android.runtime/lib64/bionic/libc.so+0x50e64)


         0x004ac1e41080 is located 0 bytes inside of 1792-byte region [0x004ac1e41080,0x004ac1e41780)    freed by thread T32 (AudioTrack) here:        #0 0x7157f74c64  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libclang_rt.asan-aarch64-android.so+0xaac64)        #1 0x70654a6d2c  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x1ed2c)        #2 0x70654a6af0  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x1eaf0)        #3 0x706549bf4c  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x13f4c)        #4 0x706549bcd4  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x13cd4)        #5 0x70654994f0  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x114f0)        #6 0x70654a9cbc  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x21cbc)        #7 0x70654a91d4  (/data/app/~~G094WKQQj7KZvdhvGYDLDA==/com.jeffmony.audioplayer-kcu1nmgzpBIQDRJDxCJDOQ==/lib/arm64/libltpaudio.so+0x211d4)        #8 0x715af9d188  (/system/lib64/libwilhelm.so+0x1c188)        #9 0x71570ea290  (/system/lib64/libaudioclient.so+0x8b290)

The displayed message is: heap-use-after-free on address 0x004ac1e41080, indicating that the memory that has been released is used, and then continue to look, where is the memory released? 0x004ac1e41080 is located 0 bytes inside of 1792-byte region[0x004ac1e410800x004ac1e41780)AgreatadvantageofAsanisthatitcantrackthepathofmemoryreleasetopreventmemoryleaksandwildpointerproblemsespeciallywildpointerswhichareextremelydifficulttotroubleshootItissimplyanightmareforC++developmentIhopeeveryonecanusethetoolswellandcultivategoodC++codinghabits[0x004ac1e410800x004ac1e41780)Asan一个很大的优势就是可以追踪内存释放的路径,防止出现内存泄漏和野指针问题,特别是野指针,一旦出现特别难排查,简直是C++开发的噩梦,希望大家用好工具,同时培养良好的C++编码习惯。

3.3 Optimizing threads

Another important factor that affects memory is threads. Video editing tools involve a lot of threads. The use of threads must follow some basic principles:

Taking the editing module as an example, here is a list of all the threads we use:

If a separate audio file is inserted, two additional threads need to be added:

  • Music file playback thread

  • Music file decoding thread

The above list is the minimum number of threads necessary for a video editing tool to work properly. If there are any more threads in your video editing tool, we suggest that you can optimize it properly. After all, one less thread can save one point of overhead, and One less thread synchronization work.

We also rewrote a set of message distribution SDK at the C++ layer according to the Android message mechanism at the bottom layer. We will share another article to explain our customized message distribution SDK in the future, so stop here.

We use video editing tools, and ultimately hope to export a video. If the export process is very slow, it must be unbearable. From the above introduction, we know that the export of video needs to go through “decoding – applying special effects – encoding” The process of decoding and encoding, in which the impact of the two processes on speed is crucial. Because decoding and encoding video requires a lot of resources, there are currently two main methods – “soft decoding/encoding” and “hard decoding/encoding”.

If you have used FFmpeg or other video codecs that use the CPU to process video, you may have encountered the problem of slow processing speed. This is mainly because the CPU is used for soft encoding and decoding, and the speed of the CPU in processing video is much lower than that of the DSP chip; Computing work is the original processing method, and of course it takes a long time; “hard decoding/encoding” is processed by GPU, which is a dedicated graphics processing chip that is specially optimized for video decoding and encoding, so encoding And decoding is very fast.

Android uses MediaCodec to achieve “hard decoding/encoding”, and iOS uses VideoToolBox to achieve “hard decoding/encoding”. Here we focus on the speed optimization of encoding and decoding on Android.

From the above process, we can see that the encoding is behind the decoding,For a 60s (30fps) video, 1800 frames need to be decodedand then encode 1800 frames of video to completely generate another video, so serial waiting is the main reason for time-consuming.

At this time, we refer to the multi-threading scheme and divide a 60s video into two sections, and then decode the two sections of video at the same time, generate and export two 30s temporary cache video files, and then merge the two 30s videos For a 60s B.mp4 video, delete the temporary cache file at the end, so that we only need to process 900 frames of data at the same time, which can theoretically double the export speed.

This is parallel export. The following is the basic process of parallel export of Dewu App.

First of all, we need to make it clear that exporting video needs to consume resources. This resource is MediaCodec, which is finally sent to the GPU for processing. The MediaCodec instance in a mobile phone is limited. Under normal circumstances, a mobile phone can provide a maximum of 16 MediaCodec instances. If there are more than 16 MediaCodec instances currently in use, the phone will not work properly. MediaCodec resources are shared by all apps in the mobile phone. Therefore, the number of parallel segments is not as high as possible.

  • There is only one paragraph, and two MediaCodecs are needed (one for decoding video and one for encoding video). Note: Audio decoding and encoding can not use MediaCodec. After all, audio takes much less time and is not a bottleneck.

  • Dividing into two segments requires four MediaCodecs, dividing into three segments requires six MediaCodecs, dividing into four segments requires eight MediaCodecs, and so on.

Here are the test results for parallel export:
The speed of two stages of parallelism is increased by 50% to 70%, and the memory is increased by 20%. The speed of three stages of parallelism is increased by 60% to 90%, and the memory is increased by 80%. If the parallelism exceeds three stages, the speed cannot be significantly improved. We recommend two stages in parallel, and three stages in parallel on some models with good performance.

If some students still have doubts about the file operation during the video export process, the following schematic diagram can clearly see the process of parallel export and operation of local files:

  • During parallel export, two temporary files are generated

  • After the parallel export is completed, the two temporary files are merged into a new file, and the two temporarily generated files are deleted (saving the user’s precious storage space)

  • The original file jeffmony_out.mp4 has not been deleted/modified

Tips: At present, the temporary files generated during the processing and the final adaptation files will be saved in /sdcard/Pictures/duapp/Compile/, and the cleaning process of the temporary files after the processing is completed will be triggered on some models protection mechanism, it is recommended to adjust it to the private directory of the app in the future.

Of course, there are other suggestions to improve the export speed. For example, in the process of video frame special effects processing, we recommend:

These practices are our practical experience in the process of video editing and development, and we hope to bring you some help.

Whether a video editing function is good enough, one of the important indicators is whether the exported video under the same conditions is clear enough. Generally speaking, there are two ways to measure whether the video is clear:

  • Subjective criteria: Find some users to watch different videos, and output video definition comparison results according to the user’s perception. Users generally evaluate the clarity based on color, screen brightness, softness, etc.

  • Objective criteria: Use algorithms to calculate the quality score of video images. Currently, the open source library VMAF launched by Netflix is ​​recommended to calculate the quality score of video frames.

In fact, the subjective standard is relatively accurate, but the operability is relatively poor, especially when processing massive videos, it requires a lot of manpower and cannot be carried out effectively. Therefore, in daily work, it is recommended to use objective standards for massive calculations, and subjective standards for key judgments . Specifically, it can be carried out according to the importance of the business.

The following is a specific way to improve video clarity based on our actual work:

  • Video basic coding information optimization

    • Profile optimization: Profile has three levels, namely Baseline, Main, and High. Among them, Baseline Profile corresponds to the lowest resolution and is supported by versions after Android 3.0. The resolution of Main Profile is better than that of Baseline Profile, but it is only supported after Android 7.0. High Profile has the highest definition and is only supported after Android 7.0. Before we set the Encoder Profile Level, we need to judge whether it is currently supported.
    • Bitrate code rate setting: The video bit rate is the number of data bits transmitted per unit time during video data transmission. The unit is kbps, as the name suggests, the larger the bit rate, the more data filled per unit time, and the higher the video quality. However, the larger the bit rate is not the better, if it exceeds the necessary limit, the improvement of video quality will not be obvious, so it is recommended to adjust the bit rate with an appropriate factor. Bitrate = width * height * frameRate * factor, where factor=0.15.
    • Bitrate Mode: There are three passing encoding modes – VBR (Variable Bit Rate), CBR (Constant Bit Rate), ABR (Average Bit Rate), among which ABR is the best way, which can take both quality and video size into consideration.
    • B frame settings: The video consists of I frame, P frame, and B frame. Among them, the I frame is the largest, the P frame is the second, and the B frame is the smallest. We try to set as many B frames as possible (within a reasonable range) during encoding without reducing the clarity. But the size of the video can be greatly reduced, so that we can increase the bit rate accordingly, and finally achieve the goal of improving clarity.
  • HEVC encoding optimization: Using HEVC encoding can ensure that the clarity of the video is greatly improved without increasing the file size. Under the same image quality, HEVC-encoded video is about 40% less than H.264-encoded video

  • color tuning

    • Comprehensively adjust the brightness, contrast, color temperature, saturation, sharpness and other color parameters to optimize the overall video picture and make the video picture look “clearer”.
  • **Super-resolution Algorithm**: Using the ESRGAN algorithm, using the advantages of machine learning to perform deblurring, resize, noise reduction, sharpening and other processing on pictures and videos, reconstruct pictures, and realize super-resolution processing of pictures.

    • Feature Extraction: Calculating Noise
    • Nonlinear mapping: zoom in, blur noise
    • Image reconstruction: difference, smooth transition, denoising
  • The following is a comparison picture before and after using the super-resolution algorithm. It can be clearly seen that the picture on the right is clearer, with much less noise, brighter pictures, and smoother transitions.

If you want to know the technical details of video definition optimization, you can refer to the article — Video Definition Optimization Guide

The beginning of this article starts with the introduction of the main functions of Dewu App, and proposes three dimensions for the optimization of video editing tools:

Among them, when “improving video export speed”, the technical solution of “parallel export” was emphasized. Judging from the final results, the improvement of video export speed is very obvious, and it also clearly explains why the “parallel export” process generates Temporary Files? Why is it necessary to delete the temporary files after the export is complete? Try our best to bring users a better experience.

Finally, the application effect of the super-resolution algorithm mentioned in “Improving the Clarity of Exported Video” has been significantly improved. Compared with the original frame image, the video frame after super-resolution is clearer, has less noise, and the details are more realistic.

In the future, we will output more meaningful technology sharing combined with AR special effects, so stay tuned.

*arts /Jeff Mony

Pay attention to Dewu technology, and update technical dry goods every Monday, Wednesday and Friday nights at 18:30
If you think the article is helpful to you, please comment, forward and like~

#Dewu #Video #Editing #Tool #Optimization #Guide #Personal #Space #Dewu #Technology #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *