When you look at a picture, which parts of the picture will you pay attention to first, or which areas in the picture will first attract your attention, can machine learning know in advance where the user’s attention will be concentrated?
Based on this idea, Google trained a machine learning model to make such predictions and applied the model to the JPEG XL image encoding format. After applying this model, the browser will first load the image part that the user will pay attention to at the first time. From the user’s perspective, the image loading speed will be significantly improved, which can significantly improve the user experience.
Of course, this model is not only applicable to the encoding of JPEG XL images, as long as the project needs to adjust the content loading priority according to the user’s attention (for example, in VR, the camera and the model can be combined to adjust the VR picture. Clarity, priority is to load the picture that the user can see).
If your Internet age is old enough, you can think back to the era when the Internet speed was not fast enough. At that time, when you wanted to browse a picture, the picture usually appeared line by line gradually, with a great sense of fragmentation, and the image did not load at 60%- 70%, you can’t tell what the image depicts at all. Now that the network speed is getting faster and faster, pictures can usually be loaded in an instant, and users will not notice the loading of pictures in most cases, but this model is still of great significance in some underdeveloped areas.
According to the principle of this model, when an image is loaded, it will first display a low-resolution version of the entire image (as shown in the picture above, just started to load), when your eyes start to look at the image, machine learning will predict you The area where the gaze will focus on, and accelerate the loading of the area to make it clear enough (as shown in the figure below, loading 30%).
Then, when your eyes walk on the picture, machine learning has already guessed where your eyes will look next, and these places in the image will gradually load and clear (as shown in the picture below, loading 50%).
The follow-up of the picture is still based on the attention to gradually load the image (as shown in the figure below, loading 80%).
Finally, there are edge areas where the user’s eyes may not pay special attention at all, and 100% loading is completed.
If this set of machine learning models is accurate enough, the user may not notice that the image is slowly loaded part by part, and there may even be an illusion that the image was loaded completely from the beginning.
At present, Google has also released a demo of the technology, and users can try it out for themselves.The best experience requires a Chromium-based browser with its experimental JPEG-XL image renderer enabled (go to chrome://flags
,search jxl
and enable it).
The demo released by Google uses the JPEG-XL image format, but in October they said they would remove this format in a subsequent Chrome version (did the team not communicate?). It’s unclear what areas Google will use this machine learning model in the future.
The GitHub address of the model: https://github.com/google/attention-center
#Googles #open #source #model #adjusts #content #loading #priority #based #user #gaze #News Fast Delivery