At the GoogleSearch Central Live event held in Singapore a few days ago, Google Webmaster trend analyst Gary Illyes said that 60% of the content on the Internet is duplicated.

It’s a well-known fact that there is a lot of duplicate content on the Internet. The question is how big. As the largest and most popular search engine, Google has its answer.

Of course, everyone wants to know how Google defines “duplicate content”. Are 100% identical duplicates? It is also a repetition with different titles but the same actual content, such as reposting in social networks, or repetition after search engine optimization. According to the explanation of the Google Search Center documentation, duplicate content generally refers to substantive content blocks within or across domains that exactly match other content in the same language, or have a certain amount of content that is obviously similar.

Audience members later in the session added Gary Illyes’the contextmainly to explain how Google handles “duplicate data”. Google’s definition of “duplicate content” here is based on the consideration of crawling data.

1. Remove protocol duplication — in favor of HTTPS
2. Remove www/non-www
3. Remove URLs that contain useless parameters (eg sessionID?)
4. Slash-removed/no-slash variants
5. Remove other checksum duplication



#Google #internet #duplicate #content #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *