Google being a semantic search engine, it relies heavily on entities. Here’s how to optimize your entity-based SEO, with advanced use of structured data.
Did you know? Google has long since become a semantic search engine .
The consequences for SEO are enormous, I hope you have taken this into account for years, otherwise you are really overwhelmed …
I will explain to you in a simplified way why SEO goes hand in hand with entities .
Also, what this change for the optimization of your content.
And why seriously invest in structured data. I’m not just talking about having rich snippets, it really goes beyond that.
How Google became a semantic search engine
What is a semantic search engine?
Instead of analyzing the words present in the pages, it seeks to identify the entities (and then the relations between them). Basically, Google is interested in the subjects / concepts / things mentioned in a page and not in the presence of this or that word.
If it does not organize your site around entities, if you are still focusing on “keywords”, then you are at least 10 years behind… And that could explain why you are having difficulty improving your SEO.
Let’s go back quickly to understand the evolution of Google’s algorithm.
The rise of the Knowledge Graph
In 2012, when we were barely recovering from the Google Panda and Google Penguin filters, Google released the Knowledge Graph (source). Google announces that its search engine is interested in ” things, not strings ” (things, not words). Freebase, Wikipedia and the CIA World Factbook created their knowledge base) with an analysis of data available on the web. For the launch, it had 500 million entities and 3.5 billion relationships between them. We have since become accustomed to finding a Knowledge Panel in the SERPs. In 2020, Google knows 5 billion entities and 500 billion relationships between them (source).
In 2013, Google announced Hummingbird, one of its biggest algorithm changes. Concretely, Google then seeks more to answer the full meaning of the query, using all the clues made up of the unfamiliar words, rather than finding the web page that best matches these words. It is semantic research, which has since made it possible to move towards conversational research.
In 2015, Google got a 1st patent entitled “Ranking Search Results Based On Entity Metrics,” (literally “Ranking of search results according to entity metrics”). Others followed.
Since then, RankBrain or BERT have largely helped Google to further improve its understanding of human language, thanks to AI techniques. Entities are more easily spotted.
What is entity analysis for Google used for?
Based on entities, Google can:
- better estimate the search intention of the Internet user (what exactly does he want, through the words typed or spoken?)
- better understand what a page is about, ensure that its aim corresponds to that of its user (navigation to a specific site, information, preparation of a transaction, transaction, etc.)
- evaluate the notoriety / popularity / reputation of a site, of the natural or legal person attached, of the related products or services, all this without being based on the links but on the relations between entities
- to answer questions asked, orally or not, to the Google Assistant or directly in the search engine, as well as to develop a conversational mode (chatbot)
- etc.
For me, no more doubts: we are in the era of entity-based SEO.
Good SEO thanks to entities
The contents
The best technique at the moment in terms of content optimization for SEO is perhaps topic clustering. You will find all kinds of other appellations more or less similar, depending on what we are trying to sell you.
The idea is to work by subjects (themes, topics in English) and to show Google that our site covers the question (on each subject). For that, each cluster contains a pillar page (the central page which will position itself at the top of the SERPs) and secondary pages which address everything you need on the subject. The mesh between them must be done carefully, with an excellent choice for the anchor text.
Entities directly linked the fact of working by subject (and not by keywords to the approach. We have to show Google that our page is complete on a subject, that it mentions all the other entities it expects to find. We can sum the mention up as a link to another page.
Backlinks
Gradually, I think backlinks took less importance in Google, as Google exploited entity analysis. When a content mentions an entity, it forms a triplet: 2 entities and a relation between them. Google can use all these relationships to assess notoriety, reputation, authority .
Google’s algo will be based on backlinks for a long time, but it can achieve excellent results with an entity-based approach.
Declare entities explicitly: structured data
Where it can become really effective is when you help Google (and any other engines or tools that want to exploit this publicly available data!). To do this, we must declare structured data in order to specify which entities are being processed. Be very precise, to remove any ambiguity, and that’s a great strength of this mechanism.
It’s much better than doing nothing and thinking that Google can figure it out without making the slightest mistake or forgetting the slightest bit.
Create your own Knowledge Graph and aim beyond Google
Even today, Google is the master of the place, it totally dominates “search”. Providing Google with all this structured data may seem like a bad idea, because it will have access to even more knowledge, and will sometimes respond directly to the user. It’s not false! But for the moment, it’s also an excellent way to make it fully understand what a page is about, and therefore to better get it out for excellent research.
In fact, besides using schema.org, to declare its structured data (a repository currently dominated by Google, when that could be the role of W3C…), you can well. This involves creating your own repository of entities, with their relationships, while indicating for each what they correspond to.
For this, not only you have to use the schema.org repository (to be well recognized by Google, especially for what triggers rich snippets ), but you must also use other databases such as DBpedia and WikiData. Any startup (or Internet giant) can exploit the data known to DBpedia and WikiData. And if they refer to your site, you will be in the race!
Above all, by having your own repository, you are a supplier of structured data (which remains declared with you, with their own identifier recognized by other open databases). This is fundamental in trying to limit the power of Google, which comes to help you and manages its own knowledge base.
How can structured data actually help SEO?
As already stated, it helps Google (and others) better understand your content, what exactly you are talking about .
But I will try to be more concrete here.
Adding the right structured data to the pages of your site also allows:
- Get more easily enriched extracts (rich snippets). For example, these could be reviews (with stars), event announcements, photos of cooking recipes, etc.
- get a better click-through rate in the SERPs (consequence of the presence of a rich snippet)
- to be positioned on additional requests, thanks to semantic analysis
- increase your chances of being optimized for voice search (Alexa and many others can leverage your structured data and ensure that your content responds to the query)
Creating your own knowledge graph (associated with the site) also improves user engagement on the site. For example, we can offer links to other pages based on the entities mentioned (without depending on a plugin), create a lexicon, find images related to the theme of the page, offer a carousel to browse the pages. Other content centered on the same entities …