Since long, Google is trying to incorporate Artificial Intelligence for its Search Algorithm. In the beginning, Google’s trademark algorithm PageRank was the CORE of its search. PageRank produced exceptional results however it has its limitation when it comes to determining the “intent” of the query. PageRank Algo was focused more on Web Data i.e. it evaluates the SERPS for a “query” based on the content & links. Google tried to refine it by adding other factors like relevancy, so if a person is searching for “best cars”, Google tried to show results from Auto-related websites only. However, sometimes a non-auto related website like Reddit or BuzzFeed may have better results for the query “best cars”.
So Google starts adding features such as LSI, Semantic, Verbatim, Personalized Search, etc. to provide a supposedly better SERPS set that was intended to “ANSWER” the query intent. All these features were pointed towards one goal i.e. to successfully interpret the query. For example, if a person searches for “black swan” there can be more than one possible query intent for this word.
1. Black Swan Movie
2. Black Swan Theory
3. Black Swan bird info
Since Google’s objective is to provide “best possible results” to its user, it is crucial for Google to understand what the user wanted when he or she searched this query. A simple method can be to use the user’s browsing behavior & history. For instance, if the person was on some movie-related webpage or has searched a movie-related query before searching “black swan”, then he is most probably searching for Black Swan Movie. Another possible factor is to use the CTR for web pages that appeared against this query. If the CTR of Wikipedia page about Black Swan movie is higher than the Wikipedia page about Black Swan Theory, then it can be safely assumed that the majority of people searching for “black swan” wanted to see results about black swan movies. However, using so many factors during Google Search algo online phase is not an easy thing plus it all based on historical usage data.
Google uses a manual machine learning algorithm that runs with the help of training data from usage history as well as from “human raters” employed by Google. Lately, Google was “pushy” on webmasters about using features such as “structured data” to enhance its knowledge Graph. However, even with the tons of that data, Google hasn’t perfected the model that can filter the SERPS on its own for new queries with no data or usage history. But that seems to be a thing of past as Google recently announced that they had successfully deployed an AI-based algorithm that “interpret” search queries on its own, especially the never-before-seen search queries. Google named it RankBrain and claimed that it’s now the third-most important Google Search signal for evaluating web pages against a query.
In simpler words, Google wants to “UNDERSTAND” your query rather than just “MATCH” it.
15% of 3.5 billion Google searches per day are never-before-seen search queries i.e. new queries with no historic data. RanKBrain uses artificial intelligence to convert vast amounts of written language into mathematical entities, called vectors, that computer can understand. Then process the never-before-seen search queries with these vectors to find the possible match and filter the result accordingly. From the comments of Greg Corrado, a senior research scientist at Google, on Bloomberg article, it seems that RankBrain is more focused towards the long tail and ambiguous queries. Normally “technical queries” fall into this category however as per the discussion and examples on HackerNews, RankBrain is not working how it supposed to be.
Whatever the case may be, once more details of RankBrain start surfacing, only then it would be possible to analyze how it can impact your website Search Marketing strategy and how far RankBrain impacts SEO.