It navigates the web by downloading pages and following links on those pages to discover new pages that have been made available.
To understand natural referencing and the process of discovering and ranking a website in the results, it is important to understand how a search engine works.
They crawl hundreds of billions of pages using their own crawlers. These crawlers navigate the web by downloading content from pages and following links on those pages from page to page and site to site.
But what happens when you type a query and click Search? How do they work internally and how do they decide what to display and in what order?
By understanding how it works, you can more easily understand the rules of SEO optimization ( What is SEO on Google) .
A search engine is software , generally accessible on the Internet, which collects and organizes the content it finds. The recovered elements can be texts, images, audio files or video files…
All of this data collected is then analyzed and classified to be returned according to criteria specific to each search engine, with the aim of offering the most relevant answers possible .
They are now an integral part of our daily lives and are used every day to search for information, find products, perform local queries… Today there are many different ones available, each with their own capabilities and functionalities. .
Google crushes the world top with around 93% market share
Google also dominates the French ranking with around 93% market share
Before even allowing you to enter a query and search the Web, search engines must perform many operations so that they can present you with a set of precise and quality results that answer your question or and research intentions.
They have three main functions:
- Exploration: robots, called crawlers, roam the web by navigating from link to link. The spider moves from page to page and from site to site using links.
- Indexing: storage and analysis of content discovered during exploration. It basically means that they save them in the databases.
- Classification: display of documents responding to a researcher's request according to secret formulas. The engine crawls through its gigantic base and uses them to filter out what's relevant to you.
Engines have a number of programs and software called crawlers (also called spiders, bots or spiders) that are responsible for finding publicly available information. Google's most famous robot is called GoogleBot.
When visiting a site, they follow all the internal and external links, move from page to page and from site to site using the links. robots return later to finish the exploration work.
Google assigns each site, according to numerous criteria, a crawl budget which limits the number of pages that its robots come to visit over a given period.
Crawlers also keep track of changes made to knowledge pages so that they can update their content analysis.
This process is called indexing and the information found is added into a structure called index.
It can be stimulated by:
- The age of the site and its popularity.
- Sending the sitemap to the engine.
- A request via webmaster tools.
You can check the indexing of a site's pages with specific commands to perform on Google.
There are a number of circumstances in which a URL will not be indexed.
- Exclusion of the page via the Robots.txt file.
- Directives on the page to not index this page or to index another canonical page.
- The algorithms judge the page of poor quality, like duplicate content…
- The page URL returns an error page, such as a 404 error.
The goal is to present a high quality, relevant set that will answer the query or question as quickly as possible.
Following the entry by an Internet user, all the pages deemed relevant are identified from the index. Algorithms are used to rank relevant documents hierarchically.
Search engines use other relevant data:
- Geolocation: some requests depend on the location of the Internet user.
- Language detected: they return documents in the language of the user.
- Search history: they will return different links depending on what the user has already searched for.
- Device used: depending on the device used (computer or mobile).
Its primary purpose is to provide a list of responses that best match what the Internet user is trying to find and operate their Google Ads advertising network.
On Google, relevance is determined by more than 200 factors and takes into account the user experience in the choice.
It is the number 1 search engine, far ahead of its challengers. So it makes more sense to optimize your site for it.
They regularly hold over 90% market share, which translates to around 3.5 billion individual searches on their platform every day .
Google should be able to crawl your site easily . The first concern is to verify that spiders can easily discover your site, without blocking points. That sections that should not be referenced are well protected.
The importance is to simplify the exploration work as much as possible.
To be implemented for SEO:
- Use the robots.txt.
- Set up a sitemap.
- Create a simple site structure.
- Create a relevant internal link.
Google's site index contains billions of pages . Organizing this information is done through a machine learning algorithm called RankBrain and a knowledge base called Knowledge Graph.
The more pages you have in the main index, the more likely you are to appear in search results.
To be implemented for SEO:
- Create quality publications.
- Avoid duplicate content.
- Make indexing requests via the search console.
Everything up to that point happens in the background , before a user interacts through the search functionality. Ranking is the action that happens based on what is searched. It should return the best possible results in the fastest way possible.
It displays its results based on many factors:
Google's algorithms are a complex system used to retrieve data from its index and instantly deliver the best possible results. The engine uses a combination of these and many signals to deliver relevance-ranked pages on its SERPs.
- Search Intent: It analyzes each query using complex language models based on past search and usage behavior. Its objective is to understand exactly the intention from the requests made by the Internet user and his browsing history.
- Relevance: once he has determined the intention, he finds in his index the most relevant content for this Internet user.
- Quality: It examines the quality of the content and prioritizes it based on many factors.
- UX: It attaches importance to the user experience and the speed of the sites.
- Popularity of the site: a popular and authoritative site in its sector of activity will have an easier time placing itself at the top.
- Device type: Those searching on mobile are offered mobile-friendly pages.
- Location: Those looking for local information will see responses related to their location.
- Context and additional settings: Personalization based on browsing history and specific settings from the Google platform.
Importance for SEO:
- Create quality content understandable by the engine (writing, semantics, quality content, etc.).
- Responding to specific intentions.
- Take care of the UX (ergonomics, site speed, tree structure, etc.).
- Work on the netlinking and the popularity of the site.
- Have a site optimized for all media.
If your site traffic suddenly drops and you see a corresponding drop in your rankings , chances are you will be penalized by Google. In 2014, he reported that more than 400,000 manual actions are taken against sites each month by his anti-spam team, which is only a fraction of the total number.
Many other sites are penalized when Google releases new algorithmic updates for Penguin or Panda.
In general, it penalizes sites in two ways: manual and algorithmic penalties .
The spam team may identify a problem on your site and take manual action, or you may experience an automatic drop due to an algorithm update. Either way, you'll need to find the root cause of the drop.
As a site editor, your SEO job is to make it easier for spiders to crawl by creating sites that have a simplified structure.
Other optimizations will consist of creating content, user experience and popularity that will send the right signals to help their algorithms position you at the top of the results pages on your target keywords.