The search engine looks and works. SEO jungle, or How do search engines work? What is a search engine

They are one of the main and most important Internet services.

With the help of search engines, billions of Internet users find the information they need.

What is a search engine?

The search engine is a software and hardware complex that uses special algorithms to process a huge amount of information about a variety of sites, about their content down to each page.

A search engine, from the point of view of ordinary visitors, is such a smart site that contains a lot of information and provides answers to any user requests.

In different countries, Internet users use different search engines. In the English-speaking segment of the Internet, the most popular search engine is Google.

Search engines in Runet

In Russia, more than half of users prefer the Yandex search engine, and Google accounts for about 35% of queries. Other users use Rambler, Mail.ru, Nigma and other services.

In Ukraine, about 60% of users use Google, while Yandex accounts for just over 25% of processed requests.

Therefore, when promoting sites on the Runet, experts try to promote the site, focusing on the search engines Yandex and Google.

Search Engines Tasks

In order to answer visitors' questions as accurately as possible, search engines must perform the following tasks:

Quickly and efficiently collect information about various pages of different sites.
Process information about these pages and determine which request or requests they correspond to.
Generate and issue search results in response to user requests.

Components of search engines

Search engines are a complex software package that consists of the following main blocks:

Data collection.
Indexing.
Calculation.
Ranging.

This division is conditional, since the work of different search engines is somewhat different from each other.

1. Data collection

At this stage, the task is to find new documents, draw up a plan for visiting and scanning them.

Webmasters need to let the search engines know about the emergence of new materials by placing the page address in the add-on or driving the page announcement on social networks.

Personally, I use the latter method and I think that this is quite enough.

A comment. I will digress a little and tell you about the effectiveness of posting announcements on social networks on the speed of indexing new website pages.

I use the text.ru service to control and fix the uniqueness of the text on the pages of my site.

He qualitatively checks the uniqueness, fixes it and makes it possible to place a banner of uniqueness on the pages of your site.

But sometimes this service has a long queue for processing. I had several cases when I did not wait for the uniqueness check, posted an article on the site and sent it to social networks.

If the uniqueness check was delayed for about an hour or more, then the uniqueness percentage was always 0%. This means that in less than an hour after placement, the page has already been indexed and entered into the search engine database.

2. Indexing

Search engines collect data on new web pages and place them in their database. At the same time, an index is formed, that is, a key for quick access to data about this page, if such a need arises.

3. Calculation

After entering the database, the pages of our sites go through the stage of calculating various parameters and indicators.

How many of these indicators and how they are calculated to say exactly, except for the developers of search engine algorithms themselves, no one can.

4. Ranking

Then, based on the calculated parameters and indicators, the relevance of the page to certain queries is determined and the ranking of this page is performed.

This will be important for the quick and high-quality formation of the search results page for these queries.

Search engines generate responses to user queries and generate results for them in the form of a search results page.

It should be noted that algorithms for processing page data, generating indicators and ranking methods are constantly being improved. The priorities for ranking are changing.
Search engines strive to answer user requests as accurately as possible, trying to take into account the nature of the request, the interests of a particular user, his place of residence, age, gender, habits, inclinations.

Many people want to be in the TOP, but not everyone understands how search engines work. And by the turn of 2017, the requirements for sites on the part of search engines became even more stringent (for more details, see the article). Therefore, in order to constantly be in the top, first you need to at least understand how search algorithms work.

After reading this article to the end, you will understand the principles on which Yandex and Google work, you will learn a little more about mail, rambler and bing. At the same time, we will not touch on the factors of ranking sites, because this is a very voluminous material that requires a separate publication.

Well, or if you want the goal, purpose or even mission of the search engine is to give the most accurate answer to the user to his request in the form of a list of links to various resources.

In order to generate a high-quality list of sites, the search engine creates a database. That is, if your site or new page of the site is not indexed by Yandex or Google, then they will not be in the search results. A database of sites is formed by search robots that report information about sites to their "Boss", and that enters the data into the register. For example, if you registered your site in or, there you can find information about how many pages of your site have been indexed by the search engine.

Further, the entire registry of data from the pages of numerous sites is ranked according to certain parameters: region, relevance to the request, resource popularity, content quality etc. As I already said, we will analyze the entire list of ranking factors in a separate publication. The main task when promoting a site is just the influence on these factors, in order to raise the site to the TOP.

Features and characteristics of search engines in 2018

We've all seen Google ads about how the search engine is getting into an unequal battle with grandma to find the nearest pharmacy. What does this mean? That search engines are learning and will soon stop working with keywords completely and will only work with meanings. Because this is their main task not to issue an arbitrary list of sites, but to help the user in finding a place, product or service.

In our country, the share of voice search is still very insignificant, but in the United States it accounts for about 50% of mobile traffic. This means that this trend will soon affect Russia. Accordingly, the number of information requests will increase ( how, where, where) and queries that cannot be predicted because they will not be stereotyped and dictated by the situation in which the person is. For example, he stands at an intersection and asks where I should turn to find a cafe where there are business cafes worth up to 300 rubles. This is Google.

As for Yandex, which was also presented at the end of 2016. This is an algorithm that will also work primarily with meanings.

Which search engine is better or how does Yandex differ from Google?

From my own experience, I can say that both search engines are good in their own way. The difference, of course, is that Yandex is a Russian search engine, while Google is the world's largest search engine. Of course, we are not interested in the external difference between the sites of these search engines and the services they provide, but in how they form the search results, since they are very different.

Yandex pays more attention to regional search. That is, if you are in Vladivostok and enter a request without specifying a city or region, for example, "windows", first of all Yandex will show the sites of those companies that are located in Vladivostok and are somehow connected with windows.

For Google, the popularity, citation of the resource (not only links to your site) is more important, on the basis of this it concludes whether your site is useful.

As for other search engines, mail.ru is the shell of Google search results, i.e. mail.ru itself does not analyze anything, but simply shows what Google would show. Rambler.ru by the same principle is a Yandex shell.

21.11.2017

Whatever question worries a modern person, he does not look for answers in books. He searches for them on the Internet. Moreover, you do not need to know the address of the site that contains the information you need. There are millions of such sites, and a search engine helps you find the right one.

On the vastness of our domestic Internet, the two most popular search engines are Google and Yandex.

Have you ever wondered how a search engine works? How does she understand which site to show, which of the millions of resources has an answer to your request?

What is a search engine?

The search engine is a huge database of web documents that is constantly updated and expanded. Each search engine has search spiders, robots are special bots that crawl sites, index content posted on them, and then rank according to the degree of its quality and relevance to user search queries.

Search engines work to ensure that anyone can find any information. Therefore, they try to show first of all those web documents in which there is the most detailed answer to a person's question.

At its core, a search engine is a directory of sites, a directory, the main function of which is to search for information in this very directory.

As I wrote above, we have two popular systems - Google (world) and Yandex (Russian-speaking segment). But there are also systems such as Rambler, Yahoo, Bing, Mail.Ru and others. The principle of operation is similar for all of them, only the ranking algorithms differ (and that is not very significant).

How does an Internet search engine work?

The principle of work of search engines is very complicated, but I will try to explain it in simple words.

The search robot (spider) crawls the pages of the site, downloads their content and extracts links. Next, the indexer begins its work - this is a program that analyzes all materials downloaded by spiders, relying on its own work algorithms.

Thus, a database of the search engine is created, in which all documents processed by the algorithm are stored.

Work with a search query is carried out as follows:

the query entered by the user is analyzed;
the analysis results are transferred to a special ranking module;
the data of all documents is processed, the most relevant to the entered query are selected;
snippet is generated - title, description, words from the request are highlighted in bold;
search results are presented to the user in the form of a SERP (SERP).

How search engines work

The main task of any search engine is to provide the user with the most useful and accurate information on his request. Therefore, the search robot crawls sites all the time. Immediately after your launch, according to a certain schedule, the spider comes to visit you, crawls a number of pages, after which they are indexed.

The principle of operation of search engines is based on two main stages:

crawling the pages with which data is collected;
assignment of an index, thanks to which the system can quickly search the content of this page.

As soon as the page of the site is indexed, it will already appear in the search results for a specific search query. You can check if a new page is in the search engine's index using webmaster tools. For example, in Yandex.Webmaster, you can immediately see which pages are indexed and when, and which ones dropped from the index and for what reason.

But what page it will end up on depends on the degree of indexing and the quality of its content. If your page gives the most accurate answer to a request, it will be higher than all the others.

The principles of ranking sites in search engines

We figured out what principle the search robots use. But how does site ranking work?

Ranking is based on two main "whales" - the textual content of the page and non-textual factors.

Text content Is the content of the page. The more complete it is, the more accurate, the more relevant to the query - the higher the page will be in the search results. In addition to the text itself, the search engine pays attention to filling in the tags title (page title), description (page description), H1 (text title).

Non-textual factors Is internal linking and external links. The bottom line is this: if the site is interesting, useful, then other thematic resources refer to it. And the more such links, the more authoritative the resource is.

But these are the most basic principles, very briefly. Let's go a little deeper.

The mainsite ranking factors

There are a number of factors that affect a site's ranking. The main ones are:

1.InInternal site ranking factors

This is the text on the site and its design - subheadings, highlighting important points in the text. The use of internal linking also applies here. Visual elements are also important: the use of pictures, photos, videos, graphs. The quality of the text itself, its content is also important.

2. External website ranking factors that determine its popularity. These are the very external links that lead to your site from other resources. Not only the number of these sites is determined, but their quality (it is desirable that the sites are similar to yours), as well as the overall quality of the link profile (how quickly these links appeared, naturally or through purchase on the exchange).

Based on the foregoing, one conclusion can be drawn: search engines try to work in such a way as to show the user those sites that give the most complete answer to his request and have already earned a certain authority. At the same time, a variety of factors are taken into account: the content of the site, and its setting, and the attitude of users to it. A site that is good in all respects will certainly take a high place in the SERP.

The most popular web service of our time is the search engine. Everything is understandable here, because the days when representatives of the first Internet users could observe new items on the network are long gone.

So much information appears and accumulates that it has become very difficult for a person to find exactly the one that he would need. Imagine what a search on the Internet would be like if an ordinary user had to look for information, don't understand where. Do not understand where, because you cannot find a lot of information by manual search.

Search engine what is it?

It is good if the user already knows sites that may have the necessary information, but what should he do otherwise? In order to make life easier for a person in finding the necessary information on the Internet, search engines or simply search engines were invented. The search engine performs one very important function, without which the Internet would not be the way we are used to seeing it - it is the search for information on the web.

Search system is a special web site or in another way a site that provides users, upon their requests, with hyperlinks to pages, sites that respond to a given search query.

To be a little more precise, the search for information on the Internet, carried out thanks to the software and hardware functional set and a web interface for interacting with users.

For human interaction with the search engine, a web interface was created, that is, a visible and understandable shell. This approach of search engine developers makes it easier for many people to find. As a rule, it is on the Internet that searches are carried out using search engines, but there are also search systems for FTP servers, certain types of goods on the World Wide Web, or news information or other search directions.

The search can be carried out not only on the text content of sites, but also on other types of information that a person can search for: images, video, sound files, etc.

How does a search engine search?

Searching on the Internet itself, just like browsing web sites, is possible using an Internet browser - a browser. Only after the user has set his request in the search bar, the search itself is carried out directly.

Any search engine contains a software part on which the entire search engine is based, it is called a search engine - this is a software package that provides the ability to search for information. After contacting a search engine, a person forming a search query and entering it into the search bar, the search engine generates a page with a list of search results, the most relevant, in the opinion of the search engine, are located here higher.

The relevance of the search is the search for materials that are most relevant to the user's request and the location of hyperlinks on them on the SERP with more accurate results above others. The distribution of results itself is called site ranking.

So how does a search engine prepare its materials for issuance and how does the search engine itself search for information? The collection of information on the network is facilitated by a robot unique for each search engine or, in another way, a bot, which also has a number of other synonyms like a crawler or a spider, and the work of the search system itself can be divided into three stages:

The first stage of a search engine's work can be attributed to scanning sites in the global network and collecting copies of web pages on their own servers. This forms a huge amount of not yet processed and unsuitable information for search results.

The second stage of the search engine's work is reduced to putting in order the information received earlier, at the first stage, from the sites. Sorting is performed that in the least amount of time will favor the very high-quality search that users actually expect from the search engine. The stage is called indexing, which means that the pages are already prepared for delivery, and the current base will be considered an index.

It is precisely the third stage that determines the search results, after receiving a request from your client, relying on the key or near keywords specified in the request. This contributes to the selection of the most appropriate information for the request, and its subsequent issuance. Since there is a lot of information, the search engine performs ranking in accordance with its algorithms.
The best search engine is considered to be the one that can provide the material that most correctly answers the user's request. But even here there may be results that were influenced by people interested in promoting their site, such sites, although not always, but often appear in search results, but not for a long time.

Although world leaders have already been identified in many regions, search engines continue to develop their quality search. The better the search they can provide, the more people will use it.

How to use the search engine?

What is a search engine and how it works is already clear, but how to use it correctly? Most sites always have a search bar, and next to it is a Find or Search button. A query is entered into the search bar, after which you need to press the search button, or, as is more often the case, press the Enter key on the keyboard and in a matter of seconds you get the query result in the form of a list.

But getting the right answer to a search query is not always possible the first time. In order that the search for what you want does not become painful, you must correctly compose a search query and follow the recommendations described below.

We compose the search query correctly

Here are some tips for using the search engine. Following some tricks and rules when searching for information in a search engine will make it possible to get the desired result much faster. Follow these guidelines:

Competent spelling of words ensures the maximum number of matches with the desired information object (Although modern search engines have already learned how to correct spelling errors, this advice should not be neglected).
By using synonyms in your query, you can cover a wider search range.
Sometimes changing the word in the text of the query can bring a greater result, carry out the re-formation of the query.
Bring specificity to the query, use exact occurrences of phrases that should define the main essence of the search.
Experiment with keywords. Using keywords and phrases can help define the main point, and the search engine will return more relevant results.

So a search engine is nothing more than an opportunity to find the information of interest and usually use it completely free of charge, learn something, understand something, or make the right conclusion for yourself. Many can no longer imagine their life without voice search, in which the text does not have to be typed, your request only needs to be pronounced, and the input device here is a microphone. All this testifies to the constant development of search technologies on the Internet and the need for them.

The first computer program to search the Internet was Archie, created in 1990 by students from Montreal. She downloaded lists of all files from all available FTP servers and built a database that could be searched by filenames. The first full-text search engine was WebCrawler, launched in 1994 and indexing resources using a robot. It allowed users to search for any word on any web page. In 1998, Larry Page and Sergey Brin created the Google search engine based on their BackRub project. Their innovation was the introduction of their own PageRank algorithm, which ranks web pages based on the number of hyperlinks to them. Search taking into account Russian morphology was first implemented in 96 on the Altavista search engine, at the same time Rambler and Aport were launched. And in September 1997, the Yandex search engine was opened. How the search engine works - we'll talk about this in today's issue.

Initially, the crawler needs to get the content, and the indexer needs to generate a searchable index. A crawler, or crawler, is a program that automatically traverses all links found on a page and highlights them. Based on a predetermined list of addresses, it searches for new documents that are not yet known to the search engine. Found new pages are analyzed by the search engine for further indexing. This is done by a special module - an indexer, which pre-splits pages into parts using lexical and morphological algorithms. Data about web pages is stored in an index database. The index allows you to quickly find information on user requests.

The search engine, in turn, works with files received from the indexer. When a user enters a query into a search engine, it checks its index and returns a list of the most relevant web pages.

Query analysis begins with defining the language, since the same word in different languages can mean different things. Therefore, the system pays attention to the alphabet, region and language of the user interface. Then the search engine moves on to morphology and determines which part of speech the written words belong to. This allows you to find documents containing different forms of the same words. The search engine also highlights various objects in the query - geographical names, names of people and names of organizations, and to take into account all possible options, supplements the query with new formulations with the same meaning. In addition, the search engine automatically corrects errors or shows results for both erroneous and corrected queries.

Most search engines use ranking techniques and machine learning to display the “best” results at the top of the list.

In advanced search engines, neural networks transform search queries and web page titles into groups of numbers - semantic vectors. They can be compared with each other and give even more accurate results.

There are also search algorithms that compare vectors of requests and web pages as a whole - and not just their headers. This allows the system to understand the meaning of the pages and select them correctly when people describe what they are looking for in their own words. To do this, the neural network converts the texts of the pages into semantic vectors in advance - at the indexing stage. And when a person asks a request, the algorithm compares the request vector with the page vectors already known to him.