Chapter 2: Search Engines
February 13, 2020
February 13, 2020
What determines SEO search factors? The elements that contribute to how and where your site is displayed on the Google results page; impact how your content makes its way to your audience. Understanding SEO is crucial to significantly increase your traffic and brand awareness. Come learn more about Advanced SEO Techniques for 2021!
Thursday, March 11, 2021
There are strategies you can employ to generate that traffic, guide visitors once they arrive on your website, and continue the relationship even after their visit to generate sales.
Wednesday, September 30, 2020
Are you trying to figure out how to set up a Payment Provider for your Site? Find out useful information that will Describe how to Install and Configure SnapScan as a Payment Provider for the Shopify eCommerce platform.
Friday, July 31, 2020
Does Page Speed affect your Shopify eCommerce store's SEO? Speed does affect SEO. Page speed is a direct ranking factor, a fact known even better since Google's Algorithm Speed Update.
Friday, July 31, 2020
Are you trying to figure out how to set up a Payment Provider for your Site? Find out useful information that will Describe how to Install and Configure Zapper as a Payment Provider for the Shopify eCommerce platform.
Friday, July 31, 2020
This is Part 2 of our SEO Series. Honey Whale takes you through all the fundamentals, tips, and tricks for Search Engine Optimization. This Series has been crafted for the South African market.
Search engines are answer machines that exist to discover, understand, and organise the internet's content in order to offer the most relevant results to the questions searchers are asking. In order to show up in search results, your content needs to first be visible to search engines. It's arguably the most important piece of the SEO puzzle: If your site can't be found, there's no way you'll ever show up in the SERPs (Search Engine Results Page).
Search engines have three primary functions:
Crawling is the process by which search engines send out a team of crawlers or spiders (machine code basically) to find new and updated content. Content can vary — it could be a webpage, an image, a video, a PDF, etc. — but regardless of the format, content is discovered by links.
Googlebot starts out by fetching a few web pages, and then follows the links on those webpages to find new URLs. By hopping along this path of links, the crawler is able to find new content and add it to their index called Caffeine — a massive database of discovered URLs — to later be retrieved when a searcher is seeking information that the content on that URL is a good match for.
Search engines process and store information they find in an index, a huge database of all the content they’ve discovered and deem good enough to serve up to searchers.
When someone performs a search, search engines scour their index for highly relevant content and then orders that content in the hopes of solving the searcher's query. This ordering of search results by relevance is known as ranking.
In general, you can assume that the higher a website is ranked, the more relevant the search engine believes that site is to the query. It’s possible to block search engine crawlers from part or all of your site, or instruct search engines to avoid storing certain pages in their index. While there can be reasons for doing this, if you want your content found by searchers, you have to first make sure it’s accessible to crawlers and is indexable. Otherwise, it’s as good as invisible.
Making sure your site gets crawled and indexed is a prerequisite to showing up in the SERPs. If you already have a website, it might be a good idea to start off by seeing how many of your pages are in the index. This will yield some great insights into whether Google is crawling and finding all the pages you want it to, and none that you don’t. One way to check your indexed pages is "site:yourdomain.com", an advanced search operator. Head to Google and type "site:yourdomain.com" into the search bar. This will return results Google has in its index for the site specified:
The number of results Google displays isn't exact, but it does give you an indication of which pages are indexed on your site and how they are currently showing up in search results. For more accurate results monitor and use the Index Coverage report in Google Search Console.
You can sign up for a free Google Search Console account if you don't currently have one.
With this tool, you can submit sitemaps for your site and monitor how many submitted pages have actually been added to Google's index, among other things. If you're not showing up anywhere in the search results, there are a few possible reasons why:
Most people think about making sure Google can find their important pages, but it’s easy to forget that there are likely pages you don’t want Googlebot to find. These might include things like old URLs that have thin content, duplicate URLs (such as sort-and-filter parameters for e-commerce), special promo code pages, staging or test pages, and so on.
To direct Googlebot away from certain pages and sections of your site, use robots.txt.
Robots.txt files are located in the root directory of websites (ex. yourdomain.com/robots.txt) and suggest which parts of your site search engines should and shouldn't crawl, as well as the speed at which they crawl your site, via specific robots.txt directives.
If Googlebot can't find a robots.txt file for a site, it proceeds to crawl the site. If Googlebot finds a robots.txt file for a site, it will usually abide by the suggestions and proceed to crawl the site. If Googlebot encounters an error while trying to access a site’s robots.txt file and can't determine if one exists or not, it won't crawl the site.
Not all web robots follow robots.txt. People with bad intentions (e.g., e-mail address scrapers) build bots that don't follow this protocol. In fact, some bad actors use robots.txt files to find where you’ve located your private content. Although it might seem logical to block crawlers from private pages such as login and administration pages so that they don’t show up in the index, placing the location of those URLs in a publicly accessible robots.txt file also means that people with malicious intent can more easily find them. It’s better to NoIndex these pages and gate them behind a login form rather than place them in your robots.txt file.
Now that you know some tactics for ensuring search engine crawlers stay away from your unimportant content, let’s learn about the optimizations that can help Googlebot find your important pages.
A search engine will be able to find parts of your site by crawling it, but other pages or sections might be obscured for one reason or another. It's important to make sure that search engines are able to discover all the content you want indexed, and not just your homepage.
Ask yourself this: Can the bot crawl through your website, and not just to it?
If you require users to log in, fill out forms, or answer surveys before accessing certain content, search engines won't see those protected pages. A crawler is definitely not going to log in.
Robots cannot use search forms. Some individuals believe that if they place a search box on their site, search engines will be able to find everything that their visitors search for.
Non-text media forms (images, video, GIFs, etc.) should not be used to display text that you wish to be indexed. While search engines are getting better at recognizing images, there's no guarantee they will be able to read and understand it just yet. It's always best to add text within the <HTML> markup of your webpage.
Just as a crawler needs to discover your site via links from other sites, it needs a path of links on your own site to guide it from page to page. If you’ve got a page you want search engines to find but it isn’t linked to from any other pages, it’s as good as invisible. Many sites make the critical mistake of structuring their navigation in ways that are inaccessible to search engines, hindering their ability to get listed in search results.
This is why it's essential that your website has a clear navigation and helpful URL folder structures.
Information architecture is the practice of organizing and labeling content on a website to improve efficiency and findability for users. The best information architecture is intuitive, meaning that users shouldn't have to think very hard to flow through your website or to find something.
A sitemap is just what it sounds like: a list of URLs on your site that crawlers can use to discover and index your content. One of the easiest ways to ensure Google is finding your highest priority pages is to create a file that meets Google's standards and submit it through Google Search Console. While submitting a sitemap doesn’t replace the need for good site navigation, it can certainly help crawlers follow a path to all of your important pages.
If your site doesn't have any other sites linking to it, you still might be able to get it indexed by submitting your XML sitemap in Google Search Console. There's no guarantee they'll include a submitted URL in their index, but it's worth a try!
In the process of crawling the URLs on your site, a crawler may encounter errors. You can go to Google Search Console’s “Crawl Errors” report to detect URLs on which this might be happening - this report will show you server errors and not found errors. Server log files can also show you this, as well as a treasure trove of other information such as crawl frequency, but because accessing and dissecting server log files is a more advanced tactic, we won’t discuss it at length in the Beginner’s Guide, although you can learn more about it here.
Before you can do anything meaningful with the crawl error report, it’s important to understand server errors and "not found" errors.
4xx errors are client errors, meaning the requested URL contains bad syntax or cannot be fulfilled. One of the most common 4xx errors is the “404 – not found” error. These might occur because of a URL typo, deleted page, or broken redirect, just to name a few examples. When search engines hit a 404, they can’t access the URL. When users hit a 404, they can get frustrated and leave.
5xx errors are server errors, meaning the server the web page is located on failed to fulfill the searcher or search engine’s request to access the page. In Google Search Console’s “Crawl Error” report, there is a tab dedicated to these errors. These typically happen because the request for the URL timed out, so Googlebot abandoned the request. View Google’s documentation to learn more about fixing server connectivity issues.
Thankfully, there is a way to tell both searchers and search engines that your page has moved — the 301 (permanent) redirect. The 301 status code itself means that the page has permanently moved to a new location, so avoid redirecting URLs to irrelevant pages — URLs where the old URL’s content doesn’t actually live. If a page is ranking for a query and you 301 it to a URL with different content, it might drop in rank position because the content that made it relevant to that particular query isn't there anymore. 301s are powerful — move URLs responsibly!
You also have the option of 302 redirecting a page, but this should be reserved for temporary moves and in cases where passing link equity isn’t as big of a concern. 302s are kind of like a road detour. You're temporarily siphoning traffic through a certain route, but it won't be like that forever.
Once you’ve ensured your site is optimized for crawlability, the next order of business is to make sure it can be indexed.
Once you’ve ensured your site has been crawled, the next order of business is to make sure it can be indexed. That’s right — just because your site can be discovered and crawled by a search engine doesn’t necessarily mean that it will be stored in their index. In the previous section on crawling, we discussed how search engines discover your web pages. The index is where your discovered pages are stored. After a crawler finds a page, the search engine renders it just like a browser would. In the process of doing so, the search engine analyzes that page's contents. All of that information is stored in its index.
How do search engines ensure that when someone types a query into the search bar, they get relevant results in return? That process is known as ranking, or the ordering of search results by most relevant to least relevant to a particular query.
To determine relevance, search engines use algorithms, a process or formula by which stored information is retrieved and ordered in meaningful ways. These algorithms have gone through many changes over the years in order to improve the quality of search results. Google, for example, makes algorithm adjustments every day — some of these updates are minor quality tweaks, whereas others are core/broad algorithm updates deployed to tackle a specific issue, like Penguin to tackle link spam. Check out our Google Algorithm Change History for a list of both confirmed and unconfirmed Google updates going back to the year 2000.
Why does the algorithm change so often? Is Google just trying to keep us on our toes? While Google doesn’t always reveal specifics as to why they do what they do, we do know that Google’s aim when making algorithm adjustments is to improve overall search quality. That’s why, in response to algorithm update questions, Google will answer with something along the lines of: "We’re making quality updates all the time." This indicates that, if your site suffered after an algorithm adjustment, compare it against Google’s Quality Guidelines or Search Quality Rater Guidelines, both are very telling in terms of what search engines want.
Search engines have always wanted the same thing: to provide useful answers to searcher’s questions in the most helpful formats. If that’s true, then why does it appear that SEO is different now than in years past?
Think about it in terms of someone learning a new language.
At first, their understanding of the language is very rudimentary — “See Spot Run.” Over time, their understanding starts to deepen, and they learn semantics — the meaning behind language and the relationship between words and phrases. Eventually, with enough practice, the student knows the language well enough to even understand nuance, and is able to provide answers to even vague or incomplete questions.
When search engines were just beginning to learn our language, it was much easier to game the system by using tricks and tactics that actually go against quality guidelines. Take keyword stuffing, for example. If you wanted to rank for a particular keyword like “funny jokes,” you might add the words “funny jokes” a bunch of times onto your page, and make it bold, in hopes of boosting your ranking for that term:
Welcome to funny jokes! We tell the funniest jokes in the world. Funny jokes are fun and crazy. Your funny joke awaits. Sit back and read funny jokes because funny jokes can make you happy and funnier. Some funny favorite funny jokes.
This tactic made for terrible user experiences, and instead of laughing at funny jokes, people were bombarded by annoying, hard-to-read text. It may have worked in the past, but this is never what search engines wanted.
When we talk about links, we could mean two things. Backlinks or "inbound links" are links from other websites that point to your website, while internal links are links on your own site that point to your other pages (on the same site).
Links have historically played a big role in SEO. Very early on, search engines needed help figuring out which URLs were more trustworthy than others to help them determine how to rank search results. Calculating the number of links pointing to any given site helped them do this.
Backlinks work very similarly to real-life WoM (Word-of-Mouth) referrals. Let’s take a hypothetical coffee shop, Jenny’s Coffee, as an example:
This is why PageRank was created. PageRank (part of Google's core algorithm) is a link analysis algorithm named after one of Google's founders, Larry Page. PageRank estimates the importance of a web page by measuring the quality and quantity of links pointing to it. The assumption is that the more relevant, important, and trustworthy a web page is, the more links it will have earned.
The more natural backlinks you have from high-authority (trusted) websites, the better your odds are to rank higher within search results.
There would be no point to links if they didn’t direct searchers to something. That something is content! Content is more than just words; it’s anything meant to be consumed by searchers — there’s video content, image content, and of course, text. If search engines are answer machines, content is the means by which the engines deliver those answers.
Any time someone performs a search, there are thousands of possible results, so how do search engines decide which pages the searcher is going to find valuable? A big part of determining where your page will rank for a given query is how well the content on your page matches the query’s intent. In other words, does this page match the words that were searched and help fulfill the task the searcher was trying to accomplish?
Because of this focus on user satisfaction and task accomplishment, there’s no strict benchmarks on how long your content should be, how many times it should contain a keyword, or what you put in your header tags. All those can play a role in how well a page performs in search, but the focus should be on the users who will be reading the content.
Today, with hundreds or even thousands of ranking signals, the top three have stayed fairly consistent: links to your website (which serve as a third-party credibility signals), on-page content (quality content that fulfills a searcher’s intent), and RankBrain.
RankBrain is the machine learning component of Google’s core algorithm. Machine learning is a computer program that continues to improve its predictions over time through new observations and training data. In other words, it’s always learning, and because it’s always learning, search results should be constantly improving.
For example, if RankBrain notices a lower ranking URL providing a better result to users than the higher ranking URLs, you can bet that RankBrain will adjust those results, moving the more relevant result higher and demoting the lesser relevant pages as a byproduct.
Because Google will continue leveraging RankBrain to promote the most relevant, helpful content, we need to focus on fulfilling searcher intent more than ever before. Provide the best possible information and experience for searchers who might land on your page, and you’ve taken a big first step to performing well in a RankBrain world.
With Google rankings, engagement metrics are most likely part correlation and part causation.
When we say engagement metrics, we mean data that represents how searchers interact with your site from search results. This includes things like:
Many tests have indicated that engagement metrics correlate with higher ranking, but causation has been hotly debated. Are good engagement metrics just indicative of highly ranked sites? Or are sites ranked highly because they possess good engagement metrics?
While they’ve never used the term “direct ranking signal,” Google has been clear that they absolutely use click data to modify the SERP for particular queries.
According to Google’s former Chief of Search Quality, Udi Manber:
“The ranking itself is affected by the click data. If we discover that, for a particular query, 80% of people click on #2 and only 10% click on #1, after a while we figure out probably #2 is the one people want, so we’ll switch it.”
Another comment from former Google engineer Edmond Lau corroborates this:
“It’s pretty clear that any reasonable search engine would use click data on their own results to feed back into ranking to improve the quality of search results. The actual mechanics of how click data is used is often proprietary, but Google makes it obvious that it uses click data with its patents on systems like rank-adjusted content items.”
Because Google needs to maintain and improve search quality, it seems inevitable that engagement metrics are more than correlation, but it would appear that Google falls short of calling engagement metrics a “ranking signal” because those metrics are used to improve search quality, and the rank of individual URLs is just a byproduct of that.
Various tests have confirmed that Google will adjust SERP order in response to searcher engagement:
Since user engagement metrics are clearly used to adjust the SERPs for quality, and rank position changes as a byproduct, it’s safe to say that SEOs should optimize for engagement. Engagement doesn’t change the objective quality of your web page, but rather your value to searchers relative to other results for that query. That’s why, after no changes to your page or its backlinks, it could decline in rankings if searchers’ behaviors indicates they like other pages better.
In terms of ranking web pages, engagement metrics act like a fact-checker. Objective factors such as links and content first rank the page, then engagement metrics help Google adjust if they didn’t get it right.
A search engine like Google has its own proprietary index of local business listings, from which it creates local search results.
If you are performing local SEO work for a business that has a physical location customers can visit (ex: dentist) or for a business that travels to visit their customers (ex: plumber), make sure that you claim, verify, and optimize a free Google My Business Listing.
When it comes to localized search results, Google uses three main factors to determine ranking:
Relevance is how well a local business matches what the searcher is looking for. To ensure that the business is doing everything it can to be relevant to searchers, make sure the business’ information is thoroughly and accurately filled out.
Google use your geo-location to better serve you local results. Local search results are extremely sensitive to proximity, which refers to the location of the searcher and/or the location specified in the query (if the searcher included one).
Organic search results are sensitive to a searcher's location, though seldom as pronounced as in local pack results.
With prominence as a factor, Google is looking to reward businesses that are well-known in the real world. In addition to a business’ offline prominence, Google also looks to some online factors to determine local ranking, such as:
The number of Google reviews a local business receives, and the sentiment of those reviews, have a notable impact on their ability to rank in local results.
A "business citation" or "business listing" is a web-based reference to a local business' "NAP" (name, address, phone number) on a localized platform (Yelp, Acxiom, YP, Infogroup, Localeze, etc.).
Local rankings are influenced by the number and consistency of local business citations. Google pulls data from a wide variety of sources in continuously making up its local business index. When Google finds multiple consistent references to a business's name, location, and phone number it strengthens Google's "trust" in the validity of that data. This then leads to Google being able to show the business with a higher degree of confidence. Google also uses information from other sources on the web, such as links and articles.
SEO best practices also apply to local SEO, since Google also considers a website’s position in organic search results when determining local ranking.
In the next chapter, you’ll learn on-page best practices that will help Google and users better understand your content.
To design is much more than simply to assemble, to order, or even to edit: it is to add value and meaning, to illuminate, to simplify.
E-commerce has helped businesses establish a wider market presence by providing cheaper and more efficient distribution channels for their products or services. For example, the mass retailer Target has supplemented its brick-and-mortar presence with an online store that lets customers purchase everything from clothes to coffeemakers to toothpaste to action figures. Find out more about what eCommerce is.
Wednesday, May 12, 2021
Today, measuring the success of those campaigns just got easier, as Shopify is launching their improved reporting that allows for customer and revenue data to be attributed to your ads, available in your Partner Dashboard. Click here to read more about Shopify Updates and News.
Thursday, May 6, 2021