What is Sitemap?
This article is written to give you basic in depth idea of Sitemap and importance of it. The post provides everything you need to know about them and how to use them on your site. They are building blocks of any site yet often people make mistake using them or do not use all types of sitemap required for the site.
Sitemap is a file. The file has list of URL on your website. This is the basic form. If you prepare a list of all URL on your website and paste in a text file. It basically shows all content your site has. This is most basic form.Later we will see different forms.
Above is simplest definition or way to create sitemap. But with progress of Internet Sitemaps have also become very important. Search Engines want more information about a site. So mere listing of URL in a text document was not the best way out.
But before diving into different types of sitemap and format of each lets understand why we need one for our website in easy terms.
Why is sitemap important?
They are important because of Search Engines.Search Engines are major traffic generator for most sites. So it is important to see how Search Engines work. I will not dive into details of search engine functionality. The topic is very huge. I will explain the basic functionality.
How search engines work?
Internet is huge collection of websites. Every website has different URL. Most of us rely on Search Engines like Google or Bing to find relevant content. Search Engines like Google or Bing have to maintain repository of most links if not all on Internet.
Search Engines display links relevant to what you search. Now how they know which links are relevant to your search queries. So not only search engines have to keep list of URL on internet but also details about them. So that they can present the URL if you search something in match to those links.
The more information Search Engines will have about your content the better. Search Engines crawl every site on Internet to create their repository. They will add , delete and update links. But this is not an easy task.
How Sitemaps help Search Engines?
Lets assume you have created a new website. How will Search Engines will know about it? Every day thousands of new websites are getting launched. If you wait for Search Engines to find your website then it will take time. They do not have any solid reason to prefer your site above others. This is where sitemaps come into picture.
You can create one of your site. As mentioned sitemaps are collection of all URL on your website. Later we will see different types of it based on format we mentioned above. We will also see which one to use and Pros and Cons of each type.
Google and Bing both have created portals where you can add your sitemap. This way they know more about your website. Sitemaps make their task easier and helps them discover your site and ranks it properly for the search terms.You can read step by step tutorial on Bing Webmasters and Google Webmasters (known as Google Search Console).
It can be divided into sub types. The first classification is based on format. It can be simple text file or HTML file or XML file. So based on file format they can be divided into below types
- Text Sitemap
- HTML Sitemap
- XML Sitemap
Text is simplest format. You can create a text file names sitemap.txt and paste all the URL of your site into this file. The file name can be anything but it has become a standard to use name which is easy to understand. You can submit this to Google Search Console or Bing Webmasters.
Text formats are easy to create. You do not need any tools to create them. You can download all URL of your site and paste into it. Since it is simplest to create there is less chances of any error.
Text format does not provide enough information about URL content. Also if you are using CMS like WordPress or Joomla you will not get any tool. So most of the time creation may be manual.Most of the websites nowadays do not use it. So it should only be used if you can not use any other formats like XML or HTML.
It is popular format compared to text format.HTML as you know is displayed by all web browsers. So they can be viewed by your website visitors. It is also list of URL but in HTML format. As the URL list is in HTML format visitors can see the sitemap. It is used by some sites. It is not uncommon like text format.
It is human readable. There are many plugins available in major CMS which will create HTML format from your website content. It also helps visitors get an idea about content of your site. They do give good idea about your website architecture to Search Engines and visitors.
It is not meant for Search Engines in first place. Improper usage of HTML format may rank it in search results. Later in this post we will see how to avoid it.HTML Sitemap does not provide extended information about your content.
They have XML file format. It is the most popular and widely used format.All search engines have agreed on format of these files. So it can be fed to all search engines out there. Out of all three sitemap formats it is the most popular one. Almost all sites use XML format.
They are meant for Search Engines. They provide additional information about your website content. This additional information helps your website immensely. XML sitemap creation is complicated. But most of the CMS have plugins.
Complexity of creating XML sitemaps is big negative. But this is solved by tools provided by each CMS. Also including relevant pages in it is important. If file is not proper then it may have adverse impact on your website performance in search results and indexing.I will cover more on this later in the post.
XML Sitemap types
You should always use XML Sitemap. It would recommend using one HTML Sitemap as well. In fact if you have to use any one format then you should always use XML format.Below we will see different XML formats defined by search engines like Google.
Based on Content type
Search Engines have also categorized XML sitemaps based on content type. These separate protocols were defined to help search engines get more information. For example if you have Images on your page then page details will not offer much information about images.
Similar is the case with Video. So different protocols were defined for Images and Videos. These protocols help Google get more information about Images and Videos and help them rank these content types better.
Below is different categories based on content type (or recommended for different content type by search engines)
- Web Sitemap
- Image Sitemap
- Video Sitemap
Based on Site type
Google has defined different protocols for special websites like News websites and Multilingual websites. Sites falling into these two categories should use these special types for best results.News sites should always pay emphasis on recent happenings.So news site should not list URL older than 2 days.
So search engines devised protocols to meet this requirement. Also Multilingual sites should in which languages a page is offered. For cater this requirement the protocol was devised.
- News Sitemap
- Multilingual Sitemap
Which Sitemap to use?
This is most important question every website owner has. There are so many types out there. Selecting the best one depends on your website nature and content. With above details you may have fair idea of each type and their advantages and disadvantages.We will see when to use each type later in the post.
All websites should use Normal (Web) XML Sitemaps. The reason is simple. Search Engines want you to use one. Search Traffic is very important traffic source for any website. Using Sitemaps is first step and most important step towards SEO.
This is must use for new and small websites. If you are established website with too many links to your website then you can ignore Sitemaps. The reason is Search Engines will follow external links to visit the site. But if you have too few links or no links then frequency of search engine visits will be less.
Small sites if not using web format will get lesser content indexed. This will adversely impact traffic. Your new content will be indexed slowly. For large sites this is not an issue. As their new content gets shared and linked. So search engines will index their new content faster.
Large sites use it to make sure all content on their site gets indexed. The ones with fewer or no links also get search engine attention. For small site it is a must for all content.
It is for images. Google needs additional information about Images. Those information is not present in web format. So if you are using too many images on your site or images are important for your site then Image Sitemap should be used.
Image format is complex then Normal (web) one. You can use one even if there are few images on your website. There is no harm in using more than one sitemap. In fact you should use all sitemaps which fulfill your website need. Image format have special fields for Image related information.
Image title, Image caption and Image description fields provide adequate knowledge about your images to search engines. With these additional information search engines understand your images better. The more information they have about your images the better is ranking. They can understand for which queries your images should be displayed.
In fact images complement content of your page. So providing more information about your images you are helping search engines understand your web content better. This does help in your web ranking as well.
It is also special sitemap for Videos. Google needs additional information about Videos. It needs information like Video title, video description , thumbnail and duration. Mostly this information is not present in your page content.
So even if search engines crawl your pages they do not have adequate information about your videos.These information is not present in web format as well. So if you are using Videos on your website then you should always use Video Sitemap.They provide all these information needed by search engines.
This information is used by Search Engines to rank your video content or include them in search term index.
It provides information about your site structure to Search Engines. It also is human readable and helps site visitors. But It is not used much by websites. The reason is it is not submitted to Google Search Console and Bing Webmasters.
But I would recommend you to use it on all websites.If you are using WordPress then you can use any of the HTML Sitemap generators.It is worth mentioning that your HTML sitemap should have robots tag follow noindex. That is it should not be indexed and shown in search results.
Site structure is slowing gaining importance in search ranking. HTML format is best way to show site structure both to search engines and visitors.
How to create Sitemaps?
There is not a single plugin which helps you create all the required sitemap types (Image, Video and HTML) mentioned above. You need couple of plugins for it. This increase maintenance and too much duplication between the sitemaps content.
Also one plugin is not compatible with other plugin. Due to above reasons I have come up with single plugin which will take care of all your website needs. You can use this plugin to create above type of sitemaps.
Other Sitemap types
Other sitemap types like News and Multilingual are special types. These are not required for most websites. News format should be used only by News sites. Other websites should not use them. Google recommends only having URL newer than 2 days in News format.
Multilingual sitemaps are rare. These are used by only multilingual sites.As these two are special types and usage are well defined I am not covering in details here. You can use below WordPress plugin to create one for your website.
You should only use these special types only if you have that type of website. For example News format should only be used if you have News website. If you are using it on non news website then it will not be beneficial.
What to do after creating Sitemaps?
Sitemap creation is important. But it is equally important to tell Search Engines about location of them. You can name your sitemap file anything. Different plugins use different file names.So it is important to tell Search Engines about the file location. For this you should follow below two steps
Add to robots.txt
You should add all sitemap file links to your robots.txt file. This helps Google bots or Search Engines bots while crawling your website. They know the location of file. This makes their task of scanning and indexing your website easier.
You should add below line at then end of your robots.txt file
Note replace the URL https://udinra.com/sitemap.xml with your sitemap file URL. You can use as many lines as you want in robots.txt. For example if you are using Image format and Video format then you can use
Sitemap: https://udinra.com/sitemap-image.xml Sitemap: https://udinra.com/sitemap-video.xml
Do not forget to replace the links with your website links before pasting.
Add to Google Search Console
They also tell you indexing rate of your content Web, Image and Video. So you can get an idea how well your content is performing in eyes of Google. If more than 90% of your URL is indexed then it is good sign. The percentage decreases for large websites.
Google Search Console and other search engine sites report any errors as well. So you can monitor them to see any errors. It is good to observe the indexing rate and error or warning after one week if you are submitting new sitemap file. If the indexing has improved and there are no error or warning then you are good.
Tracking Sitemap performance
You can use Google Search Console to track sitemap performance. Tracking performance in these tools for new websites is very important initially. As the website ages you can lessen your frequency to visit these tools. But it is better to visit these tools once in a month or quarter to say the least.
Google Search Console has one drawback. It does not tell let you compare your site performance over a period of time.For example if you made a change in your site. You want to see if that change improved your indexing or degraded it.
This relative comparison over the time is not present in Google Search Console or any other Webmaster tools. You can not see how performance improved over a period of 6 months or 3 months. Due to this limitation I created Android mobile App. The Android mobile app helps you track performance of your pages and sitemap in Google Search Console over the period of time. You can install the mobile App for free.
You can save the Google search console data in App before making a change. That will act as a data point or reference point to compare your site performance over the period of time. You can compare the site metrics in Search Console after say 3 months with that data. It tells you in easy numbers and you can confirm if changes worked fine or not.
Common Sitemap related myths
I am covering few very common myths about usage of sitemaps. My clients have asked these questions. Some of these are popular on Internet as well. I have picked some very common yet important questions.
Using a URL twice will cause duplicate content
You should have unique URL’s in sitemap. This means URL’s should not duplicated.Duplicate URL will cause duplicate content issue. But this is wrong statement. Only problem I can think of including duplicate URL will do is increase the file size.There is no adverse impact on your SEO.
You will never have duplicate content issue. Google bots will ignore duplicate repeated URL’s in sitemap.But note if two different URL have same content then you will have duplicate content issue.
No need to use Image Sitemap
Many believe that Web format is enough for all websites. The reason given is that it lists all URL. So if Google bot crawls that URL it will also know about images on that URL. Advocates of this theory proposes that Images have ALT text which proves information about images. Note ALT text is very important for Image SEO.
Going by this rule Web format is also not required. As Google bot can crawl website and know about website content. But web format is required as it provides information about URL and improves indexing. So is image format as it provides information about images and URL containing those images. It also provides information about individual images.
This improves image indexing rate and increased information helps image ranking. It also helps Search Engine understand your web page better along with images. The more clear search engines are about your pages the more chances you will perform well in search results.
Now assume you have not used Image format. There may be chance that you have missed ALT text on some percentage of your images. In this case Google have very limited or no information about your images. So using Image format helps you safeguard your self from mistake of missing ALT text.
No need to use Video Sitemap
As mentioned in case of Images many do not want to use Video sitemap.While crawling Google bot only has Video URL. There is no additional information present. Most of the sites only paste Video URL or use video shortcode. They do not use HTML5 video tag or Schema.org video tag to provide additional information about videos.
Implementing Schema.org tags for videos or using HTML 5 video tag is difficult.You will have to make sure you are using the tags correctly. You can add video details like title, description, thumbnail, duration as part of schema.org tag and HTML5 video tag.
These information about videos are easily provided by video sitemap. You can install one plugin and for get the rest. These additional information helps Search Engine rank your Video. They will know context of your video and for which search queries they should rank it.
More than one sitemap file should not be present web root
Many website owners are worried about more than one sitemap present in Web root. In fact Yoast SEO also gives warning about it. But there is nothing to worry. Google crawls only the file you mentioned in robots.txt or mentioned in Google Search Console.
There is no need to fear about different files present. You can have two three files present and have only one in robots.txt or Google Search Console without any issues. The only problem having multiple files is confusion. You might get confused which one is listed in Webmaster tools or robots.txt. Apart from that there is no problem.
So if you are getting confused with multiple files then delete the rest and keep the one you are using. If you are not confused then keep them as they are. No need to worry about them.
Using archive pages in Sitemap
Personally I am not big advocate of using Archive pages. The page or post is your main traffic pages. Do you want traffic for your category or tag or other archive pages. There is no need of having traffic to these pages.
Having these pages just increase number of URL. You should only use your main content pages.Other than that other URLs should be filtered out. Most website owners are confused on this as major sitemaps plugins offer feature to add or remove archive pages.This feature may be to give flexibility but there is no need to use it.
Zipped version of sitemap is better
I also used to have zipped version of sitemap. The reason was it decreases file size. So it can be downloaded faster and hence increase crawl rate. But some time back I saw the no zipped version gets indexed faster than zipped version. The problem occurred to a customer who was using zipped version.
The sitemap was very slow to get indexed in Google Search Console. This was weird as sitemap created by my plugins are consumed very fast by Search Engines. After looking for all possible mistakes I switched to non zipped version instead of zipped version.
To my surprise the non zipped version performed much better than zipped version. So it is better to use separate small files with index file then to use one big file gzipped. I also started using non gzipped version. I removed option to create gzipped version of sitemap from my plugins.
If the option was present many would have used it. So removing the option was the best choice. Note there is no documented statement by Google on this one. It is more of what I found in my own testing.It may be worth testing it on large sample. But for now I think it is better to have normal version instead of gzipped version.
Using Mobile Sitemap
It is yet another XML sitemap type. I have not covered it earlier as it is bit controversial and lesser understood. It is not required for responsive websites. Since most of us use responsive site we can skip using one. But things have changed with introduction of AMP.
Accelerated Mobile Pages AMP is new mobile SEO factors. If you are using AMP plugin on your site for Mobile visitors you can create one by listing only AMP pages. This improves AMP pages indexing. Also it fastens discovery of AMP pages of your website.
There is no free plugin in WordPress as of now which caters AMP pages. You can use my Sitemap plugin for WordPress.
Sitemaps are very important for every website. It is first step towards SEO. If you are creating a website then this is perhaps the one thing you should have on top of your list. Websites using well formed sitemaps matching their content get indexed faster.They also rank better than other websites.
You can publish content and see it indexed within hours if not days even if there is no external link to your site. In fact sites having little or no external links are mostly benefited. Large websites also use it but they can skip using it as per Google.
So start using one for your website if you are not using one so far.Also if you are using only web format then consider using HTML one and Image or Video one as per your website need. Using all types as per your site need boost your content indexing and in turn ranking in search engines.
Feel free to share your thoughts and questions if any.