Web Development Core Concepts - What is a Clean URL?


Welcome back to web technology core concepts at Learning Journal. In the earlier video, we hosted our website on a Linux machine in the cloud environment. We can access our website using the following address.
http://40.114.67.142/myfirstpage.html
However, this website address looks unusual. A standard website will have a different type of web address. For example, the Learning Journal website is available at www.learningjournal.guru. Similarly, YouTube is available at www.youtube.com. We used Azure cloud, and that was available at azure.microsoft.com.

What is a Domain Name?

There is one thing common in all those website addresses. They all are meaningful textual names. None of them are using an IP address like our first website. There is a specific term for such meaningful textual names.
We call it a domain name.
There are two main reasons for having a meaningful domain name for the website address.

  1. They are easy to remember.
  2. They represent your business's brand name.

You can easily remember youtube.com or learningjournal.guru. However, memorizing an IP address is quite difficult. And a website name can become a brand name. For example, YouTube is a multi-billion-dollar brand name that Google owns.

What is a Clean URL?

There is another essential characteristic of a professional website URL. They do not have the page name dot HTML in any of the URLs. I mean, our website address includes myfirstpage.html. However, all other websites that I showed here do not have such page names. Let me show you some of the other pages of azure.microsoft.com
Click on any of the links, and they take you to a different page. However, they do not have a dot HTML in any of the page URLs. So, the first thing that we want is to get rid of the page name from our current website address.
I mean, I want my website address to look like this.
http://40.114.67.142/
We call it a Clean URL, or a Pretty URL. Some people also refer to it as search engine friendly URL. The clean URL is a standard approach that is being followed by most of the professional websites. Such URLs are more comfortable to remember. You do not have to type a page name dot HTML, and they are little friendlier for search engines. And that's why I want to apply the clean URL for our example website. Great! The next question is this?


How to apply a clean URL?

There are several ways to do that. However, for now, we will use the most straightforward method. Rename the myfirstpage.html to index.html. You can use the following command to do that.

Good! Let's test it. Type the following address.
http://40.114.67.142/
Worked. It was that simple. The web server does the magic here. Let's try to understand it.

How does a URL work?

A typical web page URL is composed of four parts.

  1. Scheme or the protocol
  2. Host and the port
  3. Path
  4. Query String

The first part is the protocol. You look at any web page URL, and it will be either HTTP or HTTPS in the beginning. These two are the protocols. I will cover some more details about the protocol in a separate video. But for now, let's assume that these are two different protocols.
The second part is the host and the port number. The Host is either an IP address or a domain name. I already talked about domain names, but we will again come back to the same in a minute. The port number is optional because they are fixed for a protocol. For example, https uses port 443 and HTTP uses port 80. So, we leave the port number from a typical web page URL. However, sometimes, when we configure a different port for a website, you must use the port number as well in the URL. But for now, let us assume that we are using default port number for our websites and hence they are optional.
The next item is the path. The path itself has two parts. A directory name and a filename. Let’s talk about the directory name. I already explained the website’s root directory in an earlier video. A single slash (/) in the URL represents the root directory of the website. If you have subdirectories in the website’s root, you can specify the hierarchical path separated by a slash. For example, the below URL refers to a subdirectory structure inside the website’s root.
https://www.learningjournal.guru/courses/modern-web-development/
Now let’s talk about the file name. When a visitor types a URL up to the directory name and leaves the filename, the web server assumes that you want to open the default file in the given directory because you omitted the file name. So, the web server will search for a default file in the given directory. And the default file name is the index.html, so it returns the index.html to the browser.
That's the trick. We applied that trick to create a clean URL for the home page of our website. You can use the same method to create a clean URL for other pages of your website.
Let's try that as well. Assume you want to add a new page to your website. You want to place some information about your company on that new page. I have created this about.html page for the same purpose.
Where do I place this page on my web server? If I put it in the root directory of my web server, I can access it using the following URL.
http://40.114.67.142/about.html
However, this is not a clean URL. How can you make it clean? Think about the trick that I told you. There are two steps to do that.

  1. Create a directory as about
  2. Move the about.html to that directory and rename it as index.html

That's it. Now you can access the about page using below URL.
http://40.114.67.142/about/
And this one is a clean URL. Amazing! Isn't it? I will talk about the query string in a later video.
We accomplished the first part of giving a search engine friendly URL to our website. The next thing that we want to do is to replace this IP address with a meaningful domain name. We will cover that part in a separate video.
Thank you for watching learning journal. Keep learning and keep growing.


You will also like: