HTML URL Encoding


A URL is another term for a web address.

A web address, also known as a URL, can be made up of words (like propertutorials.com) or a set of numbers called an IP address (for example, 192.68.20.50).

When browsing the internet, many people prefer to type in names rather than numbers because names are more memorable.


URL - Uniform Resource Locator

Web browsers use URLs to ask for pages from web servers.

A web address, called a Uniform Resource Locator (URL), is employed to point to a document or data on the internet.

A web address like https://www.propertutorials.com/html follows these syntax rules:

scheme://prefix.domain:port/path/filename

Explanation:

  • scheme - shows the type of Internet service (most common is http or https)
  • prefix - shows a domain prefix (default for http is www)
  • domain - shows the Internet domain name (like propertutorials.com)
  • port - shows the port number at the host (default for http is 80)
  • path - shows a path at the server (If omitted: the root directory of the site)
  • filename - shows the name of a document or resource

Common URL Schemes

The table given below is a list of some common schemes:

Scheme Short for Used for
http HyperText Transfer Protocol Common web pages. Not encrypted
https Secure HyperText Transfer Protocol Secure web pages. Encrypted
ftp File Transfer Protocol Downloading or uploading files
file   A file on your computer

URL Encoding

When sending web addresses (URLs) over the Internet, they can only use basic characters from the ASCII set. If a URL has any characters beyond this set, it needs to be changed before being sent.

URL encoding changes non-ASCII characters into a form that can be sent over the Internet.

URL encoding changes non-ASCII characters into "%" followed by hexadecimal numbers.

Web addresses (URLs) can't have spaces. When converting them, spaces are usually replaced by a plus (+) sign or %20.


ASCII Encoding Examples

Your web browser will convert the input based on the character-set used on your webpage.

In HTML5, the standard character-set is UTF-8.

Character From Windows-1252 From UTF-8
%80 %E2%82%AC
£ %A3 %C2%A3
© %A9 %C2%A9
® %AE %C2%AE
À %C0 %C3%80
Á %C1 %C3%81
 %C2 %C3%82
à %C3 %C3%83
Ä %C4 %C3%84
Å %C5 %C3%85