HTML URL Encoding
A URL is another term for a web address.
A web address, also known as a URL, can be made up of words (like propertutorials.com) or a set of numbers called an IP address (for example, 192.68.20.50).
When browsing the internet, many people prefer to type in names rather than numbers because names are more memorable.
URL - Uniform Resource Locator
Web browsers use URLs to ask for pages from web servers.
A web address, called a Uniform Resource Locator (URL), is employed to point to a document or data on the internet.
A web address like https://www.propertutorials.com/html follows these syntax rules:
Explanation:
- scheme - shows the type of Internet service (most common is http or https)
- prefix - shows a domain prefix (default for http is www)
- domain - shows the Internet domain name (like propertutorials.com)
- port - shows the port number at the host (default for http is 80)
- path - shows a path at the server (If omitted: the root directory of the site)
- filename - shows the name of a document or resource
Common URL Schemes
The table given below is a list of some common schemes:
Scheme | Short for | Used for |
---|---|---|
http | HyperText Transfer Protocol | Common web pages. Not encrypted |
https | Secure HyperText Transfer Protocol | Secure web pages. Encrypted |
ftp | File Transfer Protocol | Downloading or uploading files |
file | A file on your computer |
URL Encoding
When sending web addresses (URLs) over the Internet, they can only use basic characters from the ASCII set. If a URL has any characters beyond this set, it needs to be changed before being sent.
URL encoding changes non-ASCII characters into a form that can be sent over the Internet.
URL encoding changes non-ASCII characters into "%" followed by hexadecimal numbers.
Web addresses (URLs) can't have spaces. When converting them, spaces are usually replaced by a plus (+) sign or %20.
ASCII Encoding Examples
Your web browser will convert the input based on the character-set used on your webpage.
In HTML5, the standard character-set is UTF-8.
Character | From Windows-1252 | From UTF-8 |
---|---|---|
€ | %80 | %E2%82%AC |
£ | %A3 | %C2%A3 |
© | %A9 | %C2%A9 |
® | %AE | %C2%AE |
À | %C0 | %C3%80 |
Á | %C1 | %C3%81 |
 | %C2 | %C3%82 |
à | %C3 | %C3%83 |
Ä | %C4 | %C3%84 |
Å | %C5 | %C3%85 |