What is URL Encoding (Percent Encoding)?
Learn about URL encoding - the process of converting special characters in URLs to a format that can be safely transmitted over the internet.
What is URL Encoding?
URL encoding, also called percent encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) using only ASCII characters that are permitted in URLs. Special characters and non-ASCII characters are converted to a format starting with a percent sign (%) followed by hexadecimal digits. This ensures URLs can be safely transmitted across different systems and networks.
Why URL Encoding is Necessary
URLs can only contain a limited set of characters from the ASCII character set.
Safe vs Unsafe Characters
Not all characters are safe to use directly in URLs:
Safe Characters (no encoding needed):
- Letters: A-Z, a-z
- Digits: 0-9
- Unreserved: - _ . ~
Reserved Characters (special meaning in URLs):
: / ? # [ ] @ ! $ & ' ( ) * + , ; =
Unsafe Characters (must be encoded):
- Space: %20 or +
- Special characters: < > " { } | \ ^ `
- Non-ASCII: é → %C3%A9, 中 → %E4%B8%ADProblems Without Encoding
Why we need URL encoding:
Without encoding:
Bad: http://example.com/search?q=hello world
Problem: Space breaks URL parsing
Bad: http://example.com/name?user=John&Jane
Problem: & is interpreted as parameter separator
Bad: http://example.com/path?url=http://other.com
Problem: : and / have special meaning
With encoding:
Good: http://example.com/search?q=hello%20world
Good: http://example.com/name?user=John%26Jane
Good: http://example.com/path?url=http%3A%2F%2Fother.comHow URL Encoding Works
The encoding process converts characters to percent-encoded format:
Encoding Process
Characters are converted to hexadecimal byte values prefixed with %.
Step-by-step encoding:
1. Take character: space " "
2. Get ASCII/UTF-8 code: 32 (decimal) = 0x20 (hex)
3. Add % prefix: %20
More examples:
! → ASCII 33 → %21
# → ASCII 35 → %23
$ → ASCII 36 → %24
& → ASCII 38 → %26
= → ASCII 61 → %3D
? → ASCII 63 → %3F
@ → ASCII 64 → %40
UTF-8 multi-byte:
é → UTF-8: C3 A9 → %C3%A9
中 → UTF-8: E4 B8 AD → %E4%B8%ADCommon Encoded Characters
Frequently encountered URL-encoded characters:
Character → Encoded → Usage
--------- ------- -----
Space %20 or + Spaces in query strings
! %21 URLs with exclamation
" %22 Quotes in parameters
# %23 Fragment identifier (not usually encoded)
$ %24 Special character
& %26 Query parameter separator
' %27 Single quote
( %28 Opening parenthesis
) %29 Closing parenthesis
* %2A Asterisk
+ %2B Plus sign
, %2C Comma
/ %2F Forward slash (path separator)
: %3A Colon (scheme separator)
; %3B Semicolon
= %3D Equals (key-value separator)
? %3F Question mark (query string start)
@ %40 At sign (user info separator)
[ %5B Opening bracket
] %5D Closing bracketURL Components and Encoding
Different parts of a URL have different encoding rules:
URL Structure
Understanding which parts need encoding:
https://user:pass@example.com:8080/path/to/resource?key=value&foo=bar#section
|---| |-------| |---------| |--| |---------------| |---------------| |-----|
| | | | | | |
Scheme User Hostname Port Path Query String Fragment
Encoding rules:
- Scheme: No encoding
- User/Pass: Encode : @ /
- Hostname: Use Punycode for internationalized domains
- Port: No encoding (numbers only)
- Path: Encode most except / - _ . ~
- Query: Encode everything except unreserved chars
- Fragment: Usually not sent to serverQuery String Encoding
Special rules for query parameters:
Original:
http://example.com/search?q=hello world&category=news & events
Encoded:
http://example.com/search?q=hello%20world&category=news%20%26%20events
Key-Value pairs:
key1=value with spaces → key1=value%20with%20spaces
key2=hello&goodbye → key2=hello%26goodbye
key3=50% → key3=50%25
Note: Space can be %20 or + in query strings
q=hello+world (application/x-www-form-urlencoded)
q=hello%20world (standard percent encoding)URL Encoding in Different Languages
Examples of encoding URLs in popular programming languages:
JavaScript
JavaScript provides multiple encoding functions:
// encodeURI - for full URLs
const url = 'https://example.com/path?q=hello world';
console.log(encodeURI(url));
// https://example.com/path?q=hello%20world
// encodeURIComponent - for URL components (recommended)
const query = 'hello world & stuff';
const encoded = encodeURIComponent(query);
console.log(encoded);
// hello%20world%20%26%20stuff
// Build URL with parameters
const params = new URLSearchParams({
q: 'hello world',
category: 'news & events'
});
const fullUrl = `https://example.com/search?${params}`;
// https://example.com/search?q=hello+world&category=news+%26+events
// Decode
const decoded = decodeURIComponent('hello%20world');
console.log(decoded); // hello worldPython
Python's urllib provides URL encoding:
from urllib.parse import quote, quote_plus, urlencode, unquote
# quote - standard encoding
encoded = quote('hello world & stuff')
print(encoded) # hello%20world%20%26%20stuff
# quote_plus - uses + for spaces
encoded = quote_plus('hello world')
print(encoded) # hello+world
# urlencode - for query parameters
params = {'q': 'hello world', 'category': 'news & events'}
query_string = urlencode(params)
print(query_string) # q=hello+world&category=news+%26+events
# Decode
decoded = unquote('hello%20world')
print(decoded) # hello worldPHP
PHP encoding functions:
<?php
// urlencode - for query parameters (+ for space)
$encoded = urlencode('hello world & stuff');
echo $encoded; // hello+world+%26+stuff
// rawurlencode - RFC 3986 (%20 for space)
$encoded = rawurlencode('hello world & stuff');
echo $encoded; // hello%20world%20%26%20stuff
// http_build_query - for query strings
$params = ['q' => 'hello world', 'cat' => 'news & events'];
$query = http_build_query($params);
echo $query; // q=hello+world&cat=news+%26+events
// Decode
$decoded = urldecode('hello+world');
echo $decoded; // hello world
?>Common Use Cases
When and where URL encoding is essential:
- Query Parameters: Encoding search terms, filters, and user input
- API Requests: Passing data in GET requests
- Form Submissions: application/x-www-form-urlencoded data
- OAuth/Authentication: Encoding redirect URLs and tokens
- File Paths: URLs containing filenames with special characters
- Internationalization: Encoding non-ASCII characters in URLs
- Social Sharing: Pre-filling share text with special characters
- Email Links: mailto: URLs with subject and body parameters
Practical Examples
Real-world URL encoding scenarios:
Search Query
Encoding search terms:
Original query: "how to use C++ & Python?"
Unencoded (wrong):
http://example.com/search?q=how to use C++ & Python?
Encoded (correct):
http://example.com/search?q=how%20to%20use%20C%2B%2B%20%26%20Python%3F
Alternate (+):
http://example.com/search?q=how+to+use+C%2B%2B+%26+Python%3FMultiple Parameters
Building complex query strings:
Parameters:
- name: John Doe
- email: john+test@example.com
- message: Hello! How are you?
Encoded URL:
http://example.com/contact?
name=John%20Doe&
email=john%2Btest%40example.com&
message=Hello%21%20How%20are%20you%3FRedirect URLs
Encoding URLs as parameters:
Redirect to: https://example.com/dashboard?tab=settings
Login URL:
https://auth.example.com/login?
redirect=https%3A%2F%2Fexample.com%2Fdashboard%3Ftab%3Dsettings
Note: The redirect URL itself is fully encoded!Best Practices
- Always encode user input before adding to URLs
- Use built-in functions provided by your programming language
- Encode components separately - don't encode the entire URL
- Use encodeURIComponent in JavaScript, not encodeURI for parameters
- Be consistent with space encoding (+ or %20)
- Test with special characters during development
- Consider international characters - use UTF-8 encoding
- Validate after encoding - ensure URLs are well-formed
- Don't double-encode - check if data is already encoded
Common Mistakes
- Not encoding user input: Leads to broken URLs and security issues
- Encoding too much: Encoding the entire URL breaks it
- Double encoding: Encoding already-encoded data
- Wrong function: Using encodeURI instead of encodeURIComponent
- Forgetting fragments: Not encoding # in parameter values
- Mixing styles: Inconsistent + vs %20 for spaces
- Encoding when not needed: Over-encoding safe characters reduces readability
Security Considerations
URL encoding and security:
- Prevent Injection Attacks: Always encode user input to prevent URL manipulation
- Open Redirect Prevention: Validate and encode redirect URLs
- XSS Prevention: Encoding helps prevent some XSS attacks in URLs
- Path Traversal: Encoding prevents ../ attacks in file paths
- SQL Injection: Not a substitute for parameterized queries, but helps
- Validate Decoded Data: Always validate after decoding user input
URL Encoding vs Other Encodings
How URL encoding differs from other encoding methods:
- vs Base64: Base64 encodes binary data; URL encoding handles special characters
- vs HTML Entities: HTML entities (&) for HTML; URL encoding for URLs
- vs Unicode Escapes: \u0020 in strings; %20 in URLs
- vs Punycode: Punycode for domain names; percent encoding for path/query
Conclusion
URL encoding is fundamental to web development, ensuring that URLs can safely contain any character while remaining compatible with internet standards. Understanding when and how to properly encode URLs prevents bugs, improves security, and ensures your web applications work correctly across different systems and browsers. Always use your programming language's built-in encoding functions and encode user input before including it in URLs.
Related Tools
Try these tools related to this topic