This will be a service like TinyURL, bit.ly, goo.gl, qlink.me, etc. It will aliase and redirect to long URLs.
Difficulty: Easy
- Why do we need URL shortening?
- Requirements and Goals of the System
- Capacity Estimation and Constraints
- System APIs
- Database Design
- Basic System Design and Algorithm
- Data Partitioning and Replication
- Cache
- Load Balancer (LB)
- Purging and DB Cleanup
- Telemetry
- Security and Permissions
URL shortening is useful to gain "short links" that redirect to longer ones.
These short links are effectively aliases and save a ton of space when posting/tweeting/etc. Users are also less likely to type in the wrong characters when using shorter links.
They are also useful to track clicks/user engagement with content.
🚧 ALWAYS CLARIFY REQUIREMENTS AT THE START 🚧
Our system has the following requirements:
Functional Requirements
- Given a URL, our system should generate a unique and shorter version of it. This is called a short link. It should be easily copied and pasted into other applications.
- When users access this link, it will redirect them to the original link.
- Users should be able to optionally choose a custom short link.
- Links should expire after a default timespan, but users should be able to override and specify an expiration time.
Non-Functional Requirements
- The system should be highly available. Redirects cannot fail.
- URL redirection should happen quickly, in real-time, and with minimal latency.
- Shortened links should not be guessable.
Extended Requirements
- Analytics would be nice (e.g. how many times redirects have happeneds)
- Our service should be accessible through REST or other services
Our system is read heavy since users will create a shortened URL once and it could be read hundreds or thousands of times.
We can assume 100 read to 1 write scenario - 100:1.
If we assume 500M new URls per month, we can expect 50B redirects.
We will have about 193 new URLs created per second:
500000000 / (30 * 24 * 3600) = ~193/s
Based on our estimations before about redirect-to-write being 100:1, the redirects per second will be estimated at 19,300:
193 * 100 = 19,300/s
Our system will also have to store data. Let's assume that we store every URL shortening request for 5 years. Since we expect to have 500M new URLs every month the total number of objects we expect to store will be 30 billion:
500M * 5 years * 12 months = 30 billion
If we assume that each stored object will be approximately 500 bytes, we'll need 15 TB of total storage across 5 years:
30 billion * 500 bytes = 15 TB
We can estimate for bandwidth that we expect 300 new URLs as write requests per secton, with total incoming data of