Platform
Apache is the most popular web server in use today because it is free, runs everywhere, performs well, and can be configured to handle most needs.
http://httpd.apache.org/">Apache
Linux is a very popular OS in data centers because it is free, runs on a lot of hardware, has tons of available software, highly performing, easily virtualizable, and flexible. All good attributes when you are starting a web site and hoping to grow with demand.
Some popular versions of Linux used in data centers are: CentOS, Red Hat, and Ubuntu.
http://www.linux.org/">Linux
http://www.mysql.com/">MySQL
http://highscalability.com/lighttpd">lighttpd
for video instead of ApacheWhat's Inside?
The Stats
Recipe for handling rapid growth
while (true)
{
identify_and_fix_bottlenecks();
drink();
sleep();
notice_new_bottleneck();
}
This loop runs many times a day.
Web Servers
Video Serving
- More disks serving content which means more speed.
- Headroom. If a machine goes down others can take over.
- There are online backups.
- Apache had too much overhead.
- Uses
http://linux.die.net/man/4/epoll">epoll
to wait on multiple fds.- Switched from single process to multiple process configuration to handle more connections.
CDN is a system of computers networked together across the Internet that cooperate transparently to deliver content (especially large media content) to end users. The first web content based CDN's were Sandpiper and Skycache followed by Akamai and Digital Island. The first video based CDN was iBEAM Broadcasting.
CDN nodes are deployed in multiple locations, often over multiple backbones. These nodes cooperate with each other to satisfy requests for content by end users, transparently moving content behind the scenes to optimize the delivery process. Optimization can take the form of reducing bandwidth costs, improving end-user performance, or both.
The number of nodes and servers making up a CDN varies, depending on the architecture, some reaching thousands of nodes with tens of thousands of servers.
http://en.wikipedia.org/wiki/Content_Delivery_Network">CDN
(content delivery network):- CDNs replicate content in multiple places. There's a better chance of content being closer to the user, with fewer hops, and content will run over a more friendly network.
- CDN machines mostly serve out of memory because the content is so popular there's little thrashing of content into and out of memory.
http://en.wikipedia.org/wiki/Colocation">colo
sites.- There's a long tail effect. A video may have a few plays, but lots of videos are being played. Random disks blocks are being accessed.
- Caching doesn't do a lot of good in this scenario, so spending money on more cache may not make sense. This is a very interesting point. If you have a long tail product caching won't always be your performance savior.
- Tune
http://en.wikipedia.org/wiki/RAID
">RAID
- Tune memory on each machine so there's not too much and not too little.
Serving Video Key Points
Serving Thumbnails
- Lots of disk seeks and problems with inode caches and page caches at OS level.
- Ran into per directory file limit. Ext3 in particular. Moved to a more hierarchical structure. Recent improvements in the 2.6 kernel may improve Ext3 large directory handling up to 100 times, yet storing lots of files in a file system is still not a good idea.
- A high number of requests/sec as web pages can display 60 thumbnails on page.
- Under such high loads Apache performed badly.
- Used squid (reverse proxy) in front of Apache. This worked for a while, but as load increased performance eventually decreased. Went from 300 requests/second to 20.
- Tried using lighttpd but with a single threaded it stalled. Run into problems with multiprocesses mode because they would each keep a separate cache.
- With so many images setting up a new machine took over 24 hours.
- Rebooting machine took 6-10 hours for cache to warm up to not go to disk.
http://labs.google.com/papers/bigtable.html">BigTable
, a distributed data store:- Avoids small file problem because it clumps files together.
- Fast, fault tolerant. Assumes its working on a unreliable network.
- Lower latency because it uses a distributed multilevel cache. This cache works across different collocation sites.
- For more information on BigTable take a look at Google Architecture, GoogleTalk Architecture, and BigTable.
Databases
- Use MySQL to store meta data like users, tags, and descriptions.
- Served data off a monolithic RAID 10 Volume with 10 disks.
- Living off credit cards so they leased hardware. When they needed more hardware to handle load it took a few days to order and get delivered.
- They went through a common evolution: single server, went to a single master with multiple read slaves, then partitioned the database, and then settled on a sharding approach.
- Suffered from replica lag. The master is multi-threaded and runs on a large machine so it can handle a lot of work. Slaves are single threaded and usually run on lesser machines and replication is asynchronous, so the slaves can lag significantly behind the master.
- Updates cause cache misses which goes to disk where slow I/O causes slow replication.
- Using a replicating architecture you need to spend a lot of money for incremental bits of write performance.
- One of their solutions was prioritize traffic by splitting the data into two clusters: a video watch pool and a general cluster. The idea is that people want to watch video so that function should get the most resources. The social networking features of YouTube are less important so they can be routed to a less capable cluster.
- Went to database partitioning.
- Split into shards with users assigned to different shards.
- Spreads writes and reads.
- Much better cache locality which means less IO.
- Resulted in a 30% hardware reduction.
- Reduced replica lag to 0.
- Can now scale database almost arbitrarily.
http://www.possibility.com/epowiki/Wiki.jsp?page=DatacenterSystemChoiceAnalysis">Data Center
Strategy
looks at different metrics to know who is closest.
Lessons Learned
Some advantages are:
* faster backup
* faster recovery
* data can fit into memory
* data is easier to manage
* provided more write bandwidth because you aren't writing to a single master. In a single master architecture write bandwidth is throttled.
This technique is used by many large websites, including eBay, Yahoo, LiveJournal, and Flickr.">Shard
. Sharding helps to isolate and constrain storage, CPU, memory, and IO. It's not just about getting more writes performance.- Software: DB, caching
- OS: disk I/O
- Hardware: memory, RAID
没有评论:
发表评论