Slideshare.net (beta)

 

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 11 (more)

FOWA Scaling The Lamp Stack Workshop

From dlieberman, 10 months ago

Slides from the workshop "Scaling the LAMP Stack" at the Future of more

3475 views  |  0 comments  |  9 favorites  |  132 downloads  |  3 embeds (Stats)
 

Tags

lieberman daniel scaling bitpusher lamp fowa fowa07 fowa2007 web startup

more

 
Embed
options

More Info

This slideshow is Public
Total Views: 3475
on Slideshare: 3439
from embeds: 36

Slideshow transcript

Slide 1: Scaling the LAMP Stack Future of Web Apps October 5, 2007

Slide 2: Introductions

Slide 3: Specific Problems, Challenges and Issues

Slide 4: About this workshop • This is a broad topic • Theory and application • Real-world focus • Interactive (please!)

Slide 5: About web apps and scaling Some different ways of looking at the problem…

Slide 6: Things to think about • Multi-server: locking and concurrency • Running many: keep in mind what’s expensive, sloppy or risky • Code quality • The law of truly large numbers

Slide 7: Elements of Scaling • Split up different tasks • Use more hardware (intelligently) • Partition • Replicate • Cache • Optimize (code and hardware) • Identify and fix weaknesses • Manage

Slide 8: Tools and Components • Apache + PHP • MySQL • File System (local) • Networked File System • Load Balancers • memcached

Slide 9: Contemplating Scaling • Understand what your app does (and how much) • Identify the bottlenecks • Solve near-term problems • Design well, but don’t over-design

Slide 10: Web apps do lots of things Different operations have different scaling issues.

Slide 11: What does your app do? List the high level elements of what your application does. Separate out different functions that will have different scaling issues.

Slide 12: Common things that web apps do • Manage connections/protocols • Deliver static content • Manage sessions • Manage user data • Render dynamic pages • Access external APIs • Process media

Slide 13: Update the list of things your app does • Add anything you missed • Note which items you do in quantity

Slide 14: Easy vs. Difficult Scaling What happens when you add hardware? •Does it work? •Does more hardware = more performance?

Slide 15: Things that break when you scale • State that isn’t properly shared (especially sessions) • Updates/refreshes (caching and replication issues)

Slide 16: Things that don’t improve when you add more servers • Unpartitioned databases • Anything that locks/blocks • Inefficient code, especially big queries

Slide 17: Scaling Each Element • (do easy separations first)

Slide 18: Managing Connections/Protocols • No problem putting on multiple servers • Apache is good o Not too far away out of the box o Moderately tunable • Linux tuning o TCP stack (tune to handle unusual networking needs)

Slide 19: Key Apache Configuration Issues • MaxClients (and ServerLimit, ThreadLimit and ThreadsPerChild) • Avoid using PHP (or other) handler unnecessarily • Use the worker MPM • Maybe MaxRequestsPerChild

Slide 20: Delivering Static Content • Don’t process it unnecessarily o Either cache or use no Apache handlers o Caching can let you treat semi-static content as static • Multiple servers complicates updates, but is otherwise easy

Slide 21: General Discussion: Multi-server, state and sessions Rethinking state for multi-server environments • What is state? • Short-term state (sessions) • Long-term state (application data) • Managing state is usually the hardest part of scaling

Slide 22: What happens with state • Written (created/destroyed/changed) • Read • Stored

Slide 23: Requirements for managing state • Depend on what it is and how it is used • Perfect coherence • Performance of different operations

Slide 24: Ways of scaling state • Replication: make more copies • Partitioning: split up the work • Caching Should make different choices for different state/data elements

Slide 25: About Load Balancers • What load balancers do o Spread load o Detect server failures o Stickiness/persistence o Acceleration (especially SSL) • Fancy features (including good stickiness) are expensive

Slide 26: Why sticky sessions are not usually good in practice • Servers fail • Corner cases exist

Slide 27: Managing Sessions

Slide 28: Where session data can be stored • Browser cookies • Web server temporary files (not scalable) • App server state • Database • Cache

Slide 29: PHP session management • Default (files) method is not multi-server friendly, and thus not scalable (unless sticky) • Can implement a different back-end easily

Slide 30: Designing a session back-end • Requirements • Data storage options o Cookies only (re-auth, let the browser take care of the logout – but less secure) o Full-featured involves a combination of cookies and database and cache (discussion of session details)

Slide 31: Managing small user data • Databases are more efficient, flexible and sharable than small files • Frequently-read data should be cached

Slide 32: Managing large user data • NFS has flaws but is almost inevitable • Locking is usually not important, but can be • Performance degradation can be sudden

Slide 33: About NFS • NFS is usually transparent to your app • NFS is easy to implement gives you multiple-write access • NFS locking is not to be trusted • The Linux NFS client is slow for writes and can do bad things under stress

Slide 34: User data and locking • Names based on hashes often mean no locking is needed • Databases do locking better than file systems do • Locking requires housekeeping

Slide 35: Disk Storage Hardware • Disk performance can degrade suddenly • If the ratio of access to storage is low, then even slow disk is usually fine • Think about seek times and spindles

Slide 36: Rendering dynamic pages • Depends heavily on application specifics (query, search, process, etc.) • Watch out for: o Onerous queries (create and watch slow query log) o Locking of resources and/or incoherence if state changes o Heavy CPU and memory usage • Cache both elements and complete pages

Slide 37: Processing media • CPU intensive • May be memory intensive • Might be spiky • Might need its own server pool

Slide 38: Hardware • Start simple • Observe performance and respond accordingly • Get lots of memory

Slide 39: Hardware-driven behaviors • Sudden degradation because demand exceeds supply (usually relieved unhappily) • Get behind due to a spike, and recover • Not enough resources for normal optimization

Slide 40: Specific hardware issues • Not enough memory o Severe: paging/swapping o Mild: poor automatic caching; slowness due to fragmentation • Disk seek (very common) • CPU (but might really be memory) • Disk throughput (rare for web apps)

Slide 41: Hardware decisions • SCSI/SAS vs. SATA • Resource ratios • Combining vs. splitting functions • Big vs. little boxes

Slide 42: Techniques • Caching • Partitioning • Replication • Data management middleware • Queuing

Slide 43: Caching • Turn expensive operations in to cheap ones • Reduce: o Database reads o Object and page calculation/rendering operations • Cache objects and subobjects • Add memory

Slide 44: Apache Caching • Can be done with zero application modifications • Complete pages/HTTP requests only • Must use Apache 2.2 • Cache is not shared between servers

Slide 45: memcached • Extremely useful • Distributed caching system • Requires new thinking and new coding • Straightforward API

Slide 46: memcached URLs • Home: http://www.danga.com/memcached/ • Intro: http://www.majordojo.com/2007/03/memcac • PHP documentation: http://www.php.net/memcache

Slide 47: Partitioning • Mostly for data management • Split load on to separate servers/pools • Partition algorithm/mechanism must be lightweight • Partition algorithm must anticipate the future

Slide 48: File Storage Partitioning • Index/database gives the most flexibility • Hash-based is simplest

Slide 49: Database partitioning • You will need to do this, but perhaps later than you think • Index vs. hash-based

Slide 50: Replication • Used where data is read far more than written • Consider caching first • Also used for failure recovery

Slide 51: Types of Replication • Replication: sync vs. async o Synchronous is not usually scalable o Asynchronous only works with certain kinds of data and use cases, because of coherence issues

Slide 52: Database Replication • Simple but finicky • Asynchronous (but not by much) • Allows big queries and backups to be moved to separate servers

Slide 53: File System Replication • Slow and very asynchronous • Mostly for disaster recovery

Slide 54: Data Management Middleware • Mostly for databases • Can handle partitioning and replication, and do it well • Big investment in coding to the API • Sometimes easier to add functionality to app

Slide 55: Queuing • Save work for later • Useful for less urgent operations, especially messaging • Can be used to wait for a pause, or to separate hardware

Slide 56: Dealing with lots of hardware (operations) • Automation • Process

Slide 57: Imaging/Provisioning • Be consistent • Use your distro’s automation (Kickstart, AutoYaST, etc.) • Use boring, meaningful hostnames • Make re-imaging easy

Slide 58: Deployment Systems • Content and code replication • Coherence/atomic updates • Managing pieces and processes • Simple scripts are fine • Create audit trail • Include back-out • Think 3AM • Do it!

Slide 59: Monitoring systems • A pain, but a lifesaver • Start with built-in basics • Add custom checks, especially end-to-end and communication between pieces • Eliminate false alarms (ongoing) • Nagios, usually

Slide 60: Coping with hardware failure • Have extra servers/capacity • Load balancers handle stateless layers • Replication prepares you to handle data layers manually • Use middleware or app-level multiple writes to get true data layer redundancy

Slide 61: Change management • Part automation, part process • Use version control on everything • Stage changes with realistic data • Know how to back out • Consult the right people (internal and/or external)

Slide 62: Efficiency • Access the smallest amount (DB, FS, etc.) • Don’t do complex stuff when simple will suffice

Slide 63: Using the database efficiently • Keep it simple • Know what queries you do • Index every query key • Cache to reduce demand • Check slow query log • Replicate if you need big queries

Slide 64: The messy real world

Slide 65: Security and abuse • Mostly same issues, just magnified • You will be a target • Spam (coming and going) • Abuse of file storage

Slide 66: Corner Cases • Murphy’s law enforcement • Watch out for how different user activities relate • Lock data, not functions • Housekeeping

Slide 67: Performance and tuning • Observation and responsiveness is more important than pre-optimizing • Redesign as needed • Collect the data to be able to analyze (both resource utilization and end-user performance)

Slide 68: Miscellaneous Warnings

Slide 69: Files and directories • Most default file system configurations get really slow with lots of files in one directory • Numerical limits on files and subdirectories • Some programs don’t like files over 2GB

Slide 70: AJAX • Sequential round trips • Make preloading invisible • UI that waits for too many things

Slide 71: Other topics • Multiple sites • CDNs

Slide 72: Scaling the LAMP Stack Future of Web Apps October, 2007 Daniel Lieberman daniel@bitpusher.com