Slideshow transcript
Slide 1: Scaling the LAMP Stack Future of Web Apps October 5, 2007
Slide 2: Introductions
Slide 3: Specific Problems, Challenges and Issues
Slide 4: About this workshop • This is a broad topic • Theory and application • Real-world focus • Interactive (please!)
Slide 5: About web apps and scaling Some different ways of looking at the problem…
Slide 6: Things to think about • Multi-server: locking and concurrency • Running many: keep in mind what’s expensive, sloppy or risky • Code quality • The law of truly large numbers
Slide 7: Elements of Scaling • Split up different tasks • Use more hardware (intelligently) • Partition • Replicate • Cache • Optimize (code and hardware) • Identify and fix weaknesses • Manage
Slide 8: Tools and Components • Apache + PHP • MySQL • File System (local) • Networked File System • Load Balancers • memcached
Slide 9: Contemplating Scaling • Understand what your app does (and how much) • Identify the bottlenecks • Solve near-term problems • Design well, but don’t over-design
Slide 10: Web apps do lots of things Different operations have different scaling issues.
Slide 11: What does your app do? List the high level elements of what your application does. Separate out different functions that will have different scaling issues.
Slide 12: Common things that web apps do • Manage connections/protocols • Deliver static content • Manage sessions • Manage user data • Render dynamic pages • Access external APIs • Process media
Slide 13: Update the list of things your app does • Add anything you missed • Note which items you do in quantity
Slide 14: Easy vs. Difficult Scaling What happens when you add hardware? •Does it work? •Does more hardware = more performance?
Slide 15: Things that break when you scale • State that isn’t properly shared (especially sessions) • Updates/refreshes (caching and replication issues)
Slide 16: Things that don’t improve when you add more servers • Unpartitioned databases • Anything that locks/blocks • Inefficient code, especially big queries
Slide 17: Scaling Each Element • (do easy separations first)
Slide 18: Managing Connections/Protocols • No problem putting on multiple servers • Apache is good o Not too far away out of the box o Moderately tunable • Linux tuning o TCP stack (tune to handle unusual networking needs)
Slide 19: Key Apache Configuration Issues • MaxClients (and ServerLimit, ThreadLimit and ThreadsPerChild) • Avoid using PHP (or other) handler unnecessarily • Use the worker MPM • Maybe MaxRequestsPerChild
Slide 20: Delivering Static Content • Don’t process it unnecessarily o Either cache or use no Apache handlers o Caching can let you treat semi-static content as static • Multiple servers complicates updates, but is otherwise easy
Slide 21: General Discussion: Multi-server, state and sessions Rethinking state for multi-server environments • What is state? • Short-term state (sessions) • Long-term state (application data) • Managing state is usually the hardest part of scaling
Slide 22: What happens with state • Written (created/destroyed/changed) • Read • Stored
Slide 23: Requirements for managing state • Depend on what it is and how it is used • Perfect coherence • Performance of different operations
Slide 24: Ways of scaling state • Replication: make more copies • Partitioning: split up the work • Caching Should make different choices for different state/data elements
Slide 25: About Load Balancers • What load balancers do o Spread load o Detect server failures o Stickiness/persistence o Acceleration (especially SSL) • Fancy features (including good stickiness) are expensive
Slide 26: Why sticky sessions are not usually good in practice • Servers fail • Corner cases exist
Slide 27: Managing Sessions
Slide 28: Where session data can be stored • Browser cookies • Web server temporary files (not scalable) • App server state • Database • Cache
Slide 29: PHP session management • Default (files) method is not multi-server friendly, and thus not scalable (unless sticky) • Can implement a different back-end easily
Slide 30: Designing a session back-end • Requirements • Data storage options o Cookies only (re-auth, let the browser take care of the logout – but less secure) o Full-featured involves a combination of cookies and database and cache (discussion of session details)
Slide 31: Managing small user data • Databases are more efficient, flexible and sharable than small files • Frequently-read data should be cached
Slide 32: Managing large user data • NFS has flaws but is almost inevitable • Locking is usually not important, but can be • Performance degradation can be sudden
Slide 33: About NFS • NFS is usually transparent to your app • NFS is easy to implement gives you multiple-write access • NFS locking is not to be trusted • The Linux NFS client is slow for writes and can do bad things under stress
Slide 34: User data and locking • Names based on hashes often mean no locking is needed • Databases do locking better than file systems do • Locking requires housekeeping
Slide 35: Disk Storage Hardware • Disk performance can degrade suddenly • If the ratio of access to storage is low, then even slow disk is usually fine • Think about seek times and spindles
Slide 36: Rendering dynamic pages • Depends heavily on application specifics (query, search, process, etc.) • Watch out for: o Onerous queries (create and watch slow query log) o Locking of resources and/or incoherence if state changes o Heavy CPU and memory usage • Cache both elements and complete pages
Slide 37: Processing media • CPU intensive • May be memory intensive • Might be spiky • Might need its own server pool
Slide 38: Hardware • Start simple • Observe performance and respond accordingly • Get lots of memory
Slide 39: Hardware-driven behaviors • Sudden degradation because demand exceeds supply (usually relieved unhappily) • Get behind due to a spike, and recover • Not enough resources for normal optimization
Slide 40: Specific hardware issues • Not enough memory o Severe: paging/swapping o Mild: poor automatic caching; slowness due to fragmentation • Disk seek (very common) • CPU (but might really be memory) • Disk throughput (rare for web apps)
Slide 41: Hardware decisions • SCSI/SAS vs. SATA • Resource ratios • Combining vs. splitting functions • Big vs. little boxes
Slide 42: Techniques • Caching • Partitioning • Replication • Data management middleware • Queuing
Slide 43: Caching • Turn expensive operations in to cheap ones • Reduce: o Database reads o Object and page calculation/rendering operations • Cache objects and subobjects • Add memory
Slide 44: Apache Caching • Can be done with zero application modifications • Complete pages/HTTP requests only • Must use Apache 2.2 • Cache is not shared between servers
Slide 45: memcached • Extremely useful • Distributed caching system • Requires new thinking and new coding • Straightforward API
Slide 46: memcached URLs • Home: http://www.danga.com/memcached/ • Intro: http://www.majordojo.com/2007/03/memcac • PHP documentation: http://www.php.net/memcache
Slide 47: Partitioning • Mostly for data management • Split load on to separate servers/pools • Partition algorithm/mechanism must be lightweight • Partition algorithm must anticipate the future
Slide 48: File Storage Partitioning • Index/database gives the most flexibility • Hash-based is simplest
Slide 49: Database partitioning • You will need to do this, but perhaps later than you think • Index vs. hash-based
Slide 50: Replication • Used where data is read far more than written • Consider caching first • Also used for failure recovery
Slide 51: Types of Replication • Replication: sync vs. async o Synchronous is not usually scalable o Asynchronous only works with certain kinds of data and use cases, because of coherence issues
Slide 52: Database Replication • Simple but finicky • Asynchronous (but not by much) • Allows big queries and backups to be moved to separate servers
Slide 53: File System Replication • Slow and very asynchronous • Mostly for disaster recovery
Slide 54: Data Management Middleware • Mostly for databases • Can handle partitioning and replication, and do it well • Big investment in coding to the API • Sometimes easier to add functionality to app
Slide 55: Queuing • Save work for later • Useful for less urgent operations, especially messaging • Can be used to wait for a pause, or to separate hardware
Slide 56: Dealing with lots of hardware (operations) • Automation • Process
Slide 57: Imaging/Provisioning • Be consistent • Use your distro’s automation (Kickstart, AutoYaST, etc.) • Use boring, meaningful hostnames • Make re-imaging easy
Slide 58: Deployment Systems • Content and code replication • Coherence/atomic updates • Managing pieces and processes • Simple scripts are fine • Create audit trail • Include back-out • Think 3AM • Do it!
Slide 59: Monitoring systems • A pain, but a lifesaver • Start with built-in basics • Add custom checks, especially end-to-end and communication between pieces • Eliminate false alarms (ongoing) • Nagios, usually
Slide 60: Coping with hardware failure • Have extra servers/capacity • Load balancers handle stateless layers • Replication prepares you to handle data layers manually • Use middleware or app-level multiple writes to get true data layer redundancy
Slide 61: Change management • Part automation, part process • Use version control on everything • Stage changes with realistic data • Know how to back out • Consult the right people (internal and/or external)
Slide 62: Efficiency • Access the smallest amount (DB, FS, etc.) • Don’t do complex stuff when simple will suffice
Slide 63: Using the database efficiently • Keep it simple • Know what queries you do • Index every query key • Cache to reduce demand • Check slow query log • Replicate if you need big queries
Slide 64: The messy real world
Slide 65: Security and abuse • Mostly same issues, just magnified • You will be a target • Spam (coming and going) • Abuse of file storage
Slide 66: Corner Cases • Murphy’s law enforcement • Watch out for how different user activities relate • Lock data, not functions • Housekeeping
Slide 67: Performance and tuning • Observation and responsiveness is more important than pre-optimizing • Redesign as needed • Collect the data to be able to analyze (both resource utilization and end-user performance)
Slide 68: Miscellaneous Warnings
Slide 69: Files and directories • Most default file system configurations get really slow with lots of files in one directory • Numerical limits on files and subdirectories • Some programs don’t like files over 2GB
Slide 70: AJAX • Sequential round trips • Make preloading invisible • UI that waits for too many things
Slide 71: Other topics • Multiple sites • CDNs
Slide 72: Scaling the LAMP Stack Future of Web Apps October, 2007 Daniel Lieberman daniel@bitpusher.com




Add a comment on Slide 1
If you have a SlideShare account, login to comment; else you can comment as a guest- Favorites & Groups
Showing 1-50 of 11 (more)