Sun GlassFish Web Stack Deep Dive Apache HTTP Server .

Transcription

Sun GlassFish Web Stack Deep DiveApache HTTP Server PerformanceJeff TrawickSun GlassFish Web Stack, Sun Microsystems

What role can Apache play in theperformance of the overall systemAlternatives (Lighttpd, Sun Web Server)Tuning for capacity and performance2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone2

AgendaWhy ApacheKey choices to makeConfiguring Apache for capacityConfiguring Apache for performanceFor more information2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone3

Why ApacheFirst: Don't choose Apache over other web servers forperformance reasonsConsider also Sun Web Server and Lighttpd Both use fewer system resources per client than Apache Both can provide higher response rates than Apache Both of these, in addition to Apache, are covered by GlassFishPortfolio support subscriptions2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone4

Why ApacheIf it is bloated and slow, why does it still matter? Rich set of features, along with configurability Multitudes of plug-in modules available from third parties Extensive documentation From the project itselfFrom the communityFrom traditional publishers Multiple vendors investing in it Sun, SpringSource, IBM, RedHat, etc. Many people skilled in configuration/tuning Analyzed extensively for security flaws Long history of addressing security issues in a responsiblemanner2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone5

Why ApacheOh, and it can perform adequately in mostcircumstances2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone6

Why ApacheFrom this point forward, we'll assume there's a goodreason to use Apache2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone7

Key choices to makeWhat Apache features can improve overall systemperformanceProcess model Prefork vs. worker MPM FastCGI vs. in-process script execution2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone8

Using Apache to improve applicationperformanceNot Apache tuning per seWhat Apache features can I use to improve overallapplication performance Whether or not it makes Apache slower or makes Apacheconsume more resources2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone9

Using Apache to improve applicationperformanceSeveral common areas: Cache Compression Load-balancing to back-end servers2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone10

Using Apache to improve applicationperformance - CacheManipulating cache headers for use by client andnetwork caches If application doesn't generate thoseApache's cache Request still reaches Apache but is served from static file ondisk (VM) instead of forcing application to render again2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone11

Adding Expires headersIf the application doesn't set Expires headers so thatthe browser (or cache) knows how long the response isvalid, Apache can add these via mod expires Uh, only if the resource won't change for the configured time,or the browser doesn't actually need to see the changedversion for the configured time2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone12

Adding Expires headersHere's what it looks like: Location /my-status-app/ ExpiresActive On# The response is valid for the next hour.# don't have to request it again.ExpiresDefault "access plus 60 minutes"Browsers /Location Any of Apache's configuration containers can be used forthis.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone13

Adding Expires headersSpecial concern: Once the response is sent, it can affect not just the client thatrequested it but also network caches. If you really really need to start serving an updated version ofthe resource, some clients will just have to wait.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone14

Apache's cachemod disk cache is the only actively developed cachebundled with Apache mod mem cache probably needs active development Other, simpler caching is of more limited use (e.g.,mod file cache)Typically used for dynamic contentAlso useful when data is stored on slow (network?)disks and the cache can be kept on fast local disksYou still need cache information from the application (oradded by Apache as in the previous mod expiresexample).2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone15

mod disk cacheIf you're using it to avoid requests to the application,you can get value from it with complete control overcaching (i.e., retain ability to start sending new contentimmediately): Use mod expires to add the Expires headers so that it iscacheable (locally) Remove this information with mod headers before theresponse is generated. If you have to start sending new content immediately, it is onlycached locally via mod disk cache, so remove the cache fileswith htcacheclean.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone16

mod disk cache – for more ing dynamic content with apache2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone17

Using Apache to improve applicationperformance - CompressionMuch of web traffic is highly compressible (HTML,JavaScript, style sheets)Spending CPU in the web server to compress outputcan significantly improve user experienceUse mod deflate's DEFLATE filter Simple to configure; check the Apache docs athttp://httpd.apache.org/docs/2.2/mod/mod deflate.html2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone18

Using Apache to horizontally scale applicationsReverse proxy to the application server (GlassFish,Tomcat, Mongrel, etc.)mod jk is a traditional implementation Well documented for use with Tomcat Use with GlassFish is described in a number of blog articlesApache 2.2 brings mod proxy ajp (AJP protocol, like mod jk) mod proxy balancer (balancer supports AJP protocol likemod jk, but also supports traditional HTTP proxy too, for usewith Mongrel or anything else)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone19

More on mod proxy balancerHere's a detailed article on using mod proxy balancer,plain HTTP proxy, and Mongrel (for rails-withapache-2-2-mod proxy balancer-and-mongrel2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone20

Process modelPrefork vs. worker MPM2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone21

Prefork MPMOne single-threaded process per active connection Matches Apache 1.3 processing modelMost commonly used Prefork is the default or only MPM provided by a number ofvendorsCompatibility with the most modules Recommended for mod php Required for mod perl if Perl interpreter doesn't supportthreadsAvoids concurrency bugs in plug-in modules, orconcurrency-related performance issues in librariesMore limited damage from vulnerabilities or crashes Complete isolation of client requests2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone22

Prefork MPM - drawbacksUses the most resources Requires a surprising amount of swap space on Solaris Larger working set (physical memory)Poor utilization of per-process state, such as Retained connections to other servers, such as Java application serverLDAP server Process-memory cachesRequires more difficult shared memory cross-processsynchronization to effectively share computedinformation (e.g., APC) So often not implemented2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone23

Worker MPMNew with Apache 2.0 Extensively used in production environments since 2002Minimizes memory consumption Unlike prefork, doesn't require surprising amounts of swapspace on Solaris Smaller working setCan allow effective utilization of retained state withoutmore-difficult shared memory implementation Increasing effectiveness with lower number of child process2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone24

Worker MPM, retained stateIn-memory cachesServer connections Application server Database connection LDAP server connection All can be much better utilized by using worker instead ofprefork Ouch! if a connection retained by a prefork process (in hopesof handling a request again soon) consumes significant serverresources2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone25

Worker MPM - drawbacksOccasional issues with releasing resources as childprocesses exit (more later)Maximum exposure to vulnerabilities and other bugs Information for other users could be retrieved If processing for one user triggers a crash, other users areimpacted(can be mitigated somewhat by reducing the number ofthreads per child)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone26

Tuning for capacityWhat does this cover? How many clients can I support How much of my system can Apache use?2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone27

Tuning for capacityHow many clients can I support? This is relatively simple (essentially MaxClients).(more later on that)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone28

Tuning for capacityHow much of my system can Apache use? This is quite important, but Apache puts all the work on thesystem administrator Administrator must set MaxClients low enough to avoidrunning out of Swap spacePhysical memoryyet high enough to serve enough clients. Observation required for new Apache or applicationdeployments (mod status, vmstat, etc.)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone29

Tuning for capacitySpecial Solaris issue Solaris allocates swap space based on the virtual memory sizeof each process, including code that is shared among theApache processes Removing code lowers the swap requirements (times number ofchild processes) Comment out any unnecessary LoadModule directives beforeworking on capacity If using in-process PHP (mod php), disable any unusedextensions as well.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone30

Tuning for capacity - Keep-AliveKeepAlive {On Off}KeepAliveTimeout [timeout-in-seconds]Because Apache* dedicates a processing thread toconnections in keep-alive state, this is a capacityconcern in addition to a performance-tuning issueReduce the number of processes (prefork) or threads(worker) required by reducing KeepAliveTimeout to asmall value (e.g., 2 seconds)*The Event MPM handles Keep-Alive state without adedicated thread. (Event as default MPM for 2.4?)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone31

Tuning for capacity - Keep-Alive2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone32

What does this mean?If we need to increase the number of clients we canhandle, and/or reduce system resources: Reduce KeepAliveTimeout2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone33

Tuning for capacityCapacity of Apache vs. capacity of back-end servers There's almost never a 1:1 mapping between Apache and aback-end server Either Apache off-loads certain types of requests or loadbalancesto multiple back-ends (usually) Apache will be handling more clients than a particular backend server Be aware that increasing Apache's capacity beyond that ofback-end server(s) can use extra system resources withoutmuch benefit (under heavy load and/or when the applicationisn't responding) Example trade-off: Browser reports that the site is unavailable, vs.application renders a page indicating that there are too manydatabase connections in use (the latter exacerbating the capacityproblem)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone34

Cleaning up Apache child processes (Capacity?)“Set MaxSpareServers/MaxSpareThreads kind-of lowso Apache doesn't use so manyprocesses/memory/whatever” What's missing here? Apache essentially “owns” or hasreserved enough system resources to run at maximumcapacity 24x7. If user load increases or an application slows down, Apachecan very rapidly reclaim the resources. In other words, youcan't use them for anything else. But there's a valuable side-effect to idle process terminationthat affects other servers: Any retained connections (appserver, DB, LDAP) will be cleaned. Very important with prefork, since such connections are so poorlyutilized to begin with. (not so important with worker unlessThreadsPerChild is low)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone35

Cleaning up Apache child processesHow? MaxSpareServers (prefork), MaxSpareThreads (worker) Reduce process count when load subsides MaxRequestsPerChild Special purpose: work around memory or other resource leak Graceful restart2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone36

Worker MPM and child process exitMaybe it is helpful for us to terminate excess processes(e.g., to terminate retained back-end connections)Worker has a special problem: A child can't exit (i.e., release resources) until the last clientconnection handled by the process has finished. The occasional hung or very long running request is notuncommon in some environments. Worst case: Because of trying to keep the process count low inpresence of hung requests, you can accumulate a huge amount ofprocesses with only 1 or 2 active threads. The bulk of systemresources is still used.Ouch! You need to use MaxRequestsPerChild to workaround a leak, but a small number of requests take avery long time.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone37

Prefork MPM and child process exitA child process only handles one client connection (ifnot idle), so no special issues surrounding child processexit(well, third-party modules can create some havoc in theirchild exit handlers, but I haven't seen that in a while)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone38

Tuning for performanceWhat does this cover? After establishing the big picture (what role does Apache play,what MPM is used), minimize system resources used and/orimprove response times.2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone39

Tuning for performance – Prefork MPMStartServers, MinSpareServers, MaxSpareServersStartServers can speed up the rate at which Apachecan start handling high load (useful if behind a loadbalancer which can suddenly divert a lot of traffic)If StartServers MinSpareServers and there's no initialburst of traffic, Apache will create a bunch of children atstartup then start terminating them (waste)If you don't get a burst of traffic initially, don't worryabout StartServers (but don't set it real high).2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone40

Tuning for performance – Prefork MPMKeep in mind the proportions of MinSpareServers andMaxSpareServers to MaxClientsSome people scale up MaxClients drastically, but don'tscale up MinSpareServers/MaxSpareServers. Apache ends up terminating/creating child process when theserver load changes by a very small amount Consider MaxSpareServers 100 and MaxClients 8192 If load decreases by just over 1% of the overal max, Apache willterminate child processes (and likely have to create more soonafter).2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone41

Tuning for performance – Prefork MPMIs this bad?MaxClients 1024MinSpareServers 256MaxSpareServers 256Yes; constant termination/creation of child processesMind the gap (between Min and Max).2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone42

Tuning for performance – Prefork MPMWhat about ServerLimit? Most people can sleep well not knowing the details and simplysetting ServerLimit MaxClients. ServerLimit is the number of process slots in the Apachescoreboard, which can't be reallocated across graceful restart. Thus, if you want to change MaxClients across graceful restart,you can't make it higher than ServerLimit (and you can'tchange ServerLimit).2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone43

Tuning for performance – Worker MPMMuch the same as prefork, but with an addeddimension – threads per child.Higher ThreadsPerChild: Lower overall memory use Better utilization of per-process resources like reusable backend connections (Potentially) Higher thread contention Higher risk of information leaking from another thread More clients impacted if one client triggers a crash2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone44

Tuning for performance – Worker MPMSpecial file descriptor-based issues with higherThreadsPerChild Traditionally some third-party modules used select() on backend connections, and blew up with fds 1024 Limit ThreadsPerChild to 500 or so 500 client connection sockets 500 backend connection sockets Log files, listening sockets, etc. Traditionally some third-party modules or libraries used usedfopen() on Solaris failed unless a free fd under 256 wasavailable. Web Stack's Apache delivery resolves that2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone45

Tuning for performance – Worker MPMSimilar to prefork MPM, but “Thread” instead of “Server” Pay attention to the proportion of MinSpareThreads andMaxSpareThreads to MaxClients, and to the differencebetween them MaxClients 8192, MinSpareThreads 96, MaxSpareThreads192 Probably means endless work For MaxSpareThreads (cleaning up extra processes),remember worker's special issue when some occasionalrequests are long-running2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone46

Tuning for performanceOther usual suspects Avoid DNS lookups HostnameLookups OffAllow/Deny from IP address (instead of hostname) Don't force Apache to look for .htaccess files in every directorydown to the file AllowOverride NoneIf you must use .htaccess, allow in minimal set of directories. Don't load modules you don't need Avoid excessive logging Don't force Apache to check if parts of the path down to the fileare symlinks Use option FollowSymLinksDon't use option SymlinksIfOwnerMatch(what about about untrusted users?)2009 CommunityOne WEST Conference san francisco, ca developers.sun.com/events/communityone47

Tuning for performanceConcerns with untrusted users managing part of theweb space You don't want Apache blindly following symlinks/export/home/joeuser/public html/index.htm /etc/xxx Disabling FollowSymLinks will force Apache to use lstat() on eachdirectory/file under the user's control They can't control httpd.conf, but maybe they need t

Tuning for capacity - Keep-Alive KeepAlive {On Off} KeepAliveTimeout [timeout-in-seconds] Because Apache* dedicates a processing thread to connections in keep-alive state, this is a capacity concern in addition to a performance-tuning issue Reduce the number of processes (prefork) or th