Fred Rants: Scaling Web Servers for the Non IT Manager

(c) Authors 2011

Introduction

This lecture will help non IT managers come to terms with the topic, and help them to improve their decision making and interactions with internal and external IT resources. When faced with expanding web traffic, management decisions should be informed, practical and economically justifiable.

The level is also pitched at MSc Marketing students at Strathclyde University. Some of them actually will have an IT background, and it may help expand their thinking into the management issues involved in scaling web server capacity.

An excutive summary and discussion workshop is reserved for the end of this lecture, such that you stand better informed before assertions are summarised and further discussion taken off line. Within the lecture, we will delve into the technical solutions, trying to explain them accurately in plain english where possible, or building upon your knowledge from my own and other learning materials.

Key Management Issues in Coping with Internet Server Traffic

Scalability in terms of a web site server capacity, means being able to handle more of the following :

Volume of incoming traffic, aka simultaenious user sessions; ie requests for web pages, information and seamless, connected flow through web shops ;

Quantity and intensiveness of internal data-handling of these incoming requests;

The Power and Appropriateness of the Back End Database ( eg MySQL, Oracle etc)

Outward serving of requested web pages and 'dynamic' information.

This is very much a topic for general and entrepreneurial management because it is the main investment in resources for a 'growing' web site in terms of: Hardware ; telecoms-connectivity; software installment for scaling ; and incremental manual maintenance.

Often the investment in scaling will be many times more the cost of initial web site development.

Management Perspective: Conflicts of Interest - It Pays to Be Informed

The conflict of interests between general or marketing management and both IT vendors, consultancies and internal IT fifedoms are very often centred on over-engineering solutions to meet "best practice and industry benchmark standards" rather than matching anticipated needs to a solution, which itself can go on to scale further if needed. Not to mince our words here, IT people often have a vested interest in over-delivery in terms of their top line as a supplier, or their departmental head count and budget as your IT department.

However, the opposite situation may also be true, where by a loyal IT department or web-host-supplier struggles to shoe horn more capacity into a fundamentally insubstantial architecture, and continues to patch-scale and fire-fight, tie-ing up resources in this rather than in the planning and implementation of a new server solution.

DEFINITIONS

What is a Web Server?

A web server is simply a powerful computer with an ethernet card capable or handeling a larger number of users than a PC or local office, shared file server would. The other main differences to your ordinary PC or Mac' are that it will have a different operating system (OS) , and this is most often Linux these days, and it will run completely different software than we have: Apache (e.g.) for handling requests for web pages, and information and to-and-from e-commerce transactions from web pages; a dynamic language engine, like php, cold fusion or asp: this reads incoming PHP GET$ urls and POST$ed data and organises computing and replies to these; A SQL database interface software, like MySQL which allows languages like PHP to refer to a larger, well designed database through a more efficient interface than directly done with say PHP.

What do we mean by Scaling Web Server Capacity?

When we talk about scaling web servers, we mean increasing the capacity for number of simultaenious user sessions possible on the hardware and tackeling any increased complexity of those user sessions. We increase the computing processer power, connectivity (bandwidth, internal server LAN, on-board RAM and BUS speed), local and back end database memory.

We do this by utilising:

a) more powerful machines ( verticle scaling, see below)
b) more machines, "clones" most often ( horizontal scaling)
c) better architecture
d) intelligent load balancing
e) Software accelerators / short cuts

LECTURE CONTENT

Goals in Scaling to Meet New Demand

In up-scaling to provide capacity what do we actually want and need to acheive ?

1) A fast and contiguous user experience: users expect rapid interaction on the internet, and it is vital to maintain continuity in user sessions.

2) Fidelity, Redundancy and Back Up of Data: we need to have data stored and retreived accurately, and to an acceptable level of currency, and periodic back up of "dynamic" data records . Data will need to be duplicated across "load balance clone servers" or stored in a central file server accessed by the clones. Some of this data will need to be prioritised in terms of populating all the cloned site data stores: eg user data for user log in, or important news items, product launches or deletions etc If there is a computer failure or down time, we would want the site to continue functioning: do we need to have the complexity to have continuity in user session even if a front end server goes down ?

3) A realistic level of investment and cost control in scaling to meet new and anticipated
demand: being able to meet a projected capacity and perhaps have a margin to exceed this, without over-engineering a solution to a definable capacity.

4) A known "road map" of potential upgrades to system architecture given projected growth or scenario setting for different potential user numbers and computing intensity.

Traffic Load Monitoring and Planning

Internet traffic for a newly launched web site is by in large chaotic: it is a function as much or even more of " word of mouth" ( tweets, bloggs, RSS feeds, top news sites) as it is of on line marketing or off line advertising.

Obviously though, a web site will see temporal traffic around the key user group's most active times on the internet around your offering. Also you may see peak traffic relating to promotions, ticket sales or fortuitous on/off line PR. It can be possible to hire in extra server banks or set up a queuing system for users, but really this is not quite in the scope of this lecture. Rather this lecture deals with a scaling demand over time.

Over time, ignoring any fortuitous "clicky" PR, you have the following to consider in the equation:

1) underlying growth rate: ( cumulative, moving-annual-total y2-y1 will help you assess this)

2) seasonality ( run rate ie ordinary graphing of y1; data from similar web sites; moving quarterly total)

3) daily internet time habits

4) planned marketing, promotions and rating on the key search engines for key words

As mentioned below, you can choose different strategies to meet forecasted demand or tactics to cope with sudden peaks in traffic. In out set for a new web venture, you will ideally have a handle on your marketing budget and success of previous web ventures either internally or from a consultancy.

In this way you can balance a desirable budget for hardware and set up of the server banks, with a set peak demand and consider this in light of break-even and target income per day ie hits converted to sales.

So practically to keep the accountants happy, you would want to be able to cope with the back end SQL requirements for processing your target sales volume from the projected time of break even to a time point when you want to achieve target operational profitability. This would be done in light of a phsyical measure relating to capacity planning, probably with reference to sales transactions per peak time hour. Meantime you may wish to perhaps "lighten" server load from people not in the shopping channel of your web site ie making casual visitors pay a queuing dividend or just sending them a server busy reply.

At this point you would want to engage the concepts of game theory and scenario building, whereby you consider different influential factors and outcomes and thus create plans or a framework to meet these scenarios. This is then beyond the scope of this lecture in producing a variety of possible peak and steady traffic numbers, but you still need to understand how to cope with each likelihood in traffic volume scalability.

Introduction to the Requirement Definition and
Technical Management in Scaling

Eventually as visits and itneraction to a web side grows, the level of Internet traffic essentially loads all the hardware on a server to a point of saturation: no more users can be coped with and the user either gets just a loading sig while their browser pane just "hangs" , or a blank page or if they are lucky the following:

Error 503 Service Unavailable: The server is currently unavailable (because it is overloaded or down for maintenance).^[2] Generally, this is a temporary state.

What is going on in the server? Well firstly the Ethernet card has limited number of transactions per second- if indeed it is specified to match anywhere near the incoming bandwidth, which in itself may be insufficient.

Further into the "layers" of the server computer, RAM and near- CPU RAM gets filled, the ports get their bandwidth filled in and out, any back end database connections get busy or the allowed user intergation links gets filled and the CPU in the primary server gets overloaded. At this point various issues of queing arise or users requests/interactions with the web site just fail. A server busy error message is often generated. Sometimes the entire server will crash or even overheat.

The major choices in hardware are to either scale 'vertically' or 'horizontally':

Vertically : this means buying better quality machines (with for example more CPUs on the board and more layer 2 RAM ), More powerful ethernet ports and actual "front end" intelligent traffic routers- the latter is then integrated to the other route which is to scale horizontally: ie install more machines running in parallel to deliver the web interaction to many more users (or interconnected, sharing data repositories and user sessions)

The software needed to load balance is another area which although inexpensive, can have cost implications in personnel resources for implemenation, maintainance and life cycle management. Another more expensive software route, requiring more expert intervention and on going management, is software which optmises server performance, thus reducing CPU /RAM load: these work in different ways as we will learn, and are most relevant with "dynamic" web sites.

Cost Implications of Scaling

The relatively significant costs of scaling must be taken in balance with expected income or utility from the web site. A "best practice" technical solution recommended by a consultancy or vendor may be both over engineering a solution and more expensive than expanding the server and traffic channel "organically" that is to say, incremental expansion to meet demand eoncountered over time.

Vendors (suppliers) and consultants will of course give a very good outline of cost per thousand user sessions and what level of complexity can be handled by a given set up (system architecture and software)In implementing a scalable web site which anticipates high demand and high return on investment, utilising consultancies and installers like CISCO, Microsoft Partners, IBM or Oracle to name but a few can instill a high level of confidence through your own orginisation.

Internal corporate IT managers will also be able to indicate how much load the current system handles or how much load a "cloned" horizontal scaling ( see below) replicating the current server would cost.

Anticipating demand and therefore traffic load levels for a telecom connection can be difficult given that search-engine-optimisation, links on leading web sites, fortuitous search engine listings or successful on/off line marketing and PR campaigns can all deliver more traffic than in service-level-agreements, and trigger punative over-capacity charges from the bandwidth telecoms supplier.

Connective bandwidth is one area for cost control and planning, while the actual number and quality of investment in server-computers is another aspect and an area for hidden costs, or overly expensive investment.

Hardware will then contribute to the majority of your costs, but software can rapidly add costs to initial implementaion or in ongoing operational resources.

Management Perspective: Take a scenario- you implement a solid hardware based solution from a leading vendor (supplier) : the functionality of the implementation ( ie web site dynamic features) becomes larger than the initial specification for hardware capacity and it is desirable to implement different levels of load balancing , caching and acceleration: The original consultancy re-quote and this is outwith budget: You engage a small consultancy, who want to install a cutting edge solution. The solution works at first, but when you alter the web site structure, it stops working: the consultancy is bankrupt and no programmers will certify a fix on the software. You should have perhaps opted for a more tested software solution and reached a compromise with the consultancy to budget for cooperation with a cheaper, specialist supplier: you then have the issue of maintainance reduced, as common solutions are often taught in computer science or learned on the job.

In terms of working around a limited budget where utility of the web site may be reasonably high but income low, or it may be in fact a Social Enterprise (NPO) : lower value users may actually not mind being told they are queued or their data will be processed later, such that current hardware or that which can be budgeted for will determine the user experience and service level, rather than the reverse.

Database Resources and Hidden Costs

Most dynamic web sites which handle any large amount of data for many simultaenious users, utilise a "back end database" server. This will be discussed in more detail in a later lecture, but in terms of scaling this also has direct and hidden costs.

A simple web site with some 'non perishable interactivity', like say log-in or shopping cart, may well be developed without an actual database or with a "local" database which is just a file depository on the Apache server accessed by the web language being used with an SQL interface plug in so to speak. So for example in PHP you can install SQL light, which uses simplified commands to deposit and retreive data from a csv file on the server. However this will create bottle necks if you use one "table" ie csv file for all users, or start to use a lot of memory and CPU time if you have a lot of tables for each session and each user. With larger number of data fields, name-address- and so on, and with the need to store associated files like jpegs or docs, then it soon becomes highly desirable to employ a seperate data machine with a powerful database installed on it: the "Back end" database server.

There are a lot of benefits in moving over to this approach, but there are two main issues for scaling up to a powerful database engine :

1) Phsyical: web server to database ratio and redundancy of data.

2) Re-engineering web sites to new databases

For the former, if most users do not access the database then one back end server may suffice. However where the function of the web site is like a banking service or Facebook, then there nearly all users will want to log in and interact with data in the database. We will return to scaling back-end databases below.

Strategies and Technologies for Implementing Scale
in Web Server Capacity:

Vertical Scaling : Expanding Capacity by Increasing Quality of Machinery

The first means to scale the capacity of a server comes without any need to reprogramme the Apache/Php/Linux software environment:

Most simple, 'local' server machines intended for light loads of say 50 to 100 simultaineous user sessions, have the simple physcial capacity to add RAM, ports and a second CPU to the motherboard or a second CPU motherboard. When this is done, the system auto-detects and installs the new hardware, thus integrating it seamlessly, with all the higher level software running immediately. Everything runs faster, so more users can be accomadated and their page requests and form-response actions go quicker.

Bottlenecks
The real world limit motherboards 256GB, 2 x 3MgHZ cpu's, 64 bit busses: the cycles still top out though, with many users or complex scripting and back-end SQL interfacing.

There is always a premium for top end servers: so it is often cheaper to scale on cheaper systems with double the number of servers. Also if one breaks down, there is one which is up!

So instead of spending on fast but economically expensive machines while still having the risk of downtime, you scale horizontally:

Horizontal Scaling: Expanding Capacity by Number of Server Machines

This just means deploying more servers to handle the load, and the most common scenario at the primary server level, is that these are in effect clones of each other in terms of operating system, dynamic web language engine, related coding and data repositories. This makes scaling easy to implement, because reprogramming is basically a very rapid "copy and paste" of all programming code, systems and information , thus minimising down time during the up-scale to further multiples of servers, or minor bug fixing, or actually during any new web site implementations.

In my experience in the past however, it may not have been possible to acheive this because a reliable work horse of a single server became obselete over time, meaning that it was not powerful enough to run a new OS and newer dynamic language engine. It may be actually discontinued, so the next machine is only a "clone" to the outside world, and is actually running a new environment. In this case, there will be significatn duplciation of manual tasks in updates etc so it may be better to migrate the whole site to a new server which is 'vertically' scaled and can scale further on a horizontal basis. The scenario of rapid obselecence has now been largely overcome by the very common use of the efficient Linux OSs and PHP/Asp systems with their accelerators ( see below). allowing reasonable life span for some server machines.

The additional cost of this is in desinging and managing load-balancing between the two: for example you may have an XML mediated CSV database (comma separated values) which is in the millions of records, so this could be best split over the two servers, with a partitioning of users "A to L" to 1, and "M to Z" to 2. This requires intelligent routing, we will come back to.

Sesssion ID now becomes complex: if you load balance over to the other server during a session, their UID session and related temp data cache on the server ( eg shopping cart, log in, progress in a form) then the user could be lost if a subsequent http request goes to the other server.

Duplication of databases is often not practical if there is need for strong real-time continuity in this, so user behaviour can be looked at : perhaps they view many html static pages before they start to interegate the SQL server, or most traffic never even uses the pages containing these requests.

Even sessions can be kept contigous for individual users across different servers, by saving the session temp cache and duplicates the UID in a commonly accessible network folder: this needs to be very quick to access for this reason, and is an expensive investment on large server banks. Then there is the issue of redundancy for this important part of the architecture: it takes investment and uses time/space on the internal bandwith and CPUs.

Load Sharing:

Load sharing or load-balancing, means distributing the http traffic to different machines in the horizontally-scaled server bank. This can be acheived at several levels or "layers" and by using different levels of either simple or intelligent trafficing:

1) Multipe IP address per Dot Com on DNS servers can be the first point of load sharing: on original DNS request from a browser, the DNS can have several IP addresses for the one web site: most DNS's will have software which just goes automatically through the list of IP to that top level domain name sequentially. This means you get a low-cost, Drawbacks though: Some caching at ISP level , virtual DNS request management, can make this and other changes in IP address. Also the browser may cache IP for domain names. You could allow for days of overlap between servers, but you risk insynchronous database entries.It also takes time for a new IP point to populate the worlds DNSs, and then the ISPs virtual DNS servers. Also if one server IP goes down, then a disproportionate amount of users can be affected because of the vDNS: you could also perhaps handle all the down time traffic, if at night for example, on the other threee machines while #4 is down, then you lose 25% of visits no matter what you can serve from the one still up.
( IPCONFIG/Flush DNS will get rid of local DNS cache in command line)

2) "Layer 5 or 7" Balancing: Intelligent Router Load Balancing and IP sharing : a device owned by you, a router-server, which redirects requests for that web site to a a bank of servers : One way is eg www1.dominoes.com, www.2..., so you can control more actual machines with IP's behind the scenes, and load balance without DNS effects: the use of multiple sub domain names can be annoying for users if the bookmark the address. Intelligen routers like this can act intelligently, parsing the packets quickly and sending to the most relevant server: eg database requests for certain folders / page requests, setting up cookie/sessions.

These can also be scaled horizontally such that one ethernet card : the other server can just be for out-times or when the primary goes down. The IP is just passed over in the ethernet card. Even session information can be balanced, ie duplicated between the two routers in-case of one going down: beacause they are 1) really just routing, transparently 2) but are intelligent enough to route users to the right web server they have been on already. These are called "sticky sessions", and can be cookie mediated at the intelligent router level.

Partitioning: the intelligent router-load balancer, log ins for user names A to L go to one server with their data there or in a back end SQL server.

So this has a cost-benefit for on line e-commerce shops." Fibre-Channel" is an expensive rapid access network file server & connection for example. NFS is at the other end, being very cheap and low maintainance database/source file-sharing on unix/linux.

RAID Arrays of managed hard-drives, were previously very popular because they could manage the information redundancy as a shared-file/data repository which overcame some of the issues of potential machine failure and recovery of data after crashes.

There are various software used to balance at layer 7 or primary web server level: LVS, Pirhana (Redhat Linux) , HAproxy (on ubuntu Linux) , A pair of Citrix's or CISCO's load balancers alone are approximately 100K USD, as an installed hardware solution with optimised linux/unix software.

Scaling Issues with "Back End Databases"

When a web site has either a large amount of complexity in the fields it will read-write to in a back end database, or when there are many more simultaenious users ( and when both are te case of course) then either web-server-directory database or a single back end database machine will become insufficient.

In order to cope with the complexity and demand, planning on both the hard ware, internal connectivity, and the software optimisation will be required in order to effect good load balancing which gives users a fast web site experience.

This means you have both the issue of horizontal/verticle back end scaling AND load balancing and data-sharing such that users have a contiguous experience. One way to ensure this latter point, is to link the 'log in' to a route to only one unique back-end server. So all users A to D go to that server. More likely, you date stamp the user and utilise an Apache server side "triage" 'log in' table such that the user name is recognised, a date is found and then the correct back end server is hooked up for that user irrespective of which web server they have come in on.

Vendors of branded database engines, and experts in the 'freeware' MySQL, will be able to present scenarios for traffic handeling :

what type of back end servers (CPUs, OS, RAM, hard drive, RAID systems)
and phsyical connectivity ( Ethernet cards, ports) will be needed
What level of redundancy and back up is needed etc

The above will be taken into account and costs presented to determine how many simultænious user sessions can be handled economically, and how the load is balanced wieghed up against contiguity of data and user session. As mentioned before, it may be an economic case to just not allow more users on at peak times, or to re-iterate low security, frequently accessed data on the Apache server side in small tables dumped out the database. ( see SQL Caching)

Software and Reprogramming Costs in Changing Back End

However the other major issue when faced with suggestions by a vendor or internal developer to change back end database programme-ware, is that the database interaction coding in the web site front end will ALL need to be re-engineered. SQL commands vary, the database call-ups in PHP / ASP vary and the plug inn modules which add Apache side functionality vary in their commands and utility.

Once again there is a vendor-client conflict of interest scenario: the vendor has vested interests as being a "partner" firm to a branded database supplier, and these may change over time from MS to Oracle for example. It may look like say, extending your MS SQL-Server licensing from say your existing ERP system to replace previous share-ware MySQL back end on your web site, is a sensible integration at little cost. However, the programming costs in the conversion, may outwiegh the costs of horizontal scaling and maintiaining MySQL.

It is nearly always possible to create a flat table on a file-server which several different database engines can access. In the worse case scenario, you can just schedule a data dump of the flat files as "CSV or PSV etc from MySQL to a file location at scheduled intervals for the data to be populated over the main ERP system. This latter case was in fact was a very common integration method between earlier "green screen" ERP systems ( often refered to as "Legacy" systems ) and e-commerce front ends in the 1990s and early 2000s: Orders would be dumped down in a simple separator delimited flat file every hour or so, and there would be a manual check to see if they were accurate and not duplicates.

Management Perspective: Development Phase and Uncertainty over Choice of Database Engine and Back End Architecture

. When presented with the need to enter a development phase for a new dynamic web project, it may be unclear which back end system to opt for before the actual functionality and user demand is defined.
The costs and time penalties in re-engineering to a different back end system
which may better suit the specific functionality, and higher user demand,
may be unforseen by other managers.

So when prototyping a web site, it may be best to use a very simple,
single server set up, with for example just XML data depositories or SQL-light. From this perspective, the initial programming will be faster and cheaper and because there will be only a small user test base, server loads will not be an issue. Furthermore, the simple programming and data tables will be very self evident to the eventual developers. The code will therefore be easier to replace and supplement with higher database command-functionality when you upgrade to a more powerful back end database.

Performance Enhancers
There are diverse strategies for optimising performance of a system in terms of streamlining the amount of processor tasks and the route the data takes:

1) Compiled Code ( Java/C++ etc Applets) and UpCode Cache
One is to run a compiled language alternative to a higher level non compiled language on the server: PHP is compliled on the fly by the php server engine, thus there is an extra load on the server processor: requests to a php file location which are high traffic and require more logical data-processing, could have a Java server-applet which recieves the request in the URL or POST from PHP and just interprets it in a fixed parser to then run a very much quicker compiled java routine.

The downside of this route is that it requires higher programmer skill to update or rewrite code in comparison to a higher level language like Active Server Pages (asp) or Personal Home Pages (php)).

PHP and other dynamic web site language-engines, do have compilers as accelerators though: UPcode caching, means that you can have your cake and eat it so to speak: the server BOTH parses the PHP, keeping it there for reference or re-iteration, while it also COMPILES the actual programme to machine coded for execution during normal operation: so web pages and interactions are handled by machine code more directly and hence the on-the-fly compiling is eliminated. In PHP it is like eAccelerator, APC, XCache etc. The programme is only compiled when new PHP code is loaded onto the server file location.

2) Optimised Web Coding for Dynamic Web SItes

i) Think Traffic! Optimise the HTML

From the very, very top of the home page HTML document you start to incur web traffic loading as a multiplier: bytes x users. Coupled to the latest advances in super dynamic, updating web sites, server load can soon become a head ache and you a victim of your own success.

When you consider the many billion hits Google gets everyday, then even their URL expressions look to be optimised in terms of pure numbers of bytes by reducing the ASCII characters to a minimum.

Taking further with google: the pages have always been somewhat graphically sparse, with youttube neatly partitioned now. This is intentional! It is their brand and their saving on server load.

The latest web sites present data in page elements which are fully dynamic: they can update every 5 seconds or whatever periodicity is set, they request data as soon as the user moves the cursor and so on. This Ajax/Jason/XML mediated http traffic and server intelligence load if you like, can snow ball as users come to your site or start to be more interactive. Furthermore, they have to actually close the web page before say a five second share or ticket price refresher stops making those http hits on the server!!!

Best practice in this field is generally to be seen at, you guessed it, Google: in terms of the search engine and especially google maps API which serves up a graphic rich drag-scroll environment to millions. Also banks tend to be clever at cutting down the http traffic and perhaps protecting their sites by having succinct coding, minimal byte submission, low graphic content and efficient server side applications.

Software ( decongoderators ?) will rewrite your legnthy HTML/PHP for you as mentioned, to make it more compact, if a little less understandable and lacking any documentation. Simple things like parameter names or file paths can save on the bytes.

So in outset, the route for a high volume consumer site in project managing, testing, debugging and beta versioning, laubching and eventually server scaling (ie the entire web build and life cycle), will be different from a low demand, high content business to business user service.

ii) Information and Query Caching

Flat HTML content files are very much quicker for a server to process and serve outwards because they are just files with little or no interactive requirement. They load the CPU very little. For web sites built in a dynamic web site language however, it can be inefficient to serve a home page which is only altered now and again by running php: the request for index.php? can therefore just be handled to send a current version of that page previously generated dynamically, in plain HTML.

In fact entire web sites are organised in this way, so in fact they are only dynamic upon scheduled processing rounds, when new submissions and versions coming from users ( like a notice board) or sources ( like an RSS or other XML news feed, or even a back end SQL database) are actually run in the dynamic language and output as htm/xhtml

Economic and quality issues for management mean that there has to be a utility to cost relationship: will lower value users interacting with files. This approach was used quite frequently with PERL based dynamic web sites in the late 90s, early 2000s, but it has drawbacks in the immediacy of data. So for example it is appropriate maybe for a catalogue of products with current prices, whereas totally innappropriate for a shopping cart type solution to buying those same products.

This is however an economic approach with interactive user-posting web sites, where there is actually little monetary revenue from the hosting: this can be done by just informing the user that their post / free advert etc

A further area which can be excluded from "low current /low perishable content" is the SQL requests emanating from PHP code going to back end databases: for example in cfm days, even a simple "welcome" message from a fully dynamic home page text area would incurr both cold fusion engine CPU useage (in and out to the :) , back end data connectivity delay and SQL Server CPU useage.

SQL Query - Result Caching
The response to a given query is cached: the infomation can be cached locally on the server once this is generated, thus being both "static" information and avoiding the connection and CPU use on the back end SQL server machine. This is actually very easy to implement on many webserver-back end server software set ups with simple command codes to store popular SQL queries and their results: hence the query is recognised as a reiteration and the local data served up, while new queries are sent to the php/SQL server route.

Further to this, and used most likely by FaceBook, is the "memcached" approach which builds a local array of frequently requested information which is then queried on the fly from just that server: so for example if you do not update your profile, your little user preferences and UID are actually local to the server ( amongst millions of others in this case perhaps! ): this is more efficient because the actual information may be stored in diverse tables in the SQL database, whereas the frequently paired information is really rapidly accessed in this small, simple and aggregated local sever array.

This can be used for emergency load balancing or back up for frequently accessed information or UID codes : the small arrays are populated accross servers, and so you may update on one, while actually a "back up" is on another, or if the back end goes down, then you have at least some redundancy for simple log ins or posting etc.

3) Back End Optimisations on SQL Data Servers
Master- Slave: here one master is the write to file, which then populates several slave SQL servers for delivering that data out, thus balancing queries over the slaves. Given failure of the master, the Slaves should be able to take over the Master role upon failure or upgrading on that original machine. We can also of course

Load balancers are then used to manage traffic to the slaves and only send write commands to the master; also they can help manage the cycles of populating the slaves.

RAID Arrays of managed hard-drives, were previously very popular because they could manage the information redundancy as a shared-file/data repository which overcame some of the issues of potential machine failure and recovery of data after crashes.

Also you may find that the database table, navigation and actual hardware is not sufficient so you would want to vertically scale as load renders your current software/hardware combination inadequate for the traffic and processing encountered or expected.

4) Partitioning Users and Their Data
This can be first and foremost geographically acheived: hence facebook UK would store maybe all user profile information for that market: or on a regional basis. Also you could partition on user name, just alphabetically, or based upon some knowledge of surfing behaviour
If we can therefore know how many current and/or potential users for a partitioning possibility are, then we can actually invest resources specifically in relation to that population "n".
This is most applicable where "global" information to be served to users is somewhat limited in complexity, while local data is more closely relating to the partitioned group. Thus the updates from a "super master" database server are global and scheduled at night time for example, whereas for that partition, relevant information is virtually live, but not accessible outside that pertition.
You could then impose a level or partitioin screening or virtual partitioning, if for example a user logged in from abroad to their home country, all the servers in the world know the simple user ID and which partitioned server to redirect the user to.

Summary of Key Management Issues in
Scaling Web Server Capacity

The Key Management issues and decisions centre on the following:

Cost-Benefit: this can have it's own KPIs, like for example , cost per user, cost per thousand simultænious transactions, and scaling related cost-outcome scenarios.

Do we want to offer a very high quality of web experience to a defined number of users? If so, then verticle scaling based upon known traffic is a good solution to speed up the user session, or integrate processing-hungry operations on the server side, or larger outward download of files and scripts to the local users computer via their browser.

Do we want to prepare for explosive growth in demand? Will we lease machinery to our location or from the web host to cope with demand from a marketing campaign? DO we expect to introduce far more dynamic code which is processor and internal bandwidth intensive? If so then we must consider both effective hardware scaling, horizontally, and optimisations in performance by software acceleration, caching data and intelligent routing.

Alternatively, do we expect predictable growth in demand or a slow rate of new, incremental traffic? If so we would want to have a cost-effective horizontal scaling which can grow organically and pay for itself within defined time and margin parameters. Can we roll out implementation to selective customer groups or geographical markets, thus being able to plan for a known or estimatable capacity? A horizontal solution with partitioning could be a very cost definable means to plan for a maximum forecast capacity from partioned user groups.'

Scale........ and Scalability: One issue is to acheive the scale you require for current or anticipated levels of web traffic, the other is to have an architecture which allows for further scaling: In terms of both traffic and complexity of your web site : will the web site start to search much larger data sources to offer the users a richer experience? WIll we roll out our full range of products, old stock, components and consumables onto the web shop? Will we force lower end users over to web information / shopping? All these can incurr extra CPU , RAM and internal BUS (bandwidth) load by volume of users of complexity of code and SQL back end computing.

As a manager faced with exponential growth in user traffic plus integration to ERP and legacy data, capacity planning must be an open discussion with IT departments, web hosts and ISPs with eventually the DNS ( Network associates etc) authorities being involved to help balance your load.

Risk: Do we accept a level of risk in either reducing costs or implementing a more innovative, higher performaning web experience for our target market? Or do we opt for safer technology with generally available personnel skills? Do we over-engineer to industry "best practice" or actually allow for a degree of over-capacity? Finally, do we risk adopting a stance that we cover a minimum projected demand with initial investment while making a road-map for scaling at a later date? How much do we expose ourselves to risk of having peculiar systems delivered by either suppliers or internal IT developers?

Location: Do you want to continue hosting at a virtual web host? WIll they be able to integrate a shop solution to your ERP order and inventory system would be a good place to start asking questions, and you may want to benchmark performance of other similar solutions they host which connect to ERP at a client: is this slow at peak times? Are their computers suitably
secure and redundant- do they offer enough back up and redundancy for your implementation? How much extra bandwidth cost is incurred? Will a rapid expansion in traffic to your site incurr punative surcharges which you are contractually obliged to cover? So it may be preferable to host on your own servers in terms of quality in e-commerce, security of sensitive data . However, do you actually command the bandwidth and speed in terms of being near to a fibre optic "backbone" or "ring" ?

The concepts of Cloud Computing and Edge Computing relate to a high level of interaction and redundancy made possible by super fast backbone communications and the sharing of computer capacity in terms of uptime, port traffic-capacity, RAM/Disc memory and processor power. These are a bit beyond the scope of this current lecture series, but are an emergent area which will have value for both large multinational corporates and web-brands, as well as for egalitarian, information based services, like perhaps Wikipedia.

Finally on location, will it be desirable to partition traffic from different geographical areas so as to load balance ? How do we avoid cross-domain-policy clashes? Do we want to infact implement new domain names for different geographical regions so we can serve nearer to market, or do we need to enter a Google/FaceBook style agreement with the DNS authorities, allowing for IP addresses to serve a dot com to nearest regional web server?

Crisis management : a new blog, how and when do we bump users during super nova demand?

(c) Authors 2011

Sources: 1) Wikipedia: links given and all authors recognised.
2) Harvard University Dept Computing Scienc, "E75"' Building Dynamic Web Sites ' on line streamed lectures ( academic earth dot com), Professor David Malan.
3) VM Ware Blog VM Ware World Headquarters Palo Alto, CA 94304 USA

Fred Rants

Saturday, January 22, 2011

Scaling Web Servers for the Non IT Manager

No comments:

Post a Comment

Snoop Through all Fred Rants

About Me

Rant rating

Blog Archive

Pages

Rant Snoops