Skip to main content

good enough redux...

In this post, I made the case that many internet companies are busy solving problems that aren't in need of solution. I wanted to differentiate the conclusion of that article from the ideas I've expressed else where that the optimal solution for a given problem over the long term involves making sure it solves the problem landscape with the largest scope possible. Over time these solutions will lead to reduced costs for the implementer in the form of maintenance and scalability issues which tend to kill projects over time as complexity (in the form of constant dam plugging and retrofitting for problems that weren't originally in the solution scope) builds to unmanageable levels.

This is a different focus from the ideas mentioned in the 'good enough' post, which were focused not on the actual solution, which in many cases in these many online business models are indeed ingenious, it is in the selection of those particular problems. In this case, the issue is that the creators of these sites chose the wrong problem to solve, even if they solved it brilliantly, the existing solutions have already been deemed "good enough" by the bulk of users making even a new brilliant solution, a not very profitable one. In my own work, many of the problems I've sought to solve have been of the type 1 variety as detailed in the previous post. They are based on incremental improvements to aspects of the problems that are already out there, for example, currently there are many dozens of companies providing "framework" or platforms for building applications online. What many people don't see are the hidden issues that can come to bite those projects as they begin to experience any level of growth, one in particular that has struck many a web 2.0 company is , the problem of scaling.

In software design scaling is something that in the past wasn't really an issue, when software ran on a single pc, the user behind the keyboard was usually the only one making the request for the resources on that pc and thus could devote all or most of those resources to the application currently given focus. Microsoft Word does not and never was designed to scale beyond the use of its features by any more than one User at a time. This is true of pretty much any stand alone software that you can install and run on your pc, the software designers focused on allowing the software to satisfy the needs of a single user in a given time, this design perspective constrains the solutions employed to solves various problems of the software design that might be at odds for issues that are encountered in web software development. Unlike web software, stand alone software pays no concern to pooling resources in order to improve allocation from multiple Users. It may pool resources between user requests but in most cases that is not needed. (since User time is much, much, much slower than a single applications execution time) Software designers that recognize the differences in the two areas are best able to employ best practices and complete designs and implementations faster. Web software in contrast as several important differences that must be catered to in the design if it is to be at all successful:

  • Web Software by definition is geared toward simultaneous use by multiple users and thus must accomodate that use between users in a transparent fashion.
  • Web Software should allow individual Users to see the software as if they are the only User and not have to pay a performance penalty to have the experience.
  • Web Software should efficiently pool common aspects of the functions and services utilizied by the Users in order to improve efficiency and scalability.
  • Web Software that is well designed for multiple users can still be inefficient if it can not extend its efficiencies over a clustered environment. One awesome piece of software on a server that goes down is no use to every User connected, even if that one server was able to support 50,000 Users simultaneously. So distributed web software is critical to scalability.
  • Web Software that is distributable still can reach a glass ceiling if the individual nodes are not designed to efficiently throttle and/or route requests between server nodes.
  • At the heighest level of scalability, entire branches of nodes separated by geographical region in many cases need to efficiently route requests between one another as load increases, this would then allow the resources of servers connected over a WAN to distribute load between regions. It gets no better than this, and if your web software doesn't take account of this level of scalability, if you hope to ever be a google or an amazon , at some point you are going to face this issue.

So, the web software engineering problems are a completely different world of issues that must be juggled by the architect in order to design a solution that works efficiently at different levels of load. There are many approaches to these issues, the network design heavily influences the scalability that can be achieved and the amount of work required to achieve it. Over the last 15 years network architectures for scalable sites have gone from mostly 3 tier architectures to 2 tier ones.

Multi-tier architectures

In my designs I prefer the monolithic two tier architecture for its simplicity and reduced cost compared to 3 tier and also because today, a two tier design allows the serving agent (the web server) to be married to the business logic and processing agent(the code that actually runs the business and mediates transactions from the database), eliminating latency issues between the two. The proliferation of application server technologies (java, .net, php) that process business logic and serve pages has allowed such designs to emerge as efficient. They allow the elimination of an entire tier of servers and associated resources and significantly change the scalability constraints. If the web server is married to the application server, intelligent routing can be designed between the nodes that older 3 tier designs didn't have. The app server can be designed to be able to determine its resource utilization and determine if it should handle or route a request to another app server. If the behavior can be dynamically modified, the architecture is flexible to sudden changes in load. This elasticity is very important in a world where DDOS attacks can assert themselves within a matter of seconds, scalable designs should efficiently handle such extreme disaster scenarios as they aren't as unlikely as some designers would like to believe. Secondarily, they ensure a design that will be robust as expected growth , or even better...unexpected growth assails the architecture.

The last few years has been a flood of web 2.0 companies that have come out offering cool technologies brainstormed , quickly pitched to acquaintances and then quickly prototyped and pitched to angels or VC's. I see these models as risking disaster as they grow, the first issue with web 2.0 companies that throw together services is their reliance on a single technology foundation to provide their service. One in particular that has formed the basis of a majority of so called web 2.0 sites is Flash. Flash technology was invented by Adobe to allow client experiences that were more like desktop experiences, it does this by having the client do more of the rendering work than the server. This is facilitated by using a language that allows the client to draw simple geometric shapes, fill them , animate them to produce games, UI elements and even video (as in youtube), Flash itself is older than the web 2.0 resurgence , which most say started after the 2001 dot com bubble burst. Back then though it was really slow on clients (all that rendering required some nice horse power), however with Moore's law still going along and the rise of multi- core processor architectures (not to be confused with software architectures previously described) clients quickly were able to do more and more complex Flash with less of a performance hiccup. It was only when this was possible that the web 2.0 companies that used Flash started to pop up all over like weeds. However, web 2.0 is not only about Flash , the other side of the trend is manned by the oft heard AJAX.

AJAX , which stands for asynchronous javascript and xml , is an umbrella term for technologies that again , bring the web client experience closer to a desktop experience but they do it by performing actions that browsers previously performed sequentially with user events in non sequential (or asynchronous) ways. The realization was that, if the browser could get some data before hand in anticipation that the user will want it, it should do it to reduce the experience of a delay when the request is actually made. The asynchronous call is done using javascript , but the heart of AJAX is one particular javascript object called the XmlHttpRequest that I've talked a little about before.

It turns out that cleverly used AJAX techniques allow very dynamic responses to be created on the browser that formerly only could be done on desktops (or with Flash or Java) Data tables could be displayed in near real time as items are selected from a list. Links could be made to dynamically fetch related definitions and other cool things. So web 2.0 owes its rise to the dual tools of AJAX and Flash, however there are major differences between these approaches. First, because of its reliance on rendering, Flash requires more of the clients processor horsepower to display its fancy interfaces or produce its video, this means that a Flash run application tends to require a lot more horsepower to give a similar experience to the same application running on AJAX.

On the other hand , AJAX can not do some things as efficiently as Flash, it can't run video for example though I am sure some one will figure that out (if they already haven't) the main hindrances being security restraints placed on the javascript that makes AJAX possible. Thus we can see a trend that will and has developed between the use of the two technologies. If you want audio and video with a responsive interface you must go stand alone client or use Flash. (Actually Microsoft also has a flash like clone called Silverlight but adoption is currently low, but picking up) The Flash runs as a plug in to browsers so it requires that the browser has the plug in already installed, this is the case with the latest generation of the most popular browsers, IE, FireFox, Opera and Apple's Safari but because Flash is so resource intensive it runs poorly if at all on wireless devices. On such devices however, AJAX interfaces have no problem at all, so long as the browsers run javascript, which is the case as javascript is a browser standard language (not a plug in) these days. So from the developer front the choice is to design for Flash or to design for AJAX, the problem to be solved and the scope of users that the developer wishes to reach must be considered. For the Flash developers, All those wireless devices that can't run flash also can't run your wiz bang flash application. For the AJAX developers , all your applications can't really do video or audio natively without some sort of plug in, so you should focus on problems that don't necessarily need native video or audio. Another key distinction between Flash and AJAX is implementation cost, AJAX involves all standard based technologies freely implemented by any one, not so with Flash which is an Adobe product. Fans of Flash will say this is a canard , that Flash is open source now, and it is...but running a Flash communication server is not. If you develop Flash applications they must be hosted on a Flash com server and that you license from one company, Adobe.

This takes us full circle , back to the problem of scalability. In the case of AJAX developed applications, the standard based element of the UI and server technologies ensures that developers can freely create their own architectures to get the best possible scalability. Flash developers are limited to the abilities of Flash com servers for scalability built by Adobe alone, this has led to the many Flash based applications not being cost effectively scalable. What's the use of all that pretty functionality if implementing it to millions of users is going to cost an arm and leg. Youtube is a notable company that gets around the problem by the fact that its business model is ad based, the revenue coming in is huge and allows it to deploy Flash com servers as needed to handle all the load of video requests coming in every second, but Youtube is paying Adobe a cart load in license fees to do that. Though there are quite a few other Flash video sites, they are no where near as scalable nor do they have as many users as the big two "google video" and "youtube".

Comparison of video sharing sites

It is clear from the graphs that even second place is ridiculously behind youtube, despite the fact they have identical back end technology the main difference is the fact that youtube had first mover status in this segment and that gave them the "eye power" that advertisers just love. That money allows them to field the servers needed to serve all that video cost effectively. Since each user gets his own stream of the flash and there is no interaction the scalability works if you have enough ad revenue to pay for the servers. Another type of scalability problem is seen when we look at another market , chat.

There are many providers of online web chat services, the proliferation pf traditional stand alone players of IM. AOL, Yahoo Messenger, ICQ , MSN in the late 90's gave way to providers of combined login capabilities, Gaim, Tribble..etc. In the web 2.0 age, the logical thing to do was to use Flash to provide chat online through a web page, great idea! New sites popped up offering such services, from meebo , to userplane to various sites in between, flash chat is all the rage , but flash chat is not scalable in the same sense of ease as flash video. The main reason is that unlike video , which plays in a silo like fashion on each clients request, chat by necessity requires interaction between the Users in a room. This requires polling of the participants in the room to find out if they are there every so often, this lets the other participants know that they can continue to chat. The polling action must be mediated by the server it can't be done on the client (well it can be but that would be extremely inefficient), the server must run a Flash server that manages the state of each room and the participants chatting in it as well as manage the polling of the users and of their messages, it must update all the Users screens every time any one of them sends a new message. These actions are all incredibly resource intensive, the server is constantly calculating who needs the message, who doesn't , who is still online who is not, to do this efficiently requires servers with big brains (powerful processor or processor cores) and deep memories (to manage the room data in real time for all the thousands of rooms that the Users might be chatting in) for this reason Flash chat services online are either expensive to run (or license to users) and offer good performance (since you are in part paying for bigger brains and deeper memories) or they are cheap with poor performance. It is a problem space that is ripe for better solutions...enter AJAX chat.

Unlike Flash chat, the first major advantage to AJAX chat is cost. AJAX chat is purely based on standards or standards mixed with proprietary technology, there are no flash com license fees in sight since the polling and room management is done by a suite of technologies , some of which themselves are free. It is possible to implement flash servers without flash com fees but those implementation require a technology that either does it for you (which in a sense leaves the scalability design to them and not you) or you build yourself which necessarily costs time and money. Thus the critical importance of developing a more scalable architecture in the first place and as well toward procurring servers with bigger brains and deeper memories. This is what has NOT happened with all the flash chat/ ajax chat options out there, they were all cobbled together quickly in the space of a few months, massive scalability being an after thought...but that decision is coming back to bite as the numbers continue to balloon. For those that chose flash due to its speed of implemenation a whole swath of mobile devices can't use it, making them off limits for their services for the time being. The same problems exist with regard to polling and room update but they can be distributed over the servers without a penalty for licensing. If your site only serves a few thousand Users at a time this is fine but if it is to scale to millions of users , the proliferation of servers (in the Flash solution) would become increasing cost ineffective especially if the chat service provided is a "free" ad revenue supported one where new users will create rooms whily nilly just for the fun of it. The AJAX solutions would in theory then be more cost effective as scalability extends into the millions of users IF those solutions are designed to scale as mentioned in the first part of the article. If you look at all the current providers you see directly in their performance a reflection of how much brain time went into developing a scalable architecture. Anyone that has used flash chat provider Userplane for even a single chat, knows how frustrating it can be when it hiccups and stutters under the strain of updating to reflect new message and user status. AJAX chat providers like Campfire have responsive interfaces but it is unknown if they can scale to millions of users , (the designers actually admit that their goal was to serve the critical needs of a few clients needs to be fair to them) Userplane is trying to allow its services to be used in a distributed fashion by licensing to commercial sites as an embedded component but I gather their attempts to diversify into paid tier services are a requirement to keep the service cost effective as well as a value add that will reap more profit. There is a much larger market here to tap that it is undetermined if any current player can access...

Userplane startup review

We should expect to soon see solutions that address the massive scalability issue as the market for online collaboration is anticipated to be nearly 3 billion dollars in revenue for business uses alone by 2010. The company or companies that can provide the needed solutions for cost efficient scaling to very large numbers of clients (including those incredibly multiplying browser enabled cell phones) will be in the drivers seat to dominate the market of rabbid cross platform consumer directed web collaboration to come.



Adobe Flash

Ajax Chat that won't scale (dog slow even with an empty room)


Popular posts from this blog

On the idea of "world wide mush" resulting from "open" development models

A recent article posted in the Wall Street Journal posits that the collectivization of various types of goods or services created by the internet is long term a damaging trend for human societies.

I think that the author misses truths that have been in place that show that collectivization is not a process that started with the internet but has been with us since we started inventing things.

It seems that Mr. Lanier is not properly defining the contexts under which different problems can benefit or suffer from collectivization. He speaks in general terms of the loss of the potential for creators to extract profit from their work but misses that this is and was true of human civilization since we first picked up a rock to use as a crude hammer. New things make old things obsolete and people MUST adapt to what is displaced (be it a former human performance of that task or use of an older product) so as to main…

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way.

The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created.

Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write.

Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is devo…

Waking Out: A proposal to emerging ethical super intelligence safely.

The zeitgeist of Science fiction is filled with stories that paint a dystopian tale of how human desires to build artificial intelligence can go wrong. From the programmed pathology of HAL in 2001 a space odyssey, to the immediately malevolent emergence of Skynet in The Terminator and later to the humans as energy stores for the advanced AI of the Matrix and today , to the rampage of "hosts" in the new HBO series Westworld.

These stories all have a common theme of probing what happens when our autonomous systems get a mind of their own to some degree and no longer obey their creators but how can we avoid these types of scenarios but still emerge generalized intelligence that will leverage their super intelligence with empathy and consideration the same that we expect from one another? This question is being answered in a way that is mostly hopeful that current methods used in machine learning and specifically deep learning will not emerge skynet or HAL.

I think this is the …