This is a different focus from the ideas mentioned in the 'good enough' post, which were focused not on the actual solution, which in many cases in these many online business models are indeed ingenious, it is in the selection of those particular problems. In this case, the issue is that the creators of these sites chose the wrong problem to solve, even if they solved it brilliantly, the existing solutions have already been deemed "good enough" by the bulk of users making even a new brilliant solution, a not very profitable one. In my own work, many of the problems I've sought to solve have been of the type 1 variety as detailed in the previous post. They are based on incremental improvements to aspects of the problems that are already out there, for example, currently there are many dozens of companies providing "framework" or platforms for building applications online. What many people don't see are the hidden issues that can come to bite those projects as they begin to experience any level of growth, one in particular that has struck many a web 2.0 company is , the problem of scaling.
In software design scaling is something that in the past wasn't really an issue, when software ran on a single pc, the user behind the keyboard was usually the only one making the request for the resources on that pc and thus could devote all or most of those resources to the application currently given focus. Microsoft Word does not and never was designed to scale beyond the use of its features by any more than one User at a time. This is true of pretty much any stand alone software that you can install and run on your pc, the software designers focused on allowing the software to satisfy the needs of a single user in a given time, this design perspective constrains the solutions employed to solves various problems of the software design that might be at odds for issues that are encountered in web software development. Unlike web software, stand alone software pays no concern to pooling resources in order to improve allocation from multiple Users. It may pool resources between user requests but in most cases that is not needed. (since User time is much, much, much slower than a single applications execution time) Software designers that recognize the differences in the two areas are best able to employ best practices and complete designs and implementations faster. Web software in contrast as several important differences that must be catered to in the design if it is to be at all successful:
- Web Software by definition is geared toward simultaneous use by multiple users and thus must accomodate that use between users in a transparent fashion.
- Web Software should allow individual Users to see the software as if they are the only User and not have to pay a performance penalty to have the experience.
- Web Software should efficiently pool common aspects of the functions and services utilizied by the Users in order to improve efficiency and scalability.
- Web Software that is well designed for multiple users can still be inefficient if it can not extend its efficiencies over a clustered environment. One awesome piece of software on a server that goes down is no use to every User connected, even if that one server was able to support 50,000 Users simultaneously. So distributed web software is critical to scalability.
- Web Software that is distributable still can reach a glass ceiling if the individual nodes are not designed to efficiently throttle and/or route requests between server nodes.
- At the heighest level of scalability, entire branches of nodes separated by geographical region in many cases need to efficiently route requests between one another as load increases, this would then allow the resources of servers connected over a WAN to distribute load between regions. It gets no better than this, and if your web software doesn't take account of this level of scalability, if you hope to ever be a google or an amazon , at some point you are going to face this issue.
So, the web software engineering problems are a completely different world of issues that must be juggled by the architect in order to design a solution that works efficiently at different levels of load. There are many approaches to these issues, the network design heavily influences the scalability that can be achieved and the amount of work required to achieve it. Over the last 15 years network architectures for scalable sites have gone from mostly 3 tier architectures to 2 tier ones.
In my designs I prefer the monolithic two tier architecture for its simplicity and reduced cost compared to 3 tier and also because today, a two tier design allows the serving agent (the web server) to be married to the business logic and processing agent(the code that actually runs the business and mediates transactions from the database), eliminating latency issues between the two. The proliferation of application server technologies (java, .net, php) that process business logic and serve pages has allowed such designs to emerge as efficient. They allow the elimination of an entire tier of servers and associated resources and significantly change the scalability constraints. If the web server is married to the application server, intelligent routing can be designed between the nodes that older 3 tier designs didn't have. The app server can be designed to be able to determine its resource utilization and determine if it should handle or route a request to another app server. If the behavior can be dynamically modified, the architecture is flexible to sudden changes in load. This elasticity is very important in a world where DDOS attacks can assert themselves within a matter of seconds, scalable designs should efficiently handle such extreme disaster scenarios as they aren't as unlikely as some designers would like to believe. Secondarily, they ensure a design that will be robust as expected growth , or even better...unexpected growth assails the architecture.
The last few years has been a flood of web 2.0 companies that have come out offering cool technologies brainstormed , quickly pitched to acquaintances and then quickly prototyped and pitched to angels or VC's. I see these models as risking disaster as they grow, the first issue with web 2.0 companies that throw together services is their reliance on a single technology foundation to provide their service. One in particular that has formed the basis of a majority of so called web 2.0 sites is Flash. Flash technology was invented by Adobe to allow client experiences that were more like desktop experiences, it does this by having the client do more of the rendering work than the server. This is facilitated by using a language that allows the client to draw simple geometric shapes, fill them , animate them to produce games, UI elements and even video (as in youtube), Flash itself is older than the web 2.0 resurgence , which most say started after the 2001 dot com bubble burst. Back then though it was really slow on clients (all that rendering required some nice horse power), however with Moore's law still going along and the rise of multi- core processor architectures (not to be confused with software architectures previously described) clients quickly were able to do more and more complex Flash with less of a performance hiccup. It was only when this was possible that the web 2.0 companies that used Flash started to pop up all over like weeds. However, web 2.0 is not only about Flash , the other side of the trend is manned by the oft heard AJAX.
It turns out that cleverly used AJAX techniques allow very dynamic responses to be created on the browser that formerly only could be done on desktops (or with Flash or Java) Data tables could be displayed in near real time as items are selected from a list. Links could be made to dynamically fetch related definitions and other cool things. So web 2.0 owes its rise to the dual tools of AJAX and Flash, however there are major differences between these approaches. First, because of its reliance on rendering, Flash requires more of the clients processor horsepower to display its fancy interfaces or produce its video, this means that a Flash run application tends to require a lot more horsepower to give a similar experience to the same application running on AJAX.
This takes us full circle , back to the problem of scalability. In the case of AJAX developed applications, the standard based element of the UI and server technologies ensures that developers can freely create their own architectures to get the best possible scalability. Flash developers are limited to the abilities of Flash com servers for scalability built by Adobe alone, this has led to the many Flash based applications not being cost effectively scalable. What's the use of all that pretty functionality if implementing it to millions of users is going to cost an arm and leg. Youtube is a notable company that gets around the problem by the fact that its business model is ad based, the revenue coming in is huge and allows it to deploy Flash com servers as needed to handle all the load of video requests coming in every second, but Youtube is paying Adobe a cart load in license fees to do that. Though there are quite a few other Flash video sites, they are no where near as scalable nor do they have as many users as the big two "google video" and "youtube".
Comparison of video sharing sites
It is clear from the graphs that even second place is ridiculously behind youtube, despite the fact they have identical back end technology the main difference is the fact that youtube had first mover status in this segment and that gave them the "eye power" that advertisers just love. That money allows them to field the servers needed to serve all that video cost effectively. Since each user gets his own stream of the flash and there is no interaction the scalability works if you have enough ad revenue to pay for the servers. Another type of scalability problem is seen when we look at another market , chat.
There are many providers of online web chat services, the proliferation pf traditional stand alone players of IM. AOL, Yahoo Messenger, ICQ , MSN in the late 90's gave way to providers of combined login capabilities, Gaim, Tribble..etc. In the web 2.0 age, the logical thing to do was to use Flash to provide chat online through a web page, great idea! New sites popped up offering such services, from meebo , to userplane to various sites in between, flash chat is all the rage , but flash chat is not scalable in the same sense of ease as flash video. The main reason is that unlike video , which plays in a silo like fashion on each clients request, chat by necessity requires interaction between the Users in a room. This requires polling of the participants in the room to find out if they are there every so often, this lets the other participants know that they can continue to chat. The polling action must be mediated by the server it can't be done on the client (well it can be but that would be extremely inefficient), the server must run a Flash server that manages the state of each room and the participants chatting in it as well as manage the polling of the users and of their messages, it must update all the Users screens every time any one of them sends a new message. These actions are all incredibly resource intensive, the server is constantly calculating who needs the message, who doesn't , who is still online who is not, to do this efficiently requires servers with big brains (powerful processor or processor cores) and deep memories (to manage the room data in real time for all the thousands of rooms that the Users might be chatting in) for this reason Flash chat services online are either expensive to run (or license to users) and offer good performance (since you are in part paying for bigger brains and deeper memories) or they are cheap with poor performance. It is a problem space that is ripe for better solutions...enter AJAX chat.
Unlike Flash chat, the first major advantage to AJAX chat is cost. AJAX chat is purely based on standards or standards mixed with proprietary technology, there are no flash com license fees in sight since the polling and room management is done by a suite of technologies , some of which themselves are free. It is possible to implement flash servers without flash com fees but those implementation require a technology that either does it for you (which in a sense leaves the scalability design to them and not you) or you build yourself which necessarily costs time and money. Thus the critical importance of developing a more scalable architecture in the first place and as well toward procurring servers with bigger brains and deeper memories. This is what has NOT happened with all the flash chat/ ajax chat options out there, they were all cobbled together quickly in the space of a few months, massive scalability being an after thought...but that decision is coming back to bite as the numbers continue to balloon. For those that chose flash due to its speed of implemenation a whole swath of mobile devices can't use it, making them off limits for their services for the time being. The same problems exist with regard to polling and room update but they can be distributed over the servers without a penalty for licensing. If your site only serves a few thousand Users at a time this is fine but if it is to scale to millions of users , the proliferation of servers (in the Flash solution) would become increasing cost ineffective especially if the chat service provided is a "free" ad revenue supported one where new users will create rooms whily nilly just for the fun of it. The AJAX solutions would in theory then be more cost effective as scalability extends into the millions of users IF those solutions are designed to scale as mentioned in the first part of the article. If you look at all the current providers you see directly in their performance a reflection of how much brain time went into developing a scalable architecture. Anyone that has used flash chat provider Userplane for even a single chat, knows how frustrating it can be when it hiccups and stutters under the strain of updating to reflect new message and user status. AJAX chat providers like Campfire have responsive interfaces but it is unknown if they can scale to millions of users , (the designers actually admit that their goal was to serve the critical needs of a few clients needs to be fair to them) Userplane is trying to allow its services to be used in a distributed fashion by licensing to commercial sites as an embedded component but I gather their attempts to diversify into paid tier services are a requirement to keep the service cost effective as well as a value add that will reap more profit. There is a much larger market here to tap that it is undetermined if any current player can access...
Userplane startup review
We should expect to soon see solutions that address the massive scalability issue as the market for online collaboration is anticipated to be nearly 3 billion dollars in revenue for business uses alone by 2010. The company or companies that can provide the needed solutions for cost efficient scaling to very large numbers of clients (including those incredibly multiplying browser enabled cell phones) will be in the drivers seat to dominate the market of rabbid cross platform consumer directed web collaboration to come.
Ajax Chat that won't scale (dog slow even with an empty room)