Google threw their hat into the cloud computing arena in April 2008 with the introduction of the Google App Engine (GAE). The GAE allows developers to create and deploy applications using Google's infrastructure, leveraging their extensive expertise in the domain of scaling BIG. The GAE is still a beta product, given their track record for extended beta periods (gmail anyone?) it may continue to be for some time. It's free to create and deploy applications supporting approximately 5 million visitors per month. If you're not scared by the beta status, willing to invest some time learning the "Datastore", and looking for a platform to develop your next web app, GAE is worth exploring.
GAE is intended to compete with offerings like Amazon's EC2, FlexiScale, and Microsoft's new Azure. However, GAE differs in one very significant way, it's not a virtualized server hosted in the cloud. Rather, it's a tightly walled platform which forces you to develop using Python and the Datastore, with Django templating bundled in. To build an application using GAE you have to give up virtually all of the flexibility that competing services offer. On its surface that doesn't sound like a great deal, but Google makes it a trade off worth considering. With flexibility comes configuration and management. To pick on EC2 for a second: you'll need to get an OS, web server, web application framework, and database server up and running before you can do any development; You'll have to manage and maintain each of these, worrying about downed services and backups; When it comes time to scale you'll have to ensure that your software architecture is scaleable; Plus you'll need the expertise to scale up your server infrastructure, not a trivial feat.
Google's GAE exchanges flexibility (and the headaches that come with it) for one click deployment, near zero maintenance and management, and virtually transparent scaling. The promise being that using GAE you'll be able to build highly trafficked web applications without having to worry about designing, managing, and scaling your back-end infrastructure. You'll be able to do this at zero cost for 'development' applications. Once GAE reaches general release, you'll be able to pay for additions to your quotas (more later) as well as 'production' status and support.
The Good:
It really is as easy as one click deployment. If you're familiar with Python and HTML you can go from nothing, no infrastructure and no code, to the foundation for virtually any web application in a matter of hours. In over a decade of building applications for the web I've never had an experience as easy as this.
Google is the best in the business when it comes to building out high availability scaleable infrastructure. They've also done a tremendous job of abstracting the intricacies of developing for multi-node parallel systems away from developers with BigTable and MapReduce. For the first time GAE exposes some of this to external developers. It's a beautiful thing.
You'll quickly learn that the Datastore is very different from RDMSs you may have worked with in the past. GAE uses what they're calling the Datastore, which is really an implementation of BigTable, Google's proprietary database. BigTable doesn't run on a single computer or cluster of computers, rather it can span tens of thousands of disks on thousands of servers. You create data models as Python classes. Instances of these classes, called entities, are synonymous with database records. Once created these entities are stored and moved around by BigTable to account for hotspots and increasing storage needs.
Because of this distributed architecture, what works in a traditional RDMS systems like MySQL or MS-SQL doesn't necessarily translate well to the Datastore. You generally wouldn't think twice about performing a count, sum, or avg operation on records in a traditional RDMS. Due to the distributed nature of the the Datastore it can be very costly perform the entity fetching necessary for these. Pre-computation is paramount where possible. Joins are also expensive in the context of distributed systems and as such, are not handled in a traditional fashion, although relationships between data entities are possible. A basic rule of thumb is that storage is cheap, so a lot of the rules of normalization that you're used to get thrown out the window - don't feel bad about data duplication.
The Bad (aka the limitations):
There are a number of limits which Google has placed on use of the GAE. These include:
* No long running processes
* A maximum of 1000 results returned from any query
* Read only access to the file system
* No scheduled activities
* No easy way to perform one off maintenance on your data
* No official support for backing up your data.
* A number quotas including bandwidth, storage, sent email, and CPU utilization
Given these limitations, there are a number of applications which are just not suitable for GAE. For instance, if you require regular maintenance to occur on your data there is no easy way to achieve this; if you require any activity to occur on a scheduled basis, the only way to make this happen is to 'ping' your application from an external source; and, if you wanted your application to perform some activity like crawling the web or performing server side image manipulation then you'll quickly run up against the long running processes limitation.
My Experience:
I sat down to build a basic blogging application. I had zero familiarity with the platform and only a passing familiarity with Python. It only took a few hours to complete the getting started tutorial and get the blogging app built and deployed. I mentioned earlier how easy I found the process to be, I'll reiterate that again. It was the easiest application construction that I've ever done. By far.
Conclusion:
The platform isn't perfect and there are many limitations which might exclude it from being a candidate for your project. It's also amazingly simple, well documented, and easy to use. My biggest fear if deciding to use GAE for a serious project would be being locked in to the Google platform. You can't just pick up and deploy elsewhere. The Datastore alone limits you to a continued relationship with Google. Then again if you trust their "Don't be evil" slogan they might not be such a bad partner to get into bed with. That being said I wouldn't recommend using the platform for anything besides pet projects until the it's post-beta costs are known. Still, it was the most enjoyable development experience I've had in a long time. I look forward to future encounters with the GAE.