These are my unedited notes from Simon Guest's talk about Patterns for Cloud Computing at QCon SF 2009.
- "This talk is about Jim, he has many questions about cloud computing…"
- 5 pattern of cloud-based applications
- Definition of cloud computing
- Different models:
- Applications must run on-premises – complete control, upfront capital costs
- Application runs at a hoster – lower capital costs, but pay for fixed capacity even if idle
- Shared, multi-tenant, pay as you go – pay someone for a pool of computing resources that can be applied to a set of applications
- Public Cloud vs. Private Cloud – private cloud useful e.g. for telcos offering this to their customers
- Windows Azure – compute, storage, management, based on 64bit Windows images
- SQL Azure - RDBMS
- .NET Service - service bus and access control
- [ed.: Who thinks of these names, and even more importantly, why doesn't Microsoft fire them?]
- Different models infrastructure (IaaS) vs. Platform as a Service (PaaS) as main paths
- Slide shows that MS offers a higher-level stack than Amazon - EC2 provides instance, Windows Azure model is a platform as a service model
- [Seems to me this is one of the major problems of Azure – it seems neither one or the other, as I would define PaaS as what GAE does, which is much higher-level than simply a Windows Server]
Pattern #1: Using the Cloud for Scale
- Shows how to scale up a Web app using more machines, load balancer, database partitioning
- A lot of work - a lot of money
- Designed for peak capacity, idle for a lot of time
- Much easier to let cloud vendor handle this dynamically
- Prerequisite for successful scaling in the cloud: having a queue to decouple web tier and backend
- Starbucks [of all possible examples! ;-)] as an example for queueing
- Demo: "PrimeSolvr" (Web 2.0 because it's missing the last "e")
- 3 takeaways: 1) core tenet of cloud computing: ability to scale up/down 2) understand how to communicate between roles and nodes 3) strategy for when to scale up and down
Pattern #2: Using the cloud for multi tenancy
- Simply approach internally: one application per customer - works only for small numbers
- Implications: Schema customizations, UI customizations
- 3 options for data in a multi-tenant environment: 1) share DB between customers 2) each customer gets a separate DB - hard to do on-premise, much easier in the cloud 3) fixed DB schema with customizations on a tenant-by-tenant basis
- Demo: ASP.NET MVC app using the HTTP host name to switch UI and DB Schema
- Takeaways: 1) Consider multi-tenancy first, hard to retrofit 2) Design considerations must include both data and UI specifics 3) Identity as a very important consideration, see MS Patterns and Practices paper on multi-tenancy ID
Pattern #3: Using the cloud for compute
- Popularized by MapReduce
- Apache Hadoop, Cloudera, Amazon Elastic MapReduce, Hadoop implementation
- Typical on-premise solution: very infrastructure-heavy, complex, expensive
- No explicit framework implementation on Azure
- Demo (inspired my MapReduce): Development Fabric (local execution environment), not using virtualization [similar to GAE environment]; next step is upload to Azure staging area, next level production
- Takeaways: MapReduce very visible, although can be hard to initally grasp, learn about existing implementations; MS academic effort: Dryad
Pattern #4: Using the cloud for (infinite) storage
- Problem: Affinity between hardware and data
- how does the cloud help? breaks the affinity
- virtualized layer between the data you store and the hardware underneath
- Three ways: blobs, tables, relational
- MS: Azure Blog Storage – REST API (using GET (even range requests) and PUT); PutBlock API to move blocks - transaction build up [must look this up]
- Azure Table Storage (Key/Value pairs)
- Initial relational effort: SQL Server Data Services (MIX 08) - REST API on top of SQL
- Customer reaction: We want to do TDS (MS native DB protocol)
- SQL Data Services (MIX 09), late SQL Azure: TDS (SQL Server) in the Cloud
- Similarity between internal and cloud architecture makes it easier for customers [agreed, even though this might me more of a problem]
- Demo: SQL Azure (http://sql.azure.com); Codeplex sqlazuremw (migration wizard) - migration from local SQL Server DB to the cloud (subset of SQL Server functionality, e.g. restrictions on certain value types, clustered indexes)
- Takeaways: 1) Storage in the cloud may look the same, but breaks the affinity problem 2) Pricing is relevant 3) SQL Azure factor for moving to cloud in the first place= *
Pattern #5: Using the cloud for communications
- Classic approach: VAN, now replace by Internet direct file transfers
- Cloud approach: REST-based queues could be used for communication - not commonly used, problem: need to pass tokens around
- Putting a web facade in front of the queue doesn't work too well either due to firewall problems. HTTP polling is bad [why?]
- MS Solution: .NET Service Bus
- TCP Relay: outbound bi-directional socket, tunneled through the bus and kept alive on both sides. Enables routing of arbitrary protocols across company boundaries
- Alternative: Message Buffer, exposed using AtomPub, support retrieve, peek, lock
- Takeaways: Be careful consuming REST-based queues because of shared secret
- additional trouble because of REST
- service bus as potential solution
Last question: How can patterns be integrated?
- 1) Sample PHP (!) application running on Windows Azure, ported to GAE and EC2 (as ASP.NET)
- 2) Map reduce spreads load across Amazon, Google, MS
- 3) Store results in SQL Azure database
4) Coordinate communication using .NET Service Bus
How many prime numbers between 1 and 10,000,000? 40 jobs of 250,000 numbers
- WPF client app sends off job
- "I'm gonna submnit the job and pray"
Spontaneous applause as the demo actually worked
make sure you have a clear definition of cloud computing
- explore the 5 usage patterns
- think about the next steps for implementation and migration