These are my unedited notes from Stu Charlton's talk about From Agile Development to Agile Operations at QCon SF 2009
- Cloud computing changing the game between development and operations
- Suggested design goals for cloud computing
- Integrated approach to application design, development and operations
- Tennis match going on between the dev and ops side
- Performance, scale and availability of design and operational decisions
- You usually can't just tell the platform to scale your app
- The offerings of commercial companies are mostly the result of buying companies that cover either operations or development
- How can agile practices be applied to operations?
- (Nice quote "Mimicking the illusion of working software by building a lot of documents")
- Development values what is built; operations values what does not happen
- Automated build, test, integration - what's the test environment in operations?
- Not really test, rather planning and rehearsal
- Autonomous teams – in operations, there's always a lot of legacy dependencies, need for situational awareness
- Continuous integration - in operations, what's the source code?
- Examples: Why can't two servers communicate? security, server configuration, network configuration, firewall …
- Example: What do I need to scale out? Easy, simply start up more machines … no, not really: impacts on other systems, e.g. security systems, load balancers, monitoring, CMDB, service desk. Architectural issues: stateful or stateless nodes, repartitioning; limiting the scale out
- Example: What is the authorative reality? What's the different between the current state and the one I want
- In operations, transitional states matter a lot more than in development
- What we have now: on demand provisioning of commodity infrastructure and constrained applications
- What we still need to consider: configuration as data and as code; collaboration on design, development and operations
- What funds a project is usually very different form what funds operations
- IT complexity is overwhelming - not sure whether this is accidental or inherent complexity
- Little tooling for collaboration in operations
- Integrated view of operations and design: Different planes – management plane, cloud control plane, application plane
- All of the vendors are working on building a platform for controlling cloud resources
- Key question: what's the source code?
- Bottom-up approach (based on scripts, recipes, runbooks)
- Chef: DSL for describing infrastructure
- Puppet used by Google to standardize all OS X desktops
- Trying to use Maven in operations
- Top-down (modeled viewpoints, enterprise architecture, configuration models)
- UML profiles, MS uses Oslo to describe different viewpoint models
- Configuration models: W3C SML - now it's been standardized, nobody's using it
- Model-driven Collaborative Application Design
- "All modeling is programming, all programming is debugging" (Neil Gunther)
- Chef is very popular because it's easy; Puppet is declarative, which makes it hard to debug
- Analogy: SQL query plan; tools could derive a plan from a declaratively specified model
- Accounting barriers to Agile operations
- Capex vs. Opex is only partially addressed in reality, as HW is only part of the cost
- Promising approach: Time-driven activity-based costing; activity-based costing is an approach used to make consultants rich in the 80s, but in combination with time-driven seems useful
- How to arrive at an integrated approach:
- distributed, autonomous descriptions of the complete configuration
- document-based description as the basis for collaboration
- The way to enable collaboration of autonomous owners is to link configuration pieces via hyperlinks [he is a REST guy, after all]
- Model-driven approach because something is needed that's both data and code
- Problem with data: hard to debug
- Problem with code: hard to see what's in it
- Mentions Lisp as data is code/code is data example – it's been done before
Elastra approach: "Elastic Modeling Languages" (Open Source licensing): EMML, ECML, EDML - doesn't expect these to become standards, but part of the debate
Q. Applicable to private clouds? A. Very much so.
- Q. There's a trend of expanding Continuous Integration to Continous Deployment. Does this apply? A. Modeling is not a conflict to an agile approach, small changes could be in production, no need to do things in a monolithic way. Both exist and need to co-exist.
- Q. (rather a comment) one can start with a DSL, validate it, check dependencies etc. – bottom up is not a conflict A. A textual DSL is just a model.
- Q. Would "structure" be a better term than model? A. That would only part of it. "Model" has many connotations people don't like, which is why people start using DSL
- Q. Connection of OSS/Telco experience? A. One example is Erlang and Mnesia showing up as a technology in the Cloud space.
- Q. Are there new technologies in the security space? A. Federated ID technologies getting some tractions, e.g. Azure using WS-Federation, SAML and OAuth are both growing. Directories still primary way.
- Q. Is there a directory in the cloud? A. Concept of "virtual identity" instead, e.g. OpenID. SAML can be used with some Google apps, some Salesforce.com apps
- Q. As an alternative to complex tooling, can co-locating/integrating developers and operations people help? A. Two approaches: Let's not do ops, let's just have developers do operations. Not good, usually a different value system. Second: Co-locate them and create autonomous teams. Good approach, larger Web shops do this - still a shared service team. Classic scaling problem: lots of interdependencies between teams. Tooling can help. Sometimes you even have to separate teams due to regulatory reasons.