DevOps is a pretty hot topic today, and if you feel like maybe everyone but you is doing it and you're not even sure what "it" is, this is your blog.
And if DevOps is something you're familiar with, but that legacy desktop app keeps you stuck in a non-DevOps world, this is your blog. We have also captured this topic in a webinar on Microsoft Channel 9.
What IS DevOps anyway?
Like so many labels in our fast-changing tech universe, the term "DevOps" has numerous definitions. Some people see it as the convergence of software development and operations--ie first you write the app then you host the app.
What if you smooshed those two things into one thing?
Perhaps a simpler way to think about it is continuing the trend to remove silos in software engineering. Just as Agile brings product owners (ie the "user"), development, and test out of their traditional silos into a unified team, so does DevOps break down the further silo of operations.
A brief history of software
If you think back--WAY back--all applications used to be run by an Operations group in the data center. These were, of course, mainframe apps--first batch programs run overnight and later on-line apps but still running back in a data center with access via terminals. If you've ever had the dubious privilege of watching someone use a terminal at an airline gate you know exactly what I'm talking about. Later smaller computers like the DEC PDP-11 pushed operations out to the departmental level and still later PCs running Windows changed everything.
It's not exactly news that client-server applications kinda sorta took over the world of computing in the 90s. Those apps tended to be written in Windows and connected via LANs or VPNs back to servers that managed shared resources like databases and authentication. Engineering was siloed--the devs writing the application didn't really care about deployment, scalability, or monitoring--they just wanted to deliver a quality application that was on-time, on-budget, met the user requirements, and could be maintained. It was up to Ops to figure out how to manage server farms, load balancing, failover, network bandwidth, and all the other bits and pieces needed to let those users actually USE the app.
But Marc Benioff and Steve Jobs changed everything.
Marc Benioff invented SaaS, for all intents and purposes. It turned out renting software had lots of financial advantages for lots of people (opex vs capex, among others) and running inside a browser suddenly eliminated OS version and config issues that plagued on-premise packages.
Then Steverino came out with the iPhone, the iPad, and that sleek sexy laptop everybody wanted and before long browser-based apps were where it was at.
By 2010--give or take--on-premise software seemed so, well, 1990s. And now web apps seem to be the default for ISVs except in those situations where either network latency and/or the need for a far richer UX than can be accomplished in a browser is needed. 3D design, Photoshop, Visual Studio all come to mind.
SaaS is a gas but it gave me gas
The advent of SaaS brought new revenue streams but new troubles as well. Back in the "good old days" you could copy a bunch of CDs and frisbee them out to the customers and go take some time off. SaaS meant you had to actually RUN the software for the customers and that meant monitoring and scaling and roll backs and a bunch of other stuff people had to scramble to figure out. It was no longer enough for development to make sure there were no bugs in a test lab environment. How can you tell in advance what will happen if you get a sudden surge of users all hitting your app at once? How can you predict network latency? How do you manage state across a server farm? How do you test something that doesn't actually exist until it goes live?
Here at Mobilize.Net's Intergalactic HQ we suffered through all this pain. Visual Basic Upgrade Companion is on-premise but WebMAP is SaaS and early on we experienced the same problems from having development in one silo and operations in a different one.
We found we needed Infrastructure as Code, for example, since the error-prone process of publishing the full app sometimes didn't work like we expected. We found that we needed the ability to quickly roll back a release to a Last Known Good version if a post-release bug popped up that we couldn't live with. We realized we had to figure out how to use a live database in a predictable, robust method both for staging and also production, without breaking customer data. We learned that Azure was powerful but complex and we needed to understand not only our options but also manage costs and performance.
We learned a bunch of stuff. All of it new and interesting and terribly important. And that led us to DevOps.
What can DevOps do for me?
There's no question that DevOps will require change: new approaches, new organization, new processes to implement and sort out.
Is it worth it?
“In this way, a DevOps product backlog is really a set of hypotheses that become experiments in the running software and allow a cycle of continuous feedback.” -- Sam Guckenheimer (Microsoft), "Our Journey to Cloud Cadence"
This is an interesting take on what DevOps can offer. Initially when I thought about our own experience moving into DevOps it was really about efficiency and communications. But Sam has it right: DevOps is a way to continually run real-time experiments with your software and immediately respond to the results of those experiments.
Suppose you have a key feature in your app that monitoring shows is being underutilized. Is it because users haven't noticed it or is it because it isn't wanted? Following on Sam's point above, you can selectively offer some random subset of users a variant that highlights the feature in a better way, injects a wizard into the UX to push them to the feature, or even build in a survey to get better data. Then, based on the results of that--can we even call it a beta?--you can adapt the UX for the entire user universe.
Hypothesis--experiment--result--change--improvement. Rinse and repeat.
More about operations
A key aspect of DevOps is implementing CI/CD: continuous integration, continuous deployment. Coupled with Agile or Scrum, this means that new features are only a sprint or two away--likewise no bug fix is ever more than a sprint away. CI/CD means the merge/build/test/publish steps are automated. Tools like Docker make it possible for devs to know that what works on their dev environment will work in production because they are identical. Bigger features can be added sprint-by-sprint using feature flags that let internal testers use functionality before general availability--thus letting big chunks span multiple sprints before you want the world at large to experience them.
Another key aspect of DevOps is monitoring. There are about a zillion application performance monitoring tools out there--some used while coding and some for production. Three big ones are New Relic, AppDynamics, and Azure's Application Insights. There is a dizzying array of features and capabilities, but the bottom line is that you can see what your application is doing in real time and you can even model "what if" scenarios with some. For example, with Application Insights, you can find out how much your Azure bill would go up in order to achieve a specific responsiveness goal with XX simultaneous sessions. With some APM systems you can track user paths to see where they are stumbling or having trouble. Gather insights to define the product backlog in a truly agile fashion.
Agile isn't very agile with on-prem
Interesting data point from the Visual Studio gang at Microsoft: they are still limited to quarterly releases of VS. That means you can have a situation where a bug is released in Q1, identified somewhere in Q2, and not released until Q3. So worst case scenario is a bug in the product for 9 months.
I have Visual Studio--I don't always update it as soon as a release is dropped. Frankly the VS folks don't do a tremendous job of telling me when they have an update, so I might not notice. I'm just muddling along, thinking everything is fine. So just because they drop an update doesn't mean they get user feedback immediately at the start of the next sprint. Which means they may have locked down the sprint's backlog by the time they identify a needed change--which can push it out to the NEXT sprint.
Not. Very. Agile.
With Visual Studio Online (badly named--it's really TFS not VS) they are down to three week sprints. Three weeks vs three months. More precise planning (it's easier to be accurate predicting the results of three weeks work vs three months) and more agility.