The Holy Grail of Application Development
Seagate ships 50 million computer hard drives per quarter, and that makes for a highly complicated set of processes. The ultimate in software development, said Seagate's Steve Katz, would be an integrated approach that incorporates development activities as well as testing, monitoring, provisioning and quality checks and balances.
07/11/11 5:00 AM PT
Welcome to a special BriefingsDirect podcast series. Recently we've explored some some major enterprise IT solutions, trends and innovations making news across HP's ecosystem of customers, partners, and developers.
This enterprise case study discussion focuses on Seagate Technology, one of the world's largest manufacturers of rotating storage media hard-drive disks, where the application development teams are spanning the dev-ops divide and exploiting agile development methodologies.
Please now join Steve Katz, manager of software performance and quality at Seagate, an adopter of modern application development techniques like agile, for a discussion moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.
Listen to the podcast (14:05 minutes).
Here are some excerpts:
Steve Katz: Seagate is one of the largest manufacturers of rotating media hard disks and we also are into the solid state [storage media] and hybrids. Last quarter, we shipped about 50 million drives. That continues to grow every quarter.
As you can imagine, with that many products -- and we have a large product line and a large supply chain -- the complexities of making that happen, both from a supply chain perspective and also from a business perspective, are very complicated and get more complicated every day.
The Holy Grail for us would definitely be an integrated approach to doing software development that incorporates the development activities, but also all of the test, monitoring, provisioning and all of the quality checks and balances that we want to have to make sure that our applications meet the needs of our customers.
In the last couple of years, with the explosion with cloud, with the jump to virtual machines (VMs), virtualization of your data center, and also global operations, global development teams, new protocols, and new applications, most of what we do, rather than developing from scratch, is integrate other people's third-party applications to meet our needs. That brings to the table a whole new litany of challenges, because one vendor's Web 2.0 protocol standard is completely different than another vendor's Web 2.0 protocol standard. Those are all challenges.
Also, we're adopting, and have been adopting, more of the agile development techniques, because we can deliver quanta of capability and performance at different intervals. So we can start small, get bigger, and keep adding more functionality. Basically, it lets us deliver more, more quickly, but also gives us the room to grow and be able to adapt to the changing customer needs, because in the market, things change every day.
So for us, our goal has been the ability to get all those things together early in the program and have a way to collaborate and ultimately have the collaboration platform to be able to get all the different stakeholders' views and needs at the very beginning of the program, when it's the cheapest and most effective to do it. We're not there. I don't know if anybody will ever be there, but we've made a lot of efforts and feel like we've made a lot of ground.
The dev-ops perspective has really interested us, and we have been doing some of the early adoption, the early engagement with our customers, in our business projects very early in the game for performance testing.
We get into the project early and we start understanding what the requirements are for performance and don't just cross our fingers and hope for the best down the road, but really put some hard metrics around what it is the expectations are for performance. What's the transfer function? What's the correlation between performance and the infrastructure that need to deliver that performance? Finally, what are the customer needs and how do you measure it?
That's been a huge boon for us, because it's helped us script that early in the project and actually look at the unit-level pieces, especially in each different iteration of the agile process. We can break down the performance and do testing to make sure that we've optimized that piece of it to be as good as possible.
Now when you add in the needs for VM provisioning, storage, networking, and databasing, the problem starts to mushroom and get more complex. So, for a long time, we've been big users of HP Quality Center (QC), which is what we use to gather requirements, build test plans, and link those requirements to the test plans ultimately to successful tests and defects. We have traceability from what the need of the customer is to our ability to validate that we deliver that need. And it worked well.
Then, we have the performance testing which was an add-on to that. And now, with the new ALM 11, which by the way, marries the QC functionality and Performance Center functionality. They're not two different things any more. It's the same thing, and that's the beauty for us.
That's what we've been preaching and trying to work with our project teams on, to say that it's just a requirement. Any requirement is just a requirement and how we decide to implement, fulfill, and test that is our choice. But, having the QC and performance testing closer together has made a lot of sense for us and allowed us to go faster and cheaper, and end up with something that, in fact, is better.
The number of applications we have in production is in the 300-500 range, but as far as mission critical, probably 30. As far as some things that are on everybody's radar, probably 50 or 60. In Business Servive Management (BSM), we monitor about 50 or 60 applications, we also have the lower-level monitors in place that are looking at infrastructure. Then our data all goes up to the single pane, so we can get visibility into what the problems are.
The number of things we monitor is less important to us than the actual impact that these particular applications have, not only on the customers experience, but also on our ability to support it. We need to make sure that whatever it is that we do is, first of all, faster. I can't afford to get a report every morning to see what broke in the last 24 hours. I need to know where the fires are today and what's happening now, and then we need to have direct traceability out to the operator.
As soon as something goes wrong, the operator gets the information right away and either we're doing auto-ticketing, or that operator is doing the triage to understand where the root cause is. A lot of that information comes from our dashboards, BSM, and Operations Manager. Then, they know what to do with that issue and who to send it to.
We've subscribed to a number of internal cloud services that are Software as a Service (SaaS) processes and services. For those kind of things, we need to first make sure it's not us before we go looking to find out what our software service providers are going to do about the problems. And both of our applications, all the BSM and all the dev-ops has helped us get to that point a little better.
The final piece of the puzzle that we're trying to implement is the newer BSM and how we get that built into the process as well, because that's just another piece of the puzzle.
Dana Gardner: What sort of paybacks are you expecting?
Katz: It's two things for us. One is the better job you do up front, the better job you're going to do in the back end. Things are a lot cheaper and faster, and you can be a whole lot more agile to react a problem. So the better job we do up front, understand what the requirements are and not just what this application is or what it's supposed to do, but how is it supposed to affect the rest of our infrastructure, how is it supposed to perform under stress, and what are the critical quality, the quality of service, the quality of experience aspects that we need to look at.
Defining that up front helps us to be better and helps us to develop and launch better products. In in doing that, we find issues earlier in the process, when it's a lot cheaper to fix them and a lot more effective.
On the back end, we need to be more agile. We need to get information faster and we need to be able to react to that information. So, when there's a problem, we know about it as soon as possible, and we're able to reduce our root-cause analysis and time to resolution.
Gardner: Is integrated ALM helping you move the cloud and also adopt other IT advancements?
Katz: I look at that like a baseball team. My kids are in Little League right now. We're in the playoffs. When a team does well, you get this momentum. Success really feeds momentum, and we've had a lot of success with the dev-ops, with pulling in ALM performance management and BSM into our application development lifecycle. Just because of the momentum we've got from that, we've got a lot more openness to explore new items, to pull more information into the system, and to get more information into the single pane.
Before we had the success, the philosophy was. "I don't have time to fix this. I don't have time to add new great things." Or, "I've got to go fix what I got." But when you get a little bit of that momentum and you get the successes, there is a lot more openness to it and willingness to see what happens. We've had HP helping us with. They're helping us to describe what the next phase of the world looks like.