Improving the Initial Mainframe Automation and Creating a “Two Button Push” Process

December 18, 2020 | TATSIANA IHNASHCHENKA

Transformation is a challenge.

You must balance multiple competing forces.

On one side, you have established a powerful vision of the end state you wish to achieve. You must maintain this vision, and do what you can to bring it to life at every step of your transformation.

On the other side, you must realize your vision will take a lot of work, and a lot of time, to achieve. You must create small wins along the way, and maintain momentum to keep the transformation driving forward.

We juggled these two forces with a recent client.

For the past 18+ months, we have moved this client through a massive DevOps transformation of their mainframe application development capability.

Over this time, we have worked hard to keep the right balance between maintaining our vision for what their process will one day look like, and the practical realities of what it will take to get there.

In this article, we will explore one of the first times we really hard to work through this tension.

We will discuss how we improved our initial automation, and how we moved it closer to our final vision for what this client’s automations should look like— all while maintaining the momentum we established when we first created it.

The Story So Far…

This is Part 4 of a longer series that will tell the full story of our client’s transformation.

At the start of this series, we introduced the client and their project. This client is a multi-billion-dollar technology company that employs hundreds of thousands of people, and operates a central mainframe. When they found us, they had been following 20+ year old legacy processes to develop applications for their mainframe. They partnered with us to transform their mainframe application development processes, and to bring a modern DevOps approach to this critical area of their business.

You can read the series from the beginning, by starting here.

In our most recent article—Part 3 of this series—we began to dig into the actual, hands-on, nuts-and-bolts work of how we transformed this client’s mainframe application development process. We outlined the first automation we completed for them—a “proof of concept” project that automated the way applications were installed on the TEST environment. We explained how we began with a small, imperfect automation to show this client that any automation was possible.

You can read Part 3 here

In this article, we will walk you through the next steps we took— how we fleshed out and improved this initial automation, and how that reflected our broader strategy for driving this client’s mainframe DevOps transformation.

Improving Our Initial Automation: Core Considerations

Our initial automation had a number of technical issues we needed to resolve. The infrastructure required constant assistance, or it would fail. We couldn’t process multiple deployments at once. Many deployments were not standardized, and still required individualization.

These were real issues, and they would need to be fixed. But we weren’t too worried about them. We knew what they were. We knew how to resolve them. And we saw them as relatively small technical kinks to work out.

Instead of worrying too much about these, we focused on fixing the bigger, more strategic issues lurking within our initial automation.

Here’s what we were really worried about:

  • We needed to expand the initial automation to more groups. The initial automation focused entirely on the needs of the Operations Analysts. Now, we needed to involve other groups, including Developers, Testers, and Business Analysts. Even though they were involved in deployment process, they were not yet interacting with the pipeline.
  • We needed to make the automation scripts more accessible. In our initial automation, the automation scripts had been placed directly onto the target machine— not in the CI/CD tool. As such, we could only use the scripts on the machine they were placed on, and those script could not be used on other machines.
  • We needed to prove our Agile approach actually worked. Our initial automation showed our client a new way of working. We needed to build off that momentum. We needed to demonstrate that we weren’t going to leave a half-finished automation, that we were going to continuously build more and more effective and comprehensive automations, and that Agile could actually work as we said it would.

But more than anything else, we needed to demonstrate to our client what our vision for a fully automated mainframe DevOps process really looked like.

And we could not demonstrate this vision with an incomplete automation.

Our Mainframe DevOps Vision: “One Button Push” Automation

The mainframe DevOps transformation we were leading this client through had one goal— to build a fully automated deployment pipeline.

We sought to build a pipeline that would promote code from the DEVELOPMENT environment to the PRODUCTION environment without any manual steps—or the need for any manual intervention—during the entire process.

When it was completed, this process would only have one control point— a single “button” that would be pushed by the Developer once their code was ready to enter the PRODUCTION environment. This “button” could be in a GUI, or it could be a template script to submit, or it could be a set of actions like “submit script, open the web app, and push the button.”

The exact “button” would not matter. Once the Developer pushed it, and initiated the automation, the pipeline would automatically test, check, and deploy the Developer’s code into PRODUCTION— without any additional actions, or manual interventions, required from anyone else.

This is how we wanted to build all of our automations, throughout this entire project.

But our initial automation did not yet live up to this vision.

It still had many steps that required manual intervention:

  • Code changes were still being sent from one machine to another manually.
  • After these code changes were sent, Developers still needed to ask the Operations Analyst to start the deployment pipeline.
  • The Operations Analysts then had to notify Testers to perform their testing (which the Operations Analysts may or may not have to participate within).
  • Once testing is complete, the Testers would confirm completion with the Business Analysts.
  • Finally, once testing was completed, Operations Analysts had to manually initiate the pipeline to install the code onto production.

To complete this automation—and demonstrate our vision for the project— we needed to automate these remaining manual steps.

But as we began to survey this initial automation—and the early state of the project as a whole—we realized the client was not quite ready for this complete “one button push” automation yet.

Completing this fully automated pipeline would be too complex. It would require too much work. And it would require too many individual parts to be automated.

We couldn’t do it quickly. If we decided to be purists about creating a “one button push” automation right out of the gate, then we would have to dramatically slow down the project’s progress, and risk killing the momentum we were trying to build upon.

That felt unacceptable to us.

So, for this initial automation, we temporarily set aside the concept of a “one button push” process, and settled on a compromise.

We did not abandon our vision entirely. We just decided to build towards this “one button push” process incrementally, and iteratively, with the ability to deliver progress, benefits, and continuous improvement along the way.

Here’s how we did it.

Our (Temporary) Compromise: “Two Button Push” Automation

Our approach was simple.

Instead of automating everything, all at once, we would instead create each automated step individually, and then gradually connect them over time into one big, complete automation.

To do so, we would leave “blank space” in the pipeline process that represent separate steps that we would eventually automate, but which we had not automated yet, and which still required manual intervention.

We would consider each of these remaining interventions a “button push”, and, over time, we would work to reduce the number of “button pushes” required to complete the process.

While we still ultimately wanted to reduce the number of “button pushes” down to one, we set the short-term goal of getting them down to two, and effectively creating a “two button push” automation.

The two buttons we decided to leave in the process were:

  • Button One: The Developer would manually submit a script, which would send the code to the TEST environment and automatically launch installation.
  • Button Two: The QA lead would manually test the code change and allow it to be promoted to PRODUCTION.

By achieving this, we would take our initial automation, improve on it, and bring it much closer to our final vision for all automations within this project— and do so without significantly slowing down the project’s progress.

Here’s what we did to create this “two button push” automation.

Identifying the Manual Steps that Remained in the Process

First, we laid out the existing process in detail, and identified each of the specific steps that still required a manual action to move the process forward.

While we already had a rough idea of where the manual “button pushes” were, we now defined these steps in greater detail.

Here is what the process looked like when we started work.

Developer writes the code.

Developer finishes the code, and then manually sends it to the TEST environment, where they leave a comment about it to the RTC record to change its status.

During the next review, the RTC Operations Analyst runs an automated process to deploy whatever code changes were set to the “Ready for Test Installation” status.

When this process completes, the Operations Analyst manually adds comments about what actions were performed, and manually informs Testers and Business Analysts that they can perform their tests.

Optional.

If some of the installation actions were not completed automatically, then the Operations Analyst would have to fix them manually.

The Tester—with the support of the Operations Analyst—performs testing. If they find any bugs, then the process loops back to 1 and repeats itself.

When the testing completes, the Business Analyst defines a time for the code’s promotion to PRODUCTION.

Before the promotion to PRODUCTION occurs, the Operations Analyst would need to manually customize the package, then transfer it, and finally launch the automated deployment process.

After reviewing this existing process, we saw that were still four to five “buttons” being pushed, representing manual interventions that we could be automated.

They were:

  • In Step 2, the code was manually promoted from DEVELOPMENT to TEST to PRODUCTION. (This would also be true if the process had to repeat ala 6, but we chose to focus on a straightforward sequence at first.)
  • In Step 3 and Step 4, the Operations Analyst had to manually launch the processes, leave comments, and queue up the next steps.
  • In Step 5, the Operations Analyst had to manually fix installation problems. (While some of these would always be unavoidable, we sought to minimize those occurrences.)
  • In Step 8, the Operations Analyst had to manually customize, transfer, and launch the code’s automated deployment process.

We now had a clear picture of what still needed to be automated, and got to work.

Automating the Remaining Manual Steps

We performed the following work to automate these manual steps, and to bring everyone involved on board with the new way of completing the process.

  • Using UrbanCode Deploy (UCD) functionality that we had previously installed on the mainframe, we created a script that would directly run pipelines from the mainframe itself.
  • We set up a “buztool” utility on the mainframe, which allowed us to transfer z/OS artifacts to the UCD code station.
  • We added RTC integration via its REST API. (Note: The UCD plug-in for RTC doesn’t have all the functionality we needed—such as file attachments, status check, and custom fields—so we were forced to develop several bash scripts and execute them in our pipeline.)
  • We created a Slack channel, where the client’s users of our (their Developers and Testers) could report the status of each deployment from the pipeline. We integrated Slack with UCD—through the available Slack plugin—and created simple, automated status notifications for everyone.
  • We documented the deployment process, the RTC statuses’ workflow and fields, and the responsibilities for every role involved in controlling the pipeline.
  • After completing this documentation, we hosted multiple educational sessions—with live demos—to teach the new processes to every team involved.
  • Finally, we created an addition Slack support channel, where our client’s users—from any team that used the pipelines—could ask questions, request support, share ideas, and report bugs.

It was a lot of work. But at the end of this initiative, we created a fully operational “two button push” process.

Here’s what that looked like.

Building the New “Two Button Push” Process

The client’s new process was much simpler, more streamlined, and more fully automated than the process we started working on.

Step 1: The Developer writes their code, and adjusts the template of the so-called “shiplist” .xml file that describes what files feature the code change. The Developer sets it to the “Ready for Test Installation” status in RTC, and then submits the JCL job, all with one command. The script will then send the code to the UCD CodeStation, and begin the deployment process.

Step 2: The automated deployment process will install the code changes onto the TEST system. This process will leave a comment on the RTC record that outlines all activities performed, and change the status to with “Ready for Test”—if the installation was successful—or “Development”—if the installation failed.

Step 3: All relevant team members will be notified via Slack regarding the result of the deployment.

Step 4: The Testers and Business Analysts will test the application, along with any technical, manual support required from the Operations Analyst.

Step 5: When the testing is finished successfully, the person responsible will change the status to “Ready for Production Installation”. (The installation time can be shifted, if needed.)

Step 6: When the status is “Ready for Production Installation”, the Operations Analyst will send the code to the PRODUCTION environment, and run the automated deployment process.

Step 7: The code will be automatically deployed to PRODUCTION.

There you have it.

In case you missed them, the two remaining “buttons” to manually “push” in this process are:

  • Button One: The Developer submits the JCL job.
  • Button Two: The assigned person sets the status to “Ready for Production Installation”, and runs the production deployment process directly from UCD.

The result— a much simpler, more elegant, faster, and more accurate process than this client originally deployed.

There you have it.

In case you missed them, the two remaining “buttons” to manually “push” in this process are:

  • Button One: The Developer submits the JCL job.
  • Button Two: The assigned person sets the status to “Ready for Production Installation”, and runs the production deployment process directly from UCD.

The result— a much simpler, more elegant, faster, and more accurate process than this client originally deployed.

Making it Stick: Overcoming Resistance to the New Process

As a whole, the client’s teams were accommodating to this change in their process, and quickly saw its benefits.

However, we did face some initial resistance from the Developers involved.

They had good reason to put up a little resistance, and to feel a little skeptical of the new process.

Simply put — they were just used to their old way of working. They did not initially understand how the new process would make everyone’s life easier. And they understandably worried that the new process would actually make their lives harder.

Here’s why.

In the old process, Developers sent their code directly to the pipeline. They felt that if there were any mistakes in their code, then the Operations Analysts would just be able to fix those issues.

But within the new process, Developers had to learn a new way of doing things. They had to adjust their processes and outputs to fit cleanly into the automation’s protocols. If they produced an error in the package they sent—for example, if they gave a data set the wrong name—they would receive a notification, and have to fix it themselves.

Developers saw all of this, and felt it was all just too complicated, too inflexible, and that it would just give them more work.

They saw that they would have to become more careful and detail-oriented about certain things, like how they unified the package to make sure what they submitted was 100% compatible with the new automation process. They also felt they were suddenly responsible for doing some of the Operations Analyst’s job for them.

These were legitimate concerns. And while we knew that they weren’t really accurate, we understood why the client’s Developers worried about them, and we did everything we could to provide as much support and education as possible, and to do everything we could to disarm their worries and overcome their resistance.

  • We created a dedicated Slack channel for their support.
  • We taught them the new standards through direct training.
  • We explained how to organize code changes to work within the automated pipeline.
  • We offered hands-on support anytime they submitted a package and encountered an issue.
  • We documented everything a Developer would need to know about the pipeline, and provided ample education and online assistance.

Each of these steps helped, but ultimately there was only one thing that melted away the last of the resistance— time.

Over time, the Developers learned the new approach. They got better at submitting compliant packages, they experienced fewer frustrating moments, and they saw the process-wide benefits of the new approach come to life for everyone involved.

Delivering Meaningful Process Automation and Improvement

The client had already seen some real benefits from the imperfect and incomplete process we initially delivered.

Those benefits only grew after we further developed that process into a “two button push” automation.

Over the course of this stage of the project:

  1. We dramatically reduced the time, effort, and attention required to complete this process, while increasing the collaboration and transparency of the process. By freeing this time, the client’s employees were able to focus on more valuable things than code automation.
  2. We reduce the amount of communications required between team members. Developers didn’t need to write long instructions and requests to Operations Analysts to promote code— they just organized the transfer of their end code in the proper way.
  3. We created a repository of code that was both stored on the mainframe, and duplicated code changes on an independent UCD server.
  4. We collected long histories of all deployments and their logs, giving us the ability to analyze changes to installations and their impact (even if they occurred a long time ago).
  5. We documented all actions with code changes in one place, in common manner for all teams. These well-organized RTC records helped us to navigate deployment histories.

We delivered all the above in just six weeks, with the first four weeks devoted to creating the automation’s technical infrastructure, and the remaining two weeks focused around education, training, and adoption.

And this will still just an early stage in the larger project we have delivered for this client.

Next Steps: Reviewing The Tools That Made All of This Possible

Before we move on to the next stage of this project’s development, we are going to take a brief pause.

In the next article, we are going to discuss the exact tools that we have tested throughout this project, and explain what we learned about them— which tools helped, which tools were insufficient, and which tools might be best to deliver similar results within your unique context.

If you would like to learn more about this project without waiting, then contact us today and we will talk you through the elements of driving a DevOps transformation for mainframe that are most relevant to your unique context.