Project & Program
Case Study – Department of Agriculture, Water and the Environment
The Department of Agriculture, Water and the Environment (DAWE) had been developing a new system to manage export documentation for several years with multiple attempts at taking it to production when Chalfont Consultants were engaged. A robust approach to service readiness and cutover was required to match the scale of the change being introduced with the new system.
The purpose of the engagement was to establish DAWE and the project team with:
- An understanding of current risks and issues related to system readiness or cutover;
- A plan to mitigate or manage system readiness, management, platform or support problems;
- Design and implementation of service management including system cutover and system support arrangements;
- A plan to maintain application and technical support alongside continual feature development across multiple shared environments; and
- The implementation of a ‘whole of solution’ approach in relation to the application and dependencies.
Upon initiation of the engagement the following approach was defined:
Review and update Service Management Plans to provide clarity to support teams and business stakeholders in relation to application maintenance, technical and business support. Existing plans were reviewed with a new suite of plans identified to define processes associated with key service management concepts covering service design, service transition and service operation.
These plans focussed on support, early life support measures, incident and problem management, monitoring and alert processes, service continuity, backup and recovery and implementation and rollback.
Review solution design and implementation of the application and its dependencies from several angles to understand and document infrastructure details and application system pathways. This provided an understanding of the operating systems currently installed, backup method and frequency, security vulnerabilities, monitoring configurations and system ingress channels and pathways.
System readiness risks were identified along with mitigation plans to address issues such as unsupported or end of life hardware, operating system versions, backup frequency and the potential for data loss. These risks were prioritised and mitigated through the upgrade of virtual servers and replacement of hardware along with installed systems to address software currency. Backup frequency was also addressed through the remediation of database backup methods moving from daily snapshots to incremental backups every 5 minutes.
The managed shared environments were reviewed to provide clarity for development teams to adhere to patterns and workflows to minimise the introduction of regression into the application, dependent components and the shared services. This planning also provided an understanding to maintain application versions and configuration management across environments to align production with the lower environments.
The primary methodologies and frameworks utilised for this engagement are listed below along with a summary of how they were implemented.
- Agile – iterative planning and prioritisation along with inspection and adaption loops were applied to plan and adjust to changing resource availability and time constraints;
- ITIL - service design, service transition and service operation frameworks were adopted to provide guardrails for the implementation of readiness and cutover activities; and
- Risk assessment and mitigation - DAWE risk management policies and processes were followed to identify, assess and control risks throughout the engagement. This information was fed into executive briefings, planning cycles and prioritisation.
A large portion of the engagement involved continual risk identification and mitigation. Below are some key risks that were identified and the controls applied to mitigate them.
End of life hardware - Several components within the solution were hosted on hardware that had reached the end of its supported life. This meant that the hardware couldn’t easily be replaced if required and posed a significant risk to the project if realised. This risk was mitigated through the establishment of an initiative to address software currency where hardware was replaced with virtual servers and up to date operating systems were installed and configured.
End of life software versions - As per the above risk, several middleware systems and the database upon which the project solution relied were end of life and unsupported. This risk was mitigated through the replacement of virtual servers and the migration and configuration of the systems they hosted.
Unsupported software systems - Some components within the solution utilised open source software that by its nature is not supported other than through community involvement. This risk was accepted due to the use of the software in the overall solution and the ability to use something else if required should problems be encountered that couldn’t be easily resolved.
System redundancy - Some key components within the solution were identified as single points of failure with no redundancy measures. These components were addressed through the establishment of an initiative to design a high availability configuration for the components, install, configure, test and deploy these through the various environments.
Lack of support process – Some thought and planning had gone into support processes during previous attempts at taking the system to production but there were some key aspects of service operation missing. To address this risk, a holistic approach to service management was planned to address known and possible issues through the service transition and service operation phases.
The primary challenges faced as part of this engagement were time and enablement from service teams.
The initiative began in early December 2020 with a Go Live date of the 30th April. This left just over four months to plan, socialise, implement and refine service management processes as well as mitigate the risks identified throughout the process. As previously mentioned, agile techniques were adopted to plan, prioritise, inspect and adapt work as we worked towards the go live date. This approach allowed us to incrementally complete work and continually adapt and re-prioritise as we progressed. The volume of work identified in the initial plan was quite large so by the time the system went live there was still some work remaining, however because we had focussed on the highest priority and/or highest risks first having work remaining didn’t derail the system cutover.
Throughout the engagement there were times when it was difficult to get service teams and the managed service provider to assist. This was due to various reasons including commercial coverage and priorities with their own work program. To address this a RASCI (Responsible, Accountable, Supporting, Consulted, Informed) matrix was developed to clearly articulate the service transition and operation activities associated with the solution and agree on responsibilities across the organisation. This established a baseline upon which further work could be drawn down from. Additionally, the challenges relating to respective team priorities required adequate lead times to be provided to allow for the work to be completed as well as the establishment of new commercial arrangements to cover additional work that was not already part of contractual obligations.
Smooth go live transition - the planning, work prioritisation and implementation of service management enabled a smooth service transition. Some issues and problems had been encountered but no amount of planning could have prevented every issue, however the processes implemented allowed for a quick resolution to issues as they arose.
Extensible plans and processes - the plans and processes implemented through this process go above and beyond anything the department had implemented for similar production systems. There are several initiatives underway now that are starting to adapt the documentation for their own purposes in the hope that they allow for a smooth transition and enable consistent approach across the organisation.