An Analysis of the APRA Cloud Computing Services Paper Update

An Analysis of the Australian Prudential Regulation Authority (APRA) Shared Computing Services Paper Update

In Q3 2018, the Australian Prudential Regulation Authority (APRA), an independent statutory authority that supervises institutions across banking, insurance and superannuation and promotes financial system stability in Australia, published an addendum to its original 2015 shared computing services paper. This addendum discusses and extends on the themes addressed in the original document.

It is clear when reading this paper, and comparing it to its 2015 predecessor, that the new document sets a much higher bar in terms of APRA’s view of regulated cloud practitioners. It also better defines the expectations it has of entities seeking to build or move services into material outsourcing arrangements with cloud providers.

APRA’s advancement in cloud technology understanding, and its further engagement with stakeholders and platform providers in the Financial Services Industry (FSI) vertical, has a number of benefits, including:

  • APRA has become much more accustomed to working with financial services institutions (FSIs) using public cloud providers for the launch or migration of material services. FSIs embarking on these cloud journeys now have a wealth of precedents, industry case studies, awareness and experience to draw upon as well as conversations and consultations with the regulator.
  • As the regulator has advanced its understanding, cloud providers themselves have listened to FSIs (and their regulators) regarding their unique challenges. As a result, two major shifts are evident in this paper that significantly reduce the regulatory and technical ambiguity existing in prior APRA documents;
    • The regulator has done a much better job of defining its expectations of regulated entities and how they should conceptualise cloud strategies, develop executive cloud governance and implement a controls ecosystem able to maintain control effectiveness as consumption of these services scale across hundreds of products and teams.
    • Cloud providers have observed the challenges of both the regulator and their clients with balancing the need for effective controls and innovation. As a result, the major cloud providers have substantially uplifted their platform control features relevant to FSIs, and provided resources and advisory papers as to how these can be used to meet control objectives and regulatory obligations.

General Document Guidance and Callouts

General Takeaways

APRA seeks to address the weaknesses it has observed since its initial cloud usage publication as the appetite and sophistication for cloud workload deployments and use cases have blossomed. An important theme in the paper is that APRA rightly considers cloud outsourcing risk as the confluence of both usage practices and workload/data.

APRA notes this evolving risk landscape as a cause for continued focus based on industry usage trends it has observed across its jurisdiction, including;

  • Increasing appetite for heightened inherent risk activities using cloud computing services since the “easy wins” have been achieved.
  • FSIs have moved beyond low inherent risk asset migration and are moving rapidly forward with heightened inherent risk workloads. Some advanced FSIs are considering their first waves of “system of record” migration.
  • Utilisation of high value/impact PaaS services such as Artificial Intelligence (AI) and Machine Learning (ML) that can process huge volumes of protected and confidential data has increased concern around data loss.
  • As the appetite to consume exponentially growing features across cloud platforms increases, combined with large cultural shifts to DevOps practices in FSIs, the regulator has been concerned with cloud practice ‘creep’ whereby existing governance and controls procedures have not been able to scale consistently across the enterprise (constituting major complexity challenges for the C-suite and boards of FSIs).
Risk Identification and APRA Engagement

In the latest iteration, APRA has clarified its workload, risk and impact classification framework.

This three-tiered framework for regulated parties provides clear terms and classifications as to how regulated entities should conceptualise risk and conduct appropriate internal governance, controls implementation, disclosure and liaison activities when considering cloud computing services for material business activities. Importantly, the commentary provided by APRA around this framework outlines clear boundaries and the undertakings required for activities and workloads falling into each classification.

Risk Categorisation APRA Guidance
Low inherent risk
  • Does not involve off-shoring. 
  • APRA would not expect an APRA-regulated entity to consult with APRA prior to entering into the arrangement.
  • Heightened risk
  • Service failure would lead to diminished ability to service material obligations. 
  • APRA would expect to be consulted after the APRA-regulated entity’s internal governance process is completed.
  • Extreme inherent risk
  • APRA encourages earlier engagement as these arrangements will be subjected to a higher level of scrutiny. 
  • APRA’s satisfaction, prior to entering into the arrangement is required. 
  • The entity understands the risks associated with the arrangement, and that its risk management and risk mitigation techniques are sufficiently strong and can demonstrate appropriate, sustainable control procedures and governance are in place.
  • The newly clarified risk framework and associated disclosure and engagement expectations will provide regulated entities with greater clarity and confidence when initiating programs of work to migrate workloads and services to cloud providers.

    Control Effectiveness at Scale and Organisational Change

    Consistent weaknesses that both the regulator and Sourced are observing across the FSI and risk adverse enterprise landscape are centred around the traceability of strategic drivers used for controlling cloud services, including;

    • Demonstrable business benefits and strategic drivers for migrating workloads to cloud.
    • Appropriate target state architecture, operating models and control management procedures for cloud-based operations at scale. This also includes traceability for how target states will assist the regulated entity in realising its stated benefits goal.
    • Failure to consider, plan and test acceptable business continuity recovery from plausible cloud provider failures. This includes such items as financial failure, catastrophic fault or sabotage, etc.
    • APRA has specifically called out that it is expecting to see more considered thought from regulated entities regarding the nature in which they are orchestrating and consuming cloud services. Its perspective is that different methods of abstraction can comprise different levels of risk exposure. In APRA’s view, enterprises have already demonstrated an uneven understanding of how these different layers of abstraction could result in different inherent risks to their compliance obligations when combined with their workloads and data.
    • A strong theme in both APRA information papers and borne out by what Sourced has observed in both FSI and other verticals is that enterprises are getting better at conceptualising risk vectors and designing controls to achieve a desirable residual risk exposure. However, they continue to struggle to scale, operate and iterate controls beyond “Day 1” or moving from project mode to BAU mode. Enterprises and their partners need to invest significantly more effort into ensuring ongoing ownership, management, operability, currency and iteration of controls as threat vectors and risk dynamics change.
    • As organisations scale cloud usage, a common theme regarding their controls posture is creep between teams in the ongoing iteration and management of key controls.
    • Further to this, enterprises need to place a greater focus on deeper real-world simulations of risk vectors and control breach scenarios. There was a common trend of observed weakness in the rigor and ongoing frequency of realistic control testing and response simulation. Further investment in realistic risk and security game days and SecDevOps practices is required in many instances.
    • A substantial portion of enterprises are still treating DevOps teams and security as isolated functions rather than blending security practitioners into platform and product teams.

    Sourced Perspectives

    Strategy and Governance
    • Cloud transformation and migration projects need to expend greater effort on the granular definition of risks and their impacts. APRA has observed weaknesses in the form of loosely defined risks or failure to effectively outline their impact on regulated workloads.
    • Cloud transformation programs need the right balance of workloads and masthead projects with realistic benefits and risk profiles – racing to cloud can be a risky endeavour and there is no compression algorithm for experience. Our most successful clients plan transformation programs that see increasingly important workloads moved to cloud as control maturity and organisational understanding improves.
    • Across Sourced’s global operations, we often see clients base their entire strategic cloud business case on cost savings or avoidance. Using this reasoning alone could risk the regulator, at least in Australia, requiring more strategic justifications to be satisfied with a cloud migration program (especially in high or extreme inherent risk categorisations). Regulated enterprises need to thoroughly consider their cloud business cases and priorities to demonstrate tangible strategic value such as risk reduction, time to market, scale, innovation and business/technology simplification.
    • Whilst a significant portion of this blog has focussed on effectively mapping and controlling risk, we also observe that enterprises are too conservative in their cloud programs and often struggle to get traction because they are unable to make effective risk-weighted decisions based on use case and workload. Enterprises need to be careful that they aren’t thinking about worst case impact only by starting small or they could risk sabotaging their program at the starting line.
    • Governance models / steering committees should not be siloed to technology as has often been the historic case with IT infrastructure programs.
    • Steering committees or tribe leadership groups in SAFE should have cloud transformation decision-making frameworks that define which stakeholders have decision rights in respective domains. This will have a material bearing on the controls outcome of the program when the appropriate decision transparency is in place.
    • It is important that all regulated entities consider ensuring that cloud program charters and benefit realisation strategies are closely coupled to the IT strategy/business case for cloud. These charters should clearly demonstrate consistency and traceability to these drivers to satisfy the regulator.
    Organisational Change
    • Sourced consultants commonly observe that some enterprises show weakness in understanding operating model change implications when adopting cloud services (i.e. procedural change, competency and capability change, risk in associated transition/change). This results in what could be described as degraded competence and capability to respond to risk events.
    • Quite often during large enterprise-grade cloud transformation programs, we observe clients transitioning away from traditional siloed enterprise IT roles into the cross functional “everything as code” cloud world. Quite often these clients are willing participants in this transition but also bring with them “point in time” thinking in how risks evolve and what is required to control these risks. Regulated enterprises need to ensure that they have appropriately experienced staff and partners to guide teams in cloud native controls design and operationalisation.
    Emerging Opportunities

    The last six months have seen an enormous flurry of feature innovation and announcements by the “big three” public cloud providers. Several announcements especially stand out as potential game changers for regulated FSIs in 2019 in terms of how these features can simplify and accelerate their cloud transformation programs whilst maintaining regulatory obligations.

    Back to the Data Centre

    Amazon Web Services (AWS) and Google have joined Microsoft in announcing hybrid cloud services that can be hosted in the data centre with the benefit of public cloud APIs and orchestration experience. In the context of regulated FSIs that are daunted by the scale of controls and organisational change required to move material workloads to the cloud, these services could be a powerful enabler allowing cloud adoption with a lower barrier to entry, given that all services are “behind the firewall” and would require less controls transformation to get up and running:

    Amazon Web Services
    AWS Outpost

    Allows customers to either use VMware or AWS cloud native orchestration and operational practices on fully managed infrastructure pods which can be dropped into existing client data centres.

    Additional details on what AWS services will be available on Outposts is still pending.

    Microsoft Azure
    Azure Stack

    Microsoft was the original cloud services provider to offer an on-premise hybrid cloud solution.

    Azure stack must be run on racks in the client data centre provided by an accredited Azure stack partner and that has the capability of offering a substantial set of Azure IaaS & PaaS services.

    Google Cloud Platform
    GKE On Prem

    A game changer for FSIs wanting to operate large container platforms without the headache and overheads of large on-premise cluster implementations and who lack the confidence in controls maturity to go the whole way to a cloud native implementation for material deployments.

    Google Kubernetes Engine (GKE) on-premise offers the ability to utilise the GCP management pane to manage Kubernetes clusters running in the client datacentre.

    The management and operations of these clusters are radically simplified by Google offering OS and run-time security hardening. Underlying cluster management is handled by Google.

    An added advantage is that GKE on-premise deployments are fully portable to Kubernetes infrastructure running on GCP or other managed Kubernetes platforms.

    Data Driven & Machine Learning Cloud Native Controls

    Traditionally, a great deal of both on-premise and cloud security analytics have been driven by teams analysing known threat vectors and developing rules for the capture and alerting of security events. This approach has been highly restrictive due to the security skills required, the expense and time required to develop these rules and the limitation of threat intelligence that an individual organisation may have access to.

    Cloud providers in recent years have made large strides in developing solutions that solve these customer problems by using their scale, service-oriented architecture and threat intelligence to:

    • Develop and assimilate huge datasets which can be consumed by various security machine learning algorithms.
    • Provide machine learning algorithms and training to correlate security events of interest based on known threat intelligence datasets as well as evolving threats.
    • Package these non-bespoke data-driven algorithms at scale so that they can be consumed by customers for a fee.

    Examples include:

    Amazon Web Services
    GuardDuty, Security Hub & Macie

    AWS GuardDuty is an easily configurable service that can be configured to interrogate multiple streams of critical event logs in AWS to correlate security and compliance events of interest.

    GuardDuty uses a PaaS service in that it is a service which once turned on, requires no ongoing management overhead in return for usage fees.

    GuardDuty event flags can then be used in multiple manners that the organisation sees fit, including;

    • AWS Lambda functions can be developed to create event response workflows to using the GuardDuty event metadata
    • GuardDuty event metadata can be ingested into an enterprise data visualisation tool of choice for dashboarding and alerting to DevOps and security teams

    AWS Macie is a machine learning powered service which can be easily activated to scan the data storage services in AWS accounts starting with S3.

    It interrogates these services to identify, classify and secure the data in these environments.

    Alerts and remedial actions can also be automated via Macie.

    AWS Security Hub is a centralised portal with aggregation and dashboards for the rollup of the various compliance and breach alerts across the AWS ecosystem (GuardDuty, Macie, Inspector) into a centralised space where teams can easily disseminate this data into actionable priorities.

    Microsoft Azure
    Security Center & Sentinel

    Azure Security Center is a detective control tool built by Microsoft as a managed service. Azure Security Center allows customers of Microsoft’s platform to:

    • Centrally manage policies across subscriptions
    • Report on the compliance state of subscriptions
    • Understand compliance over time

    Because Azure Security Center is built by Microsoft, it has a deep understanding of the Azure platform and can offer operational guidance directly to platform operators on the technical steps required to remediate a compliance breach.

    Azure Sentinel is a newer service from Microsoft representing their first foray into Security Incident Event Management (SIEM) products.

    Microsoft’s entry into this market is interesting, as it removes the burden of having to deploy and manage otherwise expensive and complicated products to collect and manage the data required to drive a SIEM platform. Sentinel, backed by Azure, essentially grants unlimited infrastructure that is much needed by compute and storage intense SIEM products.

    Microsoft touts Sentinel as an agnostic SIEM product and is certainly an interesting product to observe as it matures.

    Google Cloud Platform
    Data Loss Protection (DLP) API

    Google’s Data Loss Protection (DLP) API is a “no setup / no fuss” service by Google that provides data scrubbing and redaction services for structured and unstructured data.

    Being entirely stateless and API driven, this provides a very interesting preventative control mechanism for consumers. By utilising the DLP API, amongst other services, the DLP API allows enterprises or organisations with sensitive data handling requirements to hash, redact or encrypt certain fields before persisting them. The result is removal of Personally Identifying Information (PII) and hence a reduction in sensitivity of the data at rest.

    Given that several analytics use cases can be performed with this data removed, the DLP API essentially democratises the data, encouraging analytics and machine learning within the enterprise.

    Palo Alto Networks
    Redlock

    Recently acquired by Palo Alto Networks, Redlock is unique in the cloud security and compliance value proposition.

    Redlock acts as a SaaS product that can be configured with custom rules to interrogate accounts in all three cloud platforms, trigger alerts to operations teams and identifies corrective actions. For enterprises pursuing a multi cloud strategy, it is a powerful single tool value proposition.

    We have worked with Redlock in large FSI enterprises with mature cloud practices to develop automated detect and correct frameworks (mode 3 cloud) across multiple cloud providers using a combination of Redlock alert signatures and corrective automation driven by serverless workflows.

    Final thoughts

    Sourced is pleased to see APRA’s continued advancement in cloud technology understanding and engagement with stakeholders and platform providers in the FSI vertical. It aligns with our own best practices and almost decade-long experience working with FSIs globally. When Sourced works with FSIs and other large, risk averse enterprises, it employs a holistic approach to assisting clients on their cloud transformation journeys.

    Called Cloud at Scale™, we assimilate cloud native thinking, security, regulatory compliance, processes and organisational change, tools and techniques, developer experience, training and operational readiness to devise sustainable and scalable strategies for achieving sustainable enterprise-wide cloud adoption.

    To learn more about the pitfalls and best practices to adopting cloud, please see our Whitepaper: “Building the Core Foundations for Cloud at Scale”.

    Previous Post
    The “Cloud Hype Cycle” and How to Avoid It
    Next Post
    Sourced Group Named One of Canada’s Top Small and Medium Employers
    Menu