How to build a HIPAA compliant cloud infrastructure - Part II

June 24, 2020
Contact Us
Weekly Shorts are topics we discuss in our weekly remote meeting related to recent work we have done with our customers
How to build a HIPAA compliant cloud infrastructure - Part II

This is part two of building a HIPAA compliant infrastructure in the cloud. If you've missed it here's part one. {: .notice--primary}

AWS provides a generous, well-thought-out boilerplate for a HIPAA-compliant architecture. It's provided as a set of CloudFormation templates ready to deploy right off the shelf.

While you can take the template as is and deploy it into your AWS account, there are several keys to understand. It’s important to know what it generates and how it handles the environment lifecycle.

The HIPAA-compliant quickstart architecture by AWS consists of several major components:

A bird's eye view of the entire infrastructure map. Showing the communication between realted components; note the peering connection between both VPCs. The peering allows disconnection upon security breach. There's also the future ability to deploy additional VPCs using the same group of proxies for external access.
  • VPC: A virtual private cloud that represents an isolated network. It provides the “perimeter” for the deployment of cloud resources. A VPC is usually divided into many subnets, normally half of which are public and the other half private. A private subnet is one that doesn’t have public access. Aside from the bastion-host, only neighbor resources can access each other. Public subnets are the exposed network, hosting only the gateways to the infrastructure. In our case, this refers to a load balancer, which tunnels users’ requests, as well as the VPN through which internal users access the backend entities. The template provides a basic structure for two VPCs. The management VPC holds the bastion-host but can be used for services like a VPN, CI server and others. The primary VPC is production, where the application resource will find a place.
The production VPC layout describing segmentation to separate subnets on different availability zones (high availability) with second-degree isolation to multiple private subnets. One subnet is dedicated for the database, a second is dedicated for the application(s), and a third is dedicated for gateways, load-balancers, and proxies. This last subnet is the only one exposed to the outside world.
  • Database (RDS): Amazon RDS database is configured to use encryption at rest by default. RDS also allows encrypted SSL communication and stores a decryption key. It's the user's responsibility to use the public key on the application data layer. Doing so will eliminate MITM attacks. The public keys to use in your application can be found here.

Note: In order to keep up with compliance using a secure SSL connection is mandatory. Implementation is as simple as downloading the public key and setting it in any kind of data interaction layer you use. {: .notice--warning}

  • Bastion: The bastion-host (a.k.a jump box) is a proxy server that allows tunneled connections to backend resources. While the bastion is secure with an encrypted access key, it doesn't employ any other security measures. A VPN can provide the functionality of user-based access control and structured auditing. In regard to HIPAA, using a secure bastion is enough, but in general, the bastion is not considered to be sufficiently secure. A better alternative is OpenVPN. Below, you can find more information on setting up a VPN as part of your HIPPA environment.

Note: The access to the bastion is guarded with the encrypted key. Yet, it is as a single point of failure that's open to network improvisation. If the proxy is being accessed by malicious entities, make sure it's only being used as a tunnel and does not keep the keys of backend servers on its disk. One very good practice is using SSH agent forwarding. {: .notice--warning}

  • Config: This is the core of compliance in regard to ongoing monitoring, which ensures that the infrastructure and its users adhere to preset rules. Each rule breach triggers an alarm. The alarm creates a notification event on SNS which in turn notifies the user.

{% capture notice-rules %} Rules include:

  • Required tags on EC2 instances
  • Required tags on Volumes
  • Check that there are no unrestricted ports on security groups
  • Make sure only approved AMIs are used
  • Make sure CloudTrail is enabled
  • Check that SSH access is not enabled on security groups that are in use {% endcapture %}{{ notice-rules | markdownify }}
  • CloudTrail: CloudTrail provides complete tracking over actions and API calls in the account. These can be broken down into specific services, users, timeframes and many more. When something goes wrong, CloudTrail is the go-to tool to learn what happened to mitigate the damage. The deployment of CloudTrail components can be found under the logging template. It consists of S3 bucket settings, roles, and policies. The template also describes CloudWatch log groups where logs can be followed in real-time.
  • CloudWatch: Used for monitoring, alerting and aggregating logs and metrics. In the context of HIPAA CloudWatch components include a set of groups and alarms. The alarms provide a set of checks that complements the account Config rules.

{% capture notice-alarms %} Alarms include:

  • Alarms when an API call is made to create, update or delete ACL
  • Alarms when an API call is made to create, update or delete a security group
  • Root user activity detected
  • Multiple unauthorized actions or logins attempted
  • IAM Configuration changes detected
  • Warning: New IAM access key was created
  • Warning: Changes to CloudTrail log configuration detected {% endcapture %}
  • {{ notice-alarms | markdownify }}
  • CloudFront - S3: Assumes you already have a Hosted Zone registered with Amazon Route 53, and an ACM certificate ready. This template creates an Amazon Route 53 DNS record, an S3 bucket, and a CloudFront distribution. Do note the prerequisite of an ACM certificate, the ARN of which will be used to create a cache distribution.
  • IAM: Creates a better hierarchy to manage identity and access control in the account. Among others, the deployment creates an admin group and a sysadmins group, with matching roles and policies, adhering to the principle of least privilege. It’s important to stress that these are only basic implementations. The user is expected to follow through and give the limited privilege for applications, whether K8s pods, ECS services, ElasticBeanstalk containers or anything else.
  • EC2: This is the essence of the entire environment. EC2 only plays the role of a placeholder here. Whether your application is a container, a multi-service cluster or a serverless function, this is where it comes in. The EC2 component is a naive instance under an auto-scaling group and a load balancer, both of which are attached and manage the instance. If the instance is terminated, the autoscaling group will launch a new one. If the instance crosses a certain threshold, new instances are added to assist with load.


What's missing?

Unfortunately, the template only takes you so far in deploying a working environment. Key components are still missing:

  • VPN - A VPN per se is not part of the implementation simply because of the different options in the market. A bastion is one way of proxying into an environment but it’s much easier to compromise or tamper with. A VPN product should be added alongside the Bastion host for real access control.

TIP:
OpenVPN is offered for free in the AWS marketplace. It's free tier includes two simultaneously connected users. Another good open-source option is Algo VPN. {: .notice--info}

  • Encryption - SSL certificates and protocol termination - A prerequisite for the environment is a setup of an ACM certificate for CloudFront to use in its distributions. Once a certificate is in place, encrypted traffic can be routed through the CDN to the load balancer. From there requests proceed to the application. It’s recommended to keep traffic encrypted throughout. This means that the load balancer will only be listening on port 443 (HTTPS). Then traffic can continue using the same method. In that case, it's the developers' responsibility to receive and handle encrypted requests. Data layer requests to the database should be using SSL to encrypt the traffic.
  • Application deployment - The above environment provides the infrastructure to which the application is deployed. To prevent human error, both for security reasons and for maximum availability, it’s recommended to use a CI/CD solution. There are many out there: TraviceCI, Drone.io, and AWS’s CodePipeline to name a few.
  • Monitoring dashboards - These are usually dependent on the technology used for application deployment. System metrics, however, can be monitored right from the start. These metrics are only collected by CloudWatch and can show on a CloudWatch dashboard. It’s recommended to gather a few basic metrics and present them so that the team gains immediate visibility. The system’s metrics can teach us a lot. Whether a new deployment is creating unusual CPU load or a peak in memory that can help identify leaks. When applicative bugs affect performance, the point of realization is crucial. A simple metrics dashboard can save hours, days and even months of circling back to find issues.
  • Logs collection - After the application deploys, logs start flowing. If uncollected, they no longer appear when the instance is rotated. The template does not support log collection out-of-the-box. The nature of the application and its deployment should be determined first. Then, a matching log collection system is chosen. AWS provides solutions to native services like ECS, Lambda and the like. There's an out-of-the-box integration with CloudWatch logs that can be read and filtered.


Application deployment considerations

The given template creates a “naive” environment, where a server lies under an auto-scaling-group and a load balancer controlling both scale and routing. This means the application would be deployed manually on the server. The most straightforward automated process, given this implementation, is generating an AMI (machine image) holding an application version, and updating the autoscaling group to deploy the new one. However, this is as tedious to apply as it is to describe, so a better option is a platform to wrap the application and handle these for you.

Know your options:

  • A single application process - When an application is not containerized a great option to create application versioning and rolling deployments is Elastic Beanstalk. Another option is Docker Compose, which can be utilized as a small orchestrated environment with limited deployment features such as rolling updates, logging and more. Though it is a great tool for local work, Docker Compose is not advisable for long-term workloads in production.
  • A single container - If the application is already running in a container (if it isn’t it’s highly recommended to do so for various reasons), the options within AWS grow significantly: ElasticBeanstalk, Docker Compose on EC2, and ECS Fargate are all reasonable solutions. My personal preference is ECS with Fargate, a service that obsoletes the need for physical servers management and provides a “serverless” platform for containers. As such, all that’s required is an ECS cluster and a service defined as Fargate. With ECS wrapping the application deployments become easy as a single command, and scale is virtually infinite as the management of infrastructure is in the hands of AWS. Read about it some more, but do note: regardless of the choice, it’s highly recommended to integrate the solution into the CloudFormation templates for future disaster recovery and environment duplications.
  • A serverless function - Probably the future of software deployment in the cloud; serverless can be a fantastic option as it even obsoletes the need for a container. With AWS’s serverless platform Lambda, functions can be deployed as straightforward plain code in different languages. Important note: while serverless has its benefits when the product grows, the application grows with it, and the management of different functions and services can become quite cumbersome. One should consider the use of a framework such as serverless.com for monitoring, security, and management.
  • Container cluster - If you already have different containerized services, consider a container orchestration such as the increasingly popular K8s, or AWS’s solution, ECS. Fargate deployments can aid both of these options, transferring the burden of infrastructure management from you to AWS. I have noticed from experience that while K8s is great for certain products, ECS simplifies the majority of configurations and attention-requiring components of K8s. After all, ECS already handles some of the largest workloads on the internet, including most of Netflix.

Thanks for reading. HIPAA is a major concern for a large portion of our clients at ProdOps. Our clients put their trust in us because of our extensive expertise in the field. Hopefully, the information in this series of posts will help you in the process.

Links

Architecting for HIPAA Security and Compliance on Amazon Web Services

http://www.hhs.gov/ocr/privacy/

How to build a HIPAA compliant cloud infrastructure - Part II
Omer Hamerman
Senior Software Operations Architect
Omer is an experienced software operations engineer and open source contributor. He is always willing to go the extra mile to help our clients improve their software delivery. Omer gets the job done quickly, and is clear-cut and sharp throughout, delivering almost any job on the spot. When he’s not helping our clients achieve scalable and resilient infrastructure, you’ll find him rock climbing and bouldering. He is passionate about beautiful code, cybersecurity and doing things right the first time. He is a keen writer of blog posts and a speaker at meetups.