In the first part of this series I talked about the selection of tools for my PKI project. If you haven't read that one yet, I suggest that you do it now and come back to this post afterwards, just so you're familiar with the tools I'll be talking about.
In this post, I'm going to discuss the general design cosiderations that we have to make and the architecture choices I've made with my PKI. While these may not suit your requirements/needs/secret desires, it is a good place to start.
General Design Considerations
When you want to create a robust PKI for an enterprise, what are the design considerations you need to account for? When you think about it, they are no different from any other mission-critical app: Confidentiality; Availability; Integrity. Oh look. It's the CIA triad... funny how that works.
Designing for the CIA … triad
What is the single most important thing that needs to stay confidential in an enterprise (or any) PKI? CA private keys, especially the Root. So when creating our overall design, we want to use an offline Root CA that is stored securely. We will use OpenSSL to encrypt the CA private keys with long, randomly generated passphrases that are split between two distinct groups of authorized individuals. We'll also keep a record of all the certificates issued by the CAs in a Certificate Database, in the event we need to revoke a certificate due to its private key being compromised. We also want those records available even if the primary database server goes down, so we'll make the Cert-DB sit on an HA database cluster using streaming replication. The Issuing CAs themselves should be redundant so that we can still issue certificates if the primary CA is unavailable for some reason. We create this redundancy across multiple data centers if we can.
Splitting the workload
Another thing we want to consider when outlining out PKI design is how we are going to segment or group our certificate requests across Issuing CAs. This is going to depend a lot on how your organization is structured, and what your goals are for your PKI, but for a smaller organization a good general starting point that works well with CFSSL is to have three major groups:
- Server and server-like infrastructure
- Client infrastructure
- Network infrastructure
Each of these groups is described a little more below, but the thought behind grouping like this is that we can both separate by general certificate types, and by the potential workload for the relevant Issuing CAs. Network devices might need web server type certificates, but as a category they are generally static, unlike the server category, which in a busy environment can have lots of requests going on all the time. Two more potential environments you might consider adding are Staging and Development, depending on the requirements and practices in your IT shop.
In this category, I include any servers, applications, or services that are in use in the enterprise. This includes the hosts (virtual or physical) like Windows or Linux, applications like Nginx or PostgreSQL, or services like Kafka using a Java keystore.
Here we are talking about user machines/workstations, user accounts, mobile devices, etc. It will include certificates for email protection, IPSec user authentication, and similar functions.
Finally, the network infrastructure includes certificates for network devices such as router, switches, firewall, and load-balancers (and their web apps if applicable), IPSec tunnels, IPSec endpoints, etc.
Now that we have our general design considerations in mind, lets talk the actual architecture.
This first figure shows how we are laying out the general PKI architecture. We're using an offline, physical Root CA (grey) and four Intermediate CAs (blue). These Intermediate CAs are virtual servers that remain powered off when not in use. The four Intermediates include the three categories I discussed above, and one for the Stage environment. Each of the Intermediates has two Issuing CAs (green). Each of those Issuing CAs has a Master (yellow) and a Slave (light grey) Certificate Database. Finally, there is an Authority Information Access (AIA) server which is used to serve up certificates and CRLs for all of the CAs.
The second figure here shows the distribution of the Issuing CAs and their certificate databases across conceptual data centers. In my plan here I am accounting for a Primary and Secondary services-type data center. The main goal in the design presented here is to provide for redundancy and resiliency when a data center is disconnected from the main enterprise network (due to meteor strike, raptor infestation, idiot service providers, etc.). We want to be able to keep signing certificates and serving CRLs and OCSP responses at all times. You're availability needs might be lower or higher than what I've described here, so adjust accordingly.
Of course, when you are writing your detailed architectural plans, you'll also need to include the appropriate information for each of the Certificate Authorities, including:
- Common Name
- DNS name (FQDN)
- AIA endpoint URL
- CRL endpoint URL
- OCSP endpoint URL
- Certificate Database information
- Master and Slave hostnames and IPs
- Database information
With all of that decided, we'll be able to start getting into the nitty gritty of building out the infrastructure. My next post will cover setting up tooling, building CFSSL, and getting the PostgreSQL HA clusters set up. Stay tuned!