Introduction
The Enhanced Factor Stack (EFK) is a software-defined data center (SDDC) architecture that combines the simplicity of a software-only approach with the performance and scalability of a hardware-based system. The EFK design supports next generation applications, especially in the financial services, insurance, and healthcare industries, where performance and reliability are of the upmost importance.
The EFK architecture includes three key components: 1) a scalable and flexible data center platform; 2) a file-based storage management (FSSM) system; and 3) a set of virtual machines (VMs) that serve as containers for workloads.
This article will introduce you to key concepts and terminology related to the EFK stack, as well as provide you with some practical examples of how this architecture can be implemented. We will begin by defining the key terms related to data centers and SDDCs, and provide you with an overview of the EFK stack design.
Terminology Relating to Data Centers and SDDCs
Data Center
A data center (DC) is a facility that houses computer servers and associated networking equipment. Networked together, these devices form an interconnected network known as the Data Center Network (DCN).
The DCN allows for the exchange of data between the individual devices in the center. This data can be structured, such as Excel spreadsheets, or unstructured, such as PDF documents and MP3s. For organizations that perform a lot of data processing (e.g., Big Data), a DC is a cost-effective solution that provides high availability and performance.
Software Defined Data Center (SDDC)
To follow suit, organizations are turning to software-defined data center (SDDC) designs that leverage virtualization to provide a flexible and scalable infrastructure for their computing needs. With SDDC designs, the responsibility for provisioning, monitoring, and maintaining the data center resources falls to software applications, rather than people.
As with a traditional data center, the network and security infrastructure in an SDDC are also provided by the organization (known as the administrator), but the devices within the data center are defined by software running on top of a hypervisor (also known as a virtual machine monitor).
Scalable
A scalable system is one that can be expanded, upgraded, or modified to fit additional needs or demands. In the world of computing, scalability often means the ability to increase the number of storage devices, increase the number of servers, or increase the number of network interfaces, all of which boost performance.
When it comes to the EFK stack, the system can be expanded to include additional nodes (servers) and/or storage devices, as well as the network interfaces that connect all of these components.
Flexible
Flexibility is the ability to configure systems in a way that enables them to perform a variety of tasks, meet a variety of needs, or adapt to a variety of environments. When it comes to the EFK stack, administrators can configure the system to meet the specific needs of the organization by choosing the appropriate hardware and software configurations, as well as the deployment environment (e.g., public or private cloud).
Performance-Based
Organizations that operate in the financial services, insurance, and healthcare industries value performance-based metrics, such as response time, uptime, and reliability, as much as cost-based metrics like price per unit of capacity or cost per gigabyte.
In the world of computing, performance is often described as the time it takes to complete a specific task, the resource utilization (wait times and CPU utilization), and the output (i.e., the information that is produced, processed, or stored as a result of performing the task).
When it comes to the EFK stack, users can choose how to measure performance, as the system is equipped with a Prometheus gauge that provides detailed performance metrics, including Latency Exceeded Percentages (LEPs) for all deployed components.
Overview of the EFK Stack Design
The central component of the EFK stack is a scalable and flexible data center platform, which we will refer to as the EFK services layer, or just the services layer for simplicity. The services layer is responsible for providing the virtual machines (VMs) that operate as containers for workloads, as well as other components, such as the file-based storage management (FSSM) system.
The services layer provides a reliable and secure foundation that allows administrators to deploy and manage additional components, such as virtual networks, security groups, and volumes. It is able to do this in a highly-available and automated manner using native Kubernetes clustering technology.
The Virtual Machines (VMs) That Make up the EFK Services Layer
Primary VMs
A primary VM is a VM that operates as the core component of the EFK services layer. This is the component that provides users with a standard operating system, such as Linux, and applications, such as a web server or email server, that can be used to operate applications and services. A primary VM can be configured to provide a standard operating system and applications to other VMs in the system.
In addition to applications and operating systems, a primary VM is also responsible for providing security credentials, installing software applications, and acting as a repository for hosting files. These files can then be accessed by other VMs in the system.
Image-Based VMs (IAMVs)
An image-based VM (IAMV) is a VM that is configured using a standard operating system image, or an operating system and application combination that is created and stored on a repository, such as a centralized storage area network (SAN).
The advantage of using a SAN for storing image-based VMs is that organizations can take advantage of the fact that all of the VMs in the system are identical, and utilize software-defined storage management (FSSM) to provide each VM with a familiar interface for creating, browsing, and deploying image-based VMs. Using a SAN also provides the added advantage of reducing input errors, as all of the VMs can be configured by a single operator, reducing the chance of human error.
Tertiary VMs
A tertiary VM, also known as a node VM or helper VM, is a VM that operates in conjunction with a primary VM and provides additional functionality, such as: 1) the ability to process workloads more efficiently; 2) the ability to provide a familiar and secure environment for users; and 3) the ability to provide additional security controls.
For example, a tertiary VM could be used to process credit card payments, or it could be set up to perform backups. In these examples, the tertiary VM provides additional value to the system by increasing efficiency and reliability while also enabling administrators to better secure their data center. The ability to scale out and integrate secondary and tertiary VMs effectively and safely is a key advantage of the EFK architecture.
When you are first getting started with the EFK, it is recommended that you utilize only the primary VMs, as all of the other VMs in the system can be considered helper VMs and are essentially there to assist the primary VMs in providing their value.
The EFK services layer discussed in the next section provides the ability to deploy additional components, including virtual networks, security groups, and volumes, as well as the standard operating systems and applications that these components require.
Key Components of the EFK Services Layer
To follow suit, organizations are turning to software-defined data center (SDDC) designs that leverage virtualization to provide a flexible and scalable infrastructure for their computing needs. With these designs, the responsibility for provisioning, monitoring, and maintaining the data center resources falls to software applications, rather than people.
The EFK services layer incorporates some of the following components:
File-Based Storage Management (FSSM)
A file-based storage management (FSSM) system is one that integrates the storage devices, or persistent storage, in the data center with a file system, such as the Network Attached Storage (NAS) or the Direct Access Storage (DAS) connected to a server. This allows for the storage of large amounts of data, as well as the ability for users to create multiple volumes, or partitions, within the storage space, and utilize the file system to access these volumes, just like a traditional hard drive.