Cloud Native Computing Foundation: The Architecture of Harbor Explained
Harbor is an open-source container registry that enables secure and efficient management of container images and Helm charts. Designed for cloud-native applications, it is an essential component of the Cloud Native Computing Foundation (CNCF) ecosystem. As a CNCF project, Harbor follows cloud-native best practices, providing security, scalability, and smooth integration with Kubernetes and other CNCF projects.
In this blog post, we will explore the architecture of Harbor, specifically highlighting its key components, functionalities, and how they interact in a cloud-native environment.
Overview of Harbor in the Cloud Native Computing Foundation Ecosystem
Harbor’s architecture can be broken down into several key components, each serving a distinct purpose. These components are categorized as follows:
- Consumers: This includes all clients and client interfaces that interact with Harbor.
- Fundamental Services: These are the core functionalities that are part of the Harbor project, along with essential third-party projects.
- Data Access Layer: This layer consists of various data stores that Harbor utilizes.
- Identity Providers: These are external authentication provider extensions that Harbor can integrate with.
- Scan Providers: This includes external image CVE scanner extensions.
- Replicated Registry Providers: These are external image replication extensions.
Key Components of Harbor
1. Harbor Core
Furthermore, within the Harbor Core, several modules provide essential functionalities that contribute to the overall registry experience.These modules are:
- API Management: Facilitates interaction with Harbor through RESTful APIs.
- Authentication and Authorization: Manages user access and permissions.
- Interfacing Glues: Includes pluggable image replication providers, image scanners, and image signature providers.
- Multitenancy Capabilities: Supports multiple teams using the same Harbor instance with isolated configurations.
- Configuration Management: Manages settings and configurations for Harbor.
- Artifact Management: Handles the storage and retrieval of container images and Helm charts.
These core modules deploy as a Kubernetes resource named my-harbor-core
, which is exposed as a Kubernetes service resource of the same name.
2. Harbor Job Service
The Job Service is Harbor’s asynchronous task execution engine. It exposes REST APIs for other components to submit job requests, such as image scanning. This service is deployed as a Kubernetes deployment resource named my-harbor-jobservice
.
3. Harbor Portal
The Harbor Portal is the graphical user interface (GUI) for Harbor, which enables users to perform registry and administrative tasks. To elaborate, the portal interacts with various Harbor components via REST APIs, providing flexibility and automation options. The portal deploys as a Kubernetes deployment resource named my-harbor-portal
.
4. Harbor Registry
Based on the open-source project Distribution, the Harbor Registry wraps functionalities to pack, ship, store, and deliver content. It implements standards defined by the OCI Distribution Specification, serving as the core library for image registry operations. This component deploys as a Kubernetes deployment resource named my-harbor-registry
.
5. PostgreSQL Database
The PostgreSQL database serves as the main database for Harbor, storing all required configurations and metadata. It includes data related to projects, users, policies, scanners, charts, and images. This database is deployed as a stateful set on Kubernetes, named my-harbor-postgresql
.
6. Redis Cache
The Redis cache is used as a key-value store to cache metadata required by the Job Service. It is also deployed as a stateful set on Kubernetes, named my-harbor-redis-master
.
7. Trivy Scanner
Trivy is the default image CVE scanner integrated with Harbor. It scans operating system layers and language-specific packages in images for known vulnerabilities. The scanning results include detailed reports with CVE metadata, severity levels, and fixed versions if available. This component is deployed as my-harbor-trivy
.
8. Harbor Chart Museum
Harbor stores Helm charts through the Chart Museum, an open-source project for Helm chart repositories. This component deploys as a Kubernetes deployment resource named my-harbor-chartmuseum
.
Data Flow in Harbor
After the image is uploaded, the Job Service triggers the Trivy Scanner to scan the image for vulnerabilities. Following that, metadata about the image, including user actions and scan results, is stored in the PostgreSQL database:
1. User Authentication: Users authenticate through the Harbor Portal, which communicates with Identity Providers for access control.
2. Image Upload/Download: Users can upload or download images through the API or GUI, which interacts with the Harbor Registry.
3.Image Scanning: Upon image upload, the Job Service triggers the Trivy Scanner to scan the image for vulnerabilities.
4. Metadata Storage: Harbor stores all relevant metadata, including user actions and image details, in the PostgreSQL database.
5. Caching: Harbor caches frequently accessed data in Redis to boost performance.
Security in Cloud Native Computing Foundation’s Harbor
Harbor incorporates several security features to ensure the integrity and safety of container images:
1. Image Scanning: Each image is scanned for vulnerabilities, and detailed reports are generated. Users can configure policies to restrict the use of images with high-severity vulnerabilities.
2. Image Signing: Harbor supports integration with Notary, allowing images to be digitally signed for authenticity. This ensures that only verified images are deployed in production environments.
3. Role-Based Access Control (RBAC): Harbor provides robust RBAC capabilities, allowing administrators to define user roles and permissions for different projects and resources.
Extensibility
Harbor is a Cloud Native Computing Foundation project because of its ability to extend functionality through external tools:
External Scanners: Users can replace the default Trivy scanner with other supported scanners like Clair, Anchore, or Aqua.
Authentication Providers: Harbor supports integration with external authentication systems such as LDAP and OIDC, allowing for centralized user management.
Replication and Caching: Harbor allows for image replication across repositories, enabling users to pull images from external sources without direct access.
Deployment and Configuration
Deploying Harbor typically involves using Helm charts to simplify the installation process on Kubernetes. The following steps outline the general process:
1. Prerequisites: Ensure that a Kubernetes cluster is set up with the necessary tools installed, including Docker, Helm, and kubectl.
2. Helm Chart Installation: Add the Bitnami Helm repository and install Harbor using Helm commands. This will deploy all necessary components as Kubernetes resources.
3. Configuration: After installation, configure Harbor through the GUI or REST APIs, setting up projects, user roles, and image scanning policies.
Conclusion
In conclusion, Harbor is a powerful, cloud-native container registry that integrates seamlessly with the CNCF ecosystem, offering security, scalability, and extensibility. This blog provided an overview of Harbor architecture and components, highlighting its modular design, security features, and extensibility for managing container images and Helm charts efficiently.
With image scanning, signing, role-based access control (RBAC), and multi-cloud replication, Harbor simplifies container image management while ensuring security and compliance. As a CNCF project, it integrates seamlessly with Kubernetes, Helm, and other cloud-native tools, making it a crucial choice for modern DevOps workflows.
Do you like to read more educational content? Read our blogs at Cloudastra Technologies or contact us for business enquiry at Cloudastra Contact Us.