Kubernetes Archives - NGINX

Announcing NGINX Gateway Fabric Release 1.2.0

Akash Ananthanarayanan of F5 — Fri, 29 Mar 2024 14:30:24 +0000

We are thrilled to share the latest news on NGINX Gateway Fabric, which is our conformant implementation of the Kubernetes Gateway API. We recently updated it to version 1.2.0, with several exciting new features and improvements. This release focuses on enhancing the platform’s capabilities and ensuring it meets our users’ demands. We have included F5 NGINX Plus support and expanded our API surface to cover the most demanded use cases. We believe these enhancements will create a better experience for all our users and help them achieve their goals more efficiently.

Figure 1: NGINX Gateway Fabric’s design and architecture overview

NGINX Gateway Fabric 1.2.0 at a glance:

NGINX Plus Support – NGINX Gateway Fabric now supports NGINX Plus for the data plane, which offers additional stability and higher resource utilization, metrics, and observability dashboards.
BackendTLSPolicy – TLS verification allows NGINX Gateway Fabric to confirm the identity of the backend application, protecting against potential hijacking of the connection by malicious applications. Additionally, TLS encrypts traffic within the cluster, ensuring secure communication between the client and the backend application.
URLRewrite – NGINX Gateway Fabric now supports URL rewrites in Route objects. With this feature, you can easily modify the original request URL and redirect it to a more appropriate destination. That way, as your backend applications undergo API changes, you can keep the APIs you expose to your clients consistent.
Product Telemetry – With product telemetry now present in NGINX Gateway Fabric, we can help further improve operational efficiency of your infrastructure by learning about how you use the product in your environment. Also, we are planning to share these insights regularly with the community during our meetings.

We’ll take a deeper look at the new features below.

What’s New in NGINX Gateway Fabric 1.2.0?

NGINX Plus Support

NGINX Gateway Fabric version 1.2.0 has been released with support for NGINX Plus, providing users with many new benefits. With the new upgrade, users can now leverage the advanced features of NGINX Plus in their deployments including additional Prometheus metrics, dynamic upstream reloads, and the NGINX Plus dashboard.

This upgrade also allows you the option to get support directly from NGINX for your environment.

Additional Prometheus Metrics

While using NGINX Plus as your data plane, additional advanced metrics will be exported alongside the metrics you would normally get with NGINX Open Source. Some highlights include metrics around http requests, streams, connections, and many more. For the full list, you can check NGINX’s Prometheus exporter for a convenient list, but note that the exporter is not strictly required for NGINX Gateway Fabric.

With any installation of Prometheus or Prometheus compatible scraper, you can scrape these metrics into your observability stack and build dashboards and alerts using one consistent layer within your architecture. Prometheus metrics are automatically available in the NGINX Gateway Fabric through HTTP Port 9113. You can also change the default port by updating the Pod template.

If you are looking for a simple setup, you can visit our GitHub page for more information on how to deploy and configure Prometheus to start collecting. Alternatively, if you are just looking to view the metrics and skip the setup, you can use the NGINX Plus dashboard, explained in the next section.

After installing Prometheus in your cluster, you can access its dashboard by running port-forwarding in the background.

kubectl -n monitoring port-forward svc/prometheus-server 9090:80

Figure 2: Prometheus Graph showing NGINX Gateway Fabric connections accepted

The above setup will work even if you are using the default NGINX Open Source as your data plane as well! However, you will not see any of the additional metrics that NGINX Plus provides. As the size and scope of your cluster grows, we recommend looking at how NGINX Plus metrics can help quickly resolve your capacity planning issues, incidents, and even backend application faults.

Dynamic Upstream Reloads

Dynamic upstream reloads, enabled by NGINX Gateway Fabric automatically when installed with NGINX Plus, allow NGINX Gateway Fabric to make updates to NGINX configurations without a NGINX reload.

Traditionally, when a NGINX reload occurs, the existing connections are handled by the old worker processes while the newly configured workers handle new ones. When all the old connections are complete, the old workers are stopped, and NGINX continues with only the newly configured workers. In this way, configuration changes are handled gracefully even in NGINX Open Source.

However, when NGINX is under high load, maintaining both old and new workers can create a resource overhead that may cause problems, especially if trying to run NGINX Gateway Fabric as lean as possible. The dynamic upstream reloads featured in NGINX Plus bypass this problem by providing an API endpoint for configuration changes that NGINX Gateway Fabric will use automatically if present, reducing the need for extra resource overhead to handle old and new workers during the reload process.

As you begin to make changes more often to NGINX Gateway Fabric, reloads will occur more frequently. If you are curious how often or when reloads occur in your current installation of NGF, you can look at the Prometheus metric nginx_gateway_fabric_nginx_reloads_total. For a full, deep dive into the problem, check out Nick Shadrin’s article here!

Here’s an example of the metric in an environment with two deployments of NGINX Gateway Fabric in the Prometheus dashboard:

Figure 3: Prometheus graph showing the NGINX Gateway Fabric reloads total

NGINX Plus Dashboard

As previously mentioned, if you are looking for a quick way to view NGINX Plus metrics without a Prometheus installation or observability stack, the NGINX Plus dashboard gives you real-time monitoring of performance metrics you can use to troubleshoot incidents and keep an eye on resource capacity.

The dashboard gives you different views for all metrics NGINX Plus provides right away and is easily accessible on an internal port. If you would like to take a quick look for yourself as to what the dashboard capabilities look like, check out our dashboard demo site at demo.nginx.com.

To access the NGINX Plus dashboard on your NGINX Gateway Fabric installation, you can forward connections to Port 8765 on your local machine via port forwarding:

kubectl port-forward -n nginx-gateway 8765:8765

Next, open your preferred browser and type http://localhost:8765/dashboard.html in the address bar.

Figure 4: NGINX Plus Dashboard overview

BackendTLSPolicy

This release now comes with the much-awaited support for the BackendTLSPolicy. The BackendTLSPolicy introduces encrypted TLS communication between NGINX Gateway Fabric and the application, greatly enhancing the communication channel’s security. Here’s an example that shows how to apply the policy by specifying settings such as TLS ciphers and protocols when validating server certificates against a trusted certificate authority (CA).

The BackendTLSPolicy enables users to secure their traffic between NGF and their backends. You can also set the minimum TLS version and cipher suites. This protects against malicious applications hijacking the connection and encrypts the traffic within the cluster.

To configure backend TLS termination, first create a ConfigMap with the CA certification you want to use. For help with managing internal Kubernetes certificates, check out this guide.


kind: ConfigMap
apiVersion: v1
metadata:
  name: backend-cert
data:
  ca.crt: 
         < -----BEGIN CERTIFICATE-----
	   -----END CERTIFICATE-----
          >

Next, we create the BackendTLSPolicy, which targets our secure-app Service and refers to the ConfigMap created in the previous step:


apiVersion: gateway.networking.k8s.io/v1alpha2
kind: BackendTLSPolicy
metadata:
  name: backend-tls
spec:
  targetRef:
    group: ''
    kind: Service
    name: secure-app
    namespace: default
  tls:
    caCertRefs:
    - name: backend-cert
      group: ''
      kind: ConfigMap
    hostname: secure-app.example.com

URLRewrite

With a URLRewrite filter, you can modify the original URL of an incoming request and redirect it to a different URL with zero performance impact. This is particularly useful when your backend applications change their exposed API, but you want to maintain backwards compatibility for your existing clients. You can also use this feature to expose a consistent API URL to your clients while redirecting the requests to different applications with different API URLs, providing an “experience” API that combines the functionality of several different APIs for your clients’ convenience and performance.

To get started, let’s create a gateway for the NGINX gateway fabric. This will enable us to define HTTP listeners and configure the Port 80 for optimal performance.


apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: cafe
spec:
  gatewayClassName: nginx
  listeners:
  - name: http
    port: 80
    protocol: HTTP

Let’s create an HTTPRoute resource and configure request filters to rewrite any requests for /coffee to /beans. We can also provide a /latte endpoint that is rewritten to include the /latte prefix for the backend to handle (“/latte/126” becomes “/126”).


apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: coffee
spec:
  parentRefs:
  - name: cafe
    sectionName: http
  hostnames:
  - "cafe.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /coffee
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplaceFullPath
          replaceFullPath: /beans
    backendRefs:
    - name: coffee
      port: 80
  - matches:
    - path:
        type: PathPrefix
        value: /latte
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplacePrefixMatch
          replacePrefixMatch: /
    backendRefs:
    - name: coffee
      port: 80

The HTTP rewrite feature helps ensure flexibility between the endpoints on the client side and how they are mapped with the backend. It also allows traffic redirection from one URL to another, which is particularly helpful when migrating content to a new website or API traffic.

Although NGINX Gateway Fabric supports path-based rewrites, it currently does not support path-based redirects. Let us know if this is a feature you need for your environment.

Product Telemetry

We have decided to include product telemetry as a mechanism to passively collect feedback as a part of the 1.2 release. This feature will collect a variety of metrics from your environment and send them to our data collection platform every 24 hours. No PII is collected, and you can see the full list of what is collected here.

We are committed to providing complete transparency around our telemetry functionality. While we will document every field we collect, and you can validate what we collect by our code, you always have the option to disable it completely. We are planning to regularly review interesting observations based on the statistics we collect with the community in our community meetings, so make sure to drop by!

Resources

For the complete changelog for NGINX Gateway Fabric 1.2.0, see the Release Notes. To try NGINX Gateway Fabric for Kubernetes with NGINX Plus, start your free 30-day trial today or contact us to discuss your use cases.

If you would like to get involved, see what is coming next, or see the source code for NGINX Gateway Fabric, check out our repository on GitHub!

We have bi-weekly community meetings on Mondays at 9AM Pacific/5PM GMT. The meeting link, updates, agenda, and notes are on the NGINX Gateway Fabric Meeting Calendar. Links are also always available from our GitHub readme.

The post Announcing NGINX Gateway Fabric Release 1.2.0 appeared first on NGINX.

Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers

Ilya Krutov of F5 — Thu, 22 Feb 2024 20:09:02 +0000

AI and machine learning (AI/ML) workloads are revolutionizing how businesses operate and innovate. Kubernetes, the de facto standard for container orchestration and management, is the platform of choice for powering scalable large language model (LLM) workloads and inference models across hybrid, multi-cloud environments.

In Kubernetes, Ingress controllers play a vital role in delivering and securing containerized applications. Deployed at the edge of a Kubernetes cluster, they serve as the central point of handling communications between users and applications.

In this blog, we explore how Ingress controllers and F5 NGINX Connectivity Stack for Kubernetes can help simplify and streamline model serving, experimentation, monitoring, and security for AI/ML workloads.

Deploying AI/ML Models in Production at Scale

When deploying AI/ML models at scale, out-of-the-box Kubernetes features and capabilities can help you:

Accelerate and simplify the AI/ML application release life cycle.
Enable AI/ML workload portability across different environments.
Improve compute resource utilization efficiency and economics.
Deliver scalability and achieve production readiness.
Optimize the environment to meet business SLAs.

At the same time, organizations might face challenges with serving, experimenting, monitoring, and securing AI/ML models in production at scale:

Increasing complexity and tool sprawl makes it difficult for organizations to configure, operate, manage, automate, and troubleshoot Kubernetes environments on-premises, in the cloud, and at the edge.
Poor user experiences because of connection timeouts and errors due to dynamic events, such as pod failures and restarts, auto-scaling, and extremely high request rates.
Performance degradation, downtime, and slower and harder troubleshooting in complex Kubernetes environments due to aggregated reporting and lack of granular, real-time, and historical metrics.
Significant risk of exposure to cybersecurity threats in hybrid, multi-cloud Kubernetes environments because traditional security models are not designed to protect loosely coupled distributed applications.

Enterprise-class Ingress controllers like F5 NGINX Ingress Controller can help address these challenges. By leveraging one tool that combines Ingress controller, load balancer, and API gateway capabilities, you can achieve better uptime, protection, and visibility at scale – no matter where you run Kubernetes. In addition, it reduces complexity and operational cost.

NGINX Ingress Controller can also be tightly integrated with an industry-leading Layer 7 app protection technology from F5 that helps mitigate OWASP Top 10 cyberthreats for LLM Applications and defends AI/ML workloads from DoS attacks.

Benefits of Ingress Controllers for AI/ML Workloads

Ingress controllers can simplify and streamline deploying and running AI/ML workloads in production through the following capabilities:

Model serving – Deliver apps non-disruptively with Kubernetes-native load balancing, auto-scaling, rate limiting, and dynamic reconfiguration features.
Model experimentation – Implement blue-green and canary deployments, and A/B testing to roll out new versions and upgrades without downtime.
Model monitoring – Collect, represent, and analyze model metrics to gain better insight into app health and performance.
Model security – Configure user identity, authentication, authorization, role-based access control, and encryption capabilities to protect apps from cybersecurity threats.

NGINX Connectivity Stack for Kubernetes includes NGINX Ingress Controller and F5 NGINX App Protect to provide fast, reliable, and secure communications between Kubernetes clusters running AI/ML applications and their users – on-premises and in the cloud. It helps simplify and streamline model serving, experimentation, monitoring, and security across any Kubernetes environment, enhancing capabilities of cloud provider and pre-packaged Kubernetes offerings with higher degree of protection, availability, and observability at scale.

Get Started with NGINX Connectivity Stack for Kubernetes

NGINX offers a comprehensive set of tools and building blocks to meet your needs and enhance security, scalability, and visibility of your Kubernetes platform.

You can get started today by requesting a free 30-day trial of Connectivity Stack for Kubernetes.

The post Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers appeared first on NGINX.

Dynamic A/B Kubernetes Multi-Cluster Load Balancing and Security Controls with NGINX Plus

Chris Akker of F5 — Thu, 15 Feb 2024 16:00:56 +0000

You’re a modern Platform Ops or DevOps engineer. You use a library of open source (and maybe some commercial) tools to test, deploy, and manage new apps and containers for your Dev team. You’ve chosen Kubernetes to run these containers and pods in development, test, staging, and production environments. You’ve bought into the architectures and concepts of microservices and, for the most part, it works pretty well. However, you’ve encountered a few speed bumps along this journey.

For instance, as you build and roll out new clusters, services, and applications, how do you easily integrate or migrate these new resources into production without dropping any traffic? Traditional networking appliances require reloads or reboots when implementing configuration changes to DNS records, load balancers, firewalls, and proxies. These adjustments are not reconfigurable without causing downtime because a “service outage” or “maintenance window” is required to update DNS, load balancer, and firewall rules. More often than not, you have to submit a dreaded service ticket and wait for another team to approve and make the changes.

Maintenance windows can drive your team into a ditch, stall application delivery, and make you declare, “There must be a better way to manage traffic!” So, let’s explore a solution that gets you back in the fast lane.

Active-Active Multi-Cluster Load Balancing

If you have multiple Kubernetes clusters, it’s ideal to route traffic to both clusters at the same time. An even better option is to perform A/B, canary, or blue-green traffic splitting and send a small percentage of your traffic as a test. To do this, you can use NGINX Plus with ngx_http_split_clients_module.

The HTTP Split Clients module is written by NGINX Open Source and allows the ratio of requests to be distributed based on a key. In this use case, the clusters are the “upstreams” of NGINX. So, as the client requests arrive, the traffic is split between two clusters. The key that is used to determine the client request is any available NGINX client $variable. That said, to control this for every request, use the $request_id variable, which is a unique number assigned by NGINX to every incoming request.

To configure the split ratios, determine which percentages you’d like to go to each cluster. In this example, we use K8s Cluster1 as a “large cluster” for production and Cluster2 as a “small cluster” for pre-production testing. If you had a small cluster for staging, you could use a 90:10 ratio and test 10% of your traffic on the small cluster to ensure everything is working before you roll out new changes to the large cluster. If that sounds too risky, you can change the ratio to 95:5. Truthfully, you can pick any ratio you’d like from 0 to 100%.

For most real-time production traffic, you likely want a 50:50 ratio where your two clusters are of equal size. But you can easily provide other ratios, based on the cluster size or other details. You can easily set the ratio to 0:100 (or 100:0) and upgrade, patch, repair, or even replace an entire cluster with no downtime. Let NGINX split_clients route the requests to the live cluster while you address issues on the other.


# Nginx Multi Cluster Load Balancing
# HTTP Split Clients Configuration for Cluster1:Cluster2 ratios
# Provide 100, 99, 50, 1, 0% ratios  (add/change as needed)
# Based on
# https://www.nginx.com/blog/dynamic-a-b-testing-with-nginx-plus/
# Chris Akker – Jan 2024
#
 
split_clients $request_id $split100 {
   * cluster1-cafe;                     # All traffic to cluster1
   } 

split_clients $request_id $split99 {
   99% cluster1-cafe;                   # 99% cluster1, 1% cluster2
   * cluster2-cafe;
   } 
 
split_clients $request_id $split50 { 
   50% cluster1-cafe;                   # 50% cluster1, 50% cluster2
   * cluster2-cafe;
   }
    
split_clients $request_id $split1 { 
   1.0% cluster1-cafe;                  # 1% to cluster1, 99% to cluster2
   * cluster2-cafe;
   }

split_clients $request_id $split0 { 
   * cluster2-cafe;                     # All traffic to cluster2
   }
 
# Choose which cluster upstream based on the ratio
 
map $split_level $upstream { 
   100 $split100; 
   99 $split99; 
   50 $split50; 
   1.0 $split1; 
   0 $split0;
   default $split50;
}

You can add or edit the configuration above to match the ratios that you need (e.g., 90:10, 80:20, 60:40, and so on).

Note: NGINX also has a Split Clients module for TCP connections in the stream context, which can be used for non-HTTP traffic. This splits the traffic based on new TCP connections, instead of HTTP requests.

NGINX Plus Key-Value Store

The next feature you can use is the NGINX Plus key-value store. This is a key-value object in an NGINX shared memory zone that can be used for many different data storage use cases. Here, we use it to store the split ratio value mentioned in the section above. NGINX Plus allows you to change any key-value record without reloading NGINX. This enables you to change this split value with an API call, creating the dynamic split function.

Based on our example, it looks like this:

{“cafe.example.com”:90}

This KeyVal record reads: The Key is the “cafe.example.com” hostname The Value is “90” for the split ratio

Instead of hard-coding the split ratio in the NGINX configuration files, you can instead use the key-value memory. This eliminates the NGINX reload required to change a static split value in NGINX.

In this example, NGINX is configured to use 90:10 for the split ratio with the large Cluster1 for the 90% and the small Cluster2 for the remaining 10%. Because this is a key-value record, you can change this ratio using the NGINX Plus API dynamically with no configuration reloads! The Split Clients module will use this new ratio value as soon as you change it, on the very next request.

Create the KV Record, start with a 50/50 ratio:

Add a new record to the KeyValue store, by sending an API command to NGINX Plus:

curl -iX POST -d '{"cafe.example.com":50}' http://nginxlb:9000/api/8/http/keyvals/split

Change the KV Record, change to the 90/10 ratio:

Change the KeyVal Split Ratio to 90, using an HTTP PATCH Method to update the KeyVal record in memory:

curl -iX PATCH -d '{"cafe.example.com":90}' http://nginxlb:9000/api/8/http/keyvals/split

Next, the pre-production testing team verifies the new application code is ready, you deploy it to the large Cluster1, and change the ratio to 100%. This immediately sends all the traffic to Cluster1 and your new application is “live” without any disruption to traffic, no service outages, no maintenance windows, reboots, reloads, or lots of tickets. It only takes one API call to change this split ratio at the time of your choosing.

Of course, being that easy to move from 90% to 100% means you have an easy way to change the ratio from 100:0 to 50:50 (or even 0:100). So, you can have a hot backup cluster or can scale your clusters horizontally with new resources. At full throttle, you can even completely build a new cluster with the latest software, hardware, and software patches – deploying the application and migrating the traffic over a period of time without dropping a single connection!

Use Cases

Using the HTTP Split Clients module with the dynamic key-value store can deliver the following use cases:

Active-active load balancing – For load balancing to multiple clusters.
Active-passive load balancing – For load balancing to primary, backup, and DR clusters and applications.
A/B, blue-green, and canary testing – Used with new Kubernetes applications.
Horizontal cluster scaling – Adds more cluster resources and changes the ratio when you’re ready.
Hitless cluster upgrades – Ability to use one cluster while you upgrade, patch, or repair the other cluster.
Instant failover – If one cluster has a serious issue, you can change the ratio to use your other cluster.

Configuration Examples

Here is an example of the key-value configuration:

# Define Key Value store, backup state file, timeout, and enable sync
 
keyval_zone zone=split:1m state=/var/lib/nginx/state/split.keyval timeout=365d sync;

keyval $host $split_level zone=split;

And this is an example of the cafe.example.com application configuration:

# Define server and location blocks for cafe.example.com, with TLS

server {
   listen 443 ssl;
   server_name cafe.example.com; 

   status_zone https://cafe.example.com;
      
   ssl_certificate /etc/ssl/nginx/cafe.example.com.crt; 
   ssl_certificate_key /etc/ssl/nginx/cafe.example.com.key;
   
   location / {
   status_zone /;
   
   proxy_set_header Host $host;
   proxy_http_version 1.1;
   proxy_set_header "Connection" "";
   proxy_pass https://$upstream;   # traffic split to upstream blocks
   
   }

# Define 2 upstream blocks – one for each cluster
# Servers managed dynamically by NLK, state file backup

# Cluster1 upstreams
 
upstream cluster1-cafe {
   zone cluster1-cafe 256k;
   least_time last_byte;
   keepalive 16;
   #servers managed by NLK Controller
   state /var/lib/nginx/state/cluster1-cafe.state; 
}
 
# Cluster2 upstreams
 
upstream cluster2-cafe {
   zone cluster2-cafe 256k;
   least_time last_byte;
   keepalive 16;
   #servers managed by NLK Controller
   state /var/lib/nginx/state/cluster2-cafe.state; 
}

The upstream server IP:ports are managed by NGINX Loadbalancer for Kubernetes, a new controller that also uses the NGINX Plus API to configure NGINX Plus dynamically. Details are in the next section.

Let’s take a look at the HTTP split traffic over time with Grafana, a popular monitoring and visualization tool. You use the NGINX Prometheus Exporter (based on njs) to export all of your NGINX Plus metrics, which are then collected and graphed by Grafana. Details for configuring Prometheus and Grafana can be found here.

There are four upstreams servers in the graph: Two for Cluster1 and two for Cluster2. We use an HTTP load generation tool to create HTTP requests and send them to NGINX Plus.

In the three graphs below, you can see the split ratio is at 50:50 at the beginning of the graph.

Then, the ratio changes to 10:90 at 12:56:30.

Then it changes to 90:10 at 13:00:00.

You can find working configurations of Prometheus and Grafana on the NGINX Loadbalancer for Kubernetes GitHub repository.

Dynamic HTTP Upstreams: NGINX Loadbalancer for Kubernetes

You can change the static NGINX Upstream configuration to dynamic cluster upstreams using the NGINX Plus API and the NGINX Loadbalancer for Kubernetes controller. This free project is a Kubernetes controller that watches NGINX Ingress Controller and automatically updates an external NGINX Plus instance configured for TCP/HTTP load balancing. It’s very straightforward in design and simple to install and operate. With this solution in place, you can implement TCP/HTTP load balancing in Kubernetes environments, ensuring new apps and services are immediately detected and available for traffic – with no reload required.

Architecture and Flow

NGINX Loadbalancer for Kubernetes sits inside a Kubernetes cluster. It is registered with Kubernetes to watch the NGINX Ingress Controller (nginx-ingress) Service. When there is a change to the Ingress controller(s), NGINX Loadbalancer for Kubernetes collects the Worker Ips and the NodePort TCP port numbers, then sends the IP:ports to NGINX Plus via the NGINX Plus API.

The NGINX upstream servers are updated with no reload required, and NGINX Plus load balances traffic to the correct upstream servers and Kubernetes NodePorts. Additional NGINX Plus instances can be added to achieve high availability.

A Snapshot of NGINX Loadbalancer for Kubernetes in Action

In the screenshot below, there are two windows that demonstrate NGINX Loadbalancer for Kubernetes deployed and doing its job:

Service Type – LoadBalancer for nginx-ingress
External IP – Connects to the NGINX Plus servers
Ports – NodePort maps to 443:30158 with matching NGINX upstream servers (as shown in the NGINX Plus real-time dashboard)
Logs – Indicates NGINX Loadbalancer for Kubernetes is successfully sending data to NGINX Plus

Note: In this example, the Kubernetes worker nodes are 10.1.1.8 and 10.1.1.10

Adding NGINX Plus Security Features

As more and more applications running in Kubernetes are exposed to the open internet, security becomes necessary. Fortunately, NGINX Plus has enterprise-class security features that can be used to create a layered, defense-in-depth architecture.

With NGINX Plus in front of your clusters and performing the split_clients function, why not leverage that presence and add some beneficial security features? Here are a few of the NGINX Plus features that could be used to enhance security, with links and references to other documentation that can be used to configure, test, and deploy them.

IP block/allow lists – Controls access based on source IP address. You can find a configuration guide on using the NGINX Plus key-value store for dynamic IP block/allow lists using the Plus API on the Dynamic Denylisting of IP Addresses page in the NGINX Docs.
Rate limiting – Controls the number of TCP connections, HTTP requests, and bandwidth usage for applications, API endpoints, and media downloads. For more information on rate limiting with NGINX, check out the Limiting Access to Proxied HTTP Resources in the NGINX Docs and the blog posts Rate Limiting with NGINX and NGINX Plus and Dynamic Bandwidth Limits Using the NGINX Plus Key-Value Store.
GeoIP2 dynamic module – Gathers location metadata based on Source IP address. Learn more about this on the Restricting Access by Geographical Location page in the NGINX Docs.
JWT and OpenID Connect (OIDC) module – Controls user access with industry standard protocols. More information on this can be found on the Setting up JWT Authentication page in the NGINX Docs.
NGINX App Protect – A lightweight, low latency web application firewall that protects against existing and evolving threats.

Get Started Today

If you’re frustrated with networking challenges at the edge of your Kubernetes cluster, consider trying out this NGINX multi-cluster Solution. Take the NGINX Loadbalancer for Kubernetes software for a test drive and let us know what you think. The source code is open source (under the Apache 2.0 license) and all installation instructions are available on GitHub.

To provide feedback, drop us a comment in the repo or message us in the NGINX Community Slack.

The post Dynamic A/B Kubernetes Multi-Cluster Load Balancing and Security Controls with NGINX Plus appeared first on NGINX.

A Quick Guide to Scaling AI/ML Workloads on Kubernetes

Ilya Krutov of F5 — Thu, 11 Jan 2024 16:00:02 +0000

When running artificial intelligence (AI) and machine learning (ML) model training and inference on Kubernetes, dynamic scaling up and down becomes a critical element. In addition to requiring high-bandwidth storage and networking to ingest data, AI model training also needs substantial (and expensive) compute, mostly from GPUs or other specialized processors. Even when leveraging pre-trained models, tasks like model serving and fine-tuning in production are still more compute-intensive than most enterprise workloads.

Cloud-native Kubernetes is designed for rapid scalability – up and down. It’s also designed to deliver more agility and cost-efficient resource usage for dynamic workloads across hybrid, multi-cloud environments.

In this blog, we cover the three most common ways to scale AI/ML workloads on Kubernetes so you can achieve optimal performance, cost savings, and adaptability for dynamic scaling in diverse environments.

Three Scaling Modalities for AI/ML Workloads on Kubernetes

The three common ways Kubernetes scales a workload are with the Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), and Cluster Autoscaler.

Here is a breakdown of those three methods:

HPA – The equivalent of adding instances or pod replicas to an application, giving it more scale, capacity, and throughput.
VPA – The equivalent of resizing a pod to give it higher capacity with greater compute and memory.
Cluster Autoscaler – Automatically increases or decreases the number of nodes in a Kubernetes cluster depending on the current resource demand for the pods.

Each modality has its benefits for model training and inferencing, which you can explore in the use cases below.

HPA Use Cases

In many cases, distributed AI model training and inference workloads can scale horizontally (i.e., adding more pods to speed up the training process or request handling). This enables the workloads benefit from HPA, which can scale out the number of pods based on metrics like CPU and memory usage, or even custom and external metrics relevant to the workload. In scenarios where the workload varies over time, HPA can dynamically adjust the number of pods to ensure optimal resource utilization.

Another aspect of horizontally scaling AI workloads in Kubernetes is load balancing. To ensure optimal performance and timely request processing, incoming requests need to be distributed across multiple instances or pods. This is why one of the ideal tools that can be used in conjunction with HPA is an Ingress controller.

VPA Use Cases

AI model training tasks are often resource-intensive, requiring significant CPU, GPU, and memory resources. VPA can adjust these resource allocations dynamically. This helps ensure that each pod has enough resources to efficiently handle the training workload and that all assigned pods have sufficient compute capacity to perform calculations. In addition, memory requirements can fluctuate significantly during the training of large models. VPA can help prevent out-of-memory errors by increasing the memory allocation as needed.

While it’s technically possible to use both HPA and VPA together, it requires careful configuration to avoid conflicts, as they might try to scale the same workload in different ways (i.e., horizontally versus vertically). It’s essential to clearly define the boundaries for each autoscaler, ensuring they complement rather than conflict with each other. An emerging approach is to use both with different scopes – for instance, HPA for scaling across multiple pods based on workload and VPA for fine-tuning the resource allocation of each pod within the limits set by HPA.

Cluster Autoscaler Use Cases

Cluster Autoscaler can help dynamically adjust the overall pool of compute, storage, and networking infrastructure resources available cluster-wide to meet the demands of AI /ML workloads. By adjusting the number of nodes in a cluster based on current demands, an organization can load balance at the macro level. This is necessary to ensure optimal performance as AI/ML workloads can demand significant computational resources unpredictably.

HPA, VPA, and Cluster Autoscaler Each Have a Role

In summary, these are the three ways that Kubernetes autoscaling works and benefits AI workloads:

HPA scales AI model serving endpoints that need to handle varying request rates.
VPA optimizes resource allocation for AI/ML workloads and ensures each pod has enough resources for efficient processing without over-provisioning.
Cluster Autoscaler adds nodes to a cluster to ensure it can accommodate resource-intensive AI jobs or removes nodes when the compute demands are low.

HPA, VPA and Cluster Autoscaler complement each other in managing AI/ML workloads in Kubernetes. Cluster Autoscaler ensures there are enough nodes to meet workload demands, HPA efficiently distributes workloads across multiple pods, and VPA optimizes the resource allocation of these pods. Together, they provide a comprehensive scaling and resource management solution for AI/ML applications in Kubernetes environments.

Visit our Power and Protect Your AI Journey page to learn more on how F5 and NGINX can help deliver, secure, and optimize your AI/ML workloads.

The post A Quick Guide to Scaling AI/ML Workloads on Kubernetes appeared first on NGINX.

Watch: NGINX Gateway Fabric at KubeCon North America 2023

Mike Stefaniak of F5 — Thu, 30 Nov 2023 21:15:23 +0000

This year at KubeCon North America 2023, we were thrilled to share the first version of NGINX Gateway Fabric. Amidst the sea of exciting new tech, the conference served as the ideal stage for unveiling our implementation of the Kubernetes Gateway API.

Booth attendees were excited to learn about our unified app delivery fabric approach to managing app and API connectivity in Kubernetes. NGINX Gateway Fabric is a conformant implementation of Kubernetes Gateway API specifications that provides fast, reliable, and secure Kubernetes app and API connectivity leveraging one of the most widely used data planes in the world – NGINX.

As always, F5 DevCentral was there covering the action. Here is the moment we got into talking about NGINX Gateway Fabric:

Another hot topic at KubeCon this year was multi-cluster configuration. As organizations adopt distributed architectures like Kubernetes, multi-cluster plays a crucial role for scalability and availability. One of the options to achieve multi-cluster setup is adding NGINX Plus in front of your Kubernetes clusters. Leveraging the cloud native, easy-to-use features of NGINX Plus, including its reverse proxy, load balancing, and API gateway capabilities, users can enhance performance, availability, and security of their multi-cluster Kubernetes environment. Stay tuned for more info on this topic soon!

How to Try NGINX Gateway Fabric

If you’d like to get started with our new Kubernetes implementation, visit the NGINX Gateway Fabric project on GitHub to get involved:

Try the implementation in your lab
Test and provide feedback
Join the project as a contributor

The post Watch: NGINX Gateway Fabric at KubeCon North America 2023 appeared first on NGINX.

Automate TCP Load Balancing to On-Premises Kubernetes Services with NGINX

Chris Akker of F5 — Tue, 22 Aug 2023 15:00:07 +0000

You are a modern app developer. You use a collection of open source and maybe some commercial tools to write, test, deploy, and manage new apps and containers. You’ve chosen Kubernetes to run these containers and pods in development, test, staging, and production environments. You’ve bought into the architectures and concepts of microservices, the Cloud Native Computing Foundation, and other modern industry standards.

On this journey, you’ve discovered that Kubernetes is indeed powerful. But you’ve probably also been surprised at how difficult, inflexible, and frustrating it can be. Implementing and coordinating changes and updates to routers, firewalls, load balancers and other network devices can become overwhelming – especially in your own data center! It’s enough to bring a developer to tears.

How you handle these challenges has a lot to do with where and how you run Kubernetes (as a managed service or on premises). This article addresses TCP load balancing, a key area where deployment choices impact ease of use.

TCP Load Balancing with Managed Kubernetes (a.k.a. the Easy Option)

If you use a managed service like a public cloud provider for Kubernetes, much of that tedious networking stuff is handled for you. With just one command (kubectl apply -f loadbalancer.yaml), the Service type LoadBalancer gives you a Public IP, DNS record, and TCP load balancer. For example, you could configure Amazon Elastic Load Balancer to distribute traffic to pods containing NGINX Ingress Controller and, using this command, have no worries when the backends change. It’s so easy, we bet you take it for granted!

TCP Load Balancing with On-Premises Kubernetes (a.k.a. the Hard Option)

With on-premises clusters, it’s a totally different scenario. You or your networking peers must provide the networking pieces. You might wonder, “Why is getting users to my Kubernetes apps so difficult?” The answer is simple but a bit shocking: The Service type LoadBalancer, the front door to your cluster, doesn’t actually exist.

To expose your apps and Services outside the cluster, your network team probably requires tickets, approvals, procedures, and perhaps even security reviews – all before they reconfigure their equipment. Or you might need to do everything yourself, slowing the pace of application delivery to a crawl. Even worse, you dare not make changes to any Kubernetes Services, for if the NodePort changes, the traffic could get blocked! And we all know how much users like getting 500 errors. Your boss probably likes it even less.

A Better Solution for On-Premises TCP Load Balancing: NGINX Loadbalancer for Kubernetes

You can turn the “hard option” into the “easy option” with our new project: NGINX Loadbalancer for Kubernetes. This free project is a Kubernetes controller that watches NGINX Ingress Controller and automatically updates an external NGINX Plus instance configured for load balancing. Being very straightforward in design, it’s simple to install and operate. With this solution in place, you can implement TCP load balancing in on-premises environments, ensuring new apps and services are immediately detected and available for traffic – with no need to get hands on.

Architecture and Flow

NGINX Loadbalancer for Kubernetes sits inside a Kubernetes cluster. It is registered with Kubernetes to watch the nginx-ingress Service (NGINX Ingress Controller). When there is a change to the backends, NGINX Loadbalancer for Kubernetes collects the Worker IPs and the NodePort TCP port numbers, then sends the IP:ports to NGINX Plus via the NGINX Plus API. The NGINX upstream servers are updated with no reload required, and NGINX Plus load balances traffic to the correct upstream servers and Kubernetes NodePorts. Additional NGINX Plus instances can be added to achieve high availability.

A Snapshot of NGINX Loadbalancer for Kubernetes in Action

In the screenshot below, there are two windows that demonstrate NGINX Loadbalancer for Kubernetes deployed and doing its job:

Service Type – LoadBalancer (for nginx-ingress)
External IP – Connects to the NGINX Plus servers
Ports – NodePort maps to 443:30158 with matching NGINX upstream servers (as shown in the NGINX Plus real-time dashboard)
Logs – Indicates NGINX Loadbalancer for Kubernetes is successfully sending data to NGINX Plus

Note: In this example, the Kubernetes worker nodes are 10.1.1.8 and 10.1.1.10

Get Started Today

If you’re frustrated with networking challenges at the edge of your Kubernetes cluster, take the project for a spin and let us know what you think. The source code for NGINX Loadbalancer for Kubernetes is open source (under the Apache 2.0 license) with all installation instructions available on GitHub.

To provide feedback, drop us a comment in the repo or message us in the NGINX Community Slack.

The post Automate TCP Load Balancing to On-Premises Kubernetes Services with NGINX appeared first on NGINX.

Shaping the Future of Kubernetes Application Connectivity with F5 NGINX

Ilya Krutov of F5 — Thu, 08 Jun 2023 15:02:27 +0000

Application connectivity in Kubernetes can be extremely complex, especially when you deploy hundreds – or even thousands – of containers across various cloud environments, including on-premises, public, private, or hybrid and multi-cloud. At NGINX, we firmly believe that integrating a unified approach to manage connectivity to, from, and within a Kubernetes cluster can dramatically simplify and streamline operations for development, infrastructure, platform engineering, and security teams.

In this blog, we want to share some reflections and thoughts on how NGINX created one of the most popular Ingress controllers today, and ways we plan continue delivering the best-in-class capabilities to manage Kubernetes app connectivity in the future.

Also, don’t miss a chance to chat with our engineers and architects to discover the latest cool and exciting projects that NGINX is working on and see these technologies in action. NGINX, a part of F5, is proud to be a Platinum Sponsor of KubeCon North America 2023, and we hope to see you there! Come meet us at the NGINX booth to discuss how we can help enhance security, scalability, and observability of your Kubernetes platform.

Before anything, we want to note the importance of putting the customer first. NGINX does so by looking at each customer’s specific scenario and use cases, goals they aim to achieve, and challenges they might encounter on their journey. Then, we develop a solution leveraging our technology innovations that helps the customer achieve those goals and address any challenges in the most efficient way.

Ingress Controller

In 2017, we released the first version of NGINX Ingress Controller to answer the demand for enterprise-class Kubernetes-native app delivery. NGINX Ingress Controller helps improve user experience with load balancing, SSL termination, URI rewrites, session persistence, JWT authentication, and other key application delivery features. It is built on the most popular data plan in the world – NGINX – and leverages the Kubernetes Ingress API.

After its release, NGINX Ingress Controller gained immediate traction due to its ease of deployment and configuration, low resource utilization (even under heavy loads), and fast and reliable operations.

As our journey advanced, we reached limitations with the Ingress object in the Kubernetes API, such as support for protocols other than HTTP and the inability to attach customized request-handling policies like security policy. Due to these limitations, we introduced Custom Resource Definitions (CRDs) to enhance NGINX Ingress Controller capabilities and enable advanced use cases for our customers.

NGINX Ingress Controller provides the CRDs VirtualServer, VirtualServerRoute, TransportServer, and Policy to enhance performance, resilience, uptime, and security, along with observability for the API gateway, load balancer, and Ingress functionality at the edge of a Kubernetes cluster. In support of frequent app releases, these NGINX CRDs also enable role-oriented self-service governance across multi-tenant development and operations teams.

With our most recent release at the time of writing (version 3.1), we added JWT authorization and introduced Deep Service Insight to help customers monitor status of their apps behind NGINX Ingress Controller. This helps implement advanced failover scenarios (e.g., from on-premises to cloud ). Many other features are planned in the roadmap, so stay tuned for the new releases.

Learn more about how you can reduce complexity, increase uptime, and provide better insights into app health and performance at scale on the NGINX Ingress Controller web page.

Service Mesh

In 2020, we continued our Kubernetes app connectivity journey by introducing NGINX Service Mesh, a purpose-built, developer-friendly, lightweight yet comprehensive solution to power a variety of service-to-service connectivity use cases, including security and visibility, within the Kubernetes cluster.

NGINX Service Mesh and NGINX Ingress Controller leverage the same data plane technology and can be tightly and seamlessly integrated for unified connectivity to, from, and within a cluster.

Prior to the latest release (version 2.0), NGINX Service Mesh used SMI specifications and a bespoke API server to deliver service-to-service connectivity within a Kubernetes cluster. With version 2.0, we decided to deprecate the SMI resources and replace them by mimicking the resources from Gateway API for Mesh Management and Administration (GAMMA). With this approach, we ensure unified north-south and east-west connectivity that leverages the same CRD types, simplifying and streamlining configuration and operations.

NGINX Service Mesh is available as a free download from GitHub.

Gateway API

The Gateway API is an open source project intended to improve and standardize app and service networking in Kubernetes. Managed by the Kubernetes community, the Gateway API specification evolved from the Kubernetes Ingress API to solve limitations of the Ingress resource in production environments. These limitations include defining fine-grained policies for request processing and delegating control over configuration across multiple teams and roles. It’s an exciting project – and since the Gateway API’s introduction, NGINX has been an active participant.

That said, we intentionally didn’t want to include the Gateway API specifications in NGINX Ingress Controller because it already has a robust set of CRDs that cover a diverse variety of use cases, and some of those use cases are the same ones the Gateway API is intended to address.

In 2021, we decided to spin off a separate new project that covers all aspects of Kubernetes connectivity with the Gateway API: NGINX Kubernetes Gateway.

We decided to start our NGINX Kubernetes Gateway project, rather than just using NGINX Ingress Controller, for these reasons:

To ensure product stability, reliability, and production readiness (we didn’t want to include beta-level specs into a mature, enterprise-class Ingress controller).
To deliver comprehensive, vendor-agnostic configuration interoperability for Gateway API resources without mixing them with vendor-specific CRDs.
To experiment with data and control plane architectural choices and decisions with the goal to provide easy-to-use, fast, reliable, and secure Kubernetes connectivity that is future-proof.

In addition, the Gateway API formed a GAMMA subgroup to research and define capabilities and resources of the Gateway API specifications for service mesh use cases. Here at NGINX, we see the long-term future of unified north-south and east-west Kubernetes connectivity in the Gateway API and heading in this direction.

The Gateway API is truly a collaborative effort across vendors and projects – all working together to build something better for Kubernetes users, based on experience and expertise, common touchpoints, and joint decisions. There will always be room for individual implementations to innovate and for data planes to shine. With NGINX Kubernetes Gateway, we continue working on native NGINX implementation of the Gateway API, and we encourage you to join us in shaping the future of Kubernetes app connectivity.

Ways you can get involved in NGINX Kubernetes Gateway include:

Join the project as a contributor
Try the implementation in your lab
Test and provide feedback

To join the project, visit NGINX Kubernetes Gateway on GitHub.

Even with this evolution of the Kubernetes Ingress API, NGINX Ingress Controller is not going anywhere and will stay here for the foreseeable future. We’ll continue to invest into and develop our proven and mature technology to satisfy both current and future customer needs and help users who need to manage app connectivity at the edge of a Kubernetes cluster.

Get Started Today

To learn more about how you can simplify application delivery with NGINX Kubernetes solutions, visit the Connectivity Stack for Kubernetes web page.

The post Shaping the Future of Kubernetes Application Connectivity with F5 NGINX appeared first on NGINX.

The Mission-Critical Patient-Care Use Case That Became a Kubernetes Odyssey

Jenn Gile of F5 — Wed, 17 May 2023 15:00:51 +0000

Downtime can lead to serious consequences.

These words are truer for companies in the medical technology field than in most other industries – in their case, the "serious consequences" can literally include death. We recently had the chance to dissect the tech stack of a company that’s seeking to transform medical record keeping from pen-and-paper to secure digital data that is accessible anytime, and anywhere, in the world. These data range from patient information to care directives, biological markers, medical analytics, historical records, and everything else shared between healthcare teams.

From the outset, the company has sought to address a seemingly simple question: “How can we help care workers easily record data in real time?” As the company has grown, however, the need to scale and make data constantly available has made solving that challenge increasingly complex. Here we describe how the company’s tech journey has led them to adopt Kubernetes and NGINX Ingress Controller.

Tech Stack at a Glance

OS – Linux
Container orchestration – Microsoft Azure Kubernetes Service
Networking – Kubernetes, NGINX Ingress Controller based on NGINX Plus
Software development language/framework – .Net
Monitoring, observability, and Alerting – Prometheus
Monitoring dashboard – Grafana
Logging – Grafana Loki
Database – Azure SQL Service
Application server – Azure App Service, .Net
Messaging and streaming– Azure Event Hubs
Caching – Redis
Security – NGINX App Protect
Location and infrastructure – Two availability zones, 2 Kubernetes clusters, 15–20 nodes, 60–100 pods
DevOps – Azure DevOps Services

Here’s a look at where NGINX fits into their architecture:

The Problem with Paper

Capturing patient status and care information at regular intervals is a core duty for healthcare personnel. Traditionally, they have recorded patient information on paper, or more recently on laptop or tablet. There are a couple serious downsides:

Healthcare workers may interact dozens of patients per day, so it’s usually not practical to write detailed notes while providing care. As a result, workers end up writing their notes at the end of their shift. At that point, mental and physical fatigue make it tempting to record only generic comments.
The workers must also depend on their memory of details about patient behavior. Inaccuracies might mask patterns that facilitate diagnosis of larger health issues if documented correctly and consistently over time.
Paper records can’t easily be shared among departments within a single department, let alone with other entities like EMTs, emergency room staff, and insurance companies. The situation isn’t much better with laptops or tablets if they’re not connected to a central data store or the cloud.

To address these challenges, the company created a simplified data recording system that provides shortcuts for accessing patient information and recording common events like dispensing medication. This ease of access and use makes it possible to record patient interactions in real time as they happen.

All data is stored in cloud systems maintained by the company, and the app integrates with other electronic medical records systems to provide a comprehensive longitudinal view of resident behaviors. This helps caregivers provide better continuity of care, creates a secure historical record, and can be easily shared with other healthcare software systems.

Physicians and other specialists also use the platform when admitting or otherwise engaging with patients. There’s a record of preferences and personal needs that travel with the patient to any facility. These can be used to help patients feel comfortable in a new setting, which improve outcomes like recovery time.

There are strict legal requirements about how long companies must store patient data. The company’s developers have built the software to offer extremely high availability with uptime SLAs that are much better than those of generic cloud applications. Keeping an ambulance waiting because a patient’s file won’t load isn’t an option.

The Voyage from the Garage to the Cloud to Kubernetes

Like many startups, the company initially saved money by running the first proof-of-concept application on a server in a co-founder’s home. Once it became clear the idea had legs, the company moved its infrastructure to the cloud rather than manage hardware in a data center. Being a Microsoft shop, they chose Azure. The initial architecture ran applications on traditional virtual machines (VMs) in Azure App Service, a managed application delivery service that runs Microsoft’s IIS web server. For data storage and retrieval, the company opted to use Microsoft’s SQL Server running in a VM as a managed application.

After several years running in the cloud, the company was growing quickly and experiencing scaling pains. It needed to scale infinitely, and horizontally rather than vertically because the latter is slow and expensive with VMs. This requirement led rather naturally to containerization and Kubernetes as a possible solution. A further point in favor of containerization was that the company’s developers need to ship updates to the application and infrastructure frequently, without risking outages. With patient notes being constantly added across multiple time zones, there is no natural downtime to push changes to production without the risk of customers immediately being affected by glitches.

A logical starting point for the company was Microsoft’s managed Kubernetes offering, Azure Kubernetes Services (AKS). The team researched Kubernetes best practices and realized they needed an Ingress controller running in front of their Kubernetes clusters to effectively manage traffic and applications running in nodes and pods on AKS.

Traffic Routing Must Be Flexible Yet Precise

The team tested AKS’s default Ingress controller, but found its traffic-routing features simply could not deliver updates to the company’s customers in the required manner. When it comes to patient care, there’s no room for ambiguity or conflicting information – it’s unacceptable for one care worker to see an orange flag and another a red flag for the same event, for example. Hence, all users in a given organization must use the same version of the app. This presents a big challenge when it comes to upgrades. There’s no natural time to transition a customer to a new version, so the company needed a way to use rules at the server and network level to route different customers to different app versions.

To achieve this, the company runs the same backend platform for all users in an organization and does not offer multi-tenancy with segmentation at the infrastructure layer within the organization. With Kubernetes, it is possible to split traffic using virtual network routes and cookies on browsers along with detailed traffic rules. However, the company’s technical team found that AKS’s default Ingress controller can split traffic only on a percentage basis, not with rules that operate at level of customer organization or individual user as required.

In its basic configuration, the NGINX Ingress Controller based on NGINX Open Source has the same limitation, so the company decided to pivot to the more advanced NGINX Ingress Controller based on NGINX Plus, an enterprise-grade product which supports granular traffic control. Finding recommendations from NGINX Ingress Controller from Microsoft and the Kubernetes community based on the high level of flexibility and control helped solidify the choice. The configuration better supports the company’s need for pod management (as opposed to classic traffic management), ensuring that pods are running in the appropriate zones and traffic is routed to those services. Sometimes traffic is being routed internally but in most use cases, it is routed back out through NGINX Ingress Controller for observability reasons.

Here Be Dragons: Monitoring, Observability and Application Performance

With NGINX Ingress Controller, the technical team has complete control over the developer and end user experience. Once users log in and establish a session, they can immediately be routed to a new version or reverted back to an older one. Patches can be pushed simultaneously and nearly instantaneously to all users in an organization. The software isn’t reliant on DNS propagation or updates on networking across the cloud platform.

NGINX Ingress Controller also meets the company’s requirement for granular and continuous monitoring. Application performance is extremely important in healthcare. Latency or downtime can hamper successful clinical care, especially in life-or-death situations. After the move to Kubernetes, customers started reporting downtime that the company hadn’t noticed. The company soon discovered the source of the problem: Azure App Service relies on sampled data. Sampling is fine for averages and broad trends, but it completely misses things like rejected requests and missing resources. Nor does it show the usage spikes that commonly occur every half hour as care givers check in and log patient data. The company was getting only an incomplete picture of latency, error sources, bad requests, and unavailable service.

The problems didn’t stop there. By default Azure App Service preserves stored data for only a month – far short of the dozens of years mandated by laws in many countries. Expanding the data store as required for longer preservation was prohibitively expensive. In addition, the Azure solution cannot see inside of the Kubernetes networking stack. NGINX Ingress Controller can monitor both infrastructure and application parameters as it handles Layer 4 and Layer 7 traffic.

For performance monitoring and observability, the company chose a Prometheus time-series database attached to a Grafana visualization engine and dashboard. Integration with Prometheus and Grafana is pre-baked into the NGINX data and control plane; the technical team had to make only a small configuration change to direct all traffic through the Prometheus and Grafana servers. The information was also routed into a Grafana Loki logging database to make it easier to analyze logs and give the software team more control over data over time.

This configuration also future-proofs against incidents requiring extremely frequent and high-volume data sampling for troubleshooting and fixing bugs. Addressing these types of incidents might be costly with the application monitoring systems provided by most large cloud companies, but the cost and overhead of Prometheus, Grafana, and Loki in this use case are minimal. All three are stable open source products which generally require little more than patching after initial tuning.

Stay the Course: A Focus on High Availability and Security

The company has always had a dual focus, on security to protect one of the most sensitive types of data there is, and on high availability to ensure the app is available whenever it’s needed. In the shift to Kubernetes, they made a few changes to augment both capacities.

For the highest availability, the technical team deploys an active-active, multi-zone, and multi-geo distributed infrastructure design for complete redundancy with no single point of failure. The team maintains N+2 active-active infrastructure with dual Kubernetes clusters in two different geographies. Within each geography, the software spans multiple data centers to reduce downtime risk, providing coverage in case of any failures at any layer in the infrastructure. Affinity and anti-affinity rules can instantly reroute users and traffic to up-and-running pods to prevent service interruptions.

For security, the team deploys a web application firewall (WAF) to guard against bad requests and malicious actors. Protection against the OWASP Top 10 is table stakes provided by most WAFs. As they created the app, the team researched a number of WAFs including the native Azure WAF and ModSecurity. In the end, the team chose NGINX App Protect with its inline WAF and distributed denial-of-service (DDoS) protection.

A big advantage of NGINX App Protect is its colocation with NGINX Ingress Controller, which both eliminates a point of redundancy and reduces latency. Other WAFs must be placed outside of the Kubernetes environment, contributing to latency and cost. Even miniscule delays (say 1 millisecond extra per request) add up quickly over time.

Surprise Side Quest: No Downtime for Developers

Having completed the transition to AKS for most of its application and networking infrastructure, the company has also realized significant improvements to its developer experience (DevEx). Developers now almost always spot problems before customers notice any issues themselves. Since the switch, the volume of support calls about errors is down about 80%!

The company’s security and application-performance teams have a detailed Grafana dashboard and unified alerting, eliminating the need to check multiple systems or implement triggers for warning texts and calls coming from different processes. The development and DevOps teams can now ship code and infrastructure updates daily or even multiple times per day and use extremely granular blue-green patterns. Formerly, they were shipping updates once or twice per week and having to time there for low-usage windows, a stressful proposition. Now, code is shipped when ready and the developers can monitor the impact directly by observing application behavior.

The results are positive all around – an increase in software development velocity, improvement in developer morale, and more lives saved.

The post The Mission-Critical Patient-Care Use Case That Became a Kubernetes Odyssey appeared first on NGINX.

Building a Docker Image of NGINX Plus with NGINX Agent for Kubernetes

Fabrizio Fiorucci of F5 — Tue, 18 Apr 2023 23:39:45 +0000

F5 NGINX Management Suite is a family of modules for managing the NGINX data plane from a single pane of glass. By simplifying management of NGINX Open Source and NGINX Plus instances, NGINX Management Suite simplifies your processes for scaling, securing, and monitoring applications and APIs.

You need to install the NGINX Agent on each NGINX instance you want to manage from NGINX Management Suite, to enable communication with the control plane and remote configuration management.

For NGINX instances running on bare metal or a virtual machine (VM), we provide installation instructions in our documentation. In this post we show how to build a Docker image for NGINX Plus and NGINX Agent, to broaden the reach of NGINX Management Suite to NGINX Plus instances deployed in Kubernetes or other microservices infrastructures.

There are three build options, depending on what you want to include in the resulting Docker image:

NGINX Plus and NGINX Agent only
NGINX Plus, NGINX Agent, and NGINX App Protect WAF
NGINX Plus, NGINX Agent, and support for the NGINX Management Suite API Connectivity Manager developer portal (for NGINX Plus instances running as an API gateway)

[Editor – This post was updated in April 2023 to clarify the instructions, and add the ACM_DEVPORTAL field, in Step 1 of Running the Docker Image in Kubernetes.]

Prerequisites

We provide a GitHub repository of the resources you need to create a Docker image of NGINX Plus and NGINX Agent, with support for version 2.8.0 and later of the Instance Manager module from NGINX Management Suite.

To build the Docker image, you need:

A Linux host (bare metal or VM)
Docker 20.10+
A private registry to which you can push the target Docker image
A running NGINX Management Suite instance with Instance Manager, and API Connectivity Manager if you want to leverage support for the developer portal
A subscription (or 30-day free trial) for NGINX Plus and optionally NGINX App Protect

To run the Docker image, you need:

A running Kubernetes cluster
kubectl with access to the Kubernetes cluster

Building the Docker Image

Follow these instructions to build the Docker image.

Clone the GitHub repository:

$ git clone https://github.com/nginxinc/NGINX-Demos 
Cloning into 'NGINX-Demos'... 
remote: Enumerating objects: 126, done. 
remote: Counting objects: 100% (126/126), done. 
remote: Compressing objects: 100% (85/85), done. 
remote: Total 126 (delta 61), reused 102 (delta 37), pack-reused 0 
Receiving objects: 100% (126/126), 20.44 KiB | 1.02 MiB/s, done. 
Resolving deltas: 100% (61/61), done.

Change to the build directory:
```
$ cd NGINX-Demos/nginx-agent-docker/
```
Run docker ps to verify that Docker is running and then run the build.sh script to include the desired software in the Docker image. The base options are:
- ‑C – Name of the NGINX Plus license certificate file (nginx-repo.crt in the sample commands below)
- ‑K – Name of the NGINX Plus license key file (nginx-repo.key in the sample commands below)
- ‑t – The registry and target image in the form
  
  /:
  
  (registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 in the sample commands below)
- ‑n – Base URL of your NGINX Management Suite instance (https://nim.f5.ff.lan in the sample commands below)
The additional options are:
- ‑d – Add data‑plane support for the developer portal when using NGINX API Connectivity Manager
- ‑w – Add NGINX App Protect WAF
Here are the commands for the different combinations of software:
- NGINX Plus and NGINX Agent:
```
$ ./scripts/build.sh -C nginx-repo.crt -K nginx-repo.key \
-t registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 \
-n https://nim.f5.ff.lan
```
- NGINX Plus, NGINX Agent, and NGINX App Protect WAF (add the ‑w option):
```
$ ./scripts/build.sh -C nginx-repo.crt -K nginx-repo.key \
-t registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 -w \
-n https://nim.f5.ff.lan
```
- NGINX Plus, NGINX Agent, and developer portal support (add the ‑d option):
```
$ ./scripts/build.sh -C nginx-repo.crt -K nginx-repo.key \ 
-t registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 -d \ 
-n https://nim.f5.ff.lan
```
Here’s a sample trace of the build for a basic image. The Build complete message at the end indicates a successful build.
```
$ ./scripts/build.sh -C nginx-repo.crt -K nginx-repo.key -t registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 -n https://nim.f5.ff.lan 
=> Target docker image is nginx-plus-with-agent:2.7.0 
[+] Building 415.1s (10/10) FINISHED 
=> [internal] load build definition from Dockerfile
=> transferring dockerfile: 38B
=> [internal] load .dockerignore 
=> transferring context: 2B 
=> [internal] load metadata for docker.io/library/centos:7
=> [auth] library/centos:pull token for registry-1.docker.io
=> CACHED [1/4] FROM docker.io/library /centos:7@sha256:be65f488b7764ad3638f236b7b515b3678369a5124c47b8d32916d6487418ea4
=> [internal] load build context 
=> transferring context: 69B 
=> [2/4] RUN yum -y update  && yum install -y wget ca-certificates epel-release curl  && mkdir -p /deployment /etc/ssl/nginx  && bash -c 'curl -k $NMS_URL/install/nginx-agent | sh' && echo "A  299.1s 
=> [3/4] COPY ./container/start.sh /deployment/
=> [4/4] RUN --mount=type=secret,id=nginx-crt,dst=/etc/ssl/nginx/nginx-repo.crt  --mount=type=secret,id=nginx-key,dst=/etc/ssl/nginx/nginx-repo.key  set -x  && chmod +x /deployment/start.sh &  102.4s  
=> exporting to image 
=> exporting layers 
=> writing image sha256:9246de4af659596a290b078e6443a19b8988ca77f36ab90af3b67c03d27068ff 
=> naming to registry.ff.lan:31005/nginx-plus-with-agent:2.7.0 
=> Build complete for registry.ff.lan:31005/nginx-plus-with-agent:2.7.0
```
Running the Docker Image in Kubernetes

Follow these instructions to prepare the Deployment manifest and start NGINX Plus with NGINX Agent on Kubernetes.
1. Using your preferred text editor, open manifests/1.nginx-with-agent.yaml and make the following changes (the code snippets show the default values that you can or must change, highlighted in orange):
  - In the spec.template.spec.containers section, replace the default image name (your.registry.tld/nginx-with-nim2-agent:tag) with the Docker image name you specified with the ‑t option in Step 3 of Building the Docker Image (in our case, registry.ff.lan:31005/nginx-plus-with-agent:2.7.0):
```
spec:
  ...
  template:
    ...    
    spec:
      containers:
      - name: nginx-nim
        image: your.registry.tld/nginx-with-nim2-agent:tag
```
  - In the spec.template.spec.containers.env section, make these substitutions in the value field for each indicated name:
    - NIM_HOST – (Required) Replace the default (nginx-nim2.nginx-nim2) with the FQDN or IP address of your NGINX Management Suite instance (in our case nim2.f5.ff.lan).
    - NIM_GRPC_PORT – (Optional) Replace the default (443) with a different port number for gRPC traffic.
    - NIM_INSTANCEGROUP – (Optional) Replace the default (lab) with the instance group to which the NGINX Plus instance belongs.
    - NIM_TAGS – (Optional) Replace the default (preprod,devops) with a comma‑delimited list of tags for the NGINX Plus instance.
```
spec:
  ...
  template:
    ...    
  spec:
    containers:
      ...
      env:
        - name: NIM_HOST
        ...
          value: "nginx-nim2.nginx-nim2"
        - name: NIM_GRPC_PORT
          value: "443"
        - name: NIM_INSTANCEGROUP
          value: "lab"
        - name: NIM_TAGS
          value: "preprod,devops"
```
  - Also in the spec.template.spec.containers.env section, uncomment these name‑value field pairs if the indicated condition applies:
    - NIM_WAF and NIM_WAF_PRECOMPILED_POLICIES – NGINX App Protect WAF is included in the image (you included the -w option in Step 3 of Building the Docker Image), so the value is "true".
    - ACM_DEVPORTAL – Support for the App Connectivity Manager developer portal is included in the image (you included the -d option in Step 3 of Building the Docker Image), so the value is "true".
```
spec:
  ...
  template:
    ...    
  spec:
    containers:
      ...
      env:
        - name: NIM_HOST
        ...
        #- name: NAP_WAF
        #  value: "true"
        #- name: NAP_WAF_PRECOMPILED_POLICIES
        #  value: "true"
        ...
        #- name: ACM_DEVPORTAL
        #  value: "true"
```
2. Run the nginxwithAgentStart.sh script as indicated to apply the manifest and start two pods (as specified by the replicas: 2 instruction in the spec section of the manifest), each with NGINX Plus and NGINX Agent:
```
$ ./scripts/nginxWithAgentStart.sh start
$ ./scripts/nginxWithAgentStart.sh stop
```
3. Verify that two pods are now running: each pod runs an NGINX Plus instance and an NGINX Agent to communicate with the NGINX Management Suite control plane.
```
$ kubectl get pods -n nim-test  
NAME                        READY  STATUS   RESTARTS  AGE 
nginx-nim-7f77c8bdc9-hkkck  1/1    Running  0         1m 
nginx-nim-7f77c8bdc9-p2s94  1/1    Running  0         1m
```
4. Access the NGINX Instance Manager GUI in NGINX Management Suite and verify that two NGINX Plus instances are running with status Online. In this example, NGINX App Protect WAF is not enabled.
Get Started

To try out the NGINX solutions discussed in this post, start a 30-day free trial today or contact us to discuss your use cases:
- NGINX Management Suite
- NGINX Plus (includes NGINX App Protect WAF)
Download NGINX Agent – it’s free and open source.

The post Building a Docker Image of NGINX Plus with NGINX Agent for Kubernetes appeared first on NGINX.

Making Better Decisions with Deep Service Insight from NGINX Ingress Controller

Akash Ananthanarayanan of F5 — Thu, 06 Apr 2023 13:15:13 +0000

We released version 3.0 of NGINX Ingress Controller in January 2023 with a host of significant new features and enhanced functionality. One new feature we believe you’ll find particularly valuable is Deep Service Insight, available with the NGINX Plus edition of NGINX Ingress Controller.

Deep Service Insight addresses a limitation that hinders optimal functioning when a routing decision system such as a load balancer sits in front of one or more Kubernetes clusters – namely, that the system has no access to information about the health of individual services running in the clusters behind the Ingress controller. This prevents it from routing traffic only to clusters with healthy services, which potentially exposes your users to outages and errors like 404 and 500.

Deep Service Insight eliminates that problem by exposing the health status of backend service pods (as collected by NGINX Ingress Controller) at a dedicated endpoint where your systems can access and use it for better routing decisions.

In this post we take an in‑depth look at the problem solved by Deep Service Insight, explain how it works in some common use cases, and show how to configure it.

Why Deep Service Insight?

The standard Kubernetes liveness, readiness, and startup probes give you some information about the backend services running in your clusters, but not enough for the kind of insight you need to make better routing decisions all the way up your stack. Lacking the right information becomes even more problematic as your Kubernetes deployments grow in complexity and your business requirements for uninterrupted uptime become more pressing.

A common approach to improving uptime as you scale your Kubernetes environment is to deploy load balancers, DNS managers, and other automated decision systems in front of your clusters. However, because of how Ingress controllers work, a load balancer sitting in front of a Kubernetes cluster normally has no access to status information about the services behind the Ingress controller in the cluster – it can verify only that the Ingress controller pods themselves are healthy and accepting traffic.

NGINX Ingress Controller, on the other hand, does have information about service health. It already monitors the health of the upstream pods in a cluster by sending periodic passive health checks for HTTP, TCP, UDP, and gRPC services, monitoring request responsiveness, and tracking successful response codes and other metrics. It uses this information to decide how to distribute traffic across your services’ pods to provide a consistent and predictable user experience. Normally, NGINX Ingress Controller is performing all this magic silently in the background, and you might never think twice about what’s happening under the hood. Deep Service Insight “surfaces” this valuable information so you can use it more effectively at other layers of your stack.

How Does Deep Service Insight Work?

Deep Service Insight is available for services you deploy using the NGINX VirtualServer and TransportServer custom resources (for HTTP and TCP/UDP respectively). Deep Service Insight uses the NGINX Plus API to share NGINX Ingress Controller’s view of the individual pods in a backend service at a dedicated endpoint unique to Deep Service Insight:

For VirtualServer – : /probe/
For TransportServer – : /probe/ts/

where

belongs to NGINX Ingress Controller
is the Deep Service Insight port number (9114 by default)
is the domain name of the service as defined in the spec.host field of the VirtualServer resource
is the name of the service as defined in the spec.upstreams.service field in the TransportServer resource

The output includes two types of information:

An HTTP status code for the hostname or service name:
- 200 OK – At least one pod is healthy
- 418 I’m a teapot – No pods are healthy
- 404 Not Found – There are no pods matching the specified hostname or service name
Three counters for the specified hostname or service name:
- Total number of service instances (pods)
- Number of pods in the Up (healthy) state
- Number of pods in the Unhealthy state

Here’s an example where all three pods for a service are healthy:

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Day, DD Mon YYYY hh:mm:ss TZ
Content-Length: 32
{"Total":3,"Up":3,"Unhealthy":0}

For more details, see the NGINX Ingress Controller documentation.

You can further customize the criteria that NGINX Ingress Controller uses to decide a pod is healthy by configuring active health checks. You can configure the path and port to which the health check is sent, the number of failed checks that must occur within a specified time period for a pod to be considered unhealthy, the expected status code, timeouts for connecting or receiving a response, and more. Include the Upstream.Healthcheck field in the VirtualServer or TransportServer resource.

Sample Use Cases for Deep Service Insight

One use case where Deep Service Insight is particularly valuable is when a load balancer is routing traffic to a service that’s running in two clusters, say for high availability. Within each cluster, NGINX Ingress Controller tracks the health of upstream pods as described above. When you enable Deep Service Insight, information about the number of healthy and unhealthy upstream pods is also exposed on a dedicated endpoint. Your routing decision system can access the endpoint and use the information to divert application traffic away from unhealthy pods in favor of healthy ones.

The diagram illustrates how Deep Service Insight works in this scenario.

You can also take advantage of Deep Service Insight when performing maintenance on a cluster in a high‑availability scenario. Simply scale the number of pods for a service down to zero in the cluster where you’re doing maintenance. The lack of healthy pods shows up automatically at the Deep Service Insight endpoint and your routing decision system uses that information to send traffic to the healthy pods in the other cluster. You effectively get automatic failover without having to change configuration on either NGINX Ingress Controller or the system, and your customers never experience a service interruption.

Enabling Deep Service Insight

To enable Deep Service Insight, include the -enable-service-insight command‑line argument in the Kubernetes manifest, or set the serviceInsight.create parameter to true if using Helm.

There are two optional arguments which you can include to tune the endpoint for your environment:

-service-insight-listen-port – Change the Deep Service Insight port number from the default, 9114 ( is an integer in the range 1024–65535). The Helm equivalent is the serviceInsight.port parameter.
-service-insight-tls-string – A Kubernetes secret (TLS certificate and key) for TLS termination of the Deep Service Insight endpoint ( is a character string with format /). The Helm equivalent is the serviceInsight.secret parameter.

Example: Enable Deep Service Insight for the Cafe Application

To see Deep Service Insight in action, you can enable it for the Cafe application often used as an example in the NGINX Ingress Controller documentation.

Install the NGINX Plus edition of NGINX Ingress Controller with support for NGINX custom resources and enabling Deep Service Insight:
- If using Helm, set the serviceInsight.create parameter to true.
- If using a Kubernetes manifest (Deployment or DaemonSet), include the -enable-service-insight argument in the manifest file.

Verify that NGINX Ingress Controller is running:

$ kubectl get pods -n nginx-ingress
NAME                                          READY ...
ingress-plus-nginx-ingress-6db8dc5c6d-cb5hp   1/1   ...  

    ...  STATUS   RESTARTS   AGE
    ...  Running   0          9d

Deploy the Cafe application according to the instructions in the README.
Verify that the NGINX VirtualServer custom resource is deployed for the Cafe application (the IP address is omitted for legibility):
```
$ kubectl get vs 
NAME   STATE   HOST               IP    PORTS      AGE
cafe   Valid   cafe.example.com   ...   [80,443]   7h1m
```

Verify that there are three upstream pods for the Cafe service running at cafe.example.com:

$ kubectl get pods 
NAME                     READY   STATUS    RESTARTS   AGE
coffee-87cf76b96-5b85h   1/1     Running   0          7h39m
coffee-87cf76b96-lqjrp   1/1     Running   0          7h39m
tea-55bc9d5586-9z26v     1/1     Running   0          111m

Access the Deep Service Insight endpoint:

$ curl -i :9114/probe/cafe.example.com

The 200 OK response code indicates that the service is ready to accept traffic (at least one pod is healthy). In this case all three pods are in the Up state.

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Day, DD Mon YYYY hh:mm:ss TZ
Content-Length: 32
{"Total":3,"Up":3,"Unhealthy":0}

The 418 I’m a teapot status code indicates that the service is unavailable (all pods are unhealthy).

HTTP/1.1 418 I'm a teapot
Content-Type: application/json; charset=utf-8
Date: Day, DD Mon YYYY hh:mm:ss TZ
Content-Length: 32
{"Total":3,"Up":0,"Unhealthy":3}

The 404 Not Found status code indicates that there is no service running at the specified hostname.

HTTP/1.1 404 Not Found
Date: Day, DD Mon YYYY hh:mm:ss TZ
Content-Length: 0

Resources

For the complete changelog for NGINX Ingress Controller release 3.0.0, see the Release Notes.

To try NGINX Ingress Controller with NGINX Plus and NGINX App Protect, start your 30-day free trial today or contact us to discuss your use cases.

The post Making Better Decisions with Deep Service Insight from NGINX Ingress Controller appeared first on NGINX.