Tech Archives - NGINX

Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers

Ilya Krutov of F5 — Thu, 22 Feb 2024 20:09:02 +0000

AI and machine learning (AI/ML) workloads are revolutionizing how businesses operate and innovate. Kubernetes, the de facto standard for container orchestration and management, is the platform of choice for powering scalable large language model (LLM) workloads and inference models across hybrid, multi-cloud environments.

In Kubernetes, Ingress controllers play a vital role in delivering and securing containerized applications. Deployed at the edge of a Kubernetes cluster, they serve as the central point of handling communications between users and applications.

In this blog, we explore how Ingress controllers and F5 NGINX Connectivity Stack for Kubernetes can help simplify and streamline model serving, experimentation, monitoring, and security for AI/ML workloads.

Deploying AI/ML Models in Production at Scale

When deploying AI/ML models at scale, out-of-the-box Kubernetes features and capabilities can help you:

Accelerate and simplify the AI/ML application release life cycle.
Enable AI/ML workload portability across different environments.
Improve compute resource utilization efficiency and economics.
Deliver scalability and achieve production readiness.
Optimize the environment to meet business SLAs.

At the same time, organizations might face challenges with serving, experimenting, monitoring, and securing AI/ML models in production at scale:

Increasing complexity and tool sprawl makes it difficult for organizations to configure, operate, manage, automate, and troubleshoot Kubernetes environments on-premises, in the cloud, and at the edge.
Poor user experiences because of connection timeouts and errors due to dynamic events, such as pod failures and restarts, auto-scaling, and extremely high request rates.
Performance degradation, downtime, and slower and harder troubleshooting in complex Kubernetes environments due to aggregated reporting and lack of granular, real-time, and historical metrics.
Significant risk of exposure to cybersecurity threats in hybrid, multi-cloud Kubernetes environments because traditional security models are not designed to protect loosely coupled distributed applications.

Enterprise-class Ingress controllers like F5 NGINX Ingress Controller can help address these challenges. By leveraging one tool that combines Ingress controller, load balancer, and API gateway capabilities, you can achieve better uptime, protection, and visibility at scale – no matter where you run Kubernetes. In addition, it reduces complexity and operational cost.

NGINX Ingress Controller can also be tightly integrated with an industry-leading Layer 7 app protection technology from F5 that helps mitigate OWASP Top 10 cyberthreats for LLM Applications and defends AI/ML workloads from DoS attacks.

Benefits of Ingress Controllers for AI/ML Workloads

Ingress controllers can simplify and streamline deploying and running AI/ML workloads in production through the following capabilities:

Model serving – Deliver apps non-disruptively with Kubernetes-native load balancing, auto-scaling, rate limiting, and dynamic reconfiguration features.
Model experimentation – Implement blue-green and canary deployments, and A/B testing to roll out new versions and upgrades without downtime.
Model monitoring – Collect, represent, and analyze model metrics to gain better insight into app health and performance.
Model security – Configure user identity, authentication, authorization, role-based access control, and encryption capabilities to protect apps from cybersecurity threats.

NGINX Connectivity Stack for Kubernetes includes NGINX Ingress Controller and F5 NGINX App Protect to provide fast, reliable, and secure communications between Kubernetes clusters running AI/ML applications and their users – on-premises and in the cloud. It helps simplify and streamline model serving, experimentation, monitoring, and security across any Kubernetes environment, enhancing capabilities of cloud provider and pre-packaged Kubernetes offerings with higher degree of protection, availability, and observability at scale.

Get Started with NGINX Connectivity Stack for Kubernetes

NGINX offers a comprehensive set of tools and building blocks to meet your needs and enhance security, scalability, and visibility of your Kubernetes platform.

You can get started today by requesting a free 30-day trial of Connectivity Stack for Kubernetes.

The post Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers appeared first on NGINX.

Dynamic A/B Kubernetes Multi-Cluster Load Balancing and Security Controls with NGINX Plus

Chris Akker of F5 — Thu, 15 Feb 2024 16:00:56 +0000

You’re a modern Platform Ops or DevOps engineer. You use a library of open source (and maybe some commercial) tools to test, deploy, and manage new apps and containers for your Dev team. You’ve chosen Kubernetes to run these containers and pods in development, test, staging, and production environments. You’ve bought into the architectures and concepts of microservices and, for the most part, it works pretty well. However, you’ve encountered a few speed bumps along this journey.

For instance, as you build and roll out new clusters, services, and applications, how do you easily integrate or migrate these new resources into production without dropping any traffic? Traditional networking appliances require reloads or reboots when implementing configuration changes to DNS records, load balancers, firewalls, and proxies. These adjustments are not reconfigurable without causing downtime because a “service outage” or “maintenance window” is required to update DNS, load balancer, and firewall rules. More often than not, you have to submit a dreaded service ticket and wait for another team to approve and make the changes.

Maintenance windows can drive your team into a ditch, stall application delivery, and make you declare, “There must be a better way to manage traffic!” So, let’s explore a solution that gets you back in the fast lane.

Active-Active Multi-Cluster Load Balancing

If you have multiple Kubernetes clusters, it’s ideal to route traffic to both clusters at the same time. An even better option is to perform A/B, canary, or blue-green traffic splitting and send a small percentage of your traffic as a test. To do this, you can use NGINX Plus with ngx_http_split_clients_module.

The HTTP Split Clients module is written by NGINX Open Source and allows the ratio of requests to be distributed based on a key. In this use case, the clusters are the “upstreams” of NGINX. So, as the client requests arrive, the traffic is split between two clusters. The key that is used to determine the client request is any available NGINX client $variable. That said, to control this for every request, use the $request_id variable, which is a unique number assigned by NGINX to every incoming request.

To configure the split ratios, determine which percentages you’d like to go to each cluster. In this example, we use K8s Cluster1 as a “large cluster” for production and Cluster2 as a “small cluster” for pre-production testing. If you had a small cluster for staging, you could use a 90:10 ratio and test 10% of your traffic on the small cluster to ensure everything is working before you roll out new changes to the large cluster. If that sounds too risky, you can change the ratio to 95:5. Truthfully, you can pick any ratio you’d like from 0 to 100%.

For most real-time production traffic, you likely want a 50:50 ratio where your two clusters are of equal size. But you can easily provide other ratios, based on the cluster size or other details. You can easily set the ratio to 0:100 (or 100:0) and upgrade, patch, repair, or even replace an entire cluster with no downtime. Let NGINX split_clients route the requests to the live cluster while you address issues on the other.


# Nginx Multi Cluster Load Balancing
# HTTP Split Clients Configuration for Cluster1:Cluster2 ratios
# Provide 100, 99, 50, 1, 0% ratios  (add/change as needed)
# Based on
# https://www.nginx.com/blog/dynamic-a-b-testing-with-nginx-plus/
# Chris Akker – Jan 2024
#
 
split_clients $request_id $split100 {
   * cluster1-cafe;                     # All traffic to cluster1
   } 

split_clients $request_id $split99 {
   99% cluster1-cafe;                   # 99% cluster1, 1% cluster2
   * cluster2-cafe;
   } 
 
split_clients $request_id $split50 { 
   50% cluster1-cafe;                   # 50% cluster1, 50% cluster2
   * cluster2-cafe;
   }
    
split_clients $request_id $split1 { 
   1.0% cluster1-cafe;                  # 1% to cluster1, 99% to cluster2
   * cluster2-cafe;
   }

split_clients $request_id $split0 { 
   * cluster2-cafe;                     # All traffic to cluster2
   }
 
# Choose which cluster upstream based on the ratio
 
map $split_level $upstream { 
   100 $split100; 
   99 $split99; 
   50 $split50; 
   1.0 $split1; 
   0 $split0;
   default $split50;
}

You can add or edit the configuration above to match the ratios that you need (e.g., 90:10, 80:20, 60:40, and so on).

Note: NGINX also has a Split Clients module for TCP connections in the stream context, which can be used for non-HTTP traffic. This splits the traffic based on new TCP connections, instead of HTTP requests.

NGINX Plus Key-Value Store

The next feature you can use is the NGINX Plus key-value store. This is a key-value object in an NGINX shared memory zone that can be used for many different data storage use cases. Here, we use it to store the split ratio value mentioned in the section above. NGINX Plus allows you to change any key-value record without reloading NGINX. This enables you to change this split value with an API call, creating the dynamic split function.

Based on our example, it looks like this:

{“cafe.example.com”:90}

This KeyVal record reads: The Key is the “cafe.example.com” hostname The Value is “90” for the split ratio

Instead of hard-coding the split ratio in the NGINX configuration files, you can instead use the key-value memory. This eliminates the NGINX reload required to change a static split value in NGINX.

In this example, NGINX is configured to use 90:10 for the split ratio with the large Cluster1 for the 90% and the small Cluster2 for the remaining 10%. Because this is a key-value record, you can change this ratio using the NGINX Plus API dynamically with no configuration reloads! The Split Clients module will use this new ratio value as soon as you change it, on the very next request.

Create the KV Record, start with a 50/50 ratio:

Add a new record to the KeyValue store, by sending an API command to NGINX Plus:

curl -iX POST -d '{"cafe.example.com":50}' http://nginxlb:9000/api/8/http/keyvals/split

Change the KV Record, change to the 90/10 ratio:

Change the KeyVal Split Ratio to 90, using an HTTP PATCH Method to update the KeyVal record in memory:

curl -iX PATCH -d '{"cafe.example.com":90}' http://nginxlb:9000/api/8/http/keyvals/split

Next, the pre-production testing team verifies the new application code is ready, you deploy it to the large Cluster1, and change the ratio to 100%. This immediately sends all the traffic to Cluster1 and your new application is “live” without any disruption to traffic, no service outages, no maintenance windows, reboots, reloads, or lots of tickets. It only takes one API call to change this split ratio at the time of your choosing.

Of course, being that easy to move from 90% to 100% means you have an easy way to change the ratio from 100:0 to 50:50 (or even 0:100). So, you can have a hot backup cluster or can scale your clusters horizontally with new resources. At full throttle, you can even completely build a new cluster with the latest software, hardware, and software patches – deploying the application and migrating the traffic over a period of time without dropping a single connection!

Use Cases

Using the HTTP Split Clients module with the dynamic key-value store can deliver the following use cases:

Active-active load balancing – For load balancing to multiple clusters.
Active-passive load balancing – For load balancing to primary, backup, and DR clusters and applications.
A/B, blue-green, and canary testing – Used with new Kubernetes applications.
Horizontal cluster scaling – Adds more cluster resources and changes the ratio when you’re ready.
Hitless cluster upgrades – Ability to use one cluster while you upgrade, patch, or repair the other cluster.
Instant failover – If one cluster has a serious issue, you can change the ratio to use your other cluster.

Configuration Examples

Here is an example of the key-value configuration:

# Define Key Value store, backup state file, timeout, and enable sync
 
keyval_zone zone=split:1m state=/var/lib/nginx/state/split.keyval timeout=365d sync;

keyval $host $split_level zone=split;

And this is an example of the cafe.example.com application configuration:

# Define server and location blocks for cafe.example.com, with TLS

server {
   listen 443 ssl;
   server_name cafe.example.com; 

   status_zone https://cafe.example.com;
      
   ssl_certificate /etc/ssl/nginx/cafe.example.com.crt; 
   ssl_certificate_key /etc/ssl/nginx/cafe.example.com.key;
   
   location / {
   status_zone /;
   
   proxy_set_header Host $host;
   proxy_http_version 1.1;
   proxy_set_header "Connection" "";
   proxy_pass https://$upstream;   # traffic split to upstream blocks
   
   }

# Define 2 upstream blocks – one for each cluster
# Servers managed dynamically by NLK, state file backup

# Cluster1 upstreams
 
upstream cluster1-cafe {
   zone cluster1-cafe 256k;
   least_time last_byte;
   keepalive 16;
   #servers managed by NLK Controller
   state /var/lib/nginx/state/cluster1-cafe.state; 
}
 
# Cluster2 upstreams
 
upstream cluster2-cafe {
   zone cluster2-cafe 256k;
   least_time last_byte;
   keepalive 16;
   #servers managed by NLK Controller
   state /var/lib/nginx/state/cluster2-cafe.state; 
}

The upstream server IP:ports are managed by NGINX Loadbalancer for Kubernetes, a new controller that also uses the NGINX Plus API to configure NGINX Plus dynamically. Details are in the next section.

Let’s take a look at the HTTP split traffic over time with Grafana, a popular monitoring and visualization tool. You use the NGINX Prometheus Exporter (based on njs) to export all of your NGINX Plus metrics, which are then collected and graphed by Grafana. Details for configuring Prometheus and Grafana can be found here.

There are four upstreams servers in the graph: Two for Cluster1 and two for Cluster2. We use an HTTP load generation tool to create HTTP requests and send them to NGINX Plus.

In the three graphs below, you can see the split ratio is at 50:50 at the beginning of the graph.

Then, the ratio changes to 10:90 at 12:56:30.

Then it changes to 90:10 at 13:00:00.

You can find working configurations of Prometheus and Grafana on the NGINX Loadbalancer for Kubernetes GitHub repository.

Dynamic HTTP Upstreams: NGINX Loadbalancer for Kubernetes

You can change the static NGINX Upstream configuration to dynamic cluster upstreams using the NGINX Plus API and the NGINX Loadbalancer for Kubernetes controller. This free project is a Kubernetes controller that watches NGINX Ingress Controller and automatically updates an external NGINX Plus instance configured for TCP/HTTP load balancing. It’s very straightforward in design and simple to install and operate. With this solution in place, you can implement TCP/HTTP load balancing in Kubernetes environments, ensuring new apps and services are immediately detected and available for traffic – with no reload required.

Architecture and Flow

NGINX Loadbalancer for Kubernetes sits inside a Kubernetes cluster. It is registered with Kubernetes to watch the NGINX Ingress Controller (nginx-ingress) Service. When there is a change to the Ingress controller(s), NGINX Loadbalancer for Kubernetes collects the Worker Ips and the NodePort TCP port numbers, then sends the IP:ports to NGINX Plus via the NGINX Plus API.

The NGINX upstream servers are updated with no reload required, and NGINX Plus load balances traffic to the correct upstream servers and Kubernetes NodePorts. Additional NGINX Plus instances can be added to achieve high availability.

A Snapshot of NGINX Loadbalancer for Kubernetes in Action

In the screenshot below, there are two windows that demonstrate NGINX Loadbalancer for Kubernetes deployed and doing its job:

Service Type – LoadBalancer for nginx-ingress
External IP – Connects to the NGINX Plus servers
Ports – NodePort maps to 443:30158 with matching NGINX upstream servers (as shown in the NGINX Plus real-time dashboard)
Logs – Indicates NGINX Loadbalancer for Kubernetes is successfully sending data to NGINX Plus

Note: In this example, the Kubernetes worker nodes are 10.1.1.8 and 10.1.1.10

Adding NGINX Plus Security Features

As more and more applications running in Kubernetes are exposed to the open internet, security becomes necessary. Fortunately, NGINX Plus has enterprise-class security features that can be used to create a layered, defense-in-depth architecture.

With NGINX Plus in front of your clusters and performing the split_clients function, why not leverage that presence and add some beneficial security features? Here are a few of the NGINX Plus features that could be used to enhance security, with links and references to other documentation that can be used to configure, test, and deploy them.

IP block/allow lists – Controls access based on source IP address. You can find a configuration guide on using the NGINX Plus key-value store for dynamic IP block/allow lists using the Plus API on the Dynamic Denylisting of IP Addresses page in the NGINX Docs.
Rate limiting – Controls the number of TCP connections, HTTP requests, and bandwidth usage for applications, API endpoints, and media downloads. For more information on rate limiting with NGINX, check out the Limiting Access to Proxied HTTP Resources in the NGINX Docs and the blog posts Rate Limiting with NGINX and NGINX Plus and Dynamic Bandwidth Limits Using the NGINX Plus Key-Value Store.
GeoIP2 dynamic module – Gathers location metadata based on Source IP address. Learn more about this on the Restricting Access by Geographical Location page in the NGINX Docs.
JWT and OpenID Connect (OIDC) module – Controls user access with industry standard protocols. More information on this can be found on the Setting up JWT Authentication page in the NGINX Docs.
NGINX App Protect – A lightweight, low latency web application firewall that protects against existing and evolving threats.

Get Started Today

If you’re frustrated with networking challenges at the edge of your Kubernetes cluster, consider trying out this NGINX multi-cluster Solution. Take the NGINX Loadbalancer for Kubernetes software for a test drive and let us know what you think. The source code is open source (under the Apache 2.0 license) and all installation instructions are available on GitHub.

To provide feedback, drop us a comment in the repo or message us in the NGINX Community Slack.

The post Dynamic A/B Kubernetes Multi-Cluster Load Balancing and Security Controls with NGINX Plus appeared first on NGINX.

Updating NGINX for the Vulnerabilities in the HTTP/3 Module

Prabhat Dixit of F5 — Wed, 14 Feb 2024 15:15:40 +0000

Today, we are releasing updates to NGINX Plus, NGINX Open source, and NGINX Open Source subscription in response to the internally discovered vulnerabilities in the HTTP/3 module ngx_http_v3_module. These vulnerabilities were discovered based on two bug reports in NGINX open source (trac #2585 and trac #2586). Note that this module is not enabled by default and is documented as experimental.

The vulnerabilities have been registered in the Common Vulnerabilities and Exposures (CVE) database and the F5 Security Incident Response Team (F5 SIRT) has assigned scores to them using the Common Vulnerability Scoring System (CVSS v3.1) scale.

The following vulnerabilities in the HTTP/3 module apply to NGINX Plus, NGINX Open source subscription, and NGINX Open source.

CVE-2024-24989: The patch for this vulnerability is included in following software versions:

NGINX Plus R31 P1
NGINX Open source subscription R6 P1
NGINX Open source mainline version 1.25.4. (The latest NGINX Open source stable version 1.24.0 is not affected.)

CVE-2024-24990: The patch for this vulnerability is included in following software versions:

NGINX Plus R30 P2
NGINX Plus R31 P1
NGINX Open source subscription R5 P2
NGINX Open source subscription R6 P1
NGINX Open source mainline version 1.25.4. (The latest NGINX Open source stable version 1.24.0 is not affected.)

You are impacted if you are running NGINX Plus R30 or R31, NGINX Open source subscription packages R5 or R6 or NGINX Open source mainline version 1.25.3 or earlier. We strongly recommend that you upgrade your NGINX software to the latest version.

For NGINX Plus upgrade instructions, see Upgrading NGINX Plus in the NGINX Plus Admin Guide. NGINX Plus customers can also contact our support team for assistance at https://my.f5.com/.

The post Updating NGINX for the Vulnerabilities in the HTTP/3 Module appeared first on NGINX.

NGINX’s Continued Commitment to Securing Users in Action

Nina Forsyth of F5 — Wed, 14 Feb 2024 15:15:34 +0000

F5 NGINX is committed to a secure software lifecycle, including design, development, and testing optimized to find security concerns before release. While we prioritize threat modeling, secure coding, training, and testing, vulnerabilities do occasionally occur.

Last month, a member of the NGINX Open Source community reported two bugs in the HTTP/3 module that caused a crash in NGINX Open Source. We determined that a bad actor could cause a denial-of-service attack on NGINX instances by sending specially crafted HTTP/3 requests. For this reason, NGINX just announced two vulnerabilities: CVE-2024-24989 and CVE-2024-24990.

The vulnerabilities have been registered in the Common Vulnerabilities and Exposures (CVE) database, and the F5 Security Incident Response Team (F5 SIRT) has assigned them scores using the Common Vulnerability Scoring System (CVSS v3.1) scale.

Upon release, the QUIC and HTTP/3 features in NGINX were considered experimental. Historically, we did not issue CVEs for experimental features and instead would patch the relevant code and release it as part of a standard release. For commercial customers of NGINX Plus, the previous two versions would be patched and released to customers. We felt that not issuing a similar patch for NGINX Open Source would be a disservice to our community. Additionally, fixing the issue in the open source branch would have exposed users to the vulnerability without providing a binary.

Our decision to release a patch for both NGINX Open Source and NGINX Plus is rooted in doing what is right – to deliver highly secure software for our customers and community. Furthermore, we’re making a commitment to document and release a clear policy for how future security vulnerabilities will be addressed in a timely and transparent manner.

The post NGINX’s Continued Commitment to Securing Users in Action appeared first on NGINX.

Meetup Recap: NGINX’s Commitments to the Open Source Community

Shawn Wormke of F5 — Wed, 14 Feb 2024 01:13:11 +0000

Last week, we hosted the NGINX community’s first San Jose, California meetup since the outbreak of COVID-19. It was great to see our Bay Area open source community in person and hear from our presenters.

After an introduction by F5 NGINX General Manager Shawn Wormke, NGINX CTO and Co-Founder Maxim Konovalov detailed NGINX’s history – from the project’s “dark ages” through recent events. Building on that history, we looked to the future. Specifically, Principal Engineer Oscar Spencer and Principal Technical Product Manager Timo Stark covered the exciting new technology WebAssembly and how it can be used to solve complex problems securely and efficiently. Timo also gave us an overview of NGINX JavaScript (njs), breaking down its architecture and demonstrating ways it can solve many of today’s intricate application scenarios.

Above all, the highlight of the meetup was our renewed, shared set of commitments to NGINX’s open source community.

Our goal at NGINX is to continue to be an open source standard, similar to OpenSSL and Linux. Our open source projects are sponsored by F5 and, up until now, have been largely supported by paid employees of F5 with limited contributions from the community. While this has served our projects well, we believe that long-term success hinges on engaging a much larger and diverse community of contributors. Growing our open source community ensures that the best ideas are driving innovation, as we strive to solve complex problems with modern applications.

To achieve this goal, we are making the following commitments that will guarantee the longevity, transparency, and impact of our open source projects:

We will be open, consistent, transparent, and fair in our acceptance of contributions.
We will continue to enhance and open source new projects that move technology forward.
We will continue to offer projects under OSI-approved software licenses.
We will not remove and commercialize existing projects or features.
We will not impose limits on the use of our projects.

With these commitments, we hope that our projects will gain more community contributions, eventually leading to maintainers and core members outside of F5.

However, these commitments do present a pivotal change to our ways of working. For many of our projects that have a small number of contributors, this change will be straightforward. For our flagship NGINX proxy, with its long history and track record of excellence, these changes will take some careful planning. We want to be sensitive to this by ensuring plenty of notice to our community, so they may adopt and adjust to these changes with little to no disruption.

We are very excited about these commitments and their positive impact on our community. We’re also looking forward to opportunities for more meetups in the future! In the meantime, stay tuned for additional information and detailed timelines on this transition at nginx.org.

The post Meetup Recap: NGINX’s Commitments to the Open Source Community appeared first on NGINX.

Tutorial: Configure OpenTelemetry for Your Applications Using NGINX

Akash Ananthanarayanan of F5 — Thu, 18 Jan 2024 18:23:47 +0000

If you’re looking for a tool to trace web applications and infrastructure more effectively, OpenTelemetry might be just what you need. By instrumenting your NGINX server with the existing OpenTelemetry NGINX community module you can collect metrics, traces, and logs and gain better visibility into the health of your server. This, in turn, enables you to troubleshoot issues and optimize your web applications for better performance. However, this existing community module can also slow down your server’s response times due to the performance overhead it requires for tracing. This process can also consume additional resources, increasing CPU and memory usage. Furthermore, setting up and configuring the module can be a hassle.

NGINX has recently developed a native OpenTelemetry module, ngx_otel_module, which revolutionizes the tracing of request processing performance. The module utilizes telemetry calls to monitor application requests and responses, enabling enhanced tracking capabilities. The module can be conveniently set up and configured within the NGINX configuration files, making it highly user-friendly. This new module caters to the needs of both NGINX OSS and NGINX Plus users. It supports W3C context propagation and OTLP/gRPC export protocol, rendering it a comprehensive solution for optimizing performance.

The NGINX-native OpenTelemetry module is a dynamic module that doesn’t require any additional packaging with NGINX Plus. It offers a range of features, including the API and key-value store modules. These features work together to provide a complete solution for monitoring and optimizing the performance of your NGINX Plus instance. By using ngx_otel_module, you can gain valuable insights into your web application’s performance and take steps to improve it. We highly recommend exploring ngx_otel_module to discover how it can help you achieve better results.

Note: You can head over to our GitHub page for detailed instructions on how to install nginx_otel_module and get started.

Tutorial Overview

In this blog, you can follow a step-by-step guide on configuring OpenTelemetry in NGINX Plus and using the Jaeger tool to collect and visualize traces. OpenTelemetry is a powerful tool that offers a comprehensive view of a request’s path, including valuable information such as latency, request details, and response data. This can be incredibly useful in optimizing performance and identifying potential issues. To simplify things, we have set up the OpenTelemetry module, application, and Jaeger all in one instance, which you can see in the diagram below.

Figure 1: NGINX OpenTelemetry architecture overview

Follow the steps in these sections to complete the tutorial:

Prerequisites
Deploy NGINX Plus and Install the OpenTelemetry Module
Deploy Jaeger and the echo Application
Configure OpenTelemetry in NGINX for Tracing
Test the Configuration

Prerequisites

A Linux/Unix environment, or any compatible environment
A NGINX Plus subscription
Basic familiarity with the Linux command line and JavaScript
Docker
Node.js 19.x or later
Curl

Deploy NGINX Plus and Install the OpenTelemetry Module

Selecting an appropriate environment is crucial for successfully deploying an NGINX instance. This tutorial will walk you through deploying NGINX Plus and installing the NGINX dynamic modules.

Install NGINX Plus on a supported operating system.
Install ngx_otel_module. Add the dynamic module to the NGINX configuration directory to activate OpenTelemetry:

load_module modules/ngx_otel_module.so;

Reload NGINX to enable the module:

nginx -t && nginx -s reload

Deploy Jaeger and the `echo` Application

There are various options available to view traces. This tutorial uses Jaeger to collect and analyze OpenTelemetry data. Jaeger provides an efficient and user-friendly interface to collect and visualize tracing data. After data collection, you will deploy mendhak/http-https-echo, a simple Docker application. This application returns the request attributes for JavaScript in JSON format.

Use docker-compose to deploy Jaeger and the http-echo application. You can create a docker-compose file by copying the configuration below and saving it in a directory of your choice.


version: '3'

Services:
  jaeger:
    image: jaegertracing/all-in-one:1.41
    container_name: jaeger
    ports:
      - "16686:16686"
      - "4317:4317"
      - "4318:4318"
    environment:
      COLLECTOR_OTLP_ENABLED: true

  http-echo:
    image: mendhak/http-https-echo
    environment:
        - HTTP_PORT=8888
        - HTTPS_PORT=9999
    ports:
        - "4500:8888" 
        - "8443:9999"

To install the Jaeger all-in-one tracing and http-echo application. Run this command:

'docker-compose up -d'

Run the docker ps -a command to verify if the container is installed.


$docker ps -a
CONTAINER ID   IMAGE                           COMMAND                  CREATED        STATUS
PORTS                                                                                                                                                                   NAMES

5cb7763439f8   jaegertracing/all-in-one:1.41   "/go/bin/all-in-one-…"   30 hours ago   Up 30 hours   5775/udp, 5778/tcp, 14250/tcp, 0.0.0.0:4317-4318->4317-4318/tcp, :::4317-4318->4317-4318/tcp, 0.0.0.0:16686->16686/tcp, :::16686->16686/tcp, 6831-6832/udp, 14268/tcp   jaeger

e55d9c00a158   mendhak/http-https-echo         "docker-entrypoint.s…"   11 days ago    Up 30 hours   8080/tcp, 8443/tcp, 0.0.0.0:8080->8888/tcp, :::8080->8888/tcp, 0.0.0.0:8443->9999/tcp, :::8443->9999/tcp                                                                ubuntu-http-echo-1

You can now access Jaeger by simply typing in the http://localhost:16686 endpoint in your browser. Note that you might not be able to see any system trace data right away as it is currently being sent to the console. But don’t worry! We can quickly resolve this by exporting the traces in the OpenTelemetry Protocol (OTLP) format. You’ll learn to do this in the next section when we configure NGINX to send the traces to Jaeger.

Configure OpenTelemetry in NGINX for Tracing

This section will show you step-by-step how to set up the OpenTelemetry directive in NGINX Plus using a key-value store. This powerful configuration enables precise monitoring and analysis of traffic, allowing you to optimize your application’s performance. By the end of this section, you will have a solid understanding of utilizing the NGINX OpenTelemetry module to track your application’s performance.

Setting up and configuring telemetry collection is a breeze with NGINX configuration files. With ngx_otel_module, users can access a robust, protocol-aware tracing tool that can help to quickly identify and resolve issues in applications. This module is a valuable addition to your application development and management toolset and will help you enhance the performance of your applications. To learn more about configuring other OpenTelemetry sample configurations, please refer to the documentation ngx_otel_module documentation.

OpenTelemetry Directives and Variables

NGINX has new directives that can help you achieve an even more optimized OpenTelemetry deployment, tailored to your specific needs. These directives were designed to enhance your application’s performance and make it more efficient than ever.

Module Directives:

otel_exporter – Sets the parameters for OpenTelemetry data, including the endpoint, interval, batch size, and batch count. These parameters are crucial for the successful export of data and must be defined accurately.
otel_service_name – Sets the service name attribute for your OpenTelemetry resource to improve organization and tracking.
otel_trace – To enable or disable OpenTelemetry tracing, you can now do so by specifying a variable. This offers flexibility in managing your tracing settings.
otel_span_name – The name of the OpenTelemetry span is set as the location name for a request by default. It’s worth noting that the name is customizable and can include variables as required.

Configuration Examples

Here are examples of ways you can configure OpenTelemetry in NGINX using the NGINX Plus key-value store. The NGINX Plus key-value store module offers a valuable use case that enables dynamic configuration of OpenTelemetry span and other OpenTelemetry attributes, thereby streamlining the process of tracing and debugging.

This is an example of dynamically enabling OpenTelemetry tracing by using a key-value store:


http {
      keyval "otel.trace" $trace_switch zone=name;

      server {
          location / {
              otel_trace $trace_switch;
              otel_trace_context inject;
              proxy_pass http://backend;
          }

          location /api {
              api write=on;
          } 
      }
  }

Next, here’s an example of dynamically disabling OpenTelemetry tracing by using a key-value store:


location /api {
              api write=off;
          }

Here is an example NGINX OpenTelemetry span attribute configuration:


user  nginx;
worker_processes  auto;
load_module modules/ngx_otel_module.so;
error_log /var/log/nginx debug;
pid   /var/run/nginx.pid;


events {
    worker_connections  1024;
}

http {
    keyval "otel.span.attr" $trace_attr zone=demo;
    keyval_zone zone=demo:64k  state=/var/lib/nginx/state/demo.keyval;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    include       mime.types;
    default_type  application/json;
    upstream echo {
        server localhost:4500;
        zone echo 64k;
    }
    otel_service_name nginx;
    otel_exporter {
           endpoint localhost:4317;
        }

    server {
       listen       4000;
       otel_trace on;
       otel_span_name otel;
       location /city {
            proxy_set_header   "Connection" "" ;
            proxy_set_header Host $host;
            otel_span_attr demo $trace_attr;
            otel_trace_context inject;
            proxy_pass http://echo;
       }
       location /api {
           api write=on;
       }
       location = /dashboard.html {
        root /usr/share/nginx/html;
    }
       
  }

}

To save the configuration and restart NGINX, input this code:

nginx -s reload

Lastly, here is how to add span attribute in NGINX Plus API:


curl -X POST -d '{"otel.span.attr": ""}' http://localhost:4000/api/6/http/keyvals/

Test the Configuration

Now, you can test your configuration by following the steps below.

To generate the trace data, start by opening your terminal window. Next, type in this command to create the data:

$ curl -i localhost:4000/city

The output will look like this:

                      
HTTP/1.1 200 OK
Server: nginx/1.25.3
Date: Wed, 29 Nov 2023 20:25:04 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 483
Connection: keep-alive
X-Powered-By: Express
ETag: W/"1e3-2FytbGLEVpb4LkS9Xt+KkoKVW2I"

{
"path": "/city",
"headers": {
"host": "localhost",
"connection": "close",
"user-agent": "curl/7.81.0",
"accept": "*/*",
"traceparent": "00-66ddaa021b1e36b938b0a05fc31cab4a-182d5a6805fef596-00"
},
"method": "GET",
"body": "",
"fresh": false,
"hostname": "localhost",
"ip": "::ffff:172.18.0.1",
"ips": [],
"protocol": "http",
"query": {},
"subdomains": [],
"xhr": false,
"os": {
"hostname": "e55d9c00a158"
},
"connection": {}

Now you want to ensure that the OTLP exporter is functioning correctly and that you can gain access to the trace. Start by opening a browser and accessing the Jaeger UI at http://localhost:16686. Once the page loads, click on the Search button, located in the title bar. From there, select the service that starts with NGINX from the drop-down menu in the Service field. Then select the operation named Otel from the drop-down menu called Operation. To make it easier to identify any issues, click on the Find Traces button to visualize the trace.

Figure 2: Jaeger dashboard

To access a more detailed and comprehensive analysis of a specific trace, click on one of the individual traces available. This will provide you with valuable insights into the trace you have selected. In the trace below, you can review both the OpenTelemetry directive span attribute and the non-directive of the trace, allowing you to better understand the data at hand.

Figure 3: Detailed analysis of the OpenTelemetry trace

Under Tags you can see the following attributes:

demo – OTel – OpenTelemetry span attribute name
http.status_code field – 200 – Indicates successful creation
otel.library.name – nginx – OpenTelemetry service name

Conclusion

NGINX now has built-in support for OpenTelemetry, a significant development for tracing requests and responses in complex application environments. This feature streamlines the process and ensures seamless integration, making it much easier for developers to monitor and optimize their applications.

Although the OpenTracing module that was introduced in NGINX Plus R18 is now deprecated and will be removed starting from NGINX Plus R34, it will still be available in all NGINX Plus releases until then. However, it’s recommended to use the OpenTelemetry module, which was introduced in NGINX Plus R29.

If you’re new to NGINX Plus, you can start your 30-day free trial today or contact us to discuss your use cases.

The post Tutorial: Configure OpenTelemetry for Your Applications Using NGINX appeared first on NGINX.

A Quick Guide to Scaling AI/ML Workloads on Kubernetes

Ilya Krutov of F5 — Thu, 11 Jan 2024 16:00:02 +0000

When running artificial intelligence (AI) and machine learning (ML) model training and inference on Kubernetes, dynamic scaling up and down becomes a critical element. In addition to requiring high-bandwidth storage and networking to ingest data, AI model training also needs substantial (and expensive) compute, mostly from GPUs or other specialized processors. Even when leveraging pre-trained models, tasks like model serving and fine-tuning in production are still more compute-intensive than most enterprise workloads.

Cloud-native Kubernetes is designed for rapid scalability – up and down. It’s also designed to deliver more agility and cost-efficient resource usage for dynamic workloads across hybrid, multi-cloud environments.

In this blog, we cover the three most common ways to scale AI/ML workloads on Kubernetes so you can achieve optimal performance, cost savings, and adaptability for dynamic scaling in diverse environments.

Three Scaling Modalities for AI/ML Workloads on Kubernetes

The three common ways Kubernetes scales a workload are with the Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), and Cluster Autoscaler.

Here is a breakdown of those three methods:

HPA – The equivalent of adding instances or pod replicas to an application, giving it more scale, capacity, and throughput.
VPA – The equivalent of resizing a pod to give it higher capacity with greater compute and memory.
Cluster Autoscaler – Automatically increases or decreases the number of nodes in a Kubernetes cluster depending on the current resource demand for the pods.

Each modality has its benefits for model training and inferencing, which you can explore in the use cases below.

HPA Use Cases

In many cases, distributed AI model training and inference workloads can scale horizontally (i.e., adding more pods to speed up the training process or request handling). This enables the workloads benefit from HPA, which can scale out the number of pods based on metrics like CPU and memory usage, or even custom and external metrics relevant to the workload. In scenarios where the workload varies over time, HPA can dynamically adjust the number of pods to ensure optimal resource utilization.

Another aspect of horizontally scaling AI workloads in Kubernetes is load balancing. To ensure optimal performance and timely request processing, incoming requests need to be distributed across multiple instances or pods. This is why one of the ideal tools that can be used in conjunction with HPA is an Ingress controller.

VPA Use Cases

AI model training tasks are often resource-intensive, requiring significant CPU, GPU, and memory resources. VPA can adjust these resource allocations dynamically. This helps ensure that each pod has enough resources to efficiently handle the training workload and that all assigned pods have sufficient compute capacity to perform calculations. In addition, memory requirements can fluctuate significantly during the training of large models. VPA can help prevent out-of-memory errors by increasing the memory allocation as needed.

While it’s technically possible to use both HPA and VPA together, it requires careful configuration to avoid conflicts, as they might try to scale the same workload in different ways (i.e., horizontally versus vertically). It’s essential to clearly define the boundaries for each autoscaler, ensuring they complement rather than conflict with each other. An emerging approach is to use both with different scopes – for instance, HPA for scaling across multiple pods based on workload and VPA for fine-tuning the resource allocation of each pod within the limits set by HPA.

Cluster Autoscaler Use Cases

Cluster Autoscaler can help dynamically adjust the overall pool of compute, storage, and networking infrastructure resources available cluster-wide to meet the demands of AI /ML workloads. By adjusting the number of nodes in a cluster based on current demands, an organization can load balance at the macro level. This is necessary to ensure optimal performance as AI/ML workloads can demand significant computational resources unpredictably.

HPA, VPA, and Cluster Autoscaler Each Have a Role

In summary, these are the three ways that Kubernetes autoscaling works and benefits AI workloads:

HPA scales AI model serving endpoints that need to handle varying request rates.
VPA optimizes resource allocation for AI/ML workloads and ensures each pod has enough resources for efficient processing without over-provisioning.
Cluster Autoscaler adds nodes to a cluster to ensure it can accommodate resource-intensive AI jobs or removes nodes when the compute demands are low.

HPA, VPA and Cluster Autoscaler complement each other in managing AI/ML workloads in Kubernetes. Cluster Autoscaler ensures there are enough nodes to meet workload demands, HPA efficiently distributes workloads across multiple pods, and VPA optimizes the resource allocation of these pods. Together, they provide a comprehensive scaling and resource management solution for AI/ML applications in Kubernetes environments.

Visit our Power and Protect Your AI Journey page to learn more on how F5 and NGINX can help deliver, secure, and optimize your AI/ML workloads.

The post A Quick Guide to Scaling AI/ML Workloads on Kubernetes appeared first on NGINX.

Managing Your NGINX Configurations with GitHub

Michael Vernik of F5 — Tue, 05 Dec 2023 23:45:01 +0000

Software settings, options, and configurations exist in many different forms. At one end of the spectrum are the slick, responsive GUIs meant to be intuitive while providing guardrails against invalid states. On the other end: text files. While text files are lauded by both engineers and DevOps teams for their clarity, automation potential, and minimal usage requirements (you likely have a few open terminal windows, right?), there are clear trade-offs to configuring software with them. For example, try to find a Linux user who hasn’t managed to crash a software package by misconfiguring a text file.

As the only all-in-one software proxy, load balancer, web server and API gateway, NGINX is a key component of modern internet infrastructure. An infrastructure that is, in most cases, based on operating systems underpinned by the Linux kernel. To conform to this ecosystem and the professionals supporting it, NGINX relies heavily on text-based configurations.

The F5 NGINX Instance Manager module has long been the go-to for NGINX-related config orchestration. It provides advanced capabilities for remote, batch configuration management through its intuitive user interface and API, with supporting documentation and guardrails to boot. However, individual NGINX configuration files strongly resemble code and software teams already have an amazing tool for managing code: Git. Git provides developers and operations teams a slew of features geared towards managing text file workflows.

Instance Manager’s new integration with Git and other external systems enables features like version control, decentralized contribution, approval workflows, and team coordination. Platforms like GitHub and GitLab extend this feature set to continuous integration/continuous deployment (CI/CD), collaboration, issue tracking, and other valuable functions through a web-based user interface.

In this blog post we illustrate how GitHub can be used to manage NGINX configurations, automatically pushing them to instances whenever a change is made.

Instance Manager API

Instance Manager provides a rich set of REST APIs that complement its web user interface. A key benefit to the API is its ability to update configuration files of data plane instances under management. Recently, we extended this functionality by enabling the ability to tie configuration updates to a specific version of the file. When managed in a Git repository, configurations can be tagged with a Git commit hash. Additionally, we implemented a new state within Instance Manager for externally managed configs that warns would-be file editors that configurations are under external management.

GitHub Actions

GitHub allows developers to create custom deployment pipelines in their repositories using a feature called Actions. A user may choose to define their own actions or invoke existing scripts via a YAML definition. These pipelines can be triggered in a variety of ways, like when a repository is updated with a commit or a pull request merge.

In this blog’s example, we lean on out-of-the-box GitHub Actions and Linux commands. You will learn to use them to update GitHub-managed NGINX configuration files on your NGINX instances via the Instance Manager API. To start, follow these steps on GitHub Docs to create a new YAML for running Actions in your repository.

Setting Actions Secrets

Instance Manager supports various forms of authentication. In the example, we use the Basic Authentication method, though we recommend provisioning OIDC authentication in production environments.

Rather than storing Instance Manager credentials in the repository, GitHub allows you to define secrets as repository variables. These variables are accessible by the Actions environment but hidden in logs. Follow these steps to store Instance Manager username and password keys as secrets so you can use them to authenticate your API calls.

Here, we’ve set these variables to NMS_USERNAME and NMS_PASSWORD.

Setting Actions Variables

Similarly, rather than defining constant variables in your YAML, it can be helpful to factor them out for management in the GitHub user interface. On the Variables page, you can find instructions on how to define variables that span all repository Actions. In the example, we use this opportunity to define the Instance Manager FQDN or IP (NMS_HOSTNAME), identifier of the system NGINX is running on (SYSTEM_ID), and identifier of the specific NGINX instance to be updated (INSTANCE_ID).

Note: We’ve set these variables to simplify our example, but you may choose to manage Instance Manager, System, and NGINX identifying information in other ways. For instance, you may opt to create directories in your repository containing configurations specific to different instances and name these directories with system or instance IDs. You could then modify your YAML or Action script to read directory names and update configuration files on the corresponding instance.

Anatomy of the Instance Manager REST API Call for Updating NGINX Configurations

The Instance Manager configuration update REST API call requires several key components to work. Your YAML will need to define each of these parameters and package them into the API call in the proper format.

In the example, we use the API call for updating a single instance. However, it’s also possible to configure an instance group within Instance Manager. Doing so enables you to update all instances in a group whenever a new configuration is pushed from GitHub. For more information, please see our How-To Guide on publishing configurations.

Below is a breakdown of Instance Manager’s configuration update REST API call:


https://{INSTANCE MANAGER HOSTNAME}/api/platform/v1/systems/{SYSTEM ID}/instances/{INSTANCE ID}/config' 
--header "accept: application/json" 
--header "Authorization: Basic {LOGIN CREDENTIALS}" 
--header 'Content-Type: application/json'
--data '{
    "configFiles": '{
        "rootDir": "/etc/nginx",
        "files": [{
            "contents": "{NGINX CONFIGURATION FILE}",
            "name": "{PATH TO CONFIGURATION FILE ON SYSTEM}"
        }]
    },
    "externalIdType": "{SOURCE OF CONFIGS}",
    "externalId": "{COMMIT HASH}",
    "updateTime": "{TIMESTAMP}"
}'}'

URI Parameters

System identifier – A unique key identifying the system that NGINX is running on. In our example, this data is defined in the GitHub Actions Variables interface.
NGINX instance identifier – A unique key identifying a specific NGINX instance on the system. In our example, this data is defined in the GitHub Actions Variables interface.

Header Parameters

Authorization – The authorization method and Base64 encoded credentials that Instance Manager uses to authenticate the API call. In the example, you will use Basic Authorization with credentials data coming from the secrets set in the repository.

JSON Data Parameters

configFiles – The Base64 encoded contents of configuration files being updated, along with their locations on the system running NGINX. You will also need to provide a path to the root directory of your NGINX configuration files, most commonly configured as: /etc/nginx.
externalIdType – A label that identifies the source of the configuration file to Instance Manager. Possible values are git or other. In our example, we hardcode this parameter to git.
externalId – The Git commit Secure Hash Algorithm (SHA) that identifies the change to the configuration files.
updateTime – A string containing the current date and time in %Y-%m-%dT%H:%M:%SZ format.

Base64 Encoding

To accommodate Instance Manager’s API specification, you must transform certain data by encoding it into Base64 format. While there isn’t a native way to accomplish this with existing GitHub Actions, we can rely on Linux tools, accessible from our YAML.

Instance Manager Credentials

Start by referencing the Instance Manager login credentials that were defined earlier as secrets and concatenate them. Then, convert the string to Base64, echo it as a Linux variable (NMS_LOGIN) and append the result to a predefined environment variable (GITHUB_ENV), accessible by the Actions runner.

run: echo "NMS_LOGIN=`echo -n "${{ secrets.NMS_USERNAME }}:${{ secrets.NMS_PASSWORD }}" | base64`" >> $GITHUB_ENV

Timestamp

The Instance Manager API requires that a specifically formatted timestamp is sent with certain API payloads. You can construct the timestamp in that format using the Linux date command. Similar to the previous example, append the constructed string as a variable to the Linux environment.

run: echo "NMS_TIMESTAMP=`date -u +"%Y-%m-%dT%H:%M:%SZ"`" >> $GITHUB_ENV

NGINX Configurations

Next, add the NGINX configurations that you plan to manage into the repository. There are many ways of adding files to a GitHub repository. For more information, follow this guide in GitHub’s documentation. To follow our example, you can create a directory structure in your GitHub repository that mirrors that of the instance.

The YAML entry below reads the configuration file from your repository, encodes its contents to Base64 and adds the result to an environment variable, as before.

run: echo "NGINX_CONF_CONFIG_FILE=`cat nginx-server/etc/nginx/nginx.conf | base64 -w 0`" >> $GITHUB_ENV

In our example, we repeat this for every configuration file in our GitHub repository.

Putting It All Together

Finally, you can use GitHub’s sample reference implementation to piece together what you’ve learned into a working YAML file. As defined in the file, all associated GitHub Actions scripts will run whenever a user updates the repository through a commit or pull request. The final entry in the YAML will run a curl command that will make the appropriate API call, containing the necessary data for Instance Manager to update all the related configuration files.

Note: Use the multi-line run entry (run: |) in your YAML to run the curl command because this instructs the YAML interpreter to treat colons “:” as text in the parameter portion of the entry.


name: Managing NGINX configs with GitHub and GitHub Actions 
# Controls when the workflow will run 
on: 
  # Triggers the workflow on push or pull request events but only for the "main" branch 
  push: 
    branches: [ "main" ] 
  pull_request: 
    branches: [ "main" ] 
  
  # Allows you to run this workflow manually from the Actions tab 
  workflow_dispatch: 

# A workflow run is made up of one or more jobs that can run sequentially or in parallel 
jobs: 
  # This workflow contains a single job called "build" 
  build: 
    # The type of runner that the job will run on 
    runs-on: ubuntu-latest 

    # Steps represent a sequence of tasks that will be executed as part of the job 
    steps: 
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it 
      - uses: actions/checkout@v4 

      - name: Set environment variable for NMS API login credentials 
        run: echo "NMS_LOGIN=`echo -n "${{ secrets.NMS_USERNAME }}:${{ secrets.NMS_PASSWORD }}" | base64`" >> $GITHUB_ENV 

      - name: Set environment variable for NMS API timestamp 
        run: echo "NMS_TIMESTAMP=`date -u +"%Y-%m-%dT%H:%M:%SZ"`" >> $GITHUB_ENV 

      - name: Set environment variable for base64 encoded config file 
        run: echo "NGINX_CONF_CONFIG_FILE=`cat app-sfo-01/etc/nginx/nginx.conf | base64 -w 0`" >> $GITHUB_ENV 

      - name: Set environment variable for base64 encoded config file 
        run: echo "MIME_TYPES_CONFIG_FILE=`cat app-sfo-01/etc/nginx/mime.types | base64 -w 0`" >> $GITHUB_ENV 

      - name: Set environment variable for base64 encoded config file 
        run: echo "DEFAULT_CONF_CONFIG_FILE=`cat app-sfo-01/etc/nginx/conf.d/default.conf | base64 -w 0`" >> $GITHUB_ENV 

      - name: Set environment variable for base64 encoded config file 
        run: echo "SSL_CONF_CONFIG_FILE=`cat app-sfo-01/etc/nginx/conf.d/ssl.conf | base64 -w 0`" >> $GITHUB_ENV 

      - name: API call to Instance Manager 
        run: | 
          curl --location 'https://${{ vars.NMS_HOSTNAME }}/api/platform/v1/systems/${{ vars.SYSTEM_ID }}/instances/${{ vars.INSTANCE_ID }}/config' --header "accept: application/json" --header "Authorization: Basic ${{ env.NMS_LOGIN }}" --header 'Content-Type: application/json' --data '{"configFiles": {"rootDir": "/etc/nginx","files": [{"contents": "${{ env.NGINX_CONF_CONFIG_FILE }}","name": "/etc/nginx/nginx.conf"},{"contents": "${{ env.MIME_TYPES_CONFIG_FILE }}","name": "/etc/nginx/mime.types"},{"contents": "${{ env.DEFAULT_CONF_CONFIG_FILE }}","name": "/etc/nginx/conf.d/default.conf"},{"contents": "${{ env.SSL_CONF_CONFIG_FILE }}","name": "/etc/nginx/conf.d/ssl.conf"}]},"externalIdType": "git","externalId": "${{ github.sha }}","updateTime": "${{ env.NMS_TIMESTAMP }}"}'

NGINX Reference Implementation

Executing a configuration update API call after a file has changed can be achieved in different ways. While GitHub Actions is the most convenient method for those who use GitHub, it will not work for GitLab or standalone Git implementations. To address these use cases, we’ve developed a companion shell script reference implementation that can be triggered from the command line or invoked from custom scripts.

In conclusion, our new extension to the Instance Manager API provides a powerful tool for managing configuration updates, rollbacks, and version histories in a modern, decentralized way. Coupling the extension with a third-party text file and code management platform like GitHub enables additional workflow, CI/CD, collaboration, and issue tracking features through an intuitive web-based user interface.

We’d love to hear your thoughts! Give it a try and let us know what you think in the comments or by joining our NGINX Community Slack channel.

The post Managing Your NGINX Configurations with GitHub appeared first on NGINX.

How NGINX Gateway Fabric Implements Complex Routing Rules

Kate Osborn of F5 — Thu, 02 Nov 2023 15:00:17 +0000

NGINX Gateway Fabric is an implementation of the Kubernetes Gateway API specification that uses NGINX as the data plane. It handles Gateway API resources such as GatewayClass, Gateway, ReferenceGrant, and HTTPRoute to configure NGINX as an HTTP load balancer that exposes applications running in Kubernetes to outside of the cluster.

In this blog post, we explore how NGINX Gateway Fabric uses the NGINX JavaScript scripting language (njs) to simplify an implementation of HTTP request matching based on a request’s headers, query parameters, and method.

Before we dive into NGINX JavaScript, let’s go over how NGINX Gateway Fabric configures the data plane.

Configuring NGINX from Gateway API Resources Using Go Templates

To configure the NGINX data plane, we generate configuration files based on the Gateway API resources created in the Kubernetes cluster. These files are generated from Go templates. To generate the files, we process the Gateway API resources, translate them into data structures that represent NGINX configuration, and then execute the NGINX configuration templates by applying them to the NGINX data structures. The NGINX data structures contain fields that map to NGINX directives.

For the majority of cases, this works very well. Most fields in the Gateway API resources can be easily translated into NGINX directives. Take, for example, traffic splitting. In the Gateway API, traffic splitting is configured by listing multiple Services and their weights in the backendRefs field of an HTTPRouteRule.

This configuration snippet splits 50% of the traffic to service-v1 and the other 50% to service-v2:


backendRefs: 
- name: service-v1 
   port: 80 
   weight: 50 
- name: service-v2 
   port: 80 
   weight: 50

Since traffic splitting is natively supported by the NGINX HTTP split clients module, it is straightforward to convert this to an NGINX configuration using a template.

The generated configuration would look like this:


split_clients $request_id $variant { 
    50% upstream-service-v1; 
    50% upstream-service-v2; 
}

In cases like traffic splitting, Go templates are simple yet powerful tools that enable you to generate an NGINX configuration that reflects the traffic rules that the user configured through the Gateway API resources.

However, we found that more complex routing rules defined in the Gateway API specification could not easily be mapped to NGINX directives using Go templates, and we needed a higher-level language to evaluate these rules. That’s when we turned to NGINX JavaScript.

What Is NGINX JavaScript?

NGINX JavaScript is a general-purpose scripting framework for NGINX and NGINX Plus that’s implemented as a Stream and HTTP NGINX module. The NGINX JavaScript module allows you to extend NGINX’s configuration syntax with njs code, a subset of the JavaScript language that was designed to be a modern, fast, and robust high-level scripting tailored for the NGINX runtime. Unlike standard JavaScript, which is primarily intended for web browsers, njs is a server-side language. This approach was taken to meet the requirements of server-side code execution and to integrate with NGINX’s request-processing architecture.

There are many use cases for njs (including response filtering, diagnostic logging, and joining subrequests) but this blog specifically explores how NGINX Gateway Fabric uses njs to perform HTTP request matching.

HTTP Request Matching

Before we dive into the NGINX JavaScript solution, let’s talk about the Gateway API feature being implemented.

HTTP request matching is the process of matching requests to routing rules based on certain conditions (matches) – e.g., the headers, query parameters, and/or method of the request. The Gateway API allows you to specify a set of HTTPRouteRules that will result in client requests being sent to specific backends based on the matches defined in the rules.

For example, if you have two versions of your application running on Kubernetes and you want to route requests with the header version:v2 to version 2 of your application and all other requests version 1, you can achieve this with the following routing rules:


rules: 
  - matches: 
      - path: 
          type: PathPrefix 
          value: / 
    backendRefs: 
      - name: v1-app 
        port: 80 
  - matches: 
      - path: 
          type: PathPrefix 
          value: / 
        headers: 
          - name: version 
            value: v2 
    backendRefs: 
      - name: v2-app 
        port: 80

Now, say you also want to send traffic with the query parameter TEST=v2 to version 2 of your application, you can add another rule that matches that query parameter:


- matches 
  - path: 
      type: PathPrefix 
      value: /coffee 
    queryParams: 
      - name: TEST 
        value: v2

These are the three routing rules defined in the example above:

Matches requests with path / and routes them to backend v1-app
Matches requests with path / and the header version:v2 and routes them to the backend v2-app.
Matches requests with path / and the query parameter TEST=v2 and routes them to the backend v2-app.

NGINX Gateway Fabric must process these routing rules and configure NGINX to route requests accordingly. In the next section, we will use NGINX JavaScript to handle this routing.

The NGINX JavaScript Solution

To determine where to route a request when matches are defined, we wrote a location handler function in njs – named redirect – which redirects requests to an internal location block based on the request’s headers, arguments, and method.

Let’s look at the NGINX configuration generated by NGINX Gateway Fabric for the three routing rules defined above.

Note: this config has been simplified for the purpose of this blog.


# nginx.conf 
load_module /usr/lib/nginx/modules/ngx_http_js_module.so; # load NGINX JavaScript Module 
events {}  
http {  
    js_import /usr/lib/nginx/modules/httpmatches.js; # Import the njs script 
    server {  
        listen 80; 
        location /_rule1 {  
            internal; # Internal location block that corresponds to rule 1 
            proxy_pass http://upstream-v1-app$request_uri;  
         }  
        location /_rule2{  
            internal; # Internal location block that corresponds to rule 2 
            proxy_pass http://upstream-v2-app$request_uri; 
        } 
  location /_rule3{ 
internal; # Internal location block that corresponds to rule 3 
proxy_pass http://upstream-v2-app$request_uri; 
  } 
        location / {  
            # This is the location block that handles the client requests to the path / 
           set $http_matches "[{\"redirectPath\":\"/_rule2\",\"headers\":[\"version:v2\"]},{\"redirectPath\":\"/_rule3\",\"params\":[\"TEST=v2\"]},{\"redirectPath\":\"/_rule1\",\"any\":true}]"; 
             js_content httpmatches.redirect; # Executes redirect njs function 
        } 
     }  
}

The js_import directive is used to specify the file that contains the redirect function and the js_content directive is used to execute the redirect function.

The redirect function depends on the http_matches variable. The http_matches variable contains a JSON-encoded list of the matches defined in the routing rules. The JSON match holds the required headers, query parameters, and method, as well as the redirectPath, which is the path to redirect the request to a match. Every redirectPath must correspond to an internal location block.

Let’s take a closer look at each JSON match in the http_matches variable (shown in the same order as the routing rules above):

{"redirectPath":"/_rule1","any":true} – The “any” boolean means that all requests match this rule and should be redirected to the internal location block with the path /_rule1.
{"redirectPath":"/_rule2","headers"[“version:v2”]} – Requests that have the header version:v2 match this rule and should be redirected to the internal location block with the path /_rule2.
{"redirectPath":"/_rule3","params"[“TEST:v2”]} – Requests that have the query parameter TEST=v2 match this rule and should be redirected to the internal location block with the path /_rule3.

One last thing to note about the http_matches variable is that the order of the matches matters. The redirect function will accept the first match that the request satisfies. NGINX Gateway Fabric will sort the matches according to the algorithm defined by the Gateway API to make sure the correct match is chosen.

Now let’s look at the JavaScript code for the redirect function (the full code can be found here):


// httpmatches.js 
function redirect(r) { 
  let matches; 

  try { 
    matches = extractMatchesFromRequest(r); 
  } catch (e) { 
    r.error(e.message); 
    r.return(HTTP_CODES.internalServerError); 
    return; 
  } 

  // Matches is a list of http matches in order of precedence. 
  // We will accept the first match that the request satisfies. 
  // If there's a match, redirect request to internal location block. 
  // If an exception occurs, return 500. 
  // If no matches are found, return 404. 
  let match; 
  try { 
    match = findWinningMatch(r, matches); 
  } catch (e) { 
    r.error(e.message); 
    r.return(HTTP_CODES.internalServerError); 
    return; 
  } 

  if (!match) { 
    r.return(HTTP_CODES.notFound); 
    return; 
  } 

  if (!match.redirectPath) { 
    r.error( 
      `cannot redirect the request; the match ${JSON.stringify( 
        match, 
      )} does not have a redirectPath set`, 
    ); 
    r.return(HTTP_CODES.internalServerError); 
    return; 
  } 

  r.internalRedirect(match.redirectPath); 
}

The redirect function accepts the NGINX HTTP request object as an argument and extracts the http_matches variable from it. It then finds the winning match by comparing the request’s attributes (found on the request object) to the list of matches and internally redirects the request to the winning match’s redirect path.

Why Use NGINX JavaScript?

While it’s possible to implement HTTP request matching using Go templates to generate an NGINX configuration, it’s not straightforward when compared to simpler use cases like traffic splitting. Unlike the split_clients directive, there’s no native way to compare a request’s attributes to a list of matches in a low-level NGINX configuration.

We chose to use njs to HTTP request match in NGINX Gateway Fabric for these reasons:

Simplicity – Makes complex HTTP request matching easy to implement, enhancing code readability and development efficiency.
Debugging – Simplifies debugging by allowing descriptive error messages, speeding up issue resolution.
Unit Testing – Code can be thoroughly unit tested, ensuring robust and reliable functionality.
Extensibility – High-level scripting nature enables easy extension and modification, accommodating evolving project needs without complex manual configuration changes.
Performance – Purpose-built for NGINX and designed to be fast.

Next Steps

If you are interested in our implementation of the Gateway API using the NGINX data plane, visit our NGINX Gateway Fabric project on GitHub to get involved:

Join the project as a contributor
Try the implementation in your lab
Test and provide feedback

And if you are interested to chat about this project and other NGINX projects, stop by the NGINX booth at KubeCon North America 2023. NGINX, part of F5, is proud to be a Platinum Sponsor of KubeCon NA, and we hope to see you there!

To learn more about njs, check out additional examples or read this blog.

The post How NGINX Gateway Fabric Implements Complex Routing Rules appeared first on NGINX.

Configure NGINX Plus for SAML SSO with Microsoft Entra ID

Akash Ananthanarayanan of F5 — Tue, 31 Oct 2023 15:00:29 +0000

To enhance security and improve user experience, F5 NGINX Plus (R29+) now has support for Security Assertion Markup Language (SAML). A well-established protocol that provides single sign-on (SSO) to web applications, SAML enables an identity provider (IdP) to authenticate users for access to a resource and then passes that information to a service provider (SP) for authorization.

In this blog post, we cover step-by-step how to integrate NGINX with Microsoft Entra ID, formerly known as Azure Active Directory (Azure AD), using a web application that does not natively support SAML. We also cover how to implement SSO for the application and integrate it with the Microsoft Entra ID ecosystem. By following the tutorial, you’ll additionally learn how NGINX can extract claims from a SAML assertion (including UPN, first name, last name, and group memberships) and then pass them to the application via HTTP headers.

The tutorial includes three steps:

Configuring Microsoft Entra ID as an IdP
Configuring SAML settings and NGINX Plus as a reverse proxy
Testing the configuration

To complete this tutorial, you need:

NGINX Plus (R29+), which you can get as a free 30-day trial
A free or enterprise Microsoft Entra ID account
A valid SSL/TLS certificate installed on the NGINX Plus server (this tutorial uses dev.sports.com.crt and dev.sports.com.key)
To verify the SAML assertions, which can be done by downloading the public certificate demonginx.cer from the IdP

Note: This tutorial does not apply to NGINX Open Source deployments because the key-value store is exclusive to NGINX Plus.

Using NGINX Plus as a SAML Service Provider

In this setup, NGINX Plus acts as a SAML SP and can participate in an SSO implementation with a SAML IdP, which communicates indirectly with NGINX Plus via the User Agent.

The diagram below illustrates the SSO process flow, with SP initiation and POST bindings for request and response. It is critical to again note that this communication channel is not direct and is managed through the User Agent.

Figure 1: SAML SP-Initiated SSO with POST bindings for AuthnRequest and Response

Step 1: Configure Microsoft Entra ID as an Identity Provider

To access your Microsoft Entra ID management portal, sign in and navigate to the left-hand panel. Select Microsoft Entra ID and then click on the directory’s title that requires SSO configuration. Once selected, choose Enterprise applications.

Figure 2: Choosing Enterprise applications in the management portal

To create an application, click the New application button at the top of the portal. In this example, we created an application called demonginx.

Figure 3: Creating a new application in Microsoft Entra ID

After you’re redirected to the newly created application Overview, go to Getting Started via the left menu and click Single sign-on under Manage. Then, select SAML as the single sign-on method.

Figure 4: Using the SSO section to start the SAML configuration

To set up SSO in your enterprise application, you need to register NGINX Plus as an SP within Microsoft Entra ID. To do this, click the pencil icon next to Edit in Basic SAML Configuration, as seen Figure 5.

Add the following values then click Save:

Identifier (Entity ID) – https://dev.sports.com
Reply URL (Assertion Consumer Service URL) – https://dev.sports.com/saml/acs
Sign on URL: https://dev.sports.com
Logout URL (Optional): https://dev.sports.com/saml/sls

The use of verification certificates is optional. When enabling this setting, two configuration options in NGINX must be addressed:

To verify the signature with a public key, you need to set $saml_sp_sign_authn to true. This instructs the SP to sign the AuthnRequest sent to the IdP.
Provide the path to the private key that will be used for this signature by configuring the $saml_sp_signing_key. Make sure to upload the corresponding public key certificate to Microsoft Entra ID for signature verification.

Note: In this demo, attributes and claims have been modified, and new SAML attributes are added. These SAML attributes are sent by the IdP. Ensure that your NGINX configuration is set up to properly receive and process these attributes. You can check and adjust related settings in the NGINX GitHub repo.

Download the IdP Certificate (Raw) from Microsoft Entra ID and save it to your NGINX Plus instance.

Figure 5: Downloading the IdP Certificate (Raw) from Microsoft Entra ID

Figure 6: Adding a new user or group

In Microsoft Entra ID, you can grant access to your SSO-enabled company applications by adding or assigning users and groups.

On the left-hand menu, click User and groups and then the top button Add user/group.

Step 2: Configure SAML Settings and NGINX Plus as a Reverse Proxy

Ensure you have the necessary certificates before configuring files in your NGINX Plus SP:

Certificates for terminating TLS session (dev.sports.com.crt and dev.sports.com.key)
Certificate downloaded from Microsoft Entra ID for IdP signing verification (demonginx.cer)

Note: The certificates need to be in SPKI format.

To begin this step, download the IdP certificate from Microsoft Entra ID for signing verification. Then, convert PEM to DER format:

openssl x509 -in demonginx.cer -outform DER -out demonginx.der

In case you want to verify SAML SP assertions, it’s recommended to use public/private keys that are different from the ones used for TLS termination.

Extract the public key certificate in SPKI format:

openssl x509 -inform DER -in demonginx.der -pubkey -noout > demonginx.spki

Edit the frontend.conf file to update these items:

ssl_certificate – Update to include the TLS certificate path.
ssl_certificate_key – Update to include the TLS private key path.

In production deployment, you can use different backend destinations based on the business requirement. In this example, the backend provides a customized response:

“Welcome to Application page\n My objectid is $http_objectid\n My email is $http_mail\n”;

We have modified the attributes and claims in Microsoft Entra ID by adding new claims for the user’s mail and objectid. These updates enable you to provide a more personalized and tailored response to your application, resulting in an improved user experience.

Figure 7: Modified attributes and claims in Microsoft Entra ID

The next step is to configure NGINX, which will proxy traffic to the backend application. In this demo, the backend SAML application is publicly available at https://dev.sports.com.

Edit your frontend.conf file:


# This is file frontend.conf 
# This is the backend application we are protecting with SAML SSO 
upstream my_backend { 
    zone my_backend 64k; 
    server dev.sports.com; 
} 

# Custom log format to include the 'NameID' subject in the REMOTE_USER field 
log_format saml_sso '$remote_addr - $saml_name_id [$time_local] "$request" "$host" ' 
                    '$status $body_bytes_sent "$http_referer" ' 
                    '"$http_user_agent" "$http_x_forwarded_for"'; 

# The frontend server - reverse proxy with SAML SSO authentication 
# 
server { 
    # Functional locations implementing SAML SSO support 
    include conf.d/saml_sp.server_conf; 
 

    # Reduce severity level as required 
    error_log /var/log/nginx/error.log debug; 
    listen 443 ssl; 
    ssl_certificate     /home/ubuntu/dev.sports.com.crt; 
    ssl_certificate_key  /home/ubuntu/dev.sports.com.key; 
    ssl_session_cache shared:SSL:5m; 
 

    location / { 
        # When a user is not authenticated (i.e., the "saml_access_granted." 
        # variable is not set to "1"), an HTTP 401 Unauthorized error is 
        # returned, which is handled by the @do_samlsp_flow named location. 
        error_page 401 = @do_samlsp_flow; 

        if ($saml_access_granted != "1") { 
            return 401; 
        } 

        # Successfully authenticated users are proxied to the backend, 
        # with the NameID attribute passed as an HTTP header        
        proxy_set_header mail $saml_attrib_mail;  # Microsoft Entra ID's user.mail 
        proxy_set_header objectid $saml_attrib_objectid; # Microsoft Entra ID's objectid 
        access_log /var/log/nginx/access.log saml_sso; 
        proxy_pass http://my_backend; 
        proxy_set_header Host dev.sports.com; 
        return 200 "Welcome to Application page\n My objectid is $http_objectid\n My email is $http_mail\n"; 
        default_type text/plain; 

   } 
} 
# vim: syntax=nginx

For the attributes saml_attrib_mail and saml_attrib_ objectid to reflect in NGINX configurations, update the key-value store part of saml_sp_configuration.conf as follows:


keyval_zone    zone=saml_attrib_mail:1M                state=/var/lib/nginx/state/saml_attrib_email.json   timeout=1h; 
keyval   $cookie_auth_token $saml_attrib_mail    zone=saml_attrib_mail; 

 keyval_zone zone=saml_attrib_objectid:1M            state=/var/lib/nginx/state/saml_attrib_objectid.json   timeout=1h; 
keyval   $cookie_auth_token $saml_attrib_objectid   zone=saml_attrib_objectid;

Next, configure the SAML SSO configuration file. This file contains the primary configurations for the SP and IdP. To customize it according to your specific SP and IdP setup, you need to adjust the multiple map{} blocks included in the file.

This table provides descriptions of the variables within saml_sp_configuration.conf:

Variable	Description
`saml_sp_entity_id`	The URL used by the users to access the application.
`saml_sp_acs_url`	The URL used by the service provider to receive and process the SAML response, extract the user’s identity, and then grant or deny access to the requested resource based on the provided information.
`saml_sp_sign_authn`	Specifies if the SAML request from SP to IdP should be signed or not. The signature is done using the SP signing key and you need to upload the associated certificate to the IdP to verify the signature.
`saml_sp_signing_key`	The signing key that is used to sign the SAML request from SP to IdP. Make sure to upload the associated certificate to the IdP to verify the signature.
`saml_idp_entity_id`	The identity that is used to define the IdP.
`saml_idp_sso_url`	The IdP endpoint to which the SP sends the SAML assertion request to initiate the authentication request.
`saml_idp_verification_certificate`	The certification used to verify signed SAML assertions received from the IdP. The certificate is provided by the IdP and needs to be in SPKI format.
`saml_sp_slo_url`	The SP endpoint that the IdP sends the SAML LogoutRequest to (when initiating a logout process) or the LogoutResponse to (when confirming the logout).
`saml_sp_sign_slo`	Specifies if the logout SAML is to be signed by the SP or not.
`saml_idp_slo_url`	The IdP endpoint that the SP sends the LogoutRequest to (when initiating a logout process) or LogoutResponse to (when confirming the logout).
`saml_sp_want_signed_slo`	Specifies if the SAML SP wants the SAML logout response or request from the IdP to be signed or not.

The code below shows the edited values only for this use case at saml_sp_configuration.conf.

Note: Make sure the remaining parts of the configuration file still appear in the file (e.g., the key-value stores). Also ensure that you properly adjust the variables within the saml_sp_configuration.conf file based on your deployment.

 
# SAML SSO configuration 

map $host $saml_sp_entity_id { 
    # Unique identifier that identifies the SP to the IdP. 
    # Must be URL or URN. 
    default "https://dev.sports.com"; 
} 

map $host $saml_sp_acs_url { 
    # The ACS URL, an endpoint on the SP where the IdP  
    # will redirect to with its authentication response. 
    # Must match the ACS location defined in the "saml_sp.serer_conf" file. 
    default "https://dev.sports.com/saml/acs"; 
} 

map $host $saml_sp_request_binding { 
    # Refers to the method by which an authentication request is sent from 
    # the SP to an IdP during the Single Sign-On (SSO) process. 
    # Only HTTP-POST or HTTP-Redirect methods are allowed. 
    default 'HTTP-POST'; 
} 

map $host $saml_sp_sign_authn { 
    # Whether the SP should sign the AuthnRequest sent to the IdP. 
    default "false"; 
} 

map $host $saml_sp_decryption_key { 
    # Specifies the private key that the SP uses to decrypt encrypted assertion 
    # or NameID from the IdP. 
    default ""; 
} 

map $host $saml_sp_force_authn { 
    # Whether the SP should force re-authentication of the user by the IdP. 
    default "false"; 
} 

map $host $saml_sp_nameid_format { 
    # Indicates the desired format of the name identifier in the SAML assertion 
    # generated by the IdP. Check section 8.3 of the SAML 2.0 Core specification 
    # (http://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf) 
    # for the list of allowed NameID Formats. 
    default "urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified"; 
} 

map $host $saml_sp_relay_state { 
    # Relative or absolute URL the SP should redirect to 
    # after successful sign on. 
    default ""; 
} 

map $host $saml_sp_want_signed_response { 
    # Whether the SP wants the SAML Response from the IdP 
    # to be digitally signed. 
    default "false"; 
} 

map $host $saml_sp_want_signed_assertion { 
    # Whether the SP wants the SAML Assertion from the IdP 
    # to be digitally signed. 
    default "true"; 
} 

map $host $saml_sp_want_encrypted_assertion { 
    # Whether the SP wants the SAML Assertion from the IdP 
    # to be encrypted. 
    default "false"; 
} 

map $host $saml_idp_entity_id { 
    # Unique identifier that identifies the IdP to the SP. 
    # Must be URL or URN. 
    default "https://sts.windows.net/8807dced-9637-4205-a520-423077750c60/"; 
} 

map $host $saml_idp_sso_url { 
    # IdP endpoint that the SP will send the SAML AuthnRequest to initiate 
    # an authentication process. 
    default "https://login.microsoftonline.com/8807dced-9637-4205-a520-423077750c60/saml2"; 
} 

map $host $saml_idp_verification_certificate { 
    # Certificate file that will be used to verify the digital signature 
    # on the SAML Response, LogoutRequest or LogoutResponse received from IdP. 
    # Must be public key in PKCS#1 format. See documentation on how to convert 
    # X.509 PEM to DER format. 
    default "/etc/nginx/conf.d/demonginx.spki"; 
} 

######### Single Logout (SLO) ######### 

map $host $saml_sp_slo_url { 
    # SP endpoint that the IdP will send the SAML LogoutRequest to initiate 
    # a logout process or LogoutResponse to confirm the logout. 
    default "https://dev.sports.com/saml/sls"; 
} 

map $host $saml_sp_slo_binding { 
    # Refers to the method by which a LogoutRequest or LogoutResponse 
    # is sent from the SP to an IdP during the Single Logout (SLO) process. 
    # Only HTTP-POST or HTTP-Redirect methods are allowed. 
    default 'HTTP-POST'; 
} 

map $host $saml_sp_sign_slo { 
    # Whether the SP must sign the LogoutRequest or LogoutResponse 
    # sent to the IdP. 
    default "false"; 
} 

map $host $saml_idp_slo_url { 
    # IdP endpoint that the SP will send the LogoutRequest to initiate 
    # a logout process or LogoutResponse to confirm the logout. 
    # If not set, the SAML Single Logout (SLO) feature is DISABLED and 
    # requests to the 'logout' location will result in the termination 
    # of the user session and a redirect to the logout landing page. 
    default "https://login.microsoftonline.com/8807dced-9637-4205-a520-423077750c60/saml2"; 
} 

map $host $saml_sp_want_signed_slo { 
    # Whether the SP wants the SAML LogoutRequest or LogoutResponse from the IdP 
    # to be digitally signed. 
    default "true"; 
} 

map $host $saml_logout_landing_page { 
    # Where to redirect user after requesting /logout location. This can be 
    # replaced with a custom logout page, or complete URL. 
    default "/_logout"; # Built-in, simple logout page 
} 

map $proto $saml_cookie_flags { 
    http  "Path=/; SameSite=lax;"; # For HTTP/plaintext testing 
    https "Path=/; SameSite=lax; HttpOnly; Secure;"; # Production recommendation 
} 

map $http_x_forwarded_port $redirect_base { 
    ""      $proto://$host:$server_port; 
    default $proto://$host:$http_x_forwarded_port; 
} 

map $http_x_forwarded_proto $proto { 
    ""      $scheme; 
    default $http_x_forwarded_proto; 
} 
# ADVANCED CONFIGURATION BELOW THIS LINE 
# Additional advanced configuration (server context) in saml_sp.server_conf 

######### Shared memory zones that keep the SAML-related key-value databases 

# Zone for storing AuthnRequest and LogoutRequest message identifiers (ID) 
# to prevent replay attacks. (REQUIRED) 
# Timeout determines how long the SP waits for a response from the IDP, 
# i.e. how long the user authentication process can take. 
keyval_zone zone=saml_request_id:1M                 state=/var/lib/nginx/state/saml_request_id.json                  timeout=5m; 

# Zone for storing SAML Response message identifiers (ID) to prevent replay attacks. (REQUIRED) 
# Timeout determines how long the SP keeps IDs to prevent reuse. 
keyval_zone zone=saml_response_id:1M                state=/var/lib/nginx/state/saml_response_id.json                 timeout=1h; 

# Zone for storing SAML session access information. (REQUIRED) 
# Timeout determines how long the SP keeps session access decision (the session lifetime). 
keyval_zone zone=saml_session_access:1M             state=/var/lib/nginx/state/saml_session_access.json              timeout=1h; 

# Zone for storing SAML NameID values. (REQUIRED) 
# Timeout determines how long the SP keeps NameID values. Must be equal to session lifetime. 
keyval_zone zone=saml_name_id:1M                    state=/var/lib/nginx/state/saml_name_id.json                     timeout=1h; 

# Zone for storing SAML NameID format values. (REQUIRED) 
# Timeout determines how long the SP keeps NameID format values. Must be equal to session lifetime. 
keyval_zone zone=saml_name_id_format:1M             state=/var/lib/nginx/state/saml_name_id_format.json              timeout=1h; 

# Zone for storing SAML SessionIndex values. (REQUIRED) 
# Timeout determines how long the SP keeps SessionIndex values. Must be equal to session lifetime. 
keyval_zone zone=saml_session_index:1M              state=/var/lib/nginx/state/saml_session_index.json               timeout=1h; 
  
# Zone for storing SAML AuthnContextClassRef values. (REQUIRED) 
# Timeout determines how long the SP keeps AuthnContextClassRef values. Must be equal to session lifetime. 
keyval_zone zone=saml_authn_context_class_ref:1M    state=/var/lib/nginx/state/saml_authn_context_class_ref.json     timeout=1h; 

# Zones for storing SAML attributes values. (OPTIONAL) 
# Timeout determines how long the SP keeps attributes values. Must be equal to session lifetime. 
keyval_zone zone=saml_attrib_uid:1M                 state=/var/lib/nginx/state/saml_attrib_uid.json                  timeout=1h; 
keyval_zone zone=saml_attrib_name:1M                state=/var/lib/nginx/state/saml_attrib_name.json                 timeout=1h; 
keyval_zone zone=saml_attrib_memberOf:1M            state=/var/lib/nginx/state/saml_attrib_memberOf.json             timeout=1h; 

######### SAML-related variables whose value is looked up by the key (session cookie) in the key-value database. 

# Required: 
keyval $saml_request_id     $saml_request_redeemed          zone=saml_request_id;               # SAML Request ID 
keyval $saml_response_id    $saml_response_redeemed         zone=saml_response_id;              # SAML Response ID 
keyval $cookie_auth_token   $saml_access_granted            zone=saml_session_access;           # SAML Access decision 
keyval $cookie_auth_token   $saml_name_id                   zone=saml_name_id;                  # SAML NameID 
keyval $cookie_auth_token   $saml_name_id_format            zone=saml_name_id_format;           # SAML NameIDFormat 
keyval $cookie_auth_token   $saml_session_index             zone=saml_session_index;            # SAML SessionIndex 
keyval $cookie_auth_token   $saml_authn_context_class_ref   zone=saml_authn_context_class_ref;  # SAML AuthnContextClassRef 

# Optional: 
keyval $cookie_auth_token   $saml_attrib_uid                zone=saml_attrib_uid; 
keyval $cookie_auth_token   $saml_attrib_name               zone=saml_attrib_name; 
keyval $cookie_auth_token   $saml_attrib_memberOf           zone=saml_attrib_memberOf; 
 

keyval_zone    zone=saml_attrib_mail:1M                state=/var/lib/nginx/state/saml_attrib_mail.json   timeout=1h; 
keyval         $cookie_auth_token $saml_attrib_mail    zone=saml_attrib_mail; 
  
keyval $cookie_auth_token   $saml_attrib_objectid           zone=saml_attrib_objectid; 
keyval_zone zone=saml_attrib_objectid:1M            state=/var/lib/nginx/state/saml_attrib_objectid.json             timeout=1h; 
  

######### Imports a module that implements SAML SSO and SLO functionality 
js_import samlsp from conf.d/saml_sp.js;

Step 3: Testing the Configuration

Two parts are required to test the configuration:

Verifying the SAML flow
Testing the SP-initiated logout functionality

Verifying the SAML Flow

After configuring the SAML SP using NGINX Plus and the IdP using Microsoft Entra ID, it is crucial to validate the SAML flow. This validation process ensures that user authentication through the IdP is successful and that access to SP-protected resources is granted.

To verify the SP-initiated SAML flow, open your preferred browser and type https://dev.sports.com in the address bar. This directs you to the IdP login page.

Figure 8: The IdP login page

Enter the credentials of a user who is configured in the IdP’s login page. The IdP will authenticate the user upon submitting.

Figure 9: Entering the configured user’s credentials

The user will be granted access to the previously requested protected resource upon successfully establishing a session. Subsequently, that resource will be displayed in the user’s browser.

Figure 10: The successfully loaded application page

Valuable information about the SAML flow can be obtained by checking the SP and IdP logs. On the SP side (NGINX Plus), ensure the auth_token cookies are set correctly. On the IdP side (Microsoft Entra ID), ensure that the authentication process completes without errors and that the SAML assertion is sent to the SP.

The NGINX access.log should look like this:


127.0.0.1 - - [14/Aug/2023:21:25:49 +0000] "GET / HTTP/1.0" 200 127 "https://login.microsoftonline.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15" "-" 

99.187.244.63 - Akash Ananthanarayanan [14/Aug/2023:21:25:49 +0000] "GET / HTTP/1.1" "dev.sports.com" 200 127 "https://login.microsoftonline.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15" "-

While the NGINX debug.log looks like this:


2023/08/14 21:25:49 [info] 27513#27513: *399 js: SAML SP success, creating session _d4db9b93c415ee7b4e057a4bb195df6cd0be7e4d

Testing the SP-initiated Logout Functionality

SAML Single Logout (SLO) lets users log out of all involved IdPs and SPs with one action. NGINX Plus supports SP-initiated and IdP-initiated logout scenarios, enhancing security and user experience in SSO environments. In this example, we use an SP-initiated logout scenario.

Figure 11: SAML SP-Initiated SLO with POST/redirect bindings for LogoutRequest and LogoutResponse

After authenticating your session, log out by accessing the logout URL configured in your SP. For example, if you have set up https://dev.sports.com/logout as the logout URL in NGINX Plus, enter that URL in your browser’s address bar.

Figure 12: Successfully logging out of the session

To ensure a secure logout, the SP must initiate a SAML request that is then verified and processed by the IdP. This action effectively terminates the user’s session, and the IdP will then send a SAML response to redirect the user’s browser back to the SP.

Conclusion

Congratulations! NGINX Plus can now serve as a SAML SP, providing another layer of security and convenience to the authentication process. This new capability is a significant step forward for NGINX Plus, making it a more robust and versatile solution for organizations prioritizing security and efficiency. 

Learn More About Using SAML with NGINX Plus

You can begin using SAML with NGINX Plus today by starting a 30-day free trial of NGINX Plus. We hope you find it useful and welcome your feedback.

More information about NGINX Plus with SAML is available in the resources below.

The post Configure NGINX Plus for SAML SSO with Microsoft Entra ID appeared first on NGINX.

Tech Archives - NGINX

Scale, Secure, and Monitor AI/ML Workloads in Kubernetes with Ingress Controllers

Deploying AI/ML Models in Production at Scale

Benefits of Ingress Controllers for AI/ML Workloads

Get Started with NGINX Connectivity Stack for Kubernetes

Dynamic A/B Kubernetes Multi-Cluster Load Balancing and Security Controls with NGINX Plus

Active-Active Multi-Cluster Load Balancing

NGINX Plus Key-Value Store

Use Cases

Configuration Examples

Dynamic HTTP Upstreams: NGINX Loadbalancer for Kubernetes

Architecture and Flow

A Snapshot of NGINX Loadbalancer for Kubernetes in Action

Adding NGINX Plus Security Features

Get Started Today

Updating NGINX for the Vulnerabilities in the HTTP/3 Module

NGINX’s Continued Commitment to Securing Users in Action

Meetup Recap: NGINX’s Commitments to the Open Source Community

Tutorial: Configure OpenTelemetry for Your Applications Using NGINX

Tutorial Overview

Prerequisites

Deploy NGINX Plus and Install the OpenTelemetry Module

Deploy Jaeger and the echo Application

Configure OpenTelemetry in NGINX for Tracing

OpenTelemetry Directives and Variables

Configuration Examples

Test the Configuration

Conclusion

A Quick Guide to Scaling AI/ML Workloads on Kubernetes

Three Scaling Modalities for AI/ML Workloads on Kubernetes

HPA Use Cases

VPA Use Cases

Cluster Autoscaler Use Cases

HPA, VPA, and Cluster Autoscaler Each Have a Role

Managing Your NGINX Configurations with GitHub

Instance Manager API

GitHub Actions

Setting Actions Secrets

Setting Actions Variables

Anatomy of the Instance Manager REST API Call for Updating NGINX Configurations

URI Parameters

Header Parameters

JSON Data Parameters

Base64 Encoding

Instance Manager Credentials

Timestamp

NGINX Configurations

Putting It All Together

NGINX Reference Implementation

How NGINX Gateway Fabric Implements Complex Routing Rules

Configuring NGINX from Gateway API Resources Using Go Templates

What Is NGINX JavaScript?

HTTP Request Matching

The NGINX JavaScript Solution

Why Use NGINX JavaScript?

Next Steps

Configure NGINX Plus for SAML SSO with Microsoft Entra ID

Using NGINX Plus as a SAML Service Provider

Step 1: Configure Microsoft Entra ID as an Identity Provider

Step 2: Configure SAML Settings and NGINX Plus as a Reverse Proxy

Step 3: Testing the Configuration

Verifying the SAML Flow

Testing the SP-initiated Logout Functionality

Conclusion

Learn More About Using SAML with NGINX Plus

Deploy Jaeger and the `echo` Application