Skip to content

How I investigated connection hogs on Kubernetes

Hi

My name is Dominhique Dumont, DevOps freelance in Grenoble, France.

My goal is to share my experience regarding a production issue that occurred last week where my client complained that the applications was very slow and sometime showed 5xx errors. The production service is hosted on a Kubernetes cluster on Azure and use a MongoDB on ScaleGrid.

I reproduced the issue on my side and found that the API calls were randomly failing due to timeouts on server side.

The server logs were showing some MongoDB disconnections and reconnections and some time-out on MongoDB connections, but did not give any clue on why some connections to MongoDB server were failing.

Since there was not clue in the cluster logs, I looked at ScaleGrid monitoring. There was about 2500 connections on MongoDB: 2022-07-19-scalegrid-connection-leak.png That seemed quite a lot given the low traffic at that time, but not necessarily a problem.

Then, I went to the Azure console, and I got the first hint about the origin of the problem: the SNATs were exhausted on some nodes of the clusters. 2022-07-28_no-more-free-snat.png

SNATs are involved in connections from the cluster to the outside world, i.e. to our MongoDB server and are quite limited: only 1024 SNAT ports are available per node. This was consistent with the number of used connections on MongoDB.

OK, then the number of used connections on MongoDB was a real problem.

The next question was: which pods and how many connections ?

First I had to filter out the pods that did not use MongoDB. Fortunately, all our pods have labels so I could list all pods using MongoDB:

$ kubectl -n prod get pods -l db=mongo | wc -l
236

Hmm, still quite a lot.

Next problem is to check which pod used too many MongoDB connections. Unfortunately, the logs mentioned that a connection to MongoDB was opened, but that did not give a clue on how many were used.

Netstat is not installed on the pods, and cannot be installed since the pods are running as root (which is a good idea for security reasons)

Then, my Debian Developer experience kicked in and I remembered that /proc file system on Linux gives a lot of information on consumed kernel resources, including resources consumed by each process.

The trick is to know the PID of the process using the connections.

In our case, Docker files are written in a way so the main process of a pod using NodeJS is 1, so, the command to list the connections of pod is:

$ kubectl -n prod exec redacted-pod-name-69875496f8-8bj4f -- cat /proc/1/net/tcp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: AC00F00A:C9FA C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439162 2 0000000000000000 21 4 0 10 -1                 
   1: AC00F00A:CA00 C2906714:6989 01 00000000:00000000 02:00000E76 00000000  1001        0 376439811 2 0000000000000000 21 4 0 10 -1                 
   2: AC00F00A:8ED0 C2906714:6989 01 00000000:00000000 02:000004DA 00000000  1001        0 445806350 2 0000000000000000 21 4 30 10 -1                
   3: AC00F00A:CA02 C2906714:6989 01 00000000:00000000 02:000000DD 00000000  1001        0 376439812 2 0000000000000000 21 4 0 10 -1                 
   4: AC00F00A:C9FE C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439810 2 0000000000000000 21 4 0 10 -1                 
   5: AC00F00A:8760 C2906714:6989 01 00000000:00000000 02:00000810 00000000  1001        0 375803096 2 0000000000000000 21 4 0 10 -1                 
   6: AC00F00A:C9FC C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439809 2 0000000000000000 21 4 0 10 -1                 
   7: AC00F00A:C56C C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376167298 2 0000000000000000 21 4 0 10 -1                 
   8: AC00F00A:883C C2906714:6989 01 00000000:00000000 02:00000734 00000000  1001        0 375823415 2 0000000000000000 21 4 30 10 -1 

OK, that’s less appealing that netstat output. The trick is that rem_address and port are expressed in hexa. A quick calculation confirms the port 0x6989 is indeed port 27017, which is the listening port of MongoDB server.

So the number of opened MongoDB connections is given by:

$ kubectl -n prod exec redacted-pod-name-69875496f8-8bj4f -- cat /proc/1/net/tcp | grep :6989 | wc -l
9

What’s next ?

The ideal solution would be to fix the NodeJS code to handle correctly the termination of the connections, but that would have taken too long to develop.

So I’ve written a small Perl script to:

  • list the pods using MongoDB using kubectl -n prod get pods -l db=mongo
  • find the pods using more that 10 connections using the kubectl exec command shown above
  • compute the deployment name of these pods (which was possible given the naming convention used with our pods and deployments)
  • restart the deployment of these pods with a kubectl rollout restart deployment command

Why restart a deployment instead of simply deleting the gluttonous pods? I wanted to avoid downtime if all pods of a deployment were to be killed. There’s no downtime when applying rollout restart command on deployments.

This script is now run regularly until the connections issue is fixed for good in NodeJS code. Thanks to this script, there’s no need to rush a code modification.

All in all, working around this connections issues was made somewhat easier thanks to:

  • the monitoring tools provided by the hosting services.
  • a good knowledge of Linux internals
  • consistent labels on our pods
  • the naming conventions used for our kubernetes artifacts

Important bug fix for OpenSsh cme config editor

The new release of Config::Model::OpenSsh fixes a bugs that impacted experienced users: the order of Hosts or Match sections is now preserved when writing back ~/.ssh/config file.

Why does this matter ?

Well, the beginning of ssh_config man page mentions that “For each parameter, the first obtained value will be used.” and “Since the first obtained value for each parameter is used, more host-specific declarations should be given near the beginning of the file, andΒ general defaults at the end.“.

Looks like I missed these statements when I designed the model for OpenSsh configuration: the Host section was written back in a neat, but wrong, alphabetical order.

This does not matter except when there an overlap between the specifications of the Host (or Match) sections like in the example below:

Host foo.company.com
Port 22

Host *.company.com
Port 10022

With this example, ssh connection to “foo.company.com” is done using port 22 and connection to “bar.company.com” with port 10022.

If the Host sections are written back in reverse order:

Host *.company.com
Port 10022

Host foo.company.com
Port 22

Then, ssh would be happy to use the first matching section for “foo.company.com“, i.e. “*.company.com” and would use the wrong port (10022)

This is now fixed with Config::Model::OpenSsh 2.8.4.3 which is available on cpan and in Debian/experimental.

While I was at it, I’ve also updated Managing OpenSsh configuration with cme wiki page.

All the best

An improved GUI for cme and Config::Model

I’ve finally found the time to improve the GUI of my pet project: cme (aka Config::Model).

Several years ago, I stumbled on a usability problem on the GUI. Some configuration (like OpenSsh or Systemd) feature a lot of configuration parameters. Which means that the GUI displays all these parameters, so finding a specfic parameter might be challenging:

To workaround this problem, I’ve added a Filter widget in 2018 which did more or less the job, but it suffered from several bugs which made its behavior confusing.

This is now fixed. The Filter widget is now working in a more consistent way.

In the example below, I’ve typed “IdentityFile” (1) in the Filter widget to show the identityFile used for various hosts (2):

Which is quite good, but some hosts use the default identity file so no value show up in the GUI. You can then click on “hide empty value” checkbox to show only the hosts that use a specific identity file:

I hope that this new behavior of the Filter box will make this project more useful.

The improved GUI was released with Config::Model::TkUI 1.374. This new version is available on CPAN and on Debian/experimental). It will be released on Debian/unstable once the next Debian version is out.

All the best

Security gotcha with log collection on Azure Kubernetes cluster.

Azure Kubernetes Service provides a nice way to set up Kubernetes
cluster in the cloud. It’s quite practical as AKS is setup by default
with a rich monitoring and reporting environment. By default, all
container logs are collected, CPU and disk data are gathered. πŸ‘

I used AKS to setup a cluster for my first client as a
freelance. Everything was nice until my client asked me why logs
collection was as expensive as the computer resources.πŸ’Έ

Ouch… 🀦

My first reflex was to reduce the amount of logs produced by all our
containers, i.e. start logging at warn level instead of info
level
. This reduced the amount of logs quite a lot.

But this did not reduce the cost of collecting logs, which looks like
to a be a common issue.

Thanks to the documentation provided by Microsoft, I was able to find
that ContainerInventory data table was responsible of more than 60%
of our logging costs.

What is ContainerInventory ? It’s a facility to monitor the content
of all environment variables from all containers.

Wait… What ? ⚠

Should we be worried about our database credentials which are, legacy
oblige, stored in environment variables ?

Unfortunately, the query shown below confirmed that, yes, we should:
the logs aggregated by Azure contains the database credentials of my
client.

ContainerInventory
| where TimeGenerated > ago(1h)

Having credentials collected in logs is lackluster from a security
point of view. πŸ™„

And we don’t need it because our environment variables do not change.

Well, it’s now time to fix these issues. πŸ› 

We’re going to:

  1. disable the collection of environment variables in Azure, which
    will reduce cost and plug the potential credential leak
  2. renew all DB credentials, because the previous credentials can be
    considered as compromised (The renewal of our DB passwords is quite
    easy with the script I provided to my client)
  3. pass credentials with files instead of environment variables.

In summary, the service provided by Azure is still nice, but beware of
the default configuration which may contain surprises.

I’m a freelance, available for hire. The https://code-straight.fr site
describes how I can help your projects.

All the best

Read more…

How to run CEWE photo creator on Debian

Hi

This post describes how I debug an issue with a proprietary software. I hope this will give you some hint on how to proceed should you face a similar issue. If you’re in a hurry, you can read the TL;DR; version at the end.

After the summer vacations, I’ve decided to offer a photo-book to my mother. I searched for open-source solution but the printed results were lackluster.

Unfortunately, the only possible solution was to use professional service. Some of these services offer a web application to create photo books, but this is painful to use on a slow DSL line. Other services provide a program named CEWE. This proprietary program can be downloaded for Windows, Mac and, lo and behold: Linux !

The download goes quite fast as the downloaded program is a Perl script that does the actual download. I would have preferred a proper Debian package, but at least Linux amd64 is supported.

Once installed, CEWE program is available as an executable and a bunch of shared libraries.

This program works quite well to create a photo album. I won’t go into the details there.

I ran into trouble when trying to connect the application to the service site to order the photo-book: the connection fails with a cryptic message “error code 10000”.

Commercial support was not much help as they insisted that I check my proxy settings. I downloaded again CEWE from another photo service. The new CEWE installation gave me the same error. This showed that the issue was on my side and not on the server’s side.

Given that the error occurred quite fast when trying to connect, I guessed that the connection setup was going south. Since the URL shown in the installation script began with https, I had to check for SSL issues.

I checked certificate issues: curl had no problem connecting to the server mentioned in the Perl script. Wireshark showed that the connection to the server was reset by the server quite fast. I wondered which version of SSL was used by CEWE and ran ldd. To my surprise, I found that ldd did not list libssl. Something weird was going on: SSL was required but CEWE was not linked to libssl…

I used another trick: explore all the menus of the application. This was a good move as I found a checkbox to enable debug report in CEWE in “Options -> paramΓ¨tres -> Service” menu (that may be “options-> parameters -> support” in English CEWE). When set, debug traces are also shown on standard output of CEWE,

And, somewhere in the debug traces, I found:

W (2018-10-30T18:36:37.143) [ 0] ==> QSslSocket: cannot resolve SSLv3_client_method <==

So CEWE was looking for SSL symbols even though ldd did not require libssl…

I guessed that CEWE was using dlopen to open the ssl library. But which file was opened by dlopen ?

Most likely, the guys who wrote the call to dlopen did not want to handle file names with so version (i.e. like libssl.so.1.0.2), and added code to open directly libssl.so. This file is provided by libssl-dev package, which was already installed on my system.

But wait, CEWE was probably written for Debian stable with an older libssl. I tried libssl1.0-dev.. which conflicts with libssl-dev. Oh well, I can live with that for a while…

And that was it ! With libssl1.0-dev installed, CEWE was able to connect to the photo service web site without problems.

So here’s the TL;DR; version. To run CEWE on Debian, run:

sudo apt install libssl1.0-dev

Last but not least, here are some suggestions for CEWE:

  • use libssl1.1. as libssl1.0 is deprecated and will be removed from Debian
  • place the debug checkbox in “System” widget. This widget was the first I opened when I began troubleshooting. “Service” does not mean much to me. Having this checkbox in both “Service” and “System” widgets would not harm

All the best

[ Edit: I first blamed CEWE for loading libssl in a non-standard way. libssl is actually loaded by QtNetwork. Depending on the way Qt is built, SSL is either disabled (-no-openssl option), loaded by dlopen (default) or loaded with dynamic linking (-openssl-linked). The way Qt is built is CEWE choice. Thanks Uli Schlachter for the heads-up]

Β 

New Software::LicenseMoreUtils Perl module

Hello

Debian project has rather strict requirements regarding package license. One of these requirements is to provide a copyright file mentioning the license of the files included in a Debian package.

Debian also recommends to provide this copyright information in a machine readable format that contain the whole text of the license(s) or a summary pointing to a pre-defined location on the file system (see this example).

cme and Config::Model::Dpkg::Copyright helps in this task using Software::License module. But this module lacks the following features to properly support the requirements of Debian packaging:

  • license summary
  • support for clause like “GPL version 2 or (at your option) any later version”

Long story short, I’ve written Software::LicenseMoreUtils to provide these missing features. This module is a wrapper around Software::License and has the same API.

Adding license summaries for Debian requires only to update this YAML file.

This modules was written for Debian while keeping other distros in minds. Debian derevatives like Ubuntu or Mind are supported. Adding license summaries for other Linux distribution is straightforward. Please submit a bug or a PR to add support for other distributions.

For more details. please see:

 

All the best

Shutter, a nice Perl application, may be removed from Debian

Hello

Debian is moving away from Gnome2::VFS. This obsolete module will be removed from next release of Debian.

Unfortunately, Shutter, a very nice Gtk2 screenshot application, depends on Gnome::VFS, which means that Shutter will be removed from Debian unless this dependency is removed from shutter. This would be a shame as Shutter is one of the best screenshot tool available on Linux and one of the best looking Perl application. And its popularity is still growing.

Shutter also provides a way to edit screenshots, for instance to mask confidential data. This graphical editor is based on Goo::Canvas which is already gone from Debian.

To be kept on Debian, Shutter must be updated:

  • to use Gnome GIO instead of Gnome2::VFS
  • to use GooCanvas2 instead of Goo::Canvas
  • may be, to be ported to Gtk3 (that’s less urgent)

I’ve done some work to port Shutter to GIO, but I need to face reality: Maintaining cme is taking most of my free time and I don’t have the time to overhaul Shutter.

To view or clone the code, you can either:

See also the bug reports about Shutter problems on Ubuntu bug tracker

I hope this blog will help finding someone to maintain Shutter…

All the best.

cme: some read-write backend features are being deprecated

Hello

Config::Model and cme read and write configuration data with a set of “backend” classes, like Config::Model::Backend::IniFile. These classes are managed by Config::Model::BackendMgr.

Well, that’s the simplified view. Actually, the backend manager can handle several different backends to read and write data: read backends are tried until one of them succeeds to read configuration data. And write backend cen be different from the read backend, thus offering the possibility to migrare from one format to another. This feature came at the beginning of the project, back in 2005. This felt like a good idea to let user migrate from one data format to another.

More than 10 years later, this feature has never been used and is handled by a bunch of messy code that hampers further evolution of the backend classes.

So, without further ado, I’m going to deprecate the following features in order to simplify the backend manager:

  • The “custom” backend that can be easily replaced with more standard backend based on Config::Model::Backend::Any. This feature has been deprecated with Config::Model 2.107
  • The possibility to specify more that one backend. Soon, only the first read backend will be taken into account. This will simplify the declaration of backend. The “read_config” parameter, which is currently a list of backend specification, will become a single backend specification. The command cme meta edit will handle the migration of existing model to the new scheme.
  • the “write_config” parameter will be removed.

Unless someone objects, actual removal of these feature will be done in the next few months, after a quite short deprecation period.

All the best

New with cme: a GUI to configure Systemd services

Hello

Systemd is powerful, but creating a new service is a task that require creating several files in non obvious location (like /etc/systemd/system or ~/.local/share/systemd/user/). Each file features 2 or more sections (e.g. [Unit], [Service]). And each section supports a lot of parameters.

Creating such Systemd configuration files can be seen as a daunting task for beginners.

cme project aims to make this task easier by providing a GUI that:

  • shows all existing services in a single screen
  • shows all possible sections and parameters with their documentation
  • validates the content of each parameter (if possible)

For instance, on my laptop, the command cme edit systemd-user shows 2 custom services (“free-imap-tunnel@” and “gmail-imap-tunnel@”) with:

cme_edit_systemd_001

The GUI above shows the units for my custom systemd files:

$ ls ~/.config/systemd/user/
free-imap-tunnel@.service
free-imap-tunnel.socket
gmail-imap-tunnel@.service
gmail-imap-tunnel.socket
sockets.target.wants

and the units installed by Debian packages:

$ find /usr/lib/systemd/user/ -maxdepth 1 \
  '(' -name '*.service' -o -name '*.socket' ')' \
  -printf '%f\n' |sort |head -15
at-spi-dbus-bus.service
colord-session.service
dbus.service
dbus.socket
dirmngr.service
dirmngr.socket
glib-pacrunner.service
gpg-agent-browser.socket
gpg-agent-extra.socket
gpg-agent.service
gpg-agent.socket
gpg-agent-ssh.socket
obex.service
pulseaudio.service
pulseaudio.socket

The screenshot above shows the content of the service defined by the following file:

$ cat ~/.config/systemd/user/free-imap-tunnel@.service
[Unit]
Description=Tunnel IMAPS connections to Free with Systemd

[Service]
StandardInput=socket
# no need to install corkscrew
ExecStart=-/usr/bin/socat - PROXY:127.0.0.1:imap.free.fr:993,proxyport=8888

Note that empty parameters are not shown because the “hide empty value” checkbox on top right is enabled.

Likewise, cme is able to edit system files like user files with sudo cme edit systemd:

cme_edit_systemd_001

For more details on how to use the GUI to edit systemd files, please see:

Using a GUI may not be your cup of tea. cme can also be used as a validation tool. Let’s add a parameter with an excessive value to my service:

$ echo "CPUShares = 1000000" >> ~/.local/share/systemd/user/free-imap-tunnel@.service

And check the file with cme:

$ cme check systemd-user 
cme: using Systemd model
loading data
Configuration item 'service:"free-imap-tunnel@" Service CPUShares' has a wrong value:
        value 1000000 > max limit 262144

ok, let’s fix this with cme. The wrong value can either be deleted:

$ cme modify systemd-user 'service:"free-imap-tunnel@" Service CPUShares~'
cme: using Systemd model

Changes applied to systemd-user configuration:
- service:"free-imap-tunnel@" Service CPUShares: '1000000' -> ''

Or modified:

$ cme modify systemd-user 'service:"free-imap-tunnel@" Service CPUShares=2048'
cme: using Systemd model

Changes applied to systemd-user configuration:
- service:"free-imap-tunnel@" Service CPUShares: '1000000' -> '2048'

You can also view the specification of a service using cme:

$ cme dump systemd-user 'service:"free-imap-tunnel@"'---
Service:
  CPUShares: 2048
  ExecStart:
    - '-/usr/bin/socat -  PROXY:127.0.0.1:imap.free.fr:993,proxyport=8888'
  StandardInput: socket
Unit:
  Description: Tunnel IMAPS connections to Free with Systemd

The output above matches the content of the service configuration file:

$ cat ~/.local/share/systemd/user/free-imap-tunnel@.service
## This file was written by cme command.
## You can run 'cme edit systemd-user' to modify this file.
## You may also modify the content of this file with your favorite editor.

[Unit]
Description=Tunnel IMAPS connections to Free with Systemd

[Service]
StartupCPUWeight=100
CPUShares=2048
StartupCPUShares=1024
StandardInput=socket
# no need to install corkscrew now
ExecStart=-/usr/bin/socat -  PROXY:127.0.0.1:imap.free.fr:993,proxyport=8888

Last but not least, you can use cme shell if you want an interactive ui but cannot use a graphical interface:

$ cme shell systemd-user 
cme: using Systemd model
 >:$ cd service:"free-imap-tunnel@"  Service  
 >: service:"free-imap-tunnel@" Service $ ll -nz Exec*
name      β”‚ type β”‚ value                                                             
──────────┼──────┼───────────────────────────────────────────────────────────────────
ExecStart β”‚ list β”‚ -/usr/bin/socat -  PROXY:127.0.0.1:imap.free.fr:993,proxyport=8888

 >: service:"free-imap-tunnel@" Service $ ll -nz
name             β”‚ type    β”‚ value                                                             
─────────────────┼─────────┼───────────────────────────────────────────────────────────────────
StartupCPUWeight β”‚ integer β”‚ 100                                                               
CPUShares        β”‚ integer β”‚ 2048                                                              
StartupCPUShares β”‚ integer β”‚ 1024                                                              
StandardInput    β”‚ enum    β”‚ socket                                                            
ExecStart        β”‚ list    β”‚ -/usr/bin/socat -  PROXY:127.0.0.1:imap.free.fr:993,proxyport=8888

 >: service:"free-imap-tunnel@" Service $ set CPUShares=1024
 >: service:"free-imap-tunnel@" Service $ ll -nz CPUShares 
name      β”‚ type    β”‚ value
──────────┼─────────┼──────
CPUShares β”‚ integer β”‚ 1024 

 >: service:"free-imap-tunnel@" Service $ quit


Changes applied to systemd-user configuration:
- service:"free-imap-tunnel@" Service CPUShares: '2048' -> '1024'

write back data before exit ? (Y/n)

Currently, only service, socket and timer units are supported. Please create a bug report on github if you need more.

Installation instructions are detailed at the beginning of Managing Systemd configuration with cme wiki page.

As all softwares, cme probably has bugs. Please report any issue you might have with it.

For more information:

All in all, systemd is quite complex to setup. I hope I made a little bit easier to deal with.

All the best

New dzil command to install author dependencies as Debian packages

Hello

Dist::Zilla is a great tool to limit tedious tasks while working on Perl modules. For instance, dzil provides tools like dzil authordeps or dzil listdeps to list dependencies.
This list of Perl modules can then be installed with cpanm:

dzil authordeps --missing | cpanm
dzil listdeps --missing | cpanm

On a Debian system, one may prefer to install Perl modules using Debian packages. Installing build dependencies can be done with apt build-dep, but apt does not handle Dist::Zilla author dependencies.

The new authordebs Dist::Zilla sub-command was wriiten to fill this gap. When run in a directory containing the source of a Perl module that uses Dist::Zilla, you can run dzil installdebs to list the Debian packages required to run the dzil command. You can also run dzil installdebs -install to install author dependencies (using sudo under the hood).

See:

On Debian, authordebs is provided by libdist-zilla-app-command-authordebs-perl

All the best