Skip to content

Commit b823537

Browse files
Merge pull request #19 from uswitch/AIRSHIP-3913-move-node-problem-detector
Airship 3913 move node problem detector
2 parents ef120c0 + e22cf33 commit b823537

5 files changed

Lines changed: 47 additions & 22 deletions

File tree

.drone.yml

Lines changed: 0 additions & 18 deletions
This file was deleted.

.github/rvu/labels.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
service.rvu.co.uk/brand: airship

.github/workflows/push.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
name: push
2+
on: push
3+
permissions:
4+
contents: read
5+
id-token: write
6+
jobs:
7+
build:
8+
runs-on: ubuntu-latest
9+
steps:
10+
- uses: actions/checkout@v4
11+
- name: Login to Quay.io
12+
uses: docker/login-action@v3
13+
with:
14+
registry: quay.io
15+
username: ${{ secrets.QUAY_USERNAME }}
16+
password: ${{ secrets.QUAY_PASSWORD }}
17+
- id: meta
18+
uses: docker/metadata-action@v5
19+
with:
20+
images: quay.io/uswitch/node-problem-detector
21+
tags: type=sha,prefix=,format=long
22+
- uses: docker/build-push-action@v6
23+
with:
24+
context: .
25+
labels: ${{ steps.meta.outputs.labels }}
26+
push: true
27+
tags: ${{ steps.meta.outputs.tags }}

README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,16 @@
1-
Adding our own scripts to https://github.com/kubernetes/node-problem-detector
1+
# Node Problem Detector custom scripts
2+
3+
Adding our own scripts to https://github.com/kubernetes/node-problem-detector and sharing them in case you might find those handy for you use cases.
4+
5+
6+
The scripts details can be found in `/config/plugin/` but ultimately, they are:
7+
* `launch-config-drift`: a way to check if your instances launch template has diverged from your asg launch template
8+
* `spot-termination`: uses the `meta-data/spot/instance-action endpoint` to check EC2 Spot Instance interruption notice
9+
* `local-dns-resolver`: checks the response status value received (if any) from the local dns resolver ip
10+
* `upstream-dns-resolver`: check if we receive an IPv4 address for a given A record.
11+
* `uptime`: every 5 seconds, checks if the information detailing how long the system has been on since its last restart is acceptable (to us the threshold being 604800 seconds)
12+
13+
14+
## Notes
15+
*July 2024 -* The custom `node problem detector` image is now stored in the `uswitch/node_problem_detectr` repository on Quay.
16+
<br>

config/plugin/launch_config_drift.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ then
1616
fi
1717

1818
instance="$(echo "${instances}" | jq '.AutoScalingInstances[0]')"
19-
instance_launch_config="$(echo "${instance}" | jq -r .LaunchConfigurationName)"
19+
instance_launch_config="$(echo "${instance}" | jq -r .LaunchTemplate.LaunchTemplateName)"
2020
instance_asg="$(echo "${instance}" | jq -r .AutoScalingGroupName)"
2121

2222
asgs="$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names ${instance_asg})"
@@ -26,11 +26,11 @@ then
2626
exit $UNKNOWN
2727
fi
2828

29-
asg_launch_config="$(echo "${asgs}" | jq -r '.AutoScalingGroups[0].LaunchConfigurationName')"
29+
asg_launch_config="$(echo "${asgs}" | jq -r '.AutoScalingGroups[0].MixedInstancesPolicy.LaunchTemplate.LaunchTemplateSpecification.LaunchTemplateName')"
3030

3131
if [ "${instance_launch_config}" = "${asg_launch_config}" ]
3232
then
3333
exit $OK
3434
else
3535
exit $NONOK
36-
fi
36+
fi

0 commit comments

Comments
 (0)