Manage & Operate NetBox

Backup, restore, and perform health checks and maintenance on your NetBox instance

NetBox Backup Guide

Introduction

The NetBox backup process provides regular secure backups. In summary the Ansible playbook does the following:

  1. Manages retention based on your preferences, e.g. deletes local backups older than 14 days old.
  2. Backs up and encrypts the media directory.
  3. Backs up and encrypts the netbox PostgreSQL database.
  4. Captures the current state of NetBox from the /api/status endpoint and saves the output in a JSON file that will be compared agains the target state in a restore event.
  5. Optionally SFTP's the three backup files to a secure remote SFTP server.

The NetBox process is stopped during the backup to ensure no changes are made to the database during the backup, and to ensure data integrity. This is typically for less than a minute, but take this into account if you anything connecting to the NetBox, e.g. automation tools. Netos Pod (Airflow) has a retry mechanism that handles broken connectivity to NetBox.

image.png

Cron Scheduling

The cron scheduler in Semaphore can be configured to backup the database at regular intervals, for example, at 02:30 every day.

image.png

Note that there is a bug in Ansible Semaphore UI that causes the same task to run many times. The solution is to toggle the "Show cron format' button and use UNIX formatting like here.

On the server, you can see the daily 02:00 encrypted backups in the red boxes. Below that, you can see an example of the JSON status that was captured when the backup was executed, so you know the exact version of NetBox and installed plugins for a restore.

image.png

Backup File Rotation

You can set the retention period for backup files stored in /netos/backups/netbox in the NetBox Backup Settings Semaphore environment.

image.png

Ensure you use the exact values of days or weeks, i.e. no capitals.

Remote SFTP

To enable remote SFTP, change the No value in the SFTP_ENABLED variable to Yes, and set the SFTP_HOST/USER/PASS values accordingly.

image.png

Restoring from Backup

The NetBox Restore Process is one way to restore. If you want to manually decrypt the enc files on your local workstation, use the following commands. 

openssl enc -aes-256-cbc -d -in BACKUP_FILE.enc -out OUTPUT.sql.gz/.tar.gz

You will need the password set in the NetBox Backup Settings Semaphore environment variable ENCRYPTION_KEY. For example, by running this command and entering the password, we decrypt the NetBox media directory:

openssl enc -aes-256-cbc -d -in netbox_media_09_25_2024_02_00.tar.gz.enc -out netbox_media_09_25_2024_02_00.tar.gz

NetBox Development Snapshots

Use Backups for Labs

The Netos team use the tools outlined in these guides to deploy, restore, develop, and manage dozens of NetBox development environments, typically with many changes and deployments each day.

The original backup below used the standard timestamp format mm_dd_yyyy_HH_MM. However, we renamed it so we can snapshot different useful development environments pre-populated with data. We can then have many backups and images deployed to different NetBox instances on Semaphore to test and stage.

image.png

As long as you only replace the timestamp in the filename with some text, you can set this in the RESTORE_FILE_STAMP, variable in the NetBox Backup Settings Semaphore environment. Here are some examples:

Before After
netbox_db_10_01_2024_15_24.sql.gz.enc netbox_db_01_Data_Feeds_and_Excel.sql.gz.enc
netbox_media_10_01_2024_15_24.tar.gz.enc netbox_media_01_Data_Feeds_and_Excel.tar.gz.enc
netbox_status_10_01_2024_15_24.json netbox_status_01_Data_Feeds_and_Excel.json

image.png

Additionally, during the backup process the generated timestamp is displayed, for example; 10_01_2024_18_35. You could use this as per the screenshot above to quickly roll back.

image.png

NetBox Restoration Guide

Introduction

The restore process works by taking a backup generated by the NetBox Backup, decrypting it, and restoring it. Check the NetBox Development Snapshots section in the backup guide for an approach on using this feature for dev/test.

The ENCRYPTION_KEY value in the NetBox Backup Settings Semaphore environment is used in both the backup and restore process. Be sure to keep track as if you change it on the target system, all future backups will use the new key.

Configure the Restore

The NetBox backup process will create three timestamped (mm_dd_yyyy_HH_MM) files like these. The enc files are encrypted using the ENCRYPTION_KEY variable.

netbox_db_03_18_2024_01_00.sql.gz.enc
netbox_media_03_18_2024_01_00.tar.gz.enc
netbox_status_03_18_2024_01_00.json

You can manually decrypt the files using the command openssl enc -aes-256-cbc -d -in BACKUP_FILE.enc -out OUTPUT.sql.gz.tar.gz

The restore file prefix is set in the NetBox Backup Settings environment and is the timestamp used for all three files. Set this to match the timestamp of the files that you place in the /netos/netbox/backups directory.

image.png

Plugin Version Validation Logic

By default, when the backup runs it will check the NetBox version and plugin versions of the target system by comparing the output of the /api/status endpoint collected from the backup with the version of the target system.

This is an example of the relevant JSON: 

    "netbox-version": "3.7.8",
    "plugins": {
        "netos": "1.4.6",
        "netos_fabric": "1.3.3",
        "netos_model_builder": "1.2.37",
        "netos_reporting": "1.3.16"
    }

In the event of a mismatch, you will see a warning like this that details the discrepancies between the backup and target system.

image.png

To disable this check, change BYPASS_PLUGIN_CHECK to Yes. Disabling may be acceptable when restoring to a slightly different target system, for example, where the Netos Insights plugin has increased from minor version 1.2.24 to 1.2.26.

image.png

When BYPASS_PLUGIN_CHECK is set to Yes the restore process will look like this:

image.png

Any NetBox Open Source Plugins or Netos Enterprise NetBox Plugins installed using Semaphore will be named according to their version and saved in /netos/working-dir/netbox-plugins and /netos/working-dir/netos-plugins. The /netos/working-dir directory is backed up each day, assuming you enable the schedule in Semaphore.

NetBox Health Checks & Maintenance

Introduction

In order to keep your NetBox instance operating smoothly you can run housekeeping and maintenance scripts.

NetBox Housekeeping Script

NetBox includes a housekeeping management command that should be run nightly. This command handles:

The official documentation is here netbox/docs/administration/housekeeping.md at develop · netbox-community/netbox (github.com)

image.png

As per the NetBox developer suggestions, you can schedule the housekeeping script to run each day by enabling the schedule job in Semaphore.

image.png

NetBox Health Check

The NetBox Health Check script runs some useful commands to quickly capture the state of your NetBox instance, for example, checking processes are running, checking system logs, etc.

systemctl is-active netbox
systemctl is-active netbox-rq
python3 /opt/netbox/netbox/manage.py check
grep -i 'error' /var/log/nginx/netbox.error.log || true
grep -i 'warning' /var/log/nginx/netbox.error.log || true
journalctl -u netbox --since '24 hours ago' -n 250 | grep -i 'error' || true
journalctl -u netbox-rq --since '24 hours ago' -n 250 | grep -i 'error' || true

When you run the script, you'll see output like this:

image.png

In this scenario we can quickly see there is an error when starting NetBox, which pointed us in the right direction for troubleshooting a syntax issue in settings.py (using the commands detailed above). 

image.png