# diagnostic-collection

**Repository Path**: mirrors_datastax/diagnostic-collection

## Basic Information

- **Project Name**: diagnostic-collection
- **Description**: Diagnostic Collector for Apache Cassandra, DSE, HCD
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-12-03
- **Last Updated**: 2025-12-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Diagnostic Collector for Apache Cassandra&trade;, DSE&trade;, HCD&trade;, …

A script for collecting a diagnostic snapshot (support bundle) from each node in a Cassandra based cluster.


# Users: Running the Collector against your Cluster

Download the latest `ds-collector.GENERIC-*.tar.gz` release from the [releases page](https://github.com/datastax/diagnostic-collection/releases).

This `ds-collector*.tar.gz` tarball is then extracted onto a bastion or jumpbox that has access to the nodes in the cluster.

Instructions for running the Collector is found in [`ds-collector/README.md`](https://github.com/datastax/diagnostic-collection/blob/master/ds-collector/README.md).

If you hit any issues please also read [`ds-collector/TROUBLESHOOTING.md`](https://github.com/datastax/diagnostic-collection/blob/master/ds-collector/TROUBLESHOOTING.md).

These instructions are also bundled into the built collector tarball.


# Developers: Building from source code

The code for the collector script is in the _ds-collector/_ directory.  This top-level directory contains the `Makefile` for developers wishing to build the ds-collector bundle for themselves.

Then _ds-collector/_ code gets built into a `ds-collector*.tar.gz` tarball.


## Pre-configuring the Collector Configuration
When building the collector, it can be instructed to pre-configure the collector.conf by setting the following variables:

```bash
# If the target cluster being collected is DataStax Enterprise, please set is_dse=true, otherwise it will assume Apache Cassandra.
export is_dse=true
# If the target cluster is running on docker containers, please set is_docker=true, this will result in the script issuing commands via docker and not ssh.
export is_docker=true
# If the target cluster is running on k8s, please set is_k8s=true, this will result in the script issueing commands via kubectl and not ssh.
export is_k8s=true
```

If no variables are set, then the collector will be pre-configured to assume Apache Cassandra running on hosts which can be accessed via SSH.


## Building the Collector
Build the collector using the following make command syntax. You will need make and Docker.

```bash
# The ISSUE variable is typically a JIRA ID, but can be any unique label
export ISSUE=<JIRA_ID>
make
```

This will generate a _.tar.gz_ tarball with the `issueId` set in the packaged configuration file. The archive will named in the format `ds-collector.$ISSUE.tar.gz`.


## Building the Collector with automatic s3 upload ability

If the collector is built with the following variables defined, all collected diagnostic snapshots will be encrypted and uploaded to a specific AWS S3 bucket. Encryption will use a one-off built encryption key that is created locally.

```bash
export ISSUE=<JIRA_ID>
# AWS Key and secret for S3 bucket, where the diagnostic snapshots will be uploaded to
export COLLECTOR_S3_BUCKET=yourBucket
export COLLECTOR_S3_AWS_KEY=yourKey
export COLLECTOR_S3_AWS_SECRET=yourSecret
make
```

To use this feature you will need the aws-cli and openssl installed on your local machine as well.

This will then generate a .tar.gz tarball as described above, additionally with the AWS credentials set in the packaged configuration file, and the bucket name set within the ds-collector script.

In addition to the _.tar.gz_ tarball, an encryption key is now generated. The encryption key must be placed in the same directory as the extracted collector tarball for it to execute. If the tarball is being sent to someone else, it is recommeneded to send the encryption key via a different (and preferably secured) medium.


## Storing Encryption keys within the AWS Secrets Manager
The collector build process also supports storing and retrieving keys from the AWS secrets manager, to use this feature, 2 additional environment variables must be provided before the script is run.

```bash
export ISSUE=<JIRA_ID>
# AWS Key and secret for S3 bucket, where the diagnostic snapshots will be uploaded to
export COLLECTOR_S3_BUCKET=yourBucket
export COLLECTOR_S3_AWS_KEY=yourKey
export COLLECTOR_S3_AWS_SECRET=yourSecret
# AWS Key and secret for Secrets Manager, where the one-off build-specific encryption key will be stored
export COLLECTOR_SECRETSMANAGER_KEY=anotherKey
export COLLECTOR_SECRETSMANAGER_SECRET=anotherSecret
make
```

When the collector is built, it will also upload the generated encryption key to the Secrets Manager, as defined by the COLLECTOR_SECRETSMANAGER_* variables.

Please be careful with the encryption keys. They should only be stored in a secure vault (such as the AWS Secrets Manager), and temporarily on the jumpbox or bastion where and while the collector script is being executed. The encryption key ensures the diagnostic snapshots are secured when transferred over the network and stored in the AWS S3 bucket.