# build-tooling
**Repository Path**: mirrors_databricks/build-tooling
## Basic Information
- **Project Name**: build-tooling
- **Description**: Databricks Education department's curriculum build tool chain
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-24
- **Last Updated**: 2025-12-06
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# build-tools
This directory contains source for the various build tools used during
curriculum development within the Education department at Databricks.
* `bdc`: *B*uild *D*atabricks *C*ourse: This is the main build tool. See
the `bdc` [README](bdc/README.md) for full details.
* `gendbc`: Create Databricks DBC files from the command line. See
the `gendbc` [README](gendbc/README.md) for full details.
* `master_parse`: The master notebook parse tool and module. See the
`master_parse` [README](master_parse/README.md) for full details.
* `course`: An optional curriculum workflow management tool that sits on top
of `bdc`. There's no README for `course`. Just install it (or symlink to
it), and run `course help`.
Unless you're actually developing the build tools, you'll probably never run
`master_parse` or `gendbc` manually; `bdc` will do that for you.
## Prerequisites
* Ensure that you have a Python 2 environment (preferably, an activated virtual
environment).
* Ensure that you have a Java 7 or Java 8 JDK and that `java` is in your path.
Java 9 is _not supported._
* `gendbc` will be installed in `$HOME/local/bin`. Make sure `$HOME/local/bin`
is in your path, or your builds will fail.
## Quick Links
* [Installing or updating the build tools](#installing-the-build-tools)
* [Using a Docker-based Build Environment](#using-docker)
* [Creating the virtual python environment](#virtual-python-environment)
* [`bdc` Documentation](bdc/README.md), which includes documentation of the build
file format
* [`master_parse` Documentation](master_parse/README.md), which tells you all
the cool things the master parser supports within your notebooks.
## Installing the Build Tools
### Using Docker
One of the simplest ways to set your build environment up is to use Docker.
See the [README](docker/README.md) in the `docker` directory for details on
creating and updating a Docker-based build tool environment.
### Installing the Build Tools Manually
#### Virtual Python Environment
_bdc_ is currently limited to Python 2.
While it is possible to build the courseware by installing the necessary
software in the system-installed (or Homebrew-installed) Python, **it is not
recommended**. It's much better to run the build from a dedicated Python
virtual environment. This document describes how to do that. If you want to
use the system version of Python, you're on your own (because it's
riskier).
#### Install `pip`
You'll have to install `pip`, if it isn't already installed. First,
download `get-pip.py` from here:
Once you have `get-pip.py`, install `pip`.
* If you're on Linux, run this command: `sudo /usr/bin/python get-pip.py`
* If you're on a Mac and _not_ using Homebrew: `sudo /usr/bin/python get-pip.py`
* If you're on a Mac and using a Homebrew-installed Python: `/usr/local/bin/python get-pip.py`
* If you're on Windows and you used the standard installer: `C:\Python27\python get-pip.py`
#### Install `virtualenv`
* Linux: `sudo pip install virtualenv`
* Mac and not using Homebrew: `sudo pip install virtualenv`
* Mac with Homebrew-install Python: `/usr/local/bin/pip install virtualenv`
* Windows: `C:\Python27\Scripts\pip install virtualenv`
##### Create a virtual environment
Create a virtual Python environment for the build. You can call it anything
you want, and you can create it any where you want. Let's assume you'll
call it `dbbuild` and put it in your home directory. Here's how to create
the virtual environment.
From a command window, assuming you're in your home directory:
* Linux or Mac: `virtualenv dbbuild`
* Windows: `C:\Python27\Scripts/virtualenv dbbuild`
##### Activate the virtual environment
Once you have the virtual Python environment installed, you'll need to
activate it. **You have to activate the environment any time you create a
new command window.**
(For complete details on using `virtualenv`, see .)
* Linux or Mac: `. $HOME/dbbuild/bin/activate`
* Windows: `dbbuild\bin\activate.bat`
#### Installing the Tools
##### The `course` tool
If you're using `course`, which helps you automate your workflow, start by
installing that tool. The easiest solution:
* Choose a directory that is already in your path (e.g., `$HOME/bin`,
`/usr/local/bin`)
* `cd` to that directory
* Create a symbolic link to `course` in that directory:
```
$ ln -s /path/to/repos/build-tooling/course .
```
##### Installing the rest of the build tools with `course`
If you're using `course`, you can just type:
```
course install-tools
```
to install and update the build tools. It will also install `databricks-cli`
for you.
**NOTE**: `course install-tools` does _not_ work for Docker-based
installations. See [Using Docker](#using-docker) if you're using a Docker-based
setup.
##### Installing the build tools manually
If you have never installed the tools in your virtual Python environment, run
this command:
```
pip install git+https://github.com/databricks-edu/build-tooling
```
If you have installed the tools before, run:
```
pip install --upgrade git+https://github.com/databricks-edu/build-tooling
```
This installation script will install:
* `bdc`
* `master_parse`
* `gendbc`
* `databricks-cli`
It'll take a few minutes, but it will download and install all three pieces.
## NOTICE
* This software is copyright © 2017-2018 Databricks, Inc., and is released under
the [Apache License, version 2.0](https://www.apache.org/licenses/). See
`LICENSE.txt` for details.
* Databricks cannot support this software for you. We use it internally,
and we have released it as open source, for use by those who are
interested in building similar kinds of Databricks notebook-based
curriculum. But this software does not constitute an official Databricks
product, and it is subject to change without notice.