Skip to main content

Docker Configuration

In the previous steps you generated three sets of configuration files: a PostgreSQL schema, an Elasticsearch mapping, and Arranger configs. This section covers what to update in docker-compose.yml to wire those files into the platform for your dataset, then how to apply the changes.

Step 1: Update for Your Dataset

When you swap in a different dataset, a few values must change within the docker-compose.yml. The table below identifies each location:

VariableServiceUpdate to
ES_INDEX_0_NAMEsetupyour index name (hockey_players-index)
ES_INDEX_0_TEMPLATE_NAMEsetupyour index name (hockey_players-index)
ES_INDEX_0_ALIAS_NAMEsetupyour alias name (hockey_players_centric)
NEXT_PUBLIC_ARRANGER_DATATABLE_1_INDEXstageyour alias name (hockey_players_centric)

The most common mistake is updating the alias in one place but not the others. If Arranger throws a GraphQL schema error on startup, check that ES_INDEX_0_ALIAS_NAME, esIndex in base.json, and NEXT_PUBLIC_ARRANGER_DATATABLE_1_INDEX all carry the same value.

info

The index pattern (e.g. hockey_players-*) determines which future indices automatically inherit the mapping template. The index name (e.g. hockey_player-index) is the concrete index created at startup. The alias (e.g. hockey_players_centric) is what Arranger and Stage actually query. Aliases become more valuable when managing or migrating multiple indices, the distinction will be less apparent in this workshop.

Step 2: Apply the Changes

Since you previously ran make demo, the environment contains demo data that needs to be cleared before loading your own. Run a full reset first:

make reset
Windows (PowerShell)
.\run.ps1 reset

This wipes all Elasticsearch and PostgreSQL data, stops all containers, and returns the environment to a clean state. Then bring the platform back up with:

make platform
Windows (PowerShell)
.\run.ps1 platform
info

For future configuration changes (once your own data is loaded), make restart is sufficient — it reloads configs without wiping data. Only use make reset when you need to start from scratch.

Troubleshooting

If services don't start correctly after changes:

# Check container logs
docker logs setup
docker logs postgres
docker logs arranger-datatable1
docker logs stage

# Verify PostgreSQL is healthy
docker exec postgres pg_isready -U admin

# Verify Elasticsearch is healthy
curl -u elastic:myelasticpassword http://localhost:9200/_cluster/health?pretty

# Full reset (caution: deletes all data)
make reset
Windows (PowerShell) — full reset
.\run.ps1 reset

Checkpoint

Before proceeding, confirm:

  1. You updated docker-compose.yml with your dataset's index name, alias, and field values
  2. You ran make reset followed by make platform to clear demo data and start fresh
  3. The portal is accessible at http://localhost:3000

Stuck? Run docker ps to check which containers are running. If a container exited, run docker logs <container-name> to see why.

Next: With the infrastructure configured, let's load data into the portal.

Reference: Service Configuration Details

The following sections explain how each service in docker-compose.yml is wired up. This is supplemental — you don't need to modify these during the workshop, but it's useful context if you're adapting the platform for your own deployment.

Setup

The setup service runs initialization scripts that create PostgreSQL tables and Elasticsearch indices from your generated config files:

# PostgreSQL Configuration
POSTGRES_CONFIGS_DIR: setup/configs/postgresConfigs
POSTGRES_HOST: postgres
POSTGRES_PORT: 5432
POSTGRES_DB: ${POSTGRES_DB:-overtureDb} # update if changing the database name
POSTGRES_USER: ${POSTGRES_USER:-admin} # update if changing credentials
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-admin123}

# Elasticsearch Index Configuration
ES_INDEX_COUNT: 1

ES_INDEX_0_NAME: hockey_player-index # must match the index name in your mapping file
ES_INDEX_0_TEMPLATE_FILE: setup/configs/elasticsearchConfigs/datatable1-mapping.json # path to your mapping
ES_INDEX_0_TEMPLATE_NAME: hockey_players-index
ES_INDEX_0_ALIAS_NAME: hockey_players_centric # must match the alias in your mapping and base.json
VariableWhat it controls
POSTGRES_DB/USER/PASSWORDDatabase credentials, must be consistent across all services that connect to PostgreSQL
ES_INDEX_0_NAMEThe Elasticsearch index name, corresponds to the index name entered in the Config Generator
ES_INDEX_0_TEMPLATE_FILEPath to your generated mapping JSON, tells setup where to find the index template
ES_INDEX_0_ALIAS_NAMEThe alias Arranger queries, must match aliases in your mapping and esIndex in base.json
info

The setup container is backed by shell scripts under setup/scripts/ that handle sequencing, health checks, and initialization signalling. You do not need to modify these during the workshop, they run automatically with make platform or make demo.

tip

Multiple datasets are supported, each gets its own index block following the same ES_INDEX_1_* pattern. This is beyond the scope of this workshop, but reach out via contact@overture.bio if you'd like guidance on multi-dataset setups afterwards.

PostgreSQL

PostgreSQL stores your data persistently. The setup service auto-discovers and executes all .sql files from setup/configs/postgresConfigs/ to create tables:

postgres:
image: postgres:15-alpine
restart: unless-stopped
environment:
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-admin123} # update if changing credentials
POSTGRES_USER: ${POSTGRES_USER:-admin} # update if changing credentials
POSTGRES_DB: ${POSTGRES_DB:-overtureDb} # update if changing the database name
volumes:
- postgres-data:/var/lib/postgresql/data
VariableWhat it controls
POSTGRES_PASSWORDDatabase password, must match the value set in the setup service
POSTGRES_USERDatabase user, must match the value set in the setup service
POSTGRES_DBDatabase name, must match the value set in the setup service and your SQL schema

The postgres-data volume ensures your data persists across container restarts. Credentials use ${VARIABLE:-default} syntax, Docker Compose reads actual values from the .env file if it exists, falling back to the workshop defaults. For production, create a .env file with strong passwords (see .env.example).

tip

Adding a second dataset only requires a new SQL file in setup/configs/postgresConfigs/, the setup service discovers and executes all .sql files automatically, no changes to docker-compose.yml are needed. This is beyond the scope of this workshop, but reach out via contact@overture.bio if you'd like guidance afterwards.

Arranger

Each data table requires its own Arranger service instance. The volume mount connects the Arranger configuration files you generated to the running container, and the environment variables tell it how to reach Elasticsearch:

arranger-datatable1:
image: ghcr.io/overture-stack/arranger-server:4919f736
container_name: arranger-datatable1
restart: unless-stopped
volumes:
- ./setup/configs/arrangerConfigs/datatable1:/app/apps/search-server/configs
environment:
ES_HOST: http://elasticsearch:9200
ES_USER: ${ES_USER:-elastic} # update if changing Elasticsearch credentials
ES_PASS: ${ES_PASSWORD:-myelasticpassword} # update if changing Elasticsearch credentials
ES_ARRANGER_SET_INDEX: datatable1_arranger_set # must be unique per Arranger instance
PORT: 5050
SettingWhat it controls
Volume mount pathWhich Arranger config directory is loaded, the datatable1 segment matches your table name
ES_USER / ES_PASSCredentials used to connect to Elasticsearch
ES_ARRANGER_SET_INDEXInternal Arranger bookmarks index, must be unique per Arranger instance
tip

Adding a second Arranger instance requires a new service block with a unique port, container name, config directory, and ES_ARRANGER_SET_INDEX. This is beyond the scope of this workshop, but reach out via contact@overture.bio if you'd like guidance afterwards.

Stage

Stage connects to Arranger via environment variables. These settings control the portal's identity and tell Stage which Arranger instance to query for each data table:

stage:
restart: unless-stopped
environment:
NEXT_PUBLIC_LAB_NAME: ${NEXT_PUBLIC_LAB_NAME:-My Data Portal} # portal display name
NEXT_PUBLIC_ADMIN_EMAIL: ${NEXT_PUBLIC_ADMIN_EMAIL:-admin@example.org} # contact email shown in the portal

NEXT_PUBLIC_ARRANGER_DATATABLE_1_API: http://arranger-datatable1:5050 # must match Arranger service name and port
NEXT_PUBLIC_ARRANGER_DATATABLE_1_DOCUMENT_TYPE: records
NEXT_PUBLIC_ARRANGER_DATATABLE_1_INDEX: hockey_players_centric # must match ES_INDEX_0_ALIAS_NAME in setup
NEXT_PUBLIC_DATATABLE_1_EXPORT_ROW_ID_FIELD: submission_metadata.submission_id

NEXTAUTH_SECRET: ${NEXTAUTH_SECRET:-your-secure-secret-here}
VariableWhat it controls
NEXT_PUBLIC_LAB_NAMEThe display name shown in the portal header
NEXT_PUBLIC_ADMIN_EMAILContact email shown in the portal footer
NEXT_PUBLIC_ARRANGER_DATATABLE_1_APIThe URL Stage uses to reach Arranger, must match the service name and port
NEXT_PUBLIC_ARRANGER_DATATABLE_1_INDEXThe Elasticsearch alias Stage queries, must match ES_INDEX_0_ALIAS_NAME in setup
NEXT_PUBLIC_DATATABLE_1_EXPORT_ROW_ID_FIELDThe field used as the unique row identifier for TSV export
tip

Stage natively supports up to 5 data table connections following the DATATABLE_1, DATATABLE_2 naming pattern. Adding a second connection is beyond the scope of this workshop, but reach out via contact@overture.bio if you'd like guidance afterwards.