Customizing the Data Portal
Arranger simplifies GraphQL queries over Elasticsearch indices with its front-end library of reusable search components. The primary configurable components for this guide are the left-hand search facets and the central data table seen below.
All configurations for these components are made through four configuration files: base.json
, extended.json
, table.json
and facets.json
. We will cover each in the following sections.
Viewing Elasticsearch Documents
Indices in Elasticsearch are a collection of related documents with similar characteristics.
- Elasticvue offers a convenient and user-friendly interface for managing and exploring your Elasticsearch data. With elasticvue, you can:
- Easily visualize and search through indexed documents.
- Quickly access and interact with JSON documents.
- Simplify the management and troubleshooting of Elasticsearch indices.
To install elasticvue, follow these steps:
-
Search for elasticvue in your browser's extension catalogue. E.g. Chrome Web Store, Firefox Add-ons, Microsoft Webstore.
-
Click on "Add to Chrome" (or "Add to Firefox") to install the extension.
-
Open elasticvue and enter your Elasticsearch URL. For the Overture Quickstart, this will be
http://localhost:9200
. -
Select basic authentication and enter the default username
elastic
and passwordmyelasticpassword
.
Using ElasticVue
From the elasticvue dashboard's top navigation, select search.
This page displays all indexed Elasticsearch documents created by Maestro from published Song analyses and used by Arranger. Clicking any of the _index
rows will give you a direct view of the JSON documents that populate the index.
Being able to easily view the JSON documents within your Elasticsearch instance will be beneficial when configuring your Arranger configs.
Arranger Configurations
Base Configuration
The base.json file contains only two fields, documentType
and index
:
{
"documentType": "file",
"index": "overture-quickstart-index"
}
-
The
index
field specifies the name of the Elasticsearch index, in this example theoverture-quickstart-index
-
documentType
informs Arranger of the mapping type being used by Maestro,analysis
orfile
centricLearn MoreFor more information on index mappings and index centricity, see our administration guide covering index mappings.
Extended Configuration
The extended.json configuration file defines all the fields and display names you wish to populate your front-end portal with. Below, we have provided a simplified list taken from our QuickStart extended.json configuration:
{
"extended": [
{
"displayName": "Object Id",
"fieldName": "object_id"
},
{
"displayName": "Analysis Id",
"fieldName": "analysis.analysis_id"
},
{
"displayName": "Treatment Duration (Days)",
"fieldName": "analysis.donor.primaryDiagnosis.treatment.treatmentDuration"
}
]
}
-
The
displayName
field outlines how you want your fields displayed on the front-end UI when used within the search facets and or table. -
The
fieldName
values are written as represented within your Elasticsearch documents:-
Object ID can be found at the root of the Elasticsearch Documents and therefore is simply the fieldName shown here
-
The Analysis ID is a nested element found inside the Analysis field, we denote nesting by adding a period
.
making the appropriate fieldNameanalysis.analysis_id
-
By looking at the
treatmentDuration
field, we can see it is nested relatively deeper than our other three fields outlined above. The same rules, however apply, and by working backwards and adding a.
for each nested element, we end up withanalysis.donor.primaryDiagnosis.treatment.treatmentDuration
-
Table Configuration
The table.json file configures the columns displayed in the data table. These configurations specify which fields are shown, their visibility, and their sortability.
{
"table": {
"columns": [
{
"canChangeShow": true,
"fieldName": "object_id",
"show": false,
"sortable": true
},
{
"canChangeShow": true,
"fieldName": "analysis.analysis_id",
"show": false,
"sortable": true
},
{
"canChangeShow": true,
"fieldName": "analysis.collaborator.name",
"jsonPath": "$.analysis.collaborator.hits.edges[*].node.name",
"query": "analysis { collaborator { hits { edges { node { name } } } } }",
"show": true,
"sortable": true
}
]
}
}
Basic fields
-
canChangeShow
is a boolean indicating if the user can toggle the visibility of the column, set this to true if you want users to have the option to show or hide this column using the columns dropdown menu. Set it to false if the visibility of this column should remain fixed. -
FieldName
is the same fieldname as described above, these values are written as represented within your Elasticsearch documents -
show
is a boolean indicating if the column is initially, by default, visible. Set this to true if you want the column to be visible when the table is first loaded. Set it to false if you want the column to be hidden by default. -
sortable
is a boolean indicating if the column can be sorted. Set this to true if you want users to be able to sort the table by this column. Set it to false if sorting should not be allowed for this column.
jsonPath
The jsonPath
field specifies the JSON path to extract nested data from Elasticsearch documents. This field defines the path to data nested within arrays.
For example, suppose we have an Elasticsearch document structured like this:
{
"analysis": {
"collaborator": [
{
"contactEmail": "susannorton@micr.ca",
"name": "MICR"
}
]
}
}
If we want to extract the name
field from the collaborator
array within the analysis
object, our jsonPath for this field would be:
$.analysis.collaborator.hits.edges[*].node.name
$.
designates the root of our elasticsearch documentsanalysis.collaborator
is the key for our desired nested object within the roothits.edges[*].node
specifies that we're accessing an array ([*]
translates to "all elements" in the array)name
specifies the desired field we want to extract from our Elasticsearch documents
query
The query
field defines the GraphQL query needed to retrieve the nested data.
This follows a similar structure to our JSON path but is written in GraphQL syntax:
{
analysis {
collaborator {
hits {
edges {
node {
name
}
}
}
}
}
}
When flattened, this matches the configuration shown in our example above:
"analysis { collaborator { hits { edges { node { name } } } } }",
If you want to gain hands-on experience making these queries and exploring GraphQL, we recommend accessing the Arranger GraphQL server using our Quickstart from http://localhost:5050/graphql
. For those preferring to use the most up-to-date GraphQL Playground UI, you can access it from http://localhost:5050/graphql/hellogql
(appending any string to the URL will take you there).
Facet Configuration
The facets.json file defines how aggregations (also known as facets in Elasticsearch) are configured for data exploration and filtering.
{
"facets": {
"aggregations": [
{
"active": true,
"fieldName": "file_type",
"show": true
},
{
"active": true,
"fieldName": "analysis__collaborator__name",
"show": true
}
]
}
}
-
active
indicates whether this aggregation is active or enabled (true) -
fieldName
the field used for aggregation. This means Elasticsearch will aggregate data based on different values found in the defined field. For thefile_type
field, this translates into a facet with the options of filtering for aggregations of three file types:VCF
,BAM
andCRAM
-
show
indicates whether to display this aggregation in the user interface (true)Facets.json SyntaxOne caveat of the
facets.json
file is the notation used for fieldNames. Here we use double underscores__
rather than.
for nested elements, for exampleanalysis__collaborator__name
instead ofanalysis.collaborator.name