· til nodejs

Node.js: Minifying JSON documents

I often need to minimise the schema and table config files that you use to configure Apache Pinot so that they don’t take up so much space. After doing this manually for ages, I came across the json-stringify-pretty-compact library, which speeds up the process.

We can install it like this:

npm install json-stringify-pretty-compact

And then I have the following script:

minify.mjs
import pretty from 'json-stringify-pretty-compact';

let inputData = '';

process.stdin.on('data', (chunk) => {
    inputData += chunk;
});

process.stdin.on('end', () => {
    const value = JSON.parse(inputData);
    console.log(pretty(value));
});

process.stdin.resume();

Imagine we then have the following file:

config/schema.json
{
    "schemaName": "parkrun",
    "primaryKeyColumns": ["competitorId"],
    "dimensionFieldSpecs": [
      {
        "name": "runId",
        "dataType": "STRING"
      },
      {
        "name": "eventId",
        "dataType": "STRING"
      },
      {
        "name": "competitorId",
        "dataType": "LONG"
      },
      {
        "name": "rawTime",
        "dataType": "INT"
      },

      {
        "name": "lat",
        "dataType": "DOUBLE"
      },
      {
        "name": "lon",
        "dataType": "DOUBLE"
      },
      {
        "name": "location",
        "dataType": "BYTES"
      },
      {
        "name": "course",
        "dataType": "STRING"
      }
    ],
    "metricFieldSpecs": [
      {
        "name": "distance",
        "dataType": "DOUBLE"
      }
    ],
    "dateTimeFieldSpecs": [{
      "name": "timestamp",
      "dataType": "TIMESTAMP",
      "format" : "1:MILLISECONDS:EPOCH",
      "granularity": "1:MILLISECONDS"
    }]
  }

The field specs take up so much unnecessary space, so let’s get our script to sort that out:

cat config/schema.json | node minify.mjs
Output
{
  "schemaName": "parkrun",
  "primaryKeyColumns": ["competitorId"],
  "dimensionFieldSpecs": [
    {"name": "runId", "dataType": "STRING"},
    {"name": "eventId", "dataType": "STRING"},
    {"name": "competitorId", "dataType": "LONG"},
    {"name": "rawTime", "dataType": "INT"},
    {"name": "lat", "dataType": "DOUBLE"},
    {"name": "lon", "dataType": "DOUBLE"},
    {"name": "location", "dataType": "BYTES"},
    {"name": "course", "dataType": "STRING"}
  ],
  "metricFieldSpecs": [{"name": "distance", "dataType": "DOUBLE"}],
  "dateTimeFieldSpecs": [
    {
      "name": "timestamp",
      "dataType": "TIMESTAMP",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:MILLISECONDS"
    }
  ]
}
  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket