· pinot

Apache Pinot: Inserts from SQL - Unable to get tasks states map - No task is generated for table

I recently wrote a post on the StarTre blog describing the inserts from SQL feature that was added in Apache Pinot 0.11, and while writing it I came across some interesting exceptions due to configuration mistakes I’d made. In this post we’re going to describe one of those exceptions.

To recap, I was trying to ingest a bunch of JSON files from an S3 bucket using the following SQL query:

INSERT INTO "events"
FROM FILE 's3://marks-st-cloud-bucket/events/*.json'
OPTION(
  taskName=myTask-s3,
  input.fs.className=org.apache.pinot.plugin.filesystem.S3PinotFS,
  input.fs.prop.accessKey=AKIARCOCT6DWLUB7F77Z,
  input.fs.prop.secretKey=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,
  input.fs.prop.region=eu-west-2
);
Note

Don’t worry, those credentials were deactivated and deleted several days ago.

When I ran this query against a Pinot cluster that contained a controller, broker, and server, I got the following exception:

[
  {
    "message": "QueryExecutionError:\norg.apache.commons.httpclient.HttpException: Unable to get tasks states map. Error code 400, Error message: {\"code\":400,\"error\":\"No task is generated for table: events, with task type: SegmentGenerationAndPushTask\"}\n\tat org.apache.pinot.common.minion.MinionClient.executeTask(MinionClient.java:123)\n\tat org.apache.pinot.core.query.executor.sql.SqlQueryExecutor.executeDMLStatement(SqlQueryExecutor.java:102)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:145)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:103)",
    "errorCode": 200
  }
]

My mistake here was that I didn’t have a minion in the cluster. The ingestion job is run by the minion component, so without one of those this feature doesn’t work!

An update (30th June 2023)

Today I learned that you can get this error even if you do have a minion configured. The scenario that results in this error is if no files are found for ingestion.

This might happen if you have an invalid glob expression in the includeFileNamePattern: For example, the following throws the exception:

SET taskName = 'events-task7';
SET input.fs.className = 'org.apache.pinot.spi.filesystem.LocalPinotFS';
SET includeFileNamePattern='glob:customers.csv';
INSERT INTO customers
FROM FILE 'file:///input/';

We can fix the query by adding **/ at the beginning:

SET taskName = 'events-task7';
SET input.fs.className = 'org.apache.pinot.spi.filesystem.LocalPinotFS';
SET includeFileNamePattern='glob:**/customers.csv';
INSERT INTO customers
FROM FILE 'file:///input/';
  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket