Job Manager Service– Samples¶

Running a Model¶

Input Parameters¶

modelId: Defines the ID of the model, which must be accessible by the tenant using Model Management Service.

Supported Models
Currently, only Apache Zeppelin notebooks are supported.
configurationId: Defines the Environment Configuration created using the Predictive Learning services. The Job Manager instantiates the environment and runs the model on it.
inputFolderId: Defines the folder holding the input data, which must be accessible by the tenant using Data Exchange Service.
outputFolderId: Defines the output folder, which must already be created and be accessible by the tenant using Data Exchange Service.
maximumExecutionTimeInSeconds: Defines the maximum allowed execution time of the job. Only the actual execution time of the execution engine (Apache Zeppelin) is taken into account.

Starting the Job¶

Request:

POST /api/jobmanager/v3/jobs
X-XSRF-TOKEN: `<xsrf_token>`

The input parameters are defined in the body:

{
  "modelId": "<modelId>",
  "configurationId": "<configurationId>",
  "inputFolderId": "<inputFolderId>",
  "outputFolderId": "<outputFolderId>",
  "maximumExecutionTimeInSeconds":"7200"
}

Sample response:

{
  "id": "<job_execution_id>",
  "modelId": "<modelId>",
  "environmentId": "<environmentId>",
  "message": "",
  "status": "SUBMITTED",
  "creationDate": "2018-10-01T12:00:00.001Z",
  "createdBy": "<your_tenant_id>",
  "inputFolderId": "<inputFolderId>",
  "outputFolderId": "<outputFolderId>",
  "configurationId": "<configurationId>",
  "maximumExecutionTimeInSeconds":"7200"
}

The <job_execution_id> is required for monitoring the job using the Job Manager.

Monitoring Job Execution¶

The current status of a job is requested using:

GET /api/jobmanager/v3/jobs/<job_execution_id>

Sample response:

{
  "id": "<job_execution_id>",
  "modelId": "<modelId>",
  "environmentId": "<environmentId>",
  "message": "Started notebook execution.",
  "status": "RUNNING",
  "creationDate": "2018-10-01T02:00:00.001Z",
  "createdBy": "<your_tenant_id>",
  "inputFolderId": "<inputFolderId>",
  "outputFolderId": "<outputFolderId>",
  "configurationId": "<configurationId>",
  "maximumExecutionTimeInSeconds":"7200"
}

The following statuses are available:

SUBMITTED
STARTING
RUNNING
STOPPING
FAILED
SUCCEDED
STOPPED

The status is FAILED, if the job execution cannot be completed or an error occurred during the execution. This is also true, if only a single paragraph within a Zeppelin notebook fails and all other steps succeed. The Job Manager attempts to continue a job execution, if the execution workflow allows skipping individual steps.

Retrieving the List of Jobs¶

A list of all available jobs is retrieved using the following request:

GET /api/jobmanager/v3/jobs

By default, the Job Manager divides the result into pages of 100 entries and returns the first page. Using the query parameter <pageNumber>, the page number to be returned can be changed. The query parameter <pageSize> changes the number entries per page.

Response:

{
 "content": [
  {
     "jobId": "<job_execution_id>",
     "modelId": "<modelId>",
     "environmentId": "<environmentId>",
     "message": "message",
     "status": "SUCCEEDED",
     "creationDate": "2018-10-01T12:00:00.001Z",
     "createdBy": "TenantId",
     "inputFolderId": "<inputFolderId>",
     "outputFolderId": "<outputFolderId>",
     "configurationId": "<configurationId>",
     "maximumExecutionTimeInSeconds":"7200"
  },
  {
     "jobId": "<job_execution_id>",
     "modelId": "<modelId>",
     "environmentId": "<environmentId>",
     "message": "Unable to import model into Zeppelin[I/O error on GET request for \"https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/<model_id>/versions/last\": Server returned HTTP response code: 504 for URL: https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/<model_id>/versions/last; nested exception is java.io.IOException: Server returned HTTP response code: 504 for URL: https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/Model_id/versions/last] Error while stopping environment",
     "status": "STOPPING",
     "creationDate": "2018-10-01T12:00:00.001Z",
     "createdBy": "TenantId",
     "inputFolderId": "<inputFolderId>",
     "outputFolderId": "<outputFolderId>",
     "configurationId": "<configurationId>",
     "maximumExecutionTimeInSeconds":"7200"
  },
  {
     "jobId": "<job_execution_id>",
     "modelId": "<modelId>",
     "environmentId": "<environmentId>",
     "message": "Failed to start environment [An environment has already been started for the configuration]",
     "status": "FAILED",
     "creationDate": "2018-10-02T12:00:00.001Z",
     "createdBy": "TenantId",
     "inputFolderId": "<inputFolderId>",
     "outputFolderId": "<outputFolderId>",
     "configurationId": "<configurationId>",
     "maximumExecutionTimeInSeconds":"7200"
  }
  ],
 "totalPages": 1,
 "totalElements": 3,
 "last": true,
 "size": 20,
 "number": 0,
 "numberOfElements": 3,
 "first": true,
 "sort": null
}

The second and third entry illustrate, that the chain of messages is returned to the user, if a fatal error is encountered and the execution logic cannot recover. The Job Manager attempts to stop the used instances, while preserving any outputs that might have been produced.