Job Manager Service– Samples¶
Running a Model¶
Input Parameters¶
modelId
: Defines the ID of the model, which must be accessible by the tenant using Model Management Service.Supported Models
Currently, only Apache Zeppelin notebooks are supported.
configurationId
: Defines the Environment Configuration created using the Predictive Learning services. The Job Manager instantiates the environment and runs the model on it.inputFolderId
: Defines the folder holding the input data, which must be accessible by the tenant using Data Exchange Service.outputFolderId
: Defines the output folder, which must already be created and be accessible by the tenant using Data Exchange Service.maximumExecutionTimeInSeconds
: Defines the maximum allowed execution time of the job. Only the actual execution time of the execution engine (Apache Zeppelin) is taken into account.
Starting the Job¶
Request:
POST /api/jobmanager/v3/jobs
X-XSRF-TOKEN: `<xsrf_token>`
The input parameters are defined in the body:
{
"modelId": "<modelId>",
"configurationId": "<configurationId>",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"maximumExecutionTimeInSeconds":"7200"
}
Sample response:
{
"id": "<job_execution_id>",
"modelId": "<modelId>",
"environmentId": "<environmentId>",
"message": "",
"status": "SUBMITTED",
"creationDate": "2018-10-01T12:00:00.001Z",
"createdBy": "<your_tenant_id>",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"configurationId": "<configurationId>",
"maximumExecutionTimeInSeconds":"7200"
}
The <job_execution_id>
is required for monitoring the job using the Job Manager.
Monitoring Job Execution¶
The current status of a job is requested using:
GET /api/jobmanager/v3/jobs/<job_execution_id>
Sample response:
{
"id": "<job_execution_id>",
"modelId": "<modelId>",
"environmentId": "<environmentId>",
"message": "Started notebook execution.",
"status": "RUNNING",
"creationDate": "2018-10-01T02:00:00.001Z",
"createdBy": "<your_tenant_id>",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"configurationId": "<configurationId>",
"maximumExecutionTimeInSeconds":"7200"
}
The following statuses are available:
SUBMITTED
STARTING
RUNNING
STOPPING
FAILED
SUCCEDED
STOPPED
The status is FAILED
, if the job execution cannot be completed or an error occurred during the execution. This is also true, if only a single paragraph within a Zeppelin notebook fails and all other steps succeed. The Job Manager attempts to continue a job execution, if the execution workflow allows skipping individual steps.
Retrieving the List of Jobs¶
A list of all available jobs is retrieved using the following request:
GET /api/jobmanager/v3/jobs
By default, the Job Manager divides the result into pages of 100 entries and returns the first page. Using the query parameter <pageNumber>
, the page number to be returned can be changed. The query parameter <pageSize>
changes the number entries per page.
Response:
{
"content": [
{
"jobId": "<job_execution_id>",
"modelId": "<modelId>",
"environmentId": "<environmentId>",
"message": "message",
"status": "SUCCEEDED",
"creationDate": "2018-10-01T12:00:00.001Z",
"createdBy": "TenantId",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"configurationId": "<configurationId>",
"maximumExecutionTimeInSeconds":"7200"
},
{
"jobId": "<job_execution_id>",
"modelId": "<modelId>",
"environmentId": "<environmentId>",
"message": "Unable to import model into Zeppelin[I/O error on GET request for \"https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/<model_id>/versions/last\": Server returned HTTP response code: 504 for URL: https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/<model_id>/versions/last; nested exception is java.io.IOException: Server returned HTTP response code: 504 for URL: https://gateway.{region}.{MindSphere-domain}/api/modelmanagement/v3/models/Model_id/versions/last] Error while stopping environment",
"status": "STOPPING",
"creationDate": "2018-10-01T12:00:00.001Z",
"createdBy": "TenantId",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"configurationId": "<configurationId>",
"maximumExecutionTimeInSeconds":"7200"
},
{
"jobId": "<job_execution_id>",
"modelId": "<modelId>",
"environmentId": "<environmentId>",
"message": "Failed to start environment [An environment has already been started for the configuration]",
"status": "FAILED",
"creationDate": "2018-10-02T12:00:00.001Z",
"createdBy": "TenantId",
"inputFolderId": "<inputFolderId>",
"outputFolderId": "<outputFolderId>",
"configurationId": "<configurationId>",
"maximumExecutionTimeInSeconds":"7200"
}
],
"totalPages": 1,
"totalElements": 3,
"last": true,
"size": 20,
"number": 0,
"numberOfElements": 3,
"first": true,
"sort": null
}
The second and third entry illustrate, that the chain of messages is returned to the user, if a fatal error is encountered and the execution logic cannot recover. The Job Manager attempts to stop the used instances, while preserving any outputs that might have been produced.