Batch Recommendations

Use an asynchronous batch workflow to get recommendations from large datasets that do not require real-time updates.

For instance, you might create a batch inference job to get product recommendations for all users on an email list, or to get item-to-item similarities (SIMS) across an inventory. To get batch recommendations, you can create a batch inference job by calling CreateBatchInferenceJob.

In order to get batch recommendations, the IAM user role that invokes the CreateBatchInferenceJob operation must have read and write permissions to your input and output Amazon S3 buckets respectively.

Getting Batch Recommendations using Python

Use the following code to get batch recommendations using the AWS Python SDK. The example includes itemExplorationConfig hyperparameters for solution versions trained using the USER_PERSONALIZATION recommendation recipe. Optionally include the itemExplorationConfig hyperparameters to configure exploration. For more information see User-Personalization Recipe.

The operation reads an input JSON file from an Amazon S3 bucket and places an output JSON file (input-file-name.out) in an Amazon S3 bucket.

The first item in the response file is considered by Amazon Personalize to be of most interest to the user.

import boto3

personalize_rec = boto3.client(service_name='personalize')

personalize_rec.create_batch_inference_job (
    solutionVersionArn = "Solution version ARN",
    jobName = "Batch job name",
    roleArn = "IAM role ARN",
    batchInferenceJobConfig = {
        # optional USER_PERSONALIZATION recipe hyperparameters
        "itemExplorationConfig": {      
            "explorationWeight": "0.3",
            "explorationItemAgeCutOff": "30"
        }
    },
    jobInput = 
       {"s3DataSource": {"path": "S3 input path"}},
    jobOutput = 
       {"s3DataDestination": {"path": "S3 output path"}}
)

The command returns the ARN for the batch job (the batchRecommendationsJobArn).

Processing the batch job might take a while to complete. You can check a job’s status by calling DescribeBatchInferenceJob and passing a batchRecommendationsJobArn as the input parameter. You can also list all Amazon Personalize batch inference jobs in your AWS environment by calling ListBatchInferenceJobs.

Input and Output JSON Examples

The CreateBatchInferenceJob uses a solution version to make recommendations based on data provided in an input JSON file. The result is then returned as a JSON file to an Amazon S3 bucket. The following tab list contains correctly formatted JSON input and output examples for each recipe type.

User-Personalization recipes

[ Input ]

{"userId": "4638"}
{"userId": "663"}
{"userId": "3384"}
...

The JSON object does not have commas ( , ) in between the objects

[ Output ]

{"input":{"userId":"4638"}, "output": {"recommendedItems": ["296", "1", "260", "318"]}, {"scores": [0.0009785, 0.000976, 0.0008851]}}
{"input":{"userId":"663"}, "output": {"recommendedItems": ["1393", "3793", "2701", "3826"]}, {"scores": [0.00008149, 0.00007025, 0.000652]}}
{"input":{"userId":"3384"}, "output": {"recommendedItems": ["8368", "5989", "40815", "48780"]}, {"scores": [0.003015, 0.00154, 0.00142]}}
...

Personalize-Ranking

[ Input ]

{"userId": "891", "itemList": ["27", "886", "101"]}
{"userId": "445", "itemList": ["527", "55", "901"]}
{"userId": "71", "itemList": ["27", "351", "101"]}
...

[ Output ]

{"input": {"userId": "891", "itemList": ["27", "886", "101"]}, "output": {"recommendedItems": ["27", "101", "886"]}, {"scores": [0.48421, 0.28133, 0.23446]}}
{"input": {"userId": "445", "itemList": ["527", "55", "901"]}, "output": {"recommendedItems": ["901", "527", "55"]}, {"scores": [0.46972, 0.31011, 0.22017]}}
{"input": {"userId": "71", "itemList": ["29", "351", "199"]}, "output": {"recommendedItems": ["351", "29", "199"]}, {"scores": [0.68937, 0.24829, 0.06232]}}
...

SIMS

[ Input ]

{"itemId": "105"}
{"itemId": "106"}
{"itemId": "441"}
...

[ Output ]

{"input": {"itemId": "105"}, "output": {"recommendedItems": ["106", "107", "49"]}, }
{"input": {"itemId": "106"}, "output": {"recommendedItems": ["105", "107", "49"]}}
{"input": {"itemId": "441"}, "output": {"recommendedItems": ["2", "442", "435"]}}
...