# nova-custom-eval-sdk **Repository Path**: mirrors_aws/nova-custom-eval-sdk ## Basic Information - **Project Name**: nova-custom-eval-sdk - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-15 - **Last Updated**: 2026-03-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # LLM Eval Kit A Python SDK for creating custom evaluation metrics for LLM model evaluation on Sagemaker Training Job with built-in Pydantic validation. For the official integration with AWS Sagemaker training job, please view in the [Official AWS Sagemaker Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/nova-model-evaluation.html). ## Installation ``` git clone https://github.com/aws/llm-eval-kit.git cd llm-eval-kit pip install . ``` ## Architecture The SDK provides: - **Pydantic Validation**: Automatic input/output validation using Pydantic models - **PreProcessor**: For input data transformation with validation - **PostProcessor**: For output data formatting with validation - **Decorators**: Simplified processor creation (@preprocess, @postprocess) - **Lambda Handler Builder**: Easy Lambda function creation - **Exception Handling**: Custom error types with validation feedback ## Quick Start ### Complete Example See `example/run_example.py` for a complete working example to run locally. ### Run in AWS Lambda You need to create a lambda (follow this [guide](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html)) and upload `llm-eval-kit` as a lambda layer in order to use it. In the [github release](https://github.com/aws/llm-eval-kit/releases), you should be able to find a pre-built llm-eval-kit-layer.zip file. Use below command to upload custom lambda layer. ``` aws lambda publish-layer-version \ --layer-name llm-eval-kit-layer \ --zip-file fileb://llm-eval-kit-layer.zip \ --compatible-runtimes python3.12 python3.11 python3.10 python3.9 ``` You need to add this layer as custom layer along with the required AWS layer: `AWSLambdaPowertoolsPythonV3-python312-arm64` (because of pydantic depencency) to your lambda. Then update your lambda code with: ```python from llm_eval_kit.processors.decorators import preprocess, postprocess from llm_eval_kit.lambda_handler import build_lambda_handler @preprocess def preprocessor(event: dict, context) -> dict: data = event.get('data', {}) return { "statusCode": 200, "body": { "system": data.get("system"), "prompt": data.get("prompt", ""), "gold": data.get("gold", "") } } @postprocess def postprocessor(event: dict, context) -> dict: # data is already validated and extracted from event data = event.get('data', []) inference_output = data.get('inference_output', '') gold = data.get('gold', '') metrics = [] inverted_accuracy = 0 if inference_output.lower() == gold.lower() else 1.0 metrics.append({ "metric": "inverted_accuracy_custom", "value": inverted_accuracy }) # Add more metrics here return { "statusCode": 200, "body": metrics } # Build Lambda handler lambda_handler = build_lambda_handler( preprocessor=preprocessor, postprocessor=postprocessor ) ``` ## Input/Output Validation The SDK automatically validates: ### Preprocessing Input ```json { "process_type": "preprocess", "data": { "prompt": "what can you do?", "gold": "Hello! How can I help you today?", "system": "You are a helpful assistant" } } ``` ### Postprocessing Input ```json { "process_type": "postprocess", "data": [ { "prompt": "what can you do", "inference_output": "Hello! How can I help you today?", "gold": "Hello! How can I help you today?" } ] } ``` ## Testing ```bash # Run all tests python -m pytest -v # Run example python example/run_example.py ``` ## Development ```bash # Install in development mode pip install -e . # Run tests with coverage python -m pytest tests/ --cov=llm_eval_kit ``` ## Contributing See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. ## License This project is licensed under the Apache-2.0 License.