# huggingface-sagemaker-snowflake-example **Repository Path**: mirrors_huggingface/huggingface-sagemaker-snowflake-example ## Basic Information - **Project Name**: huggingface-sagemaker-snowflake-example - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-03-17 - **Last Updated**: 2025-12-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Tutorial: Use Hugging Face Transformers with Snowflake External Functions This repository contains code and instructions on how to integrate Hugging Face Transformers with Snowflake using External Functions. Below you can find an architectural overview of the solution. ![architecture](assets/architecture.png) # Tutorial ## 0. Prequisition 1. Running Snowflake Warehose. Get started [here](https://signup.snowflake.com/?utm_cta=trial-en-www-homepage-top-right-nav-ss-evg&_ga=2.4253299.1747282503.1647350425-2028784425.1644849379) 2. Database with data, e.g. [tweet_data](tweet_data.csv) * [YT: Load CSV data to create a new table in Snowflake](https://www.youtube.com/watch?v=GfCBhZK3X7w&ab_channel=KahanDataSolutions) # 1. Deploy Hugging Face endpoint with Amazon API Gateway on Amazon SageMaker TODO: Add API Gateway policy We are going to use AWS CDK to deploy your Hugging Face Transformers to Amazon SageMaker and create the AWS API Gateway to connect to Snowflake and our SageMaker endpoint. Install the cdk required dependencies. Make your you have the [cdk](https://docs.aws.amazon.com/cdk/latest/guide/getting_started.html#getting_started_install) installed. ```bash pip3 install -r aws-infrastructure/requirements.txt ``` Change directory int to `aws-infrastructure/` ```bash cd aws-infrastructure/ ``` [Bootstrap](https://docs.aws.amazon.com/cdk/latest/guide/bootstrapping.html) your application in the cloud. ```bash cdk bootstrap \ -c model="distilbert-base-uncased-finetuned-sst-2-english" \ -c task="text-classification" ``` Deploy your Hugging Face Transformer model to Amazon SageMaker ```bash cdk deploy \ -c model="distilbert-base-uncased-finetuned-sst-2-english" \ -c task="text-classification" ``` Test your endpoint with `curl`: ```bash curl --request POST \ --url {HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4} \ --header 'Content-Type: application/json' \ --data '{ "inputs": "Hugging Face, the winner of VentureBeat’s Innovation in Natural Language Process/Understanding Award for 2021, is looking to level the playing field. The team, launched by Clément Delangue and Julien Chaumond in 2016, was recognized for its work in democratizing NLP, the global market value for which is expected to hit $35.1 billion by 2026. This week, Google’s former head of Ethical AI Margaret Mitchell joined the team." }' ``` You should see the following response: `[{"label":"POSITIVE","score":0.9970797896385193}]` # 2. Create API Integration in snowflake Open a new Worksheet in the Snowflake Web Console and create a new API Integration. Therefore we need our API Gateway endpoint and the `snowflake_role` arn. Change the Values in the snippet below and then execute. ```sql CREATE OR REPLACE API INTEGRATION huggingface API_PROVIDER = aws_api_gateway API_AWS_ROLE_ARN = 'arn:aws:iam::{YOUR-ACCOUNT-ID}:role/snowflake_role' API_ALLOWED_PREFIXES = ('{HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4}') ENABLED = TRUE ; ``` ![create-api-integration](assets/create-api-integration.png) # 3. Update IAM role (different CDK) project Before we can create and use our external function we need to authorize Snowflake to assume our `snowflake_role` to access our API Gateway. To do this we need to extracte the `API_AWS_IAM_USER_ARN` and `API_AWS_EXTERNAL_ID` from out Snowflake API integration. Therefore we need to run the following snippet in our snowflake web console: ```sql describe integration huggingface; ``` Then copy the `API_AWS_IAM_USER_ARN` and `API_AWS_EXTERNAL_ID`. ![api-integration-description](assets/api-integration-description.png) To authorize snowflake we need to manually adjust the trust relationship for our `snowflake_role`. Go to the AWS Management Console IAM Service. Search for the `snoflake_role` and click on the `Edit trust policy` button on the "Trust Relationships" tab. ![trust-relationships](assets/trust-relationships.png) Replace `API_AWS_IAM_USER_ARN` and `API_AWS_EXTERNAL_ID` from the snippet below with your values and click "update policy". ```bash { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "{API_AWS_IAM_USER_ARN}" }, "Action": "sts:AssumeRole", "Condition": {"StringEquals": {"sts:ExternalId": "{API_AWS_EXTERNAL_ID}"}} } ] } ``` # 4. Create External Function After we have enabled the trust relationship between Snowflake and our `snowflake_role` we can create our external function. Replace the `{HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4}` value with your API Gateway endpoint and then execute the following snippet in Snowflake. ```bash CREATE OR REPLACE external function huggingface_function(v varchar) returns variant api_integration = huggingface as '{HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4}'; ``` ![create-external-function](assets/create-external-function.png) # 5. Run External function on data Now we can use our external function to run our model on our data. Replace `HUGGINGFACE_TEST.PUBLIC.TWEETS` and `inputs` with your database and column. ```sql select huggingface_function(inputs) from HUGGINGFACE_TEST.PUBLIC.TWEETS limit 100 ``` the result look the similar to this ![invocation](assets/invocation.png) # Resources * [Snowflake: External Functions YT](https://www.youtube.com/watch?v=qangh4oM_zs&ab_channel=SnowflakeInc.) * [Snowflake: External Functions Docs](https://docs.snowflake.com/en/sql-reference/external-functions-creating-aws-ui.html) * [Snowflake: API Gateway policy](https://docs.snowflake.com/en/sql-reference/external-functions-creating-aws-common-api-integration-proxy-link.html)