3 Star 0 Fork 0

mirrors_huggingface/huggingface-sagemaker-snowflake-example

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

Tutorial: Use Hugging Face Transformers with Snowflake External Functions

This repository contains code and instructions on how to integrate Hugging Face Transformers with Snowflake using External Functions. Below you can find an architectural overview of the solution.

architecture

Tutorial

0. Prequisition

  1. Running Snowflake Warehose. Get started here
  2. Database with data, e.g. tweet_data

1. Deploy Hugging Face endpoint with Amazon API Gateway on Amazon SageMaker

TODO: Add API Gateway policy

We are going to use AWS CDK to deploy your Hugging Face Transformers to Amazon SageMaker and create the AWS API Gateway to connect to Snowflake and our SageMaker endpoint.

Install the cdk required dependencies. Make your you have the cdk installed.

pip3 install -r aws-infrastructure/requirements.txt

Change directory int to aws-infrastructure/

cd aws-infrastructure/

Bootstrap your application in the cloud.

cdk bootstrap \
   -c model="distilbert-base-uncased-finetuned-sst-2-english" \
   -c task="text-classification"

Deploy your Hugging Face Transformer model to Amazon SageMaker

cdk deploy \
   -c model="distilbert-base-uncased-finetuned-sst-2-english" \
   -c task="text-classification"

Test your endpoint with curl:

curl --request POST \
  --url {HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4} \
  --header 'Content-Type: application/json' \
  --data '{
	"inputs": "Hugging Face, the winner of VentureBeat’s Innovation in Natural Language Process/Understanding Award for 2021, is looking to level the playing field. The team, launched by Clément Delangue and Julien Chaumond in 2016, was recognized for its work in democratizing NLP, the global market value for which is expected to hit $35.1 billion by 2026. This week, Google’s former head of Ethical AI Margaret Mitchell joined the team."
}'

You should see the following response: [{"label":"POSITIVE","score":0.9970797896385193}]

2. Create API Integration in snowflake

Open a new Worksheet in the Snowflake Web Console and create a new API Integration. Therefore we need our API Gateway endpoint and the snowflake_role arn. Change the Values in the snippet below and then execute.

CREATE OR REPLACE API INTEGRATION huggingface
    API_PROVIDER = aws_api_gateway
    API_AWS_ROLE_ARN = 'arn:aws:iam::{YOUR-ACCOUNT-ID}:role/snowflake_role'
    API_ALLOWED_PREFIXES = ('{HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4}')
    ENABLED =  TRUE 
    ;

create-api-integration

3. Update IAM role (different CDK) project

Before we can create and use our external function we need to authorize Snowflake to assume our snowflake_role to access our API Gateway. To do this we need to extracte the API_AWS_IAM_USER_ARN and API_AWS_EXTERNAL_ID from out Snowflake API integration.

Therefore we need to run the following snippet in our snowflake web console:

describe integration huggingface;

Then copy the API_AWS_IAM_USER_ARN and API_AWS_EXTERNAL_ID.

api-integration-description

To authorize snowflake we need to manually adjust the trust relationship for our snowflake_role. Go to the AWS Management Console IAM Service. Search for the snoflake_role and click on the Edit trust policy button on the "Trust Relationships" tab.

trust-relationships

Replace API_AWS_IAM_USER_ARN and API_AWS_EXTERNAL_ID from the snippet below with your values and click "update policy".

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Principal": {
				"AWS": "{API_AWS_IAM_USER_ARN}"
			},
			"Action": "sts:AssumeRole",
			"Condition": {"StringEquals": {"sts:ExternalId": "{API_AWS_EXTERNAL_ID}"}}
		}
	]
}

4. Create External Function

After we have enabled the trust relationship between Snowflake and our snowflake_role we can create our external function. Replace the {HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4} value with your API Gateway endpoint and then execute the following snippet in Snowflake.

CREATE OR REPLACE external function huggingface_function(v varchar)
    returns variant
    api_integration = huggingface
    as '{HuggingfaceSagemakerEndpoint.hfapigwEndpointE75D67B4}';

create-external-function

5. Run External function on data

Now we can use our external function to run our model on our data. Replace HUGGINGFACE_TEST.PUBLIC.TWEETS and inputs with your database and column.

select huggingface_function(inputs) from HUGGINGFACE_TEST.PUBLIC.TWEETS  limit 100

the result look the similar to this

invocation

Resources

空文件

简介

取消

发行版

暂无发行版

贡献者

全部

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/mirrors_huggingface/huggingface-sagemaker-snowflake-example.git
git@gitee.com:mirrors_huggingface/huggingface-sagemaker-snowflake-example.git
mirrors_huggingface
huggingface-sagemaker-snowflake-example
huggingface-sagemaker-snowflake-example
main

搜索帮助