# Paddle-DALL-E

**Repository Path**: AgentMaker/Paddle-DALL-E

## Basic Information

- **Project Name**: Paddle-DALL-E
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-05-11
- **Last Updated**: 2021-05-11

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Paddle-DALL-E
![GitHub forks](https://img.shields.io/github/forks/AgentMaker/Paddle-DALL-E)
![GitHub Repo stars](https://img.shields.io/github/stars/AgentMaker/Paddle-DALL-E)
![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/AgentMaker/Paddle-DALL-E?include_prereleases)
![GitHub](https://img.shields.io/github/license/AgentMaker/Paddle-DALL-E)  
A PaddlePaddle version implementation of DALL-E of OpenAI. [【origin repo】](https://github.com/openai/DALL-E)

Now this implementation only include the dVAE part, can't generate images from text.

## Install Package
* Install by pip：
```shell
$ pip install paddledalle==1.0.0 -i https://pypi.python.org/pypi 
```
* Install by wheel package：[【Releases Packages】](https://github.com/AgentMaker/Paddle-DALL-E/releases)

## Quick Start
```python
import paddle
import paddle.nn.functional as F
import paddle.vision.transforms as T
import paddle.vision.transforms.functional as TF

from PIL import Image
from dall_e import load_model, map_pixels, unmap_pixels

target_image_size = 256

def preprocess(img):
    s = min(img.size)

    if s < target_image_size:
        raise ValueError(f'min dim for image {s} < {target_image_size}')

    r = target_image_size / s
    s = (round(r * img.size[1]), round(r * img.size[0]))
    img = TF.resize(img, s, interpolation='lanczos')
    img = TF.center_crop(img, output_size=2 * [target_image_size])
    img = paddle.unsqueeze(T.ToTensor()(img), 0)
    return map_pixels(img)

enc = load_model('encoder', pretrained=True)
dec = load_model('decoder', pretrained=True)

img = Image.open('1000x-1.jpg')
x = preprocess(img)

z_logits = enc(x)
z = paddle.argmax(z_logits, axis=1)
z = F.one_hot(z, num_classes=enc.vocab_size).transpose((0, 3, 1, 2))

x_stats = dec(z)
x_rec = unmap_pixels(F.sigmoid(x_stats[:, :3]))

out = (x_rec[0].transpose((1, 2, 0))*255.).astype('uint8').numpy()
out = Image.fromarray(out)
out.show()
```