Model Overview

Data-efficient Image Transformer (DeiT).

Reference

Training data-efficient image transformers & distillation through attention

ViT models required training on expensive infrastructure for multiple weeks, using external data. DeiT (data-efficient image transformers) are more efficiently trained transformers for image classification, requiring far less data and far less computing resources compared to the original ViT models.

Installation

Keras and KerasHub can be installed with:

pip install -U -q keras-hub
pip install -U -q keras

Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.

Presets

The following model checkpoints are provided by the Keras team. Weights have been ported from: https://huggingface.co. Full code examples for each are available below.

Preset name	Parameters	Description
deit_tiny_distilled_patch16_224_imagenet	5.52M	DeiT-T16 model pre-trained on the ImageNet 1k dataset with image resolution of 224x224
deit_small_distilled_patch16_224_imagenet	21.66M	DeiT-S16 model pre-trained on the ImageNet 1k dataset with image resolution of 224x224
deit_base_distilled_patch16_224_imagenet	85.80M	DeiT-B16 model pre-trained on the ImageNet 1k dataset with image resolution of 224x224 .
deit_base_distilled_patch16_384_imagenet	86.09M	DeiT-B16 model pre-trained on the ImageNet 1k dataset with image resolution of 384x384

Downloads last month: 13

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including keras/deit_tiny_distilled_patch16_224_imagenet

DeiT

Collection

Data-efficient Image Transformer (DeiT). • 4 items • Updated 6 days ago

Paper for keras/deit_tiny_distilled_patch16_224_imagenet

Training data-efficient image transformers & distillation through attention

Paper • 2012.12877 • Published Dec 23, 2020 • 2

keras
/

deit_tiny_distilled_patch16_224_imagenet