release notes
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
release notes
Published 4/13/2023
MinorContains breaking changesrelease notes
Published 4/13/2023
MinorContains breaking changesThe LLaMA model was proposed in LLaMA: Open and Efficient Foundation Language Models. It is a collection of foundation language models ranging from 7B to 65B parameters. You can request access to the weights here then use the conversion script to generate a checkpoint compatible with Hugging Face
Pix2Struct is a pretrained image-to-text model for purely visual language understanding, which can be finetuned on tasks containing visually-situated language. Pix2Struct has been fine-tuned on various tasks and datasets, ranging from image captioning and visual question answering (VQA) over different inputs (books, charts, science diagrams) to captioning UI components, and others.
transformers by @younesbelkada in #22528MEGA proposes a new approach to self-attention with each encoder layer having a multi-headed exponential moving average in addition to a single head of standard dot-product attention, giving the attention mechanism stronger positional biases. This allows MEGA to perform competitively to Transformers on standard benchmarks including LRA while also having significantly fewer parameters. MEGA’s compute efficiency allows it to scale to very long sequences, making it an attractive option for long-document NLP tasks.
The model is a an optimized GPT2 model with support for Multi-Query Attention.
The mixture of experts version of the NLLB release has been added to the library.
NLLB-MoE Adds the moe model by @ArthurZucker in #22024bnb] Let's make serialization of int8 models possible by @younesbelkada in #22177You can now push 8bit models and/or load 8bit models directly from the Hub, save memory and load your 8bit models faster! An example repo here
Notes from the PR:
The BLIP image processor incorrectly passed in the dimensions to resize in the order (width, height). This is reordered to be correct.
In most cases, this won't have an effect as the default height and width are the same. However, this is not backwards compatible for custom configurations with different height, width settings and direct calls to the resize method with different height, width values.
The big problem was the prefix and suffix tokens of the NLLB tokenizer.
Previous behaviour:
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
>>> tokenizer("How was your day?").input_ids
[13374, 1398, 4260, 4039, 248130, 2, 256047]
>>> # 2: '</s>'
>>> # 256047 : 'eng_Latn'
New behaviour
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
>>> tokenizer("How was your day?").input_ids
[256047, 13374, 1398, 4260, 4039, 248130, 2]
In case you have pipelines that were relying on the old behavior, here is how you would enable it once again:
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", legacy_behaviour = True)
[NLLB Tokenizer] Fix the prefix tokens 🚨🚨🚨 by @ArthurZucker in #22313The BLIP model is now available in TensorFlow.
As the title says, this PR adds the possibility to export TF generate with a TF-native tokenizer -- the full thing in a single TF graph.
A new task guide has been added, focusing on depth-estimation.
MgpstrModelIntegrationTest by @ydshieh in #22195XGLM] Add accelerate support for XGLM by @younesbelkada in #22207dash==2.8.1 for now for daily CI by @ydshieh in #22227dash==2.8.1 for now for daily CI" by @ydshieh in #22233TFCvtModel by @gcuder in #22267max_memory for device_map strategies by @sgugger in #22311generate(synced_gpus=True, ...) by @stas00 in #22242MBart] Add accelerate support for MBart by @younesbelkada in #22309torch<1.10 by @stas00 in #22370cmake dependencies in CI by @gante in #22383bnb] Force requires_grad to be False by @younesbelkada in #22396causal_mask is created directly on device by @jeffra in #22378bnb] fix bnb failing test by @younesbelkada in #22439Generate] Add conditional generation for multimodal models by @younesbelkada in #22424Pix2Struct] Fix slow test by @younesbelkada in #22448model_type update for auto mapping by @ArthurZucker in #22470max_position_embeddings by @gante in #22471eos_token_id < 0 checks in generate() from ValueError to warning by @lewtun in #22472Wav2Vec2ProcessorWithLM doc example by @ydshieh in #22474TextIteratorStreamer (streamer for gradio) by @gante in #22501Trainer] Force is_model_parallel when model is loaded in multiple GPUs using accelerate by @younesbelkada in #22532T5] Enable naive Pipeline Parallelism training for T5 by @younesbelkada in #22535distutils usage by @XuehaiPan in #22531pyproject.toml by @XuehaiPan in #22539bnb] Fix typo by @younesbelkada in #22556_no_split_modules for Whisper model by @pacman100 in #22486TextIteratorStreamer timeout by @gante in #22576accelerate_tests mark warnings by @gante in #22585_toctree.yml by @wonhyeongseo in #22581pipeline_model_mapping systematically by @ydshieh in #22180bnb] 8bit models should not be converted to DDP by @younesbelkada in #22628Blip] Fix slow tests and doctests with correct values by @younesbelkada in #22632autoclass_tutorial to Korean and Fix the typo of quicktour by @gabrielwithappy in #22533MegaModel CI by @ydshieh in #22652pipeline_tutorial.mdx to Korean by @wonhyeongseo in #22508MarkupLM tests' expected values by @ydshieh in #22667torch.distributed group initialization for torch_neuron disabled when optimum-neuron is installed by @michaelbenayoun in #22728The following contributors have made significant changes to the library over the last release:
The LLaMA model was proposed in LLaMA: Open and Efficient Foundation Language Models. It is a collection of foundation language models ranging from 7B to 65B parameters. You can request access to the weights here then use the conversion script to generate a checkpoint compatible with Hugging Face
Pix2Struct is a pretrained image-to-text model for purely visual language understanding, which can be finetuned on tasks containing visually-situated language. Pix2Struct has been fine-tuned on various tasks and datasets, ranging from image captioning and visual question answering (VQA) over different inputs (books, charts, science diagrams) to captioning UI components, and others.
transformers by @younesbelkada in #22528MEGA proposes a new approach to self-attention with each encoder layer having a multi-headed exponential moving average in addition to a single head of standard dot-product attention, giving the attention mechanism stronger positional biases. This allows MEGA to perform competitively to Transformers on standard benchmarks including LRA while also having significantly fewer parameters. MEGA’s compute efficiency allows it to scale to very long sequences, making it an attractive option for long-document NLP tasks.
The model is a an optimized GPT2 model with support for Multi-Query Attention.
The mixture of experts version of the NLLB release has been added to the library.
NLLB-MoE Adds the moe model by @ArthurZucker in #22024bnb] Let's make serialization of int8 models possible by @younesbelkada in #22177You can now push 8bit models and/or load 8bit models directly from the Hub, save memory and load your 8bit models faster! An example repo here
Notes from the PR:
The BLIP image processor incorrectly passed in the dimensions to resize in the order (width, height). This is reordered to be correct.
In most cases, this won't have an effect as the default height and width are the same. However, this is not backwards compatible for custom configurations with different height, width settings and direct calls to the resize method with different height, width values.
The big problem was the prefix and suffix tokens of the NLLB tokenizer.
Previous behaviour:
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
>>> tokenizer("How was your day?").input_ids
[13374, 1398, 4260, 4039, 248130, 2, 256047]
>>> # 2: '</s>'
>>> # 256047 : 'eng_Latn'
New behaviour
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
>>> tokenizer("How was your day?").input_ids
[256047, 13374, 1398, 4260, 4039, 248130, 2]
In case you have pipelines that were relying on the old behavior, here is how you would enable it once again:
>>> from transformers import NllbTokenizer
>>> tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", legacy_behaviour = True)
[NLLB Tokenizer] Fix the prefix tokens 🚨🚨🚨 by @ArthurZucker in #22313The BLIP model is now available in TensorFlow.
As the title says, this PR adds the possibility to export TF generate with a TF-native tokenizer -- the full thing in a single TF graph.
A new task guide has been added, focusing on depth-estimation.
MgpstrModelIntegrationTest by @ydshieh in #22195XGLM] Add accelerate support for XGLM by @younesbelkada in #22207dash==2.8.1 for now for daily CI by @ydshieh in #22227dash==2.8.1 for now for daily CI" by @ydshieh in #22233TFCvtModel by @gcuder in #22267max_memory for device_map strategies by @sgugger in #22311generate(synced_gpus=True, ...) by @stas00 in #22242MBart] Add accelerate support for MBart by @younesbelkada in #22309torch<1.10 by @stas00 in #22370cmake dependencies in CI by @gante in #22383bnb] Force requires_grad to be False by @younesbelkada in #22396causal_mask is created directly on device by @jeffra in #22378bnb] fix bnb failing test by @younesbelkada in #22439Generate] Add conditional generation for multimodal models by @younesbelkada in #22424Pix2Struct] Fix slow test by @younesbelkada in #22448model_type update for auto mapping by @ArthurZucker in #22470max_position_embeddings by @gante in #22471eos_token_id < 0 checks in generate() from ValueError to warning by @lewtun in #22472Wav2Vec2ProcessorWithLM doc example by @ydshieh in #22474TextIteratorStreamer (streamer for gradio) by @gante in #22501Trainer] Force is_model_parallel when model is loaded in multiple GPUs using accelerate by @younesbelkada in #22532T5] Enable naive Pipeline Parallelism training for T5 by @younesbelkada in #22535distutils usage by @XuehaiPan in #22531pyproject.toml by @XuehaiPan in #22539bnb] Fix typo by @younesbelkada in #22556_no_split_modules for Whisper model by @pacman100 in #22486TextIteratorStreamer timeout by @gante in #22576accelerate_tests mark warnings by @gante in #22585_toctree.yml by @wonhyeongseo in #22581pipeline_model_mapping systematically by @ydshieh in #22180bnb] 8bit models should not be converted to DDP by @younesbelkada in #22628Blip] Fix slow tests and doctests with correct values by @younesbelkada in #22632autoclass_tutorial to Korean and Fix the typo of quicktour by @gabrielwithappy in #22533MegaModel CI by @ydshieh in #22652pipeline_tutorial.mdx to Korean by @wonhyeongseo in #22508MarkupLM tests' expected values by @ydshieh in #22667torch.distributed group initialization for torch_neuron disabled when optimum-neuron is installed by @michaelbenayoun in #22728The following contributors have made significant changes to the library over the last release: