pipeline()
functionpipeline()
function. It connects a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer. pipeline()
function are feature extraction, named entity recognition, sentiment analysis, summarization and text generation.from transformers import pipelineclassifier = pipeline("sentiment-analysis")classifier([sample_text_sequence_1,sample_text_sequence_2,])
bert-base-uncased
is a checkpoint. from_pretrained()
methodThe from_pretrained()
method can be used to load and save a pretrained transformer model. The AutoTokenizer
, AutoProcessor
and AutoModel
classes allow one to load tokenizers, processors and models respectively for any model architecture.
from transformers import AutoModel, AutoTokenizercheckpoint = 'pretrained-model-you-want'tokenizer = AutoTokenizer.from_pretrained(checkpoint)model = AutoModel.from_pretrained(checkpoint)
Decoder models employ various strategies in next token generation, which can be adjusted by the user. These include n-gram penalties, which prevent token sequences of n length from repeating; sampling, which chooses the next token at random from among a collection of likely next tokens; and temperature, which adjusts the predictability of the randomly selected next token, with higher temperatures producing less predictable output.
When decoder models simply select the most probable next token for their output, it’s called “greedy search.” When the model projects several tokens further into the potential output and chooses the most probable multi-token sequence from among several candidates, it’s known as “beam search.”