Multi-layer Neural Networks consist of an input layer, several hidden layers, and an output layer.
Each node in a hidden layer is essentially a Perceptron. Each one
Convolutional Neural Networks (CNNs) excel at image tasks through specialized layers:
CNNs are the backbone for many vision applications like image classification.
import torch.nn as nnimport torch.nn.functional as Fclass SimpleCNN(nn.Module):def __init__(self):super(SimpleCNN, self).__init__()# Convolutional layer: 3 input channels, 12 filters, 3x3 kernelself.conv1 = nn.Conv2d(3, 12, kernel_size=3, padding=1)# Fully connected layersself.fc1 = nn.Linear(12 * 16 * 16, 64)self.fc2 = nn.Linear(64, 10) # 10 output classesdef forward(self, x):# Apply convolution and ReLU activationx = F.relu(self.conv1(x))# Apply max pooling (2x2)x = F.max_pool2d(x, 2)# Flatten for fully connected layerx = x.view(x.size(0), -1)# Pass through fully connected layersx = F.relu(self.fc1(x))x = self.fc2(x)return x
Recurrent Neural Networks (RNNs) are a type of deep learning model used to model sequential data using a recurrence connection that connects individual units that can process information from the full sequence.
The key component of an RNN is a hidden state that attempts to retain information from multiple, previous units which is used to predict the next output of the sequence.
Optimizing a model’s deployment involves measuring parameters such as parameter count, inference latency, and GPU memory consumption. Understanding these metrics helps tailor model performance to meet specific environment requirements.
# Simulated example of model profiling in Pythonmodel_params = 123456 # Hypothetical number of parametersinference_latency = 0.005 # Latency in secondsgpu_memory = 2048 # Memory usage in MBprint(f"Model Parameters: {model_params}")print(f"Inference Latency: {inference_latency} seconds")print(f"GPU Memory Usage: {gpu_memory} MB")
PythonUnderstand the specific preprocessing steps for different neural network architectures. CNNs require pixel normalization, transformers use tokenization with special tokens, and RNNs need sequence formatting. Each method prepares data uniquely for optimal model performance.
# CNN preprocessing: Normalize pixel valuesimage = ... # Some image datanormalized_image = image / 255.0# Transformer preprocessing: Tokenizationfrom transformers import BertTokenizertext = "A sample text for input."tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')input_tokens = tokenizer.encode_plus(text, add_special_tokens=True)# RNN preprocessing: Sequence formattingsequence = [1, 2, 3, 4]formatted_sequence = nn.utils.rnn.pad_sequence([torch.tensor(seq) for seq in sequence], batch_first=True)
Activation functions, like ReLU, GELU, and Swish, are key to making a neural network model complex behaviors. They add non-linearity, enabling networks to learn intricate patterns. Try experimenting with them to see different learning outcomes.
import numpy as np# Define ReLU functiondef relu(x):return np.maximum(0, x)# Define GELU functiondef gelu(x):return 0.5 * x * (1 + np.tanh(np.sqrt(2/np.pi) * (x + 0.044715 * x**3)))# Define Swish functiondef swish(x):return x / (1 + np.exp(-x))# Sample inputdata = np.array([-1, 0, 1, 2])print("ReLU:", relu(data))print("GELU:", gelu(data))print("Swish:", swish(data))
Batch Normalization in PythonNormalization layers like Batch Normalization and Layer Normalization help stabilize neural network training. They scale and shift activations, making gradients flow more smoothly and reducing internal covariate shift. This process often results in faster convergence and improved model generalization.
import torchimport torch.nn as nn# Define a batch normalization layer with specific dimensionsbatch_norm_layer = nn.BatchNorm1d(num_features=10)# Random input tensor with 2 batches and 10 featuresx = torch.randn(2, 10)# Apply batch normalizationnormalized_output = batch_norm_layer(x)print(normalized_output)
Tokenization is the process of breaking down a text into individual units called tokens.
Tokenization strategies include:
text = '''Vanity and pride are different things'''# word-based tokenizationwords = ['Vanity', 'and', 'pride', 'are', 'different', 'things']# subword-based tokenizationsubwords = ['Van', 'ity', 'and', 'pri', 'de', 'are', 'differ', 'ent', 'thing', 's']# character-based tokenizationcharacters = ['V', 'a', 'n', 'i', 't', 'y', ' ', 'a', 'n', 'd', ' ', 'p', 'r', 'i', 'd', 'e', ' ', 'a', 'r', 'e', ' ', 'd', 'i', 'f', 'f', 'e', 'r', 'e', 'n', 't', ' ', 't', 'h', 'i', 'n', 'g', 's']
Word embeddings are key to natural language processing. Each is a real number vector representation of a specific word. Contextual information about that word is encoded within the vector numbers.
A basic English word embedding model can be loaded in Python using the spaCy library. This allows access to embeddings for English words.
nlp = spacy.load('en')
Call the model with the desired word as an argument and access the .vector attribute:
nlp('peace').vector
The result would be:
[5.2907305, -4.20267, 1.6989858, -1.422668, -1.500128, ...]