7.11. Recurrent Neural Network¶

DLP/NN: deal
CNN: capture the spatial data structure(image analysis)
RNN: capture the sequential data structure(sentence, stock price)

Example : “I like eating apple.” and “Apple is a company.”. If we want to specify different meanings of a word, we need to take the nearby words into consideration.

How does RNN works:
The hidden layer remembers the infomation of the previous hidden layer $h_{t-1}$, and then learn from the current data $X_t$. $$ O_t=g(Wh_t)\\ h_t=f(UX_t+Vh_{t-1}) $$

$X_t$: input vector
$h_t$: hidden layer vector
$O_t$: output vector
$W,U,V$: parameter matrices

Example Assume we have trained a RNN, 2 nodes with weights $W,U,V=(0.5,0.5)'$, our sequence $(1,1)',(1,2)',...$
$h_1=(0.5*1+0.5*1,0.5*1+0.5*1)=(1,1),O_1=(1,1)$
$h_2=(0.5*1+0.5*2,0.5*1+0.5*2)+(1,1)=(2.5,2.5),O_2=(2.5,2.5)$

LSTM: “long short term memory”, a commonly used RNN model.

from numpy import array
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.embeddings import Embedding
# define documents
docs = ['Well done!',
        'Good work',
        'Great effort',
        'nice work',
        'Excellent!',
        'Weak',
        'Poor effort!',
        'not good',
        'poor work',
        'Could have done better.']
# define class labels
labels = array([1,1,1,1,1,0,0,0,0,0])

# integer encode the documents
vocab_size = 50
encoded_docs = [one_hot(d, vocab_size) for d in docs]
print(encoded_docs)

[[16, 1], [46, 17], [5, 10], [35, 17], [27], [2], [4, 10], [40, 46], [4, 17], [26, 41, 1, 6]]

e = Embedding(200, 32, input_length=50)
# pad documents to a max length of 4 words
max_length = 4
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')
print(padded_docs)
# define the model
model = Sequential()
model.add(Embedding(vocab_size, 8, input_length=max_length))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# summarize the model
print(model.summary())
# fit the model
model.fit(padded_docs, labels, epochs=50, verbose=0)
# evaluate the model
loss, accuracy = model.evaluate(padded_docs, labels, verbose=0)
print('Accuracy: %f' % (accuracy*100))

[[16  1  0  0]
 [46 17  0  0]
 [ 5 10  0  0]
 [35 17  0  0]
 [27  0  0  0]
 [ 2  0  0  0]
 [ 4 10  0  0]
 [40 46  0  0]
 [ 4 17  0  0]
 [26 41  1  6]]
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_1 (Embedding)     (None, 4, 8)              400       
                                                                 
 flatten_1 (Flatten)         (None, 32)                0         
                                                                 
 dense_2 (Dense)             (None, 1)                 33        
                                                                 
=================================================================
Total params: 433
Trainable params: 433
Non-trainable params: 0
_________________________________________________________________
None
Accuracy: 89.999998

Introduction to Data Science

7.11. Recurrent Neural Network¶

7.12. Convolutional Neural Network¶

7.12.1. Filter¶

7.12.2. Basic concepts¶