Classify Videos Using Deep Learning Matlab

4 min readFeb 14, 2021

Ok based on matlab samples. I wanted to classify video sequences but got error of out of memory. So i combined two examples one works on videos and extract features and saves them to disk. And other create batches of features from the trained network.

Classify Videos Using Deep Learning

This example shows how to create a network for video classification by combining a pretrained image classification…

in.mathworks.com

Sequence Classification Using Deep Learning

This example shows how to classify sequence data using a long short-term memory (LSTM) network. To train a deep neural…

in.mathworks.com

Train Network Using Custom Mini-Batch Datastore for Sequence Data

This example shows how to train a deep learning network on out-of-memory sequence data using a custom mini-batch…

in.mathworks.com

Compute and Store futures

Since i got memory limit. I took max every 4th future of 1024.

netCNN = googlenet;

Take google net to view network we use

analyzeNetwork(netCNN)

we take layer 167 named layerName = “pool5–7x7_s1” we has activations size 1024

List all files use :

[files,labels] = hmdb51Files(dataFolder);

Randomization

numObservations = numel(labels);

N = floor(0.9 * numObservations);

idx = randperm(numObservations);

idxTrain = idx(1:N);

labelsTrain = labels(idxTrain);

idxValidation = idx(N+1:end);

labelsValidation = labels(idxValidation);

Its better to store this groups so we keep groups.

Loop Videos

decim function taken from file exchange

Its reduce memory use by decimation from 1024 to 128 for each frame.

https://in.mathworks.com/matlabcentral/fileexchange/54099-fast-data-decimation?s_tid=srchtitle

set frame limit of video length frame_limit=400;

for each video :(look carfule where we clear memory)

video = readVideo(files(i),frame_limit);

if(isempty(video))

continue;

end

video = centerCrop(video,inputSize);% 224 224 3 nframes

sequences = activations(netCNN,video,layerName);%1*1*1024*nf

clear video;

Xorg=squeeze(sequences);%1024*nf

X=[];

for m=1:size(Xorg,2)

xt=decim(Xorg(:,m).’,8,’max’); % here we reduce vector size by take maximal

X(:,m)=xt;

end

foldTrainCur = fullfile(trainFolder,string(labels(i)));

if(~exist(foldTrainCur,’dir’))

mkdir(foldTrainCur);

end

foldValidCur = fullfile(validationFolder,string(labels(i)));

if(~exist(foldValidCur,’dir’))

mkdir(foldValidCur);

end

if(ismember(i,idxTrain))

save(fullfile( foldTrainCur,num2str(i)),”X”,”-v7.3");

else

save(fullfile(foldValidCur,num2str(i)),”X”,”-v7.3");

end

clear X;

clear Xorg;

clear sequences;

end

LSTM Part

Now that already stored vector of feature for each video built LSTM network.

numFeatures = size(X,1); % 128

numClasses = numel(categories(labelsTrain));%13

layers = [

sequenceInputLayer(numFeatures,’Name’,’sequence’)

bilstmLayer(2000,’OutputMode’,’last’,’Name’,’bilstm’)

dropoutLayer(0.5,’Name’,’drop’)

fullyConnectedLayer(numClasses,’Name’,’fc’)

softmaxLayer(‘Name’,’softmax’)

classificationLayer(‘Name’,’classification’)];

Training

The fun part ! lets do it yes we can !

options = trainingOptions(‘adam’, …

‘ExecutionEnvironment’,’gpu’, …

‘MaxEpochs’,15, …

‘MiniBatchSize’,miniBatchSize, …

‘GradientThreshold’,1, …

‘Verbose’,0, …

‘Plots’,’training-progress’);

I used gpu since my cpu was stuck off course learn rate can be modified so it will change during training .

Depends on you to set batch size sincei got memory limits ..

miniBatchSize=16

dsTrain = sequenceDatastore(trainFolder);

dsTrain.MiniBatchSize = miniBatchSize;

net = trainNetwork(dsTrain,layers,options);

Wasn't the best but still some advance during epoch and many fluctuations probably due to step size/ batch size small. if step size will reduce during epochs probably it will get more stable.

The test :

dsTest = sequenceDatastore(validationFolder);

dsTest.MiniBatchSize = miniBatchSize;

Test accuracy

YPred = classify(net,dsTest,’MiniBatchSize’,miniBatchSize);

YTest = dsTest.Labels;

acc = sum(YPred == YTest)./numel(YTest) %0.61

[m,order] = confusionmat(YTest,YPred)

figure

cm = confusionchart(m,order);

Train accuracy

YPred = classify(net,dsTrain,’MiniBatchSize’,miniBatchSize);

YTrain = dsTrain.Labels;

acc = sum(YPred == YTrain)./numel(YTrain) %,0.62

[m,order] = confusionmat(YTrain,YPred)

figure

cm = confusionchart(m,order);

Summery

Here presented approach of video sequence classification using matlab with GPU.

features computed use googlenet(inception) they reduced from 1024 to 128 due to memory issues. the n feed to LSTM network.

The results for both train and test was about 61%. which wasn't great I'm sure if i had more memory and use more features from the CNN network the results was better. we can add in LSTM more epoch and change learn rate during training.

Classify Videos Using Deep Learning Matlab

Classify Videos Using Deep Learning

This example shows how to create a network for video classification by combining a pretrained image classification…

Sequence Classification Using Deep Learning

This example shows how to classify sequence data using a long short-term memory (LSTM) network. To train a deep neural…

Train Network Using Custom Mini-Batch Datastore for Sequence Data

This example shows how to train a deep learning network on out-of-memory sequence data using a custom mini-batch…

Compute and Store futures

Randomization

Loop Videos

LSTM Part

Training

Test accuracy

Train accuracy

Summery

Written by Michael Sheinfeild