7 C
New York
Thursday, April 3, 2025

Deepfake detection problem from R



Deepfake detection problem from R

Introduction

Working with video datasets, significantly with respect to detection of AI-based faux objects, could be very difficult on account of correct body choice and face detection. To method this problem from R, one could make use of capabilities provided by OpenCV, magick, and keras.

Our method consists of the next consequent steps:

  • learn all of the movies
  • seize and extract photos from the movies
  • detect faces from the extracted photos
  • crop the faces
  • construct a picture classification mannequin with Keras

Let’s shortly introduce the non-deep-learning libraries we’re utilizing. OpenCV is a pc imaginative and prescient library that features:

Then again, magick is the open-source image-processing library that may assist to learn and extract helpful options from video datasets:

  • Learn video information
  • Extract photos per second from the video
  • Crop the faces from the photographs

Earlier than we go into an in depth clarification, readers ought to know that there is no such thing as a must copy-paste code chunks. As a result of on the finish of the submit one can discover a hyperlink to Google Colab with GPU acceleration. This kernel permits everybody to run and reproduce the identical outcomes.

Knowledge exploration

The dataset that we’re going to analyze is offered by AWS, Fb, Microsoft, the Partnership on AI’s Media Integrity Steering Committee, and varied teachers.

It incorporates each actual and AI-generated faux movies. The overall measurement is over 470 GB. Nonetheless, the pattern 4 GB dataset is individually accessible.

The movies within the folders are within the format of mp4 and have varied lengths. Our job is to find out the variety of photos to seize per second of a video. We often took 1-3 fps for each video.

Observe: Set fps to NULL if you wish to extract all frames.

video = magick::image_read_video("aagfhgtpmv.mp4",fps = 2)
vid_1 = video[[1]]
vid_1 = magick::image_read(vid_1) %>% image_resize('1000x1000')

We noticed simply the primary body. What about the remainder of them?

Trying on the gif one can observe that some fakes are very simple to distinguish, however a small fraction seems to be fairly real looking. That is one other problem throughout information preparation.

Face detection

At first, face places have to be decided by way of bounding containers, utilizing OpenCV. Then, magick is used to routinely extract them from all photos.

# get face location and calculate bounding field
library(opencv)
unconf <- ocv_read('frame_1.jpg')
faces <- ocv_face(unconf)
facemask <- ocv_facemask(unconf)
df = attr(facemask, 'faces')
rectX = (df$x - df$radius) 
rectY = (df$y - df$radius)
x = (df$x + df$radius) 
y = (df$y + df$radius)

# draw with crimson dashed line the field
imh  = image_draw(image_read('frame_1.jpg'))
rect(rectX, rectY, x, y, border = "crimson", 
     lty = "dashed", lwd = 2)
dev.off()

If face places are discovered, then it is vitally simple to extract all of them.

edited = image_crop(imh, "49x49+66+34")
edited = image_crop(imh, paste(x-rectX+1,'x',x-rectX+1,'+',rectX, '+',rectY,sep = ''))
edited

Deep studying mannequin

After dataset preparation, it’s time to construct a deep studying mannequin with Keras. We will shortly place all the photographs into folders and, utilizing picture turbines, feed faces to a pre-trained Keras mannequin.

train_dir = 'fakes_reals'
width = 150L
top = 150L
epochs = 10

train_datagen = image_data_generator(
  rescale = 1/255,
  rotation_range = 40,
  width_shift_range = 0.2,
  height_shift_range = 0.2,
  shear_range = 0.2,
  zoom_range = 0.2,
  horizontal_flip = TRUE,
  fill_mode = "nearest",
  validation_split=0.2
)


train_generator <- flow_images_from_directory(
  train_dir,                  
  train_datagen,             
  target_size = c(width,top), 
  batch_size = 10,
  class_mode = "binary"
)

# Construct the mannequin ---------------------------------------------------------

conv_base <- application_vgg16(
  weights = "imagenet",
  include_top = FALSE,
  input_shape = c(width, top, 3)
)

mannequin <- keras_model_sequential() %>% 
  conv_base %>% 
  layer_flatten() %>% 
  layer_dense(items = 256, activation = "relu") %>% 
  layer_dense(items = 1, activation = "sigmoid")

mannequin %>% compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(lr = 2e-5),
  metrics = c("accuracy")
)

historical past <- mannequin %>% fit_generator(
  train_generator,
  steps_per_epoch = ceiling(train_generator$samples/train_generator$batch_size),
  epochs = 10
)

Reproduce in a Pocket book

Conclusion

This submit exhibits learn how to do video classification from R. The steps have been:

  • Learn movies and extract photos from the dataset
  • Apply OpenCV to detect faces
  • Extract faces by way of bounding containers
  • Construct a deep studying mannequin

Nonetheless, readers ought to know that the implementation of the next steps might drastically enhance mannequin efficiency:

  • extract the entire frames from the video information
  • load totally different pre-trained weights, or use totally different pre-trained fashions
  • use one other expertise to detect faces – e.g., “MTCNN face detector”

Be at liberty to strive these choices on the Deepfake detection problem and share your ends in the feedback part!

Thanks for studying!

Corrections

If you happen to see errors or wish to counsel adjustments, please create a problem on the supply repository.

Reuse

Textual content and figures are licensed beneath Inventive Commons Attribution CC BY 4.0. Supply code is offered at https://github.com/henry090/Deepfake-from-R, until in any other case famous. The figures which have been reused from different sources do not fall beneath this license and may be acknowledged by a notice of their caption: “Determine from …”.

Quotation

For attribution, please cite this work as

Abdullayev (2020, Aug. 18). Posit AI Weblog: Deepfake detection problem from R. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/

BibTeX quotation

@misc{abdullayev2020deepfake,
  creator = {Abdullayev, Turgut},
  title = {Posit AI Weblog: Deepfake detection problem from R},
  url = {https://blogs.rstudio.com/tensorflow/posts/2020-08-18-deepfake/},
  yr = {2020}
}

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles