MangaGAN: generation of high quality manga images from photo

MangaGAN promo

The authors suggested an interesting approach for solving of manga faces generation from photo problem.

Key achievements proposed in the article:

  • authors collected MangaGAN-BL dataset that contains: 109 noses, 179 mouths and 106 manga faces with landmarks. They used frames from Bleach manga
  • GAN-based framework for unpaired photo-to-manga translation.

Authors claim that current state-of-the-art approaches can’t produce good manga faces, because of different reasons. One of the reasons is that manga artists use different styles for drawing different facial features. And it’s difficult for the neural network to learn these patterns.

That is why authors suggested the following architecture:

MangaGAN architecture

Conventionally, the framework consists of 3 parts:

  • The top branch performs detection of facial features and transfers the style of each facial feature independently via special pretrained GAN. In total they trained 4 GAN’s (for mouth, eye, hairs and nose)
  • The bottom branch performs facial landmark transferring.
  • Image synthesis module

For training eye transferring GAN authors used:

  • adversarial loss with adoption of the stable least-squares losses:
  • adversarial loss formula
  • cycle-consistency loss to constrain the mapping solution between the input and the output domains
  • cycle consistency loss formula
  • structural smoothing loss to encourage networks to produce manga with smooth stroke-lines
  • structural smoothing loss formula

    Example of trained model output:

    eye regions samples generated by trained model

    For facial landmark transferring authors trained CycleGAN adversarial and cycle losses

    Synthesising of result image is performed via Piecewise Cubic Hermite Interpolating Polynomial method (PCHIP).


    comparison of suggested model with other cross-domain translation methods

    On this picture comparison of suggested model with state of the art models is presented. The last column corresponds to the developed model and seems to be it generates better images indeed.

Questions? Ask Us Now!

Other articles