MangaGAN: generation of high-quality manga images from photo

The authors suggested an interesting approach for solving manga faces generation from photo problems.

Key achievements proposed in the article:

  • authors collected MangaGAN-BL dataset that contains: 109 noses, 179 mouths and 106 manga faces with landmarks. They used frames from Bleach manga
  • GAN-based framework for unpaired photo-to-manga translation.

Authors claim that current state-of-the-art approaches can’t produce good manga faces, because of different reasons. One of the reasons is that manga artists use different styles for drawing different facial features. And it’s difficult for the neural network to learn these patterns.

That is why the authors suggested the following architecture:

MangaGAN architecture

Conventionally, the framework consists of 3 parts:

  • The top branch performs the detection of facial features and transfers the style of each facial feature independently via a special pre-trained GAN. In total, they trained 4 GAN’s (for mouth, eye, hairs and nose)
  • The bottom branch performs facial landmark transferring.
  • Image synthesis module

For training eye transferring GAN authors used:

  • adversarial loss with the adoption of the stable least-squares losses:
adversarial loss formula
cycle consistency loss formula
  • structural smoothing loss to encourage networks to produce manga with smooth stroke-lines
structural smoothing loss formula

Example of trained model output:

eye regions samples generated by trained model

For facial landmark transferring authors trained CycleGAN adversarial and cycle losses.

Synthesizing of result image is performed via Piecewise Cubic Hermite Interpolating Polynomial method (PCHIP).

Results:

comparison of suggested model with other cross-domain translation methods

On this picture comparison of the suggested model with state-of-the-art models is presented. The last column corresponds to the developed model and seems to be it generates better images indeed.

Tags