Learning to Clean: A GAN Perspective
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Learning to Clean: A GAN Perspective Monika Sharma, Abhishek Verma, and Lovekesh Vig TCS Research, New Delhi, India {monika.sharma1 | verma.abhishek7 | lovekesh.vig }@tcs.com arXiv:1901.11382v1 [cs.CV] 28 Jan 2019 Abstract. In the big data era, the impetus to digitize the vast reservoirs of data trapped in unstructured scanned documents such as invoices, bank documents, courier receipts and contracts has gained fresh momentum. The scanning process often results in the introduction of artifacts such as salt-and-pepper / background noise, blur due to camera motion or shake, watermarkings, coffee stains, wrin- kles, or faded text. These artifacts pose many readability challenges to current text recognition algorithms and significantly degrade their performance. Existing learning based denoising techniques require a dataset comprising of noisy doc- uments paired with cleaned versions of the same document. In such scenarios, a model can be trained to generate clean documents from noisy versions. How- ever, very often in the real world such a paired dataset is not available, and all we have for training our denoising model are unpaired sets of noisy and clean images. This paper explores the use of Generative Adversarial Networks (GAN) to generate denoised versions of the noisy documents. In particular, where paired information is available, we formulate the problem as an image-to-image transla- tion task i.e, translating a document from noisy domain ( i.e., background noise, blurred, faded, watermarked ) to a target clean document using Generative Adver- sarial Networks (GAN). However, in the absence of paired images for training, we employed CycleGAN which is known to learn a mapping between the distribu- tions of the noisy images to the denoised images using unpaired data to achieve image-to-image translation for cleaning the noisy documents. We compare the performance of CycleGAN for document cleaning tasks using unpaired images with a Conditional GAN trained on paired data from the same dataset. Experi- ments were performed on a public document dataset on which different types of noise were artificially induced, results demonstrate that CycleGAN learns a more robust mapping from the space of noisy to clean documents. Keywords: Document Cleaning Suite · CycleGAN · Unpaired Data · Deblurring · Denoising · Defading · Watermark Removal. 1 Introduction The advent of industry 4.0 calls for the digitization of every aspect of industry, which includes automation of business processes, business analytics and phasing out of man- ually driven processes. While business processes have evolved to store large volumes of scanned digital copies of paper documents, however for many such documents the information stored needs to be extracted via text recognition techniques. While captur- ing these images via camera or scanner, artifacts tend to creep into the images such as
2 M. Sharma et al. background noise, blurred and faded text. In some scenarios, companies insert a water- mark in the documents which poses readability issues after scanning. Text recognition engines often suffer due to the low quality of scanned documents and are not able to read the documents properly and hence, fail to correctly digitize the information present in the documents. In this paper, we attempt to perform denoising of the documents be- fore the document is being sent to text recognition network for reading and propose a document cleaning suite based on generative adversarial training. This suite is trained for background noise removal, deblurring, watermark removal and defading and learns a mapping from the distribution of noisy documents to the distribution of clean docu- ments. Background noise removal is the process of removing the background noise, such as uneven contrast, see through effects, interfering strokes, and background spots on the documents. The background noise presents a problem to the performance of OCR as it is difficult to differentiate the text and background [3], [5], [14], [9]. De-blurring is the process of removal of blur from an image. Blur is defined as distortion in the image due to various factors such as shaking of camera, improper focus of camera etc. which decreases the readability of the text in the document image and hence, deteriorates the performance of OCR. Recent works for deblurring have focused on estimating blur ker- nels using techniques such as GAN [6], CNN [10], dictionary-based prior [12], sparsity- inducing prior [15] and hybrid non-convex regularizer [24]. Watermark removal aims at removing the watermark from an image while preserving the text in the image. Wa- termarks are low-intensity images printed on photographs and books in order to prevent copying of the material. But this watermark post scanning creates hinderance in read- ing the text of interest from documents. Inpainting [20] [23] techniques are used in the literature to recover the original image after detecting watermarks statistically. Defad- ing is the process of recovering text that has lightened / faded over time, which usually happens in old books and documents. This is also detrimental to the OCR performance. To remove all these artifacts that degrade the quality of documents and create hindrance in readability, we formulate the document cleaning process as an image-to-image trans- lation task at which Generative Adversarial Networks (GANs) [6] are known to give excellent performance. However, with the limited availability of paired data i.e., noisy and corresponding cleaned documents, we proposed to train CycleGAN [26] for unpaired datasets of noisy documents. We train CycleGAN for denoising / background noise removal, deblurring, watermark removal and defading tasks. CycleGAN eliminates the need for one-to-one mapping between images of source and target domains by a two-step transformation of source image i.e., first source image is mapped to an image in target domain and then back to source again. We evaluate the performance of our document cleaning suite on synthetic and publicly available datasets and compare them against state-of- the-art methods. We use Kaggle’s document dataset for denoising / background noise removal [4], the BMVC document deblurring dataset [7] which are publicly available online. There does not exist any document dataset for watermark removal and defading online. Therefore, we have synthetically generated document datasets for watermark removal and defading tasks, and have also made these public for the benefit of research community. Overall, our contributions in this paper are as follows :
Document Cleaning Suite 3 – We proposed a Document Cleaning Suite which is capable of cleaning documents via denoising / background noise removal, deblurring, watermark removal and de- fading for improving readability. – We proposed the application of CycleGAN [26] for translating a document from a noisy document distribution (e.g. with background noise, blurred, watermarked and faded) to a clean document distribution in the situations where there is shortage of paired dataset. – We synthetically created a document dataset for watermark removal and defading by inserting logos as watermarks and applying fading techniques on Google News dataset [1] of documents, respectively. – We evaluate CycleGAN for background noise removal, deblurring, watermark re- moval and defading on publicly available kaggle document dataset [4], BMVC de- blurring document dataset [7] and synthetically created watermarked and defading document datasets, respectively. The remaining parts of the paper are organized as follows. Section 2 reviews the related work. Section 3 introduces CycleGAN and explains its architecture. Section 4 provides details of datasets, training, evaluation metric used and also discusses experi- mental results and comparisons to evaluate the effectiveness and superiority of Cycle- GAN for cleaning the noisy documents. Section 5 concludes the paper. 2 Related Work Generative adversarial Network (GAN) [6] is the idea that has taken deep learning by storm. It employs adversarial training which essentially means pitting two neural net- works against each other. One is a generator while the other is a discriminator, where the former aims at producing data that are indistinguishable from real data while the latter tries to distinguish between real and fake data. The process eventually yields a generator with the ability to do a plethora of tasks efficiently such as image-to- image generation. Other notable applications where GANs have established their su- permacy are representation learning, image editing, art generation, music generation etc. [2] [19] [22] [13] [21]. Image-to-image translation is the task of mapping images in source domain to im- ages in target domain such as converting sketches into photographs, grayscale images to color images etc. The aim is to generate the target distribution given the source distri- bution. Prior work in the field of GANs such as Conditional GAN [17] forces the image produced by generator to be conditioned on the output which allows for optimal trans- lations. However, earlier GANs require one-to-one mapping of images between source and target domain i.e., a paired dataset. In case of documents, it is not possible to al- ways have cleaned documents corresponding to each noisy document. This persuaded us to explore unpaired image-to-image translation methods, e.g. Dual-GAN [25] which uses dual learning and CycleGAN [26] which makes use of cyclic-consistency loss to achieve unpaired image-to-image translation. In this paper, we propose to apply CycleGAN for document cleaning task. It has two pairs of generators and discriminators. One pair focuses on converting source domain to target domain while the other pair focuses on converting target domain to source
4 M. Sharma et al. domain. This bi-directional conversion process allows for a cyclic consistency loss for CycleGAN which ensures the effective conversion of an image from source to target and then back to source again. The transitivity property of cyclic-consistency loss allows CycleGAN to perform well on unpaired image-to-image translation. Existing methods for removing background noise from document images consist of binarization and thresholding techniques, fuzzy logic, histogram, morphology and ge- netic algorithm based methods [3] [5]. An automatic method for color noise estimation from a single image using Noise Level Function (NLF) and a Gaussian Conditional Random Field (GCRF) based removal technique was proposed in [14] for producing a clean image from noisy input. Sobia et al. [9] employs a technique for removing background and punch-hole noise from handwritten Urdu text. We observed that deep learning has not been applied in literature for removing noise from document images. There exists quite a lot of work on deblurring of images. For example, Deblur- GAN [11] uses conditional GANs to deblur images, [18] uses a multi-scale CNN to create an end-to-end system for deblurring. Ljubenovic et al. proposed class-adapted dictionary-based prior for the image [16]. There also exists method of sparsity-inducing prior on the blurring filter, which allows for deblurring images containing different classes of images such as faces, text etc. [15] when they co-occur in a document. A non-convex regularization method was developed by Yao et al. [24] which leveraged the non-convex sparsity constraints on image gradients and blur kernels for improving the kernel estimation accuracy. [10] uses a CNN to classify the image into one of the degradative sub-spaces and the corresponding blur kernel is then used for deblurring. Very few attempts have been made in past for removing watermarks from images. Authors in [20] proposed to use image inpainting to recover the original image. How- ever, the method developed by Xu et al. [23] detects the watermark using statistical methods and subsequently, removes it using image inpainting. To the best of our knowl- edge, we did not find any work on defading of images. 3 CycleGAN CycleGAN [26] has shown its worth in scenarios where there is paucity of paired dataset, i.e., image in source domain and corresponding image in target domain. This property of CycleGAN, of working without the need of one-to-one mapping between input domain and target domain and still being able to learn such image-to-image trans- lations, persuades us to use them for document cleaning suite where there is always limited availability of clean documents corresponding to noisy documents. To circum- vent the issue of learning meaningful transformations in case of unpaired dataset, Cy- cleGAN uses cycle-consistency loss which says that if an image is transformed from source distribution to target distribution and back again to source distribution, then we should get samples from source distribution. This loss is incorporated in CycleGAN by using two generators and two discriminators, as shown in Figure 1. The first genera- tor GB maps the image from noisy domain A (IA ) to an output image in target clean domain B (OB ). To make sure that there exists a meaningful relation between IA and OB , they must learn some features which can be used to map back OB to original noisy input domain. This reverse transformation is carried out by second generator GA which
Document Cleaning Suite 5 Fig. 1. Overview of CycleGAN - It consists of two generators, GA and GB which map noisy images to clean images and clean to noisy images, respectively using cycle-consistency loss [26]. It also contains two discriminators DA and DB which acts as adversary and rejects images gen- erated by generators. takes as input OB and converts it back into an image CA in noisy domain. Similar pro- cess of transformation is carried out for converting images in clean domain B to noise domain A as well. It is evident in the Figure 1 that each discriminator takes two inputs - original image in source domain and generated image via a generator. The task of the discriminator is to distinguish between them so that discriminator is able to defeat generator by rejecting images generated by it. While competing against discriminator so that it stops rejecting its images, the generator learns to produce images very close to the original input images. We use the same network of CycleGAN as proposed in [26]. The generator network consists of two convolutional layers of stride 2, several residual blocks, two layers of transposed convolutions with stride 1. The discriminator network uses 70 × 70 Patch- GANs [8] to classify the 70 × 70 overlapping patches of images as real or fake. 4 Experimental Results and discussion This section is divided into the following subsections: Section 4.1 provides details of the datasets used for the document cleaning suite. In Section 4.2, we elaborate on the training details utilized to perform our experiments. Next, we give the performance
6 M. Sharma et al. evaluation metric in Section 4.3. Subsequently, Section 4.4 discusses the results ob- tained from the experiments we conducted and provides comparison with the baseline model i.e., Conditional GAN [17]. Table 1. Performance comparison of Conditional GAN and CycleGAN based on PSNR PSNR (in dB) Task ConditionalGAN CycleGAN Background removal 27.624 31.774 Deblurring 19.195 30.293 Watermark removal 29.736 34.404 Defading 28.157 34.403 Fig. 2. Plot showing comparison between PSNR of images produced by CycleGAN [26] and ConditionalGAN [17] on test-set of deblurring document dataset [7]. The test-set consists of 16 sets of 100 documents each, where each set is blurred with one of the 16 different blur kernels used for creating the training dataset. 4.1 Dataset Details We used 4 separate document datasets, one each for background noise removal, deblur- ring, watermark removal and defading. Their details are given below : – Kaggle Document Denoising Dataset : This document denoising dataset hosted by Kaggle [4] consists of noisy documents with noise in various forms such as coffee stains, faded sun spots, dog-eared pages, and lot of wrinkles etc. We use this dataset for training and evaluating CycleGAN for removing background noise from document images. We have used a training set of 144 noisy documents to train Cy- cleGAN and tested the trained network on a test dataset of 72 document images.
Document Cleaning Suite 7 – Document Deblurring Dataset : We used artificial deblurring dataset of docu- ments [7] available online for training CycleGANs to deblur the blurred documents. This dataset was created by taking documents from the CiteSeerX repository and were further processed via various geometric transformations and two types of blur i.e., motion blur and de-focus blur, on them to make the noise look more realistic. We have used only a subset of this dataset by random sampling of 2000 documents for training CycleGAN. For evaluation, this deblurring dataset has a test-set which consists of 16 sets of 100 documents, with each set blurred with one of the 16 dif- ferent blur kernels used for creating the training dataset. – Watermark Removal Dataset : As there exists no publicly available dataset for watermarked document images, we generated our own synthetic watermark re- moval document dataset. To create the dataset, we first obtained text documents from Google News Dataset [1] and approx. 21 logos from the Internet for inserting watermarks. Then, we pasted the logos on the documents by making logos trans- parent with varying values of alpha channel. We used variations in the position of logo, size of logo and transparency factor for creating randomness in the water- marked documents and to make them realistic. The training set of 2000 images and test set of 200 images from this synthetic dataset was used for experimental pur- poses. – Document Defading Dataset : Similar to watermark removal dataset, we artifi- cially generated faded documents from Google News Dataset [1] by applying var- ious dilation operations on document images. Here again, the train and test set consisted of 2000 and 200 document images, respectively for training and evaluat- ing the performance of CycleGAN for defading purposes. 4.2 Training Details We use the same training procedure as adopted for CycleGan in paper [26]. Least- squares loss is used to train the network as this loss is more stable and produces better quality images. We update the discriminators by using a history of generated images rather than the ones produced by latest generator to reduce model oscillations. We use Adam optimizer with learning rate of 0.0002 and momentum of 0.5 for training Cy- cleGAN on noisy images of size 200 × 200. The network is trained for 12, 30, 12 and 8 epochs for background noise removal, deblurring, watermark removal and defading, respectively. For Conditional GAN [17], we use kernel size of 3 × 3 with a stride 1 and zero- padding by 1 for all convolutional and deconvolutional layers of generator network. In case of discriminator network, the first three convolutional and deconvolutional layers were composed of kernels of size 4 × 4 with a stride 2 and zero-padding by 1. However, the last two layers in discriminator network uses kernel of size 4 × 4 with stride of size 1. The network is trained on input images of size 200 × 200 using Adam Optimizer with a learning rate of 2 × 10−3 . We use 6.6 × 10−3 and 1 as values of weights for adversarial loss and perceptual loss, respectively. The network is trained for 5 epochs
8 M. Sharma et al. for each of the document cleaning tasks i.e., background noise removal, deblurring, watermark removal and defading. 4.3 Evaluation Metric We evaluate the performance of CycleGAN using Peak Signal-to-Noise Ratio (PSNR)1 as an image quality metric. PSNR is defined as ratio of the maximum possible power of a signal and the power of distorting noise which deteriorates the quality of its repre- sentation. PSNR is usually expressed in terms of Mean-squared error (MSE). Given a denoised image (D) of size m × n and its corresponding noisy image (I) of same size, PSNR is given as follows : M axD P SN R = 20 × log 10( ) (1) M SE where M axD represents the maximum pixel intensity value of image D. Higher the PSNR value, better is the image quality. Fig. 3. Examples of sample noisy images (upper row) cleaned by CycleGAN and their corre- sponding cleaned images (bottom row) from Kaggle Document Denoising Dataset [4] 4.4 Results Now, we present the results obtained on document datasets using CycleGAN for doc- ument cleaning purposes. Table 1 gives the comparison of Conditional GAN and Cy- cleGAN for denoising, deblurring, watermark removal and defading tasks. We observe that CycleGAN beats Conditional GAN on all these document cleaning tasks as shown in Table 1. Row 1 of Table 1 gives mean PSNR values of images deblurred using Con- ditional GAN and CycleGAN. CycleGAN obtains higher PSNR value of 31.774 dB as compared to that of Conditional GAN’s PSNR (27.624 dB) on Kaggle Document 1 Peak Signal-to-Noise Ratio: http://www.ni.com/white-paper/13306/en/
Document Cleaning Suite 9 Fig. 4. Results of CycleGAN on deblurring document dataset [7]. Top row shows the blurred images and bottom row shows their corresponding deblurred images. Fig. 5. Samples of watermarked images (first row) and their respective cleaned images (second row) produced by CycleGAN. Denoising dataset [4]. Similarly, PSNR value of CycleGAN (19.195 dB) is better than Conditional GAN for deblurring dataset [7]. We have also shown the PSNR comparison for deblurring test-set using a plot, as given in Figure 2 which shows the superiority of CycleGAN over Conditional GAN. Row 3 and 4 gives the PSNR values for watermark removal and defading task. Here again, CycleGAN gives better image quality. We also show some sample examples of clean images produced after the application of CycleGAN for all four tasks - background noise removal, deblurring, watermark removal and defading, as given in Figures 3, 4, 5, 6, respectively. 5 Conclusion In this paper, we proposed and developed Document Cleaning Suite which is based on the application of CycleGAN and is responsible for performing various document cleaning tasks such as background noise removal, deblurring, watermark removal and
10 M. Sharma et al. Fig. 6. Figure showing example images of faded (top row) and corresponding defaded images (bottom row) with recovered text by CycleGAN. defading. Very often it is difficult to obtain clean images corresponding to a noisy im- age, and simulation of noise for training image-to-image translators does not adequately generalize to the real world. Instead, we trained a model to learn the mapping from an input distribution to an output distribution of images, while preserving the essence of the image. We used CycleGAN because it has been seen to provide good results for such domain adaptation scenarios where there is limited availability of paired datasets i.e., noisy and correspondig cleaned image. We demonstrated the effectiveness of Cy- cleGAN on publicly available and synthetic document datasets, and the results demon- strate that it can clean up a variety of noise effectively. References 1. 2011, E.: Google news dataset. EMNLP 2011 SIXTH WORKSHOP ON STATIS- TICAL MACHINE TRANSLATION (2011), http://www.statmt.org/wmt11/ translation-task.html#download 2. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: In- terpretable representation learning by information maximizing generative adversarial nets. CoRR abs/1606.03657 (2016), http://arxiv.org/abs/1606.03657 3. Farahmand, A., Sarrafzadeh, A., Shanbehzadeh, J.: Document image noises and removal methods. Proceedings of the International MultiConference of Engineers and Computer Scientists 2013 1 (2013), http://www.iaeng.org/publication/IMECS2013/ IMECS2013_pp436-440.pdf 4. Frank, A.: Uci machine learning repository. irvine, ca: University of california, school of information and computer science. http://archive. ics. uci. edu/ml (2010) 5. Ganbold, G.: History document image background noise and removal methods. International Journal of Knowledge Content Development and Technology 5 (2015), http://ijkcdt. net/xml/05531/05531.pdf 6. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative Adversarial Networks. ArXiv e-prints (Jun 2014) 7. Hradiš, M., Kotera, J., Zemčı́k, P., Šroubek, F.: Convolutional neural networks for direct text deblurring. In: Proceedings of BMVC 2015. The British Machine Vision Association and
Document Cleaning Suite 11 Society for Pattern Recognition (2015), http://www.fit.vutbr.cz/research/ view_pub.php?id=10922 8. Isola, P., Zhu, J., Zhou, T., Efros, A.A.: Image-to-image translation with conditional ad- versarial networks. CoRR abs/1611.07004 (2016), http://arxiv.org/abs/1611. 07004 9. Javed, S.T., Fasihi, M.M., Khan, A., Ashraf, U.: Background and punch-hole noise removal from handwritten urdu text. In: 2017 International Multi-topic Conference (INMIC). pp. 1–6 (Nov 2017). https://doi.org/10.1109/INMIC.2017.8289451 10. Jiao, J., Sun, J., Satoshi, N.: A convolutional neural network based two-stage document de- blurring. In: 2017 14th IAPR International Conference on Document Analysis and Recogni- tion (ICDAR). vol. 01, pp. 703–707 (Nov 2017). https://doi.org/10.1109/ICDAR.2017.120 11. Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: Blind motion deblurring using conditional adversarial networks. CoRR abs/1711.07064 (2017), http: //arxiv.org/abs/1711.07064 12. Li, H., Zhang, Y., Zhang, H., Zhu, Y., Sun, J.: Blind image deblurring based on sparse prior of dictionary pair. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). pp. 3054–3057 (Nov 2012) 13. Lin, D., Fu, K., Wang, Y., Xu, G., Sun, X.: Marta gans: Unsupervised representation learn- ing for remote sensing image classification. IEEE Geoscience and Remote Sensing Letters 14(11), 2092–2096 (Nov 2017). https://doi.org/10.1109/LGRS.2017.2752750 14. Liu, C., Szeliski, R., Kang, S.B., Zitnick, C.L., Freeman, W.T.: Automatic estimation and removal of noise from a single image. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence (2006), http://people.csail.mit.edu/celiu/denoise/ denoise_pami.pdf 15. Liu, R.W., Li, Y., Liu, Y., Duan, J., Xu, T., Liu, J.: Single-image blind deblurring with hy- brid sparsity regularization. In: 2017 20th International Conference on Information Fusion (Fusion). pp. 1–8 (July 2017). https://doi.org/10.23919/ICIF.2017.8009659 16. Ljubenovic, M., Zhuang, L., Figueiredo, M.A.T.: Class-adapted blind deblurring of document images. In: 2017 14th IAPR International Conference on Docu- ment Analysis and Recognition (ICDAR). vol. 01, pp. 721–726 (Nov 2017). https://doi.org/10.1109/ICDAR.2017.123 17. Mirza, M., Osindero, S.: Conditional generative adversarial nets. CoRR abs/1411.1784 (2014), http://arxiv.org/abs/1411.1784 18. Nah, S., Kim, T.H., Lee, K.M.: Deep multi-scale convolutional neural network for dynamic scene deblurring. CoRR abs/1612.02177 (2016), http://arxiv.org/abs/1612. 02177 19. Peng, Y., Qi, J., Yuan, Y.: Cm-gans: Cross-modal generative adversarial networks for com- mon representation learning. CoRR abs/1710.05106 (2017), http://arxiv.org/abs/ 1710.05106 20. Qin, C., He, Z., Yao, H., Cao, F., Gao, L.: Visible watermark removal scheme based on reversible data hiding and image inpainting. Signal Processing: Image Communication 60, 160–172 (2018). https://doi.org/https://doi.org/10.1016/j.image.2017.10.003, http:// www.sciencedirect.com/science/article/pii/S0923596517301868 21. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convo- lutional generative adversarial networks. CoRR abs/1511.06434 (2015), http://arxiv. org/abs/1511.06434 22. Wang, L., Gao, C., Yang, L., Zhao, Y., Zuo, W., Meng, D.: Pm-gans: Discriminative rep- resentation learning for action recognition using partial-modalities. CoRR abs/1804.06248 (2018), http://arxiv.org/abs/1804.06248
12 M. Sharma et al. 23. Xu, C., Lu, Y., Zhou, Y.: An automatic visible watermark removal technique using image inpainting algorithms. In: 2017 4th International Conference on Systems and Informatics (ICSAI). pp. 1152–1157 (Nov 2017). https://doi.org/10.1109/ICSAI.2017.8248459 24. Yao, Q., Kwok, J.T.: Efficient Learning with a Family of Nonconvex Regularizers by Redis- tributing Nonconvexity. ArXiv e-prints (Jun 2016) 25. Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: Unsupervised dual learning for image- to-image translation. CoRR abs/1704.02510 (2017), http://arxiv.org/abs/1704. 02510 26. Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle- consistent adversarial networks. CoRR abs/1703.10593 (2017), http://arxiv.org/ abs/1703.10593
You can also read