GAN Fashion Photo Shoot: Garment to Model Images Using Conditional GANs - Costa M. Colbert, Chief Scientist MAD Street Den Inc. Costa Colbert
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
GAN Fashion Photo Shoot: Garment to Model Images Using Conditional GANs. Costa M. Colbert, Chief Scientist MAD Street Den Inc. Costa Colbert
Live model photography is expensive • Brands and retailers on average spend $100-500 per look. Lower per-look prices do not include hair, makeup, and styling • Shooting capacity is limited • 35-40 looks per day with hair & makeup • 60-70 looks per day without hair & makeup • Bulk of the cost includes: • Models’ time (at least $1,200 day rate) • Photographer’s time • Digital tech & post production • Hair, makeup, styling • Cost does not usually include: • Pulling samples • Transporting samples to photo studio • Photo studio & equipment • Time to cast models & hire photographers & stylists • Time of internal teams involved in a photo shoot process • Reshoots due to items not selling with a current image (3-5% of items)
Generative Adversarial Network Training Dataset providing real samples x Samples z from prior distribution e.g. N(0,1) Real / Fake z Discriminator D(x) decides if sample is from x Generator G(z) approximates a sample from x
Also use L1 reconstruction loss term: abs(G(z)-x)
Conditional Generative Adversarial Network Training Dataset providing real samples x ~ X Samples z from prior distribution e.g., garments, pose, other labels Real or Fake ? Discriminator CNN D(x, garment) decides if sample is from x, also requiring correct garment Generator CNN G(z) approximates a sample from x
Conditional GAN - discriminator (Patch GAN, Isola et al. Input is Model 2016) Image concatenated to Real/Fake is Garment Image determined by observing Discriminator CNN patches of limited 6 CNN layers extent. Convolution Instance Normalization Dropout
Hmm.., maybe that global discriminator term wasn’t such a bad idea..
Conditional GAN - generator Encoder Decoder Latent Pose Fashion Model Image Vector Garment Image Decoder 6 CNN layers 4x3x512 Encoder 6 CNN layers Deconvolution/Unpool Convolution Dropout Instance Normalization Instance Normalization Adam Optimizer GTX1080ti 2-4 GB
Pose Interpolation
GAN generator - latent vector Encoder Decoder Latent Vector 512x4x3
latent vector - Interpolation Enc Dec X1garment X1 LV1 Enc LV2 Dec X2 X2garment
latent vector - Interpolation Enc Dec X1garment X1 Dec LV1 XFi,n F(LV1,LV2) Enc LV2 Dec X2 X2garment Fi,n(x,y) = x + (y-x)*i/n
Latent Variable Interpolation Shoes Neckline Hemline
Latent Variable Interpolation Hemline
Latent Variable Interpolation Color Background Note sleeves.
latent vector – modify values Enc Dec X1garment X1 Dec LV1 XF, F(LV1,i) i Latent Vector 512x4x3
Principal Component Analysis (PCA) PCA is a dimension-reduction tool that can reduce a large set of variables to a small set that still contains most of the information in the large set. PCA transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables (principal components). PCA determines the new dimensions on the basis of variance.
Principal Component Analysis (PCA) Use PCA to go from 512x4x3 (~6k) dimensions to 100. LV6k PCA LV100 Choose an entry, scale by +/- 10 LV100 Inv(PCA) LV6k Dec XF, i
PCA Latent Variable Interpolation Skin color Model build
PCA Latent Variable Interpolation Shoes
Conclusions and Future Work Conditional GAN’s are well-suited for image generation in well-defined domains. Good enough for the casual observer not to notice. GAN’s have many “moving parts,” but we are getting better at using them. More work needed on accessories, choosing specific shoes, handbags, etc. Requires more thought on implementing conditioning labels.
A big thanks to Preferred Networks
Thank you!! l support@madstreetden.com
You can also read