Authors: (1) Kedan Li, University of Illinois at Urbana-Champaign; (2) Min Jin Chong, University of Illinois at Urbana-Champaign; (3) Jingen Liu, JD AI Research; (4) David Forsyth, University of Illinois at Urbana-Champaign. Table of Links Abstract and Intro Related Work Proposed Method Experiments Conclusions and References 5. Conclusions In this paper, we propose two general modifications to the virtual try-on framework: (a) carefully choose the product-model pair for transfer using a shape embedding and (b) combine multiple coordinated warps using inpainting. Our results show that both modifications lead to significant improvement in generation quality. Qualitative examples demonstrate our ability to accurately preserve details of garments. This lead to difficulties for shoppers to distinguish between real and synthesized model images, shown by user study results. References Alp Guler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018) Ayush, K., Jandial, S., Chopra, A., Krishnamurthy, B.: Powering virtual try-on via auxiliary human segmentation learning. In: The IEEE International Conference on Computer Vision (ICCV) Workshops (Oct 2019) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI (2002) Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In: ECCV (2016) Brock, A., Donahue, J., Simonyan, K.: Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018) Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshops (2019) Chen, W., Wang, H., Li, Y., Su, H., Wang, Z., Tu, C., Lischinski, D., Cohen-Or, D., Chen, B.: Synthesizing training images for boosting human 3d pose estimation (2015) Chong, M.J., Forsyth, D.: Effectively unbiased fid and inception score and where to find them. arXiv preprint arXiv:1911.07023 (2019) Danerek, R., Dibra, E., Oztireli, A.C., Ziegler, R., Gross, M.H.: Deepgarment : 3d garment shape estimation from a single image. Comput. Graph. Forum (2017) Dong, H., Liang, X., Gong, K., Lai, H., Zhu, J., Yin, J.: Soft-gated warping-gan for pose-guided person image synthesis. In: NeurIPS (2018) Dong, H., Liang, X., Wang, B., Lai, H., Zhu, J., Yin, J.: Towards multi-pose guided virtual try-on network. In: ICCV (2019) Grigor’ev, A.K., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.S.: Coordinatebased texture inpainting for pose-guided human image generation. CVPR (2019) Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M.: Drape: Dressing any person. ACM Transactions on Graphics - TOG (2012) Han, X., Hu, X., Huang, W., Scott, M.R.: Clothflow: A flow-based model for clothed person generation. In: ICCV (2019) Han, X., Wu, Z., Huang, W., Scott, M.R., Davis, L.S.: Compatible and diverse fashion image inpainting (2019) Han, X., Wu, Z., Wu, Z., Yu, R., Davis, L.S.: Viton: An image-based virtual try-on network. In: CVPR (2018) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems. pp. 6626–6637 (2017) Hsiao, W.L., Grauman, K.: Dressing for diverse body shapes. ArXiv (2019) Hsiao, W.L., Katsman, I., Wu, C.Y., Parikh, D., Grauman, K.: Fashion++: Minimal edits for outfit improvement. In: In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019) Hsieh, C.W., Chen, C.Y., Chou, C.L., Shuai, H.H., Liu, J., Cheng, W.H.: Fashionon: Semantic-guided image-based virtual try-on with detailed human and clothing information. In: MM ’19 (2019) HyugJae, Lee, R., Kang, M., Cho, M., Park, G.: La-viton: A network for lookingattractive virtual try on. In: ICCV Workshops (2019) Jaderberg, M., Simonyan, K., Zisserman, A., kavukcuoglu, k.: Spatial transformer networks. In: NeurIPS (2015) Jandial, S., Chopra, A., Ayush, K., Hemani, M., Kumar, A., Krishnamurthy, B.: Sievenet: A unified framework for robust image-based virtual try-on. In: WACV (2020) Jeong, M.H., Han, D.H., Ko, H.S.: Garment capture from a photograph. Journal of Visualization and Computer Animation (2015) Ji, D., Kwon, J., McFarland, M., Savarese, S.: Deep view morphing. In: CVPR (2017) Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. CVPR (2018) Kanazawa, A., Jacobs, D., Chandraker, M.: Warpnet: Weakly supervised matching for single-view reconstruction. In: CVPR (2016) Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410 (2019) Lin, C.H., Yumer, E., Wang, O., Shechtman, E., Lucey, S.: St-gan: Spatial transformer generative adversarial networks for image compositing. In: CVPR (2018) Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: ECCV (2018) Liu, K.H., Chen, T.Y., Chen, C.S.: Mvc: A dataset for view-invariant clothing retrieval and attribute prediction. In: ICMR (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: CVPR (2016) McKinsey: State of the fashion industry 2019 (2019) Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., Morishima, S.: Siclope: Silhouette-based clothed people supplementary materials. In: CVPR (2019) Neverova, N., Gler, R.A., Kokkinos, I.: Dense pose transfer. In: ECCV (2018) Raffiee, A.H., Sollami, M.: Garmentgan: Photo-realistic adversarial fashion transfer (2020) Raj, A., Sangkloy, P., Chang, H., Hays, J., Ceylan, D., Lu, J.: Swapnet: Image based garment transfer. In: ECCV (2018) Rocco, I., Arandjelovi´c, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: CVPR (2017) Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: Pixelaligned implicit function for high-resolution clothed human digitization. ICCV (2019) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015) Song, D., Li, T., Mao, Z., Liu, A.: Sp-viton: shape-preserving image-based virtual try-on network. Multimedia Tools and Applications (2019) Suzuki, S., Abe, K.: Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing (1985) Vaccaro, K., Agarwalla, T., Shivakumar, S., Kumar, R.: Designing the future of personal fashion. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (2018) Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018) Wang, J., Zhang, W., Liu, W.H., Mei, T.: Down to the last detail: Virtual try-on with detail carving. ArXiv (2019) Wu, Z., Lin, G., Tao, Q., Cai, J.: M2e-try on net: Fashion from model to everyone. In: MM ’19 (2018) Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: CVPR (2017) Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019) Yu, J., Lin, Z.L., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextualattention. In: CVPR (2018) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, R., Wang, X., Xie, X.: Vtnfp: An image-based virtual try-on network withbody and clothing feature preservation Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318 (2018) Zheng, N., Song, X., Chen, Z., Hu, L., Cao, D., Nie, L.: Virtually trying on newclothing with arbitrary poses. In: MM ’19 (2019) Zheng, S., Yang, F., Kiapour, M.H., Piramuthu, R.: Modanet: A large-scale streetfashion dataset with polygon annotations. In: ACM Multimedia (2018) Zhu, S., Fidler, S., Urtasun, R., Lin, D., Chen, C.L.: Be your own prada: Fashionsynthesis with structural coherence. In: CVPR (2017) This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license. Authors: (1) Kedan Li, University of Illinois at Urbana-Champaign; (2) Min Jin Chong, University of Illinois at Urbana-Champaign; (3) Jingen Liu, JD AI Research; (4) David Forsyth, University of Illinois at Urbana-Champaign. Authors: Authors: (1) Kedan Li, University of Illinois at Urbana-Champaign; (2) Min Jin Chong, University of Illinois at Urbana-Champaign; (3) Jingen Liu, JD AI Research; (4) David Forsyth, University of Illinois at Urbana-Champaign. Table of Links Abstract and Intro Related Work Proposed Method Experiments Conclusions and References Abstract and Intro Abstract and Intro Related Work Related Work Proposed Method Proposed Method Experiments Experiments Conclusions and References Conclusions and References 5. Conclusions In this paper, we propose two general modifications to the virtual try-on framework: (a) carefully choose the product-model pair for transfer using a shape embedding and (b) combine multiple coordinated warps using inpainting. Our results show that both modifications lead to significant improvement in generation quality. Qualitative examples demonstrate our ability to accurately preserve details of garments. This lead to difficulties for shoppers to distinguish between real and synthesized model images, shown by user study results. References Alp Guler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018) Ayush, K., Jandial, S., Chopra, A., Krishnamurthy, B.: Powering virtual try-on via auxiliary human segmentation learning. In: The IEEE International Conference on Computer Vision (ICCV) Workshops (Oct 2019) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI (2002) Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In: ECCV (2016) Brock, A., Donahue, J., Simonyan, K.: Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018) Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshops (2019) Chen, W., Wang, H., Li, Y., Su, H., Wang, Z., Tu, C., Lischinski, D., Cohen-Or, D., Chen, B.: Synthesizing training images for boosting human 3d pose estimation (2015) Chong, M.J., Forsyth, D.: Effectively unbiased fid and inception score and where to find them. arXiv preprint arXiv:1911.07023 (2019) Danerek, R., Dibra, E., Oztireli, A.C., Ziegler, R., Gross, M.H.: Deepgarment : 3d garment shape estimation from a single image. Comput. Graph. Forum (2017) Dong, H., Liang, X., Gong, K., Lai, H., Zhu, J., Yin, J.: Soft-gated warping-gan for pose-guided person image synthesis. In: NeurIPS (2018) Dong, H., Liang, X., Wang, B., Lai, H., Zhu, J., Yin, J.: Towards multi-pose guided virtual try-on network. In: ICCV (2019) Grigor’ev, A.K., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.S.: Coordinatebased texture inpainting for pose-guided human image generation. CVPR (2019) Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M.: Drape: Dressing any person. ACM Transactions on Graphics - TOG (2012) Han, X., Hu, X., Huang, W., Scott, M.R.: Clothflow: A flow-based model for clothed person generation. In: ICCV (2019) Han, X., Wu, Z., Huang, W., Scott, M.R., Davis, L.S.: Compatible and diverse fashion image inpainting (2019) Han, X., Wu, Z., Wu, Z., Yu, R., Davis, L.S.: Viton: An image-based virtual try-on network. In: CVPR (2018) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems. pp. 6626–6637 (2017) Hsiao, W.L., Grauman, K.: Dressing for diverse body shapes. ArXiv (2019) Hsiao, W.L., Katsman, I., Wu, C.Y., Parikh, D., Grauman, K.: Fashion++: Minimal edits for outfit improvement. In: In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019) Hsieh, C.W., Chen, C.Y., Chou, C.L., Shuai, H.H., Liu, J., Cheng, W.H.: Fashionon: Semantic-guided image-based virtual try-on with detailed human and clothing information. In: MM ’19 (2019) HyugJae, Lee, R., Kang, M., Cho, M., Park, G.: La-viton: A network for lookingattractive virtual try on. In: ICCV Workshops (2019) Jaderberg, M., Simonyan, K., Zisserman, A., kavukcuoglu, k.: Spatial transformer networks. In: NeurIPS (2015) Jandial, S., Chopra, A., Ayush, K., Hemani, M., Kumar, A., Krishnamurthy, B.: Sievenet: A unified framework for robust image-based virtual try-on. In: WACV (2020) Jeong, M.H., Han, D.H., Ko, H.S.: Garment capture from a photograph. Journal of Visualization and Computer Animation (2015) Ji, D., Kwon, J., McFarland, M., Savarese, S.: Deep view morphing. In: CVPR (2017) Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. CVPR (2018) Kanazawa, A., Jacobs, D., Chandraker, M.: Warpnet: Weakly supervised matching for single-view reconstruction. In: CVPR (2016) Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410 (2019) Lin, C.H., Yumer, E., Wang, O., Shechtman, E., Lucey, S.: St-gan: Spatial transformer generative adversarial networks for image compositing. In: CVPR (2018) Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: ECCV (2018) Liu, K.H., Chen, T.Y., Chen, C.S.: Mvc: A dataset for view-invariant clothing retrieval and attribute prediction. In: ICMR (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: CVPR (2016) McKinsey: State of the fashion industry 2019 (2019) Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., Morishima, S.: Siclope: Silhouette-based clothed people supplementary materials. In: CVPR (2019) Neverova, N., Gler, R.A., Kokkinos, I.: Dense pose transfer. In: ECCV (2018) Raffiee, A.H., Sollami, M.: Garmentgan: Photo-realistic adversarial fashion transfer (2020) Raj, A., Sangkloy, P., Chang, H., Hays, J., Ceylan, D., Lu, J.: Swapnet: Image based garment transfer. In: ECCV (2018) Rocco, I., Arandjelovi´c, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: CVPR (2017) Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: Pixelaligned implicit function for high-resolution clothed human digitization. ICCV (2019) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015) Song, D., Li, T., Mao, Z., Liu, A.: Sp-viton: shape-preserving image-based virtual try-on network. Multimedia Tools and Applications (2019) Suzuki, S., Abe, K.: Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing (1985) Vaccaro, K., Agarwalla, T., Shivakumar, S., Kumar, R.: Designing the future of personal fashion. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (2018) Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018) Wang, J., Zhang, W., Liu, W.H., Mei, T.: Down to the last detail: Virtual try-on with detail carving. ArXiv (2019) Wu, Z., Lin, G., Tao, Q., Cai, J.: M2e-try on net: Fashion from model to everyone. In: MM ’19 (2018) Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: CVPR (2017) Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019) Yu, J., Lin, Z.L., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextualattention. In: CVPR (2018) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, R., Wang, X., Xie, X.: Vtnfp: An image-based virtual try-on network withbody and clothing feature preservation Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318 (2018) Zheng, N., Song, X., Chen, Z., Hu, L., Cao, D., Nie, L.: Virtually trying on newclothing with arbitrary poses. In: MM ’19 (2019) Zheng, S., Yang, F., Kiapour, M.H., Piramuthu, R.: Modanet: A large-scale streetfashion dataset with polygon annotations. In: ACM Multimedia (2018) Zhu, S., Fidler, S., Urtasun, R., Lin, D., Chen, C.L.: Be your own prada: Fashionsynthesis with structural coherence. In: CVPR (2017) Alp Guler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018) Alp Guler, R., Neverova, N., Kokkinos, I.: Densepose: Dense human pose estimation in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2018) Ayush, K., Jandial, S., Chopra, A., Krishnamurthy, B.: Powering virtual try-on via auxiliary human segmentation learning. In: The IEEE International Conference on Computer Vision (ICCV) Workshops (Oct 2019) Ayush, K., Jandial, S., Chopra, A., Krishnamurthy, B.: Powering virtual try-on via auxiliary human segmentation learning. In: The IEEE International Conference on Computer Vision (ICCV) Workshops (Oct 2019) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI (2002) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI (2002) Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In: ECCV (2016) Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In: ECCV (2016) Brock, A., Donahue, J., Simonyan, K.: Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018) Brock, A., Donahue, J., Simonyan, K.: Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018) Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018) Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshops (2019) Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: ICCV Workshops (2019) Chen, W., Wang, H., Li, Y., Su, H., Wang, Z., Tu, C., Lischinski, D., Cohen-Or, D., Chen, B.: Synthesizing training images for boosting human 3d pose estimation (2015) Chen, W., Wang, H., Li, Y., Su, H., Wang, Z., Tu, C., Lischinski, D., Cohen-Or, D., Chen, B.: Synthesizing training images for boosting human 3d pose estimation (2015) Chong, M.J., Forsyth, D.: Effectively unbiased fid and inception score and where to find them. arXiv preprint arXiv:1911.07023 (2019) Chong, M.J., Forsyth, D.: Effectively unbiased fid and inception score and where to find them. arXiv preprint arXiv:1911.07023 (2019) Danerek, R., Dibra, E., Oztireli, A.C., Ziegler, R., Gross, M.H.: Deepgarment : 3d garment shape estimation from a single image. Comput. Graph. Forum (2017) Danerek, R., Dibra, E., Oztireli, A.C., Ziegler, R., Gross, M.H.: Deepgarment : 3d garment shape estimation from a single image. Comput. Graph. Forum (2017) Dong, H., Liang, X., Gong, K., Lai, H., Zhu, J., Yin, J.: Soft-gated warping-gan for pose-guided person image synthesis. In: NeurIPS (2018) Dong, H., Liang, X., Gong, K., Lai, H., Zhu, J., Yin, J.: Soft-gated warping-gan for pose-guided person image synthesis. In: NeurIPS (2018) Dong, H., Liang, X., Wang, B., Lai, H., Zhu, J., Yin, J.: Towards multi-pose guided virtual try-on network. In: ICCV (2019) Dong, H., Liang, X., Wang, B., Lai, H., Zhu, J., Yin, J.: Towards multi-pose guided virtual try-on network. In: ICCV (2019) Grigor’ev, A.K., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.S.: Coordinatebased texture inpainting for pose-guided human image generation. CVPR (2019) Grigor’ev, A.K., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.S.: Coordinatebased texture inpainting for pose-guided human image generation. CVPR (2019) Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M.: Drape: Dressing any person. ACM Transactions on Graphics - TOG (2012) Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M.: Drape: Dressing any person. ACM Transactions on Graphics - TOG (2012) Han, X., Hu, X., Huang, W., Scott, M.R.: Clothflow: A flow-based model for clothed person generation. In: ICCV (2019) Han, X., Hu, X., Huang, W., Scott, M.R.: Clothflow: A flow-based model for clothed person generation. In: ICCV (2019) Han, X., Wu, Z., Huang, W., Scott, M.R., Davis, L.S.: Compatible and diverse fashion image inpainting (2019) Han, X., Wu, Z., Huang, W., Scott, M.R., Davis, L.S.: Compatible and diverse fashion image inpainting (2019) Han, X., Wu, Z., Wu, Z., Yu, R., Davis, L.S.: Viton: An image-based virtual try-on network. In: CVPR (2018) Han, X., Wu, Z., Wu, Z., Yu, R., Davis, L.S.: Viton: An image-based virtual try-on network. In: CVPR (2018) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems. pp. 6626–6637 (2017) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems. pp. 6626–6637 (2017) Hsiao, W.L., Grauman, K.: Dressing for diverse body shapes. ArXiv (2019) Hsiao, W.L., Grauman, K.: Dressing for diverse body shapes. ArXiv (2019) Hsiao, W.L., Katsman, I., Wu, C.Y., Parikh, D., Grauman, K.: Fashion++: Minimal edits for outfit improvement. In: In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019) Hsiao, W.L., Katsman, I., Wu, C.Y., Parikh, D., Grauman, K.: Fashion++: Minimal edits for outfit improvement. In: In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019) Hsieh, C.W., Chen, C.Y., Chou, C.L., Shuai, H.H., Liu, J., Cheng, W.H.: Fashionon: Semantic-guided image-based virtual try-on with detailed human and clothing information. In: MM ’19 (2019) Hsieh, C.W., Chen, C.Y., Chou, C.L., Shuai, H.H., Liu, J., Cheng, W.H.: Fashionon: Semantic-guided image-based virtual try-on with detailed human and clothing information. In: MM ’19 (2019) HyugJae, Lee, R., Kang, M., Cho, M., Park, G.: La-viton: A network for lookingattractive virtual try on. In: ICCV Workshops (2019) HyugJae, Lee, R., Kang, M., Cho, M., Park, G.: La-viton: A network for lookingattractive virtual try on. In: ICCV Workshops (2019) Jaderberg, M., Simonyan, K., Zisserman, A., kavukcuoglu, k.: Spatial transformer networks. In: NeurIPS (2015) Jaderberg, M., Simonyan, K., Zisserman, A., kavukcuoglu, k.: Spatial transformer networks. In: NeurIPS (2015) Jandial, S., Chopra, A., Ayush, K., Hemani, M., Kumar, A., Krishnamurthy, B.: Sievenet: A unified framework for robust image-based virtual try-on. In: WACV (2020) Jandial, S., Chopra, A., Ayush, K., Hemani, M., Kumar, A., Krishnamurthy, B.: Sievenet: A unified framework for robust image-based virtual try-on. In: WACV (2020) Jeong, M.H., Han, D.H., Ko, H.S.: Garment capture from a photograph. Journal of Visualization and Computer Animation (2015) Jeong, M.H., Han, D.H., Ko, H.S.: Garment capture from a photograph. Journal of Visualization and Computer Animation (2015) Ji, D., Kwon, J., McFarland, M., Savarese, S.: Deep view morphing. In: CVPR (2017) Ji, D., Kwon, J., McFarland, M., Savarese, S.: Deep view morphing. In: CVPR (2017) Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. CVPR (2018) Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. CVPR (2018) Kanazawa, A., Jacobs, D., Chandraker, M.: Warpnet: Weakly supervised matching for single-view reconstruction. In: CVPR (2016) Kanazawa, A., Jacobs, D., Chandraker, M.: Warpnet: Weakly supervised matching for single-view reconstruction. In: CVPR (2016) Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410 (2019) Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410 (2019) Lin, C.H., Yumer, E., Wang, O., Shechtman, E., Lucey, S.: St-gan: Spatial transformer generative adversarial networks for image compositing. In: CVPR (2018) Lin, C.H., Yumer, E., Wang, O., Shechtman, E., Lucey, S.: St-gan: Spatial transformer generative adversarial networks for image compositing. In: CVPR (2018) Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: ECCV (2018) Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: ECCV (2018) Liu, K.H., Chen, T.Y., Chen, C.S.: Mvc: A dataset for view-invariant clothing retrieval and attribute prediction. In: ICMR (2016) Liu, K.H., Chen, T.Y., Chen, C.S.: Mvc: A dataset for view-invariant clothing retrieval and attribute prediction. In: ICMR (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: CVPR (2016) Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: CVPR (2016) McKinsey: State of the fashion industry 2019 (2019) McKinsey: State of the fashion industry 2019 (2019) Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., Morishima, S.: Siclope: Silhouette-based clothed people supplementary materials. In: CVPR (2019) Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., Morishima, S.: Siclope: Silhouette-based clothed people supplementary materials. In: CVPR (2019) Neverova, N., Gler, R.A., Kokkinos, I.: Dense pose transfer. In: ECCV (2018) Neverova, N., Gler, R.A., Kokkinos, I.: Dense pose transfer. In: ECCV (2018) Raffiee, A.H., Sollami, M.: Garmentgan: Photo-realistic adversarial fashion transfer (2020) Raffiee, A.H., Sollami, M.: Garmentgan: Photo-realistic adversarial fashion transfer (2020) Raj, A., Sangkloy, P., Chang, H., Hays, J., Ceylan, D., Lu, J.: Swapnet: Image based garment transfer. In: ECCV (2018) Raj, A., Sangkloy, P., Chang, H., Hays, J., Ceylan, D., Lu, J.: Swapnet: Image based garment transfer. In: ECCV (2018) Rocco, I., Arandjelovi´c, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: CVPR (2017) Rocco, I., Arandjelovi´c, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: CVPR (2017) Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: Pixelaligned implicit function for high-resolution clothed human digitization. ICCV (2019) Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: Pixelaligned implicit function for high-resolution clothed human digitization. ICCV (2019) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015) Song, D., Li, T., Mao, Z., Liu, A.: Sp-viton: shape-preserving image-based virtual try-on network. Multimedia Tools and Applications (2019) Song, D., Li, T., Mao, Z., Liu, A.: Sp-viton: shape-preserving image-based virtual try-on network. Multimedia Tools and Applications (2019) Suzuki, S., Abe, K.: Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing (1985) Suzuki, S., Abe, K.: Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing (1985) Vaccaro, K., Agarwalla, T., Shivakumar, S., Kumar, R.: Designing the future of personal fashion. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (2018) Vaccaro, K., Agarwalla, T., Shivakumar, S., Kumar, R.: Designing the future of personal fashion. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (2018) Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018) Wang, B., Zheng, H., Liang, X., Chen, Y., Lin, L.: Toward characteristic-preserving image-based virtual try-on network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018) Wang, J., Zhang, W., Liu, W.H., Mei, T.: Down to the last detail: Virtual try-on with detail carving. ArXiv (2019) Wang, J., Zhang, W., Liu, W.H., Mei, T.: Down to the last detail: Virtual try-on with detail carving. ArXiv (2019) Wu, Z., Lin, G., Tao, Q., Cai, J.: M2e-try on net: Fashion from model to everyone. In: MM ’19 (2018) Wu, Z., Lin, G., Tao, Q., Cai, J.: M2e-try on net: Fashion from model to everyone. In: MM ’19 (2018) Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: CVPR (2017) Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: CVPR (2017) Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019) Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019) Yu, J., Lin, Z.L., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextualattention. In: CVPR (2018) Yu, J., Lin, Z.L., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextualattention. In: CVPR (2018) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, L., Zhong, Y., Wang, X.: Inpainting-based virtual try-on network for selectivegarment transfer. IEEE Access (2019) Yu, R., Wang, X., Xie, X.: Vtnfp: An image-based virtual try-on network withbody and clothing feature preservation Yu, R., Wang, X., Xie, X.: Vtnfp: An image-based virtual try-on network withbody and clothing feature preservation Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318 (2018) Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318 (2018) Zheng, N., Song, X., Chen, Z., Hu, L., Cao, D., Nie, L.: Virtually trying on newclothing with arbitrary poses. In: MM ’19 (2019) Zheng, N., Song, X., Chen, Z., Hu, L., Cao, D., Nie, L.: Virtually trying on newclothing with arbitrary poses. In: MM ’19 (2019) Zheng, S., Yang, F., Kiapour, M.H., Piramuthu, R.: Modanet: A large-scale streetfashion dataset with polygon annotations. In: ACM Multimedia (2018) Zheng, S., Yang, F., Kiapour, M.H., Piramuthu, R.: Modanet: A large-scale streetfashion dataset with polygon annotations. In: ACM Multimedia (2018) Zhu, S., Fidler, S., Urtasun, R., Lin, D., Chen, C.L.: Be your own prada: Fashionsynthesis with structural coherence. In: CVPR (2017) Zhu, S., Fidler, S., Urtasun, R., Lin, D., Chen, C.L.: Be your own prada: Fashionsynthesis with structural coherence. In: CVPR (2017) This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license. This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license. available on arxiv