zibingo commited on
Commit
3d06393
Β·
verified Β·
1 Parent(s): cba68c7

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +227 -0
README.md ADDED
@@ -0,0 +1,227 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Texture-Preserving Multimodal Fashion Image Editing with Diffusion Models
2
+
3
+ <div style="display: flex; justify-content: center; align-items: center;">
4
+ <a href='https://huggingface.co/zibingo/TP-MGD' style="margin: 0 2px;">
5
+ <img src='https://img.shields.io/badge/Hugging Face-ckpts-orange?style=flat&logo=HuggingFace&logoColor=orange' alt='huggingface'>
6
+ </a>
7
+ <a href="https://github.com/zibingo/TP-MGD" style="margin: 0 2px;">
8
+ <img src='https://img.shields.io/badge/GitHub-Repo-blue?style=flat&logo=GitHub' alt='GitHub'>
9
+ </a>
10
+ </div>
11
+
12
+ ## 🎯 Overview
13
+
14
+ TP-MGD is a new method for texture-preserving multimodal fashion image editing using diffusion models. The project enables high-quality fashion image generation and editing through an innovative lightweight architecture setup while maintaining fine-grained texture details.
15
+
16
+ <div align="center">
17
+ <img src="assets/sample_by_model.jpg" width="100%" height="100%"/>
18
+ </div>
19
+
20
+ ## βœ… TODO
21
+
22
+ - [x] Release training code
23
+ - [x] Release inference code
24
+ - [x] Release processed datasets
25
+ - [x] Release checkpoints to Hugging Face
26
+ - [x] Create comprehensive documentation
27
+ ## πŸš€ Quick Start
28
+
29
+ ### Installation
30
+
31
+ ```bash
32
+ git clone https://github.com/zibingo/TP-MGD.git
33
+ cd TP-MGD
34
+ ```
35
+
36
+ **Requirements:**
37
+
38
+ - Python 3.9+
39
+ - PyTorch >= 2.5.0
40
+ - CUDA >= 12.4
41
+
42
+ ```bash
43
+ pip install diffusers accelerate transformers opencv-python einops wandb open_clip_torch
44
+ ```
45
+
46
+ ### Download Pre-trained Models
47
+
48
+ ```bash
49
+ wget https://huggingface.co/h94/IP-Adapter/resolve/main/models/ip-adapter_sd15.bin
50
+ ```
51
+
52
+ ## πŸ“Š Dataset Setup
53
+
54
+ ### VITON-HD Dataset
55
+
56
+ 1. **Download VITON-HD**: Get the original dataset from [VITON-HD](https://github.com/shadow2496/VITON-HD)
57
+ 2. **Download MGD multimodal data**: Get additional data from [MGD](https://github.com/aimagelab/multimodal-garment-designer)
58
+ 3. **Download preprocessed textures**:
59
+
60
+ ```bash
61
+ wget https://huggingface.co/zibingo/TP-MGD/resolve/main/vitonhd-texture.zip
62
+ ```
63
+ 4. **Configuration:** Set the `dataroot_path` in the YAML files under the `configs/` directory.
64
+
65
+ **Directory Structure:**
66
+ ```
67
+ β”œβ”€β”€ captions.json (from MGD)
68
+ β”œβ”€β”€ test/
69
+ β”‚ β”œβ”€β”€ agnostic-mask/
70
+ β”‚ β”œβ”€β”€ agnostic-v3.2/
71
+ β”‚ β”œβ”€β”€ cloth/
72
+ β”‚ β”œβ”€β”€ cloth-mask/
73
+ β”‚ β”œβ”€β”€ cloth-texture/ (from Ours)
74
+ β”‚ β”œβ”€β”€ im_sketch/ (from MGD)
75
+ β”‚ β”œβ”€β”€ im_sketch_unpaired/ (from MGD)
76
+ β”‚ β”œβ”€β”€ image/
77
+ β”‚ β”œβ”€β”€ image-densepose/
78
+ β”‚ β”œβ”€β”€ image-parse-agnostic-v3.2/
79
+ β”‚ β”œβ”€β”€ image-parse-v3/
80
+ β”‚ β”œβ”€β”€ openpose_img/
81
+ β”‚ └── openpose_json/
82
+ β”œβ”€β”€ test_pairs.txt
83
+ β”œβ”€β”€ train/
84
+ β”‚ β”œβ”€β”€ agnostic-mask/
85
+ β”‚ β”œβ”€β”€ agnostic-v3.2/
86
+ β”‚ β”œβ”€β”€ cloth/
87
+ β”‚ β”œβ”€β”€ cloth-mask/
88
+ β”‚ β”œβ”€β”€ cloth-texture/ (from Ours)
89
+ β”‚ β”œβ”€β”€ gt_cloth_warped_mask/
90
+ β”‚ β”œβ”€β”€ im_sketch/ (from MGD)
91
+ β”‚ β”œβ”€β”€ image/
92
+ β”‚ β”œβ”€β”€ image-densepose/
93
+ β”‚ β”œβ”€β”€ image-parse-agnostic-v3.2/
94
+ β”‚ β”œβ”€β”€ image-parse-v3/
95
+ β”‚ β”œβ”€β”€ openpose_img/
96
+ β”‚ └── openpose_json/
97
+ └── train_pairs.txt
98
+ ```
99
+
100
+ ### DressCode Dataset
101
+
102
+ 1. **Download DressCode**: Get the original dataset from [DressCode](https://github.com/aimagelab/dress-code)
103
+ 2. **Download MGD multimodal data**: Get additional data from [MGD](https://github.com/aimagelab/multimodal-garment-designer)
104
+ 3. **Download preprocessed textures**:
105
+
106
+ ```bash
107
+ wget https://huggingface.co/zibingo/TP-MGD/resolve/main/dresscode-texture.zip
108
+ ```
109
+ 4. **Configuration:** Set the `dataroot_path` in the YAML files under the `configs/` directory.
110
+
111
+ **Directory Structure:**
112
+ ```
113
+ β”œβ”€β”€ dresses/
114
+ β”‚ β”œβ”€β”€ dense/
115
+ β”‚ β”œβ”€β”€ dresses_cloth-texture/ (from Ours)
116
+ β”‚ β”œβ”€β”€ im_sketch/ (from MGD)
117
+ β”‚ β”œβ”€β”€ im_sketch_unpaired/ (from MGD)
118
+ β”‚ β”œβ”€β”€ images/
119
+ β”‚ β”œβ”€β”€ keypoints/
120
+ β”‚ β”œβ”€β”€ label_maps/
121
+ β”‚ β”œβ”€β”€ test_pairs_paired.txt
122
+ β”‚ β”œβ”€β”€ test_pairs_unpaired.txt
123
+ β”‚ └── train_pairs.txt
124
+ β”œβ”€β”€ lower_body/
125
+ β”‚ β”œβ”€β”€ dense/
126
+ β”‚ β”œβ”€β”€ im_sketch/ (from MGD)
127
+ β”‚ β”œβ”€β”€ im_sketch_unpaired/ (from MGD)
128
+ β”‚ β”œβ”€β”€ images/
129
+ β”‚ β”œβ”€β”€ keypoints/
130
+ β”‚ β”œβ”€β”€ label_maps/
131
+ β”‚ β”œβ”€β”€ lower_body_cloth-texture/ (from Ours)
132
+ β”‚ β”œβ”€β”€ test_pairs_paired.txt
133
+ β”‚ β”œβ”€β”€ test_pairs_unpaired.txt
134
+ β”‚ └── train_pairs.txt
135
+ β”œβ”€β”€ upper_body/
136
+ β”‚ β”œβ”€β”€ dense/
137
+ β”‚ β”œβ”€β”€ im_sketch/ (from MGD)
138
+ β”‚ β”œβ”€β”€ im_sketch_unpaired/ (from MGD)
139
+ β”‚ β”œβ”€β”€ images/
140
+ β”‚ β”œβ”€β”€ keypoints/
141
+ β”‚ β”œβ”€β”€ label_maps/
142
+ β”‚ β”œβ”€β”€ test_pairs_paired.txt
143
+ β”‚ β”œβ”€β”€ test_pairs_unpaired.txt
144
+ β”‚ β”œβ”€β”€ train_pairs.txt
145
+ β”‚ └── upper_body_cloth-texture/ (from Ours)
146
+ β”œβ”€β”€ coarse_captions.json (from MGD)
147
+ β”œβ”€β”€ fine_captions.json (from MGD)
148
+ β”œβ”€β”€ multigarment_test_triplets.txt
149
+ β”œβ”€β”€ readme.txt
150
+ β”œβ”€β”€ test_pairs_paired.txt
151
+ β”œβ”€β”€ test_pairs_unpaired.txt
152
+ β”œβ”€β”€ test_stitch_map/ (from MGD)
153
+ └── train_pairs.txt
154
+ ```
155
+
156
+ ## πŸš€ Usage
157
+
158
+ ### Training
159
+
160
+ **Single GPU:**
161
+ ```bash
162
+ python train_vitonhd.py
163
+ python train_dresscode.py
164
+ ```
165
+
166
+ **Multi-GPU**
167
+
168
+ ```bash
169
+ CUDA_VISIBLE_DEVICES=0,1 accelerate launch train_vitonhd.py
170
+ CUDA_VISIBLE_DEVICES=0,1 accelerate launch train_dresscode.py
171
+ ```
172
+
173
+ ### Inference
174
+
175
+ 1. **Download pre-trained weights** from [Hugging Face](https://huggingface.co/zibingo/TP-MGD/tree/main) and place them in the `checkpoints/` directory
176
+ 2. **Update configuration**: Modify the `resume_state` parameter in the YAML files under `configs/` directory to point to your checkpoint directory
177
+
178
+ **Single GPU:**
179
+
180
+ ```bash
181
+ python inference_vitonhd.py
182
+ python inference_dresscode.py
183
+ ```
184
+
185
+ **Multi-GPU:**
186
+
187
+ ```bash
188
+ CUDA_VISIBLE_DEVICES=0,1 accelerate launch inference_vitonhd.py
189
+ CUDA_VISIBLE_DEVICES=0,1 accelerate launch inference_dresscode.py
190
+ ```
191
+
192
+ ## πŸ“ Project Structure
193
+
194
+ ```
195
+ TP-MGD/
196
+ β”œβ”€β”€ configs/ # Configuration files
197
+ β”œβ”€β”€ checkpoints/ # Pre-trained model weights
198
+ β”œβ”€β”€ assets/ # Sample images
199
+ β”œβ”€β”€ train_vitonhd.py # VITON-HD training script
200
+ β”œβ”€β”€ train_dresscode.py # DressCode training script
201
+ β”œβ”€β”€ inference_vitonhd.py # VITON-HD inference script
202
+ β”œβ”€β”€ inference_dresscode.py # DressCode inference script
203
+ β”œβ”€β”€ datasets.py # Dataset loading utilities
204
+ └── attention_processor.py # Custom attention mechanisms
205
+ ```
206
+
207
+ ## πŸ”§ Configuration
208
+
209
+ Key configuration parameters in `configs/*.yaml`:
210
+
211
+ - `dataroot_path`: Path to your dataset
212
+ - `resume_state`: Path to checkpoint for inference or resume train
213
+
214
+ ## πŸ™ Acknowledgments
215
+
216
+ - Our code is based on [Diffusers](https://github.com/huggingface/diffusers)
217
+ - We use [Stable Diffusion v1.5 inpainting](https://huggingface.co/runwayml/stable-diffusion-inpainting) as the base model
218
+ - Thanks to [VITON-HD](https://github.com/shadow2496/VITON-HD), [DressCode](https://github.com/aimagelab/dress-code), and [MGD](https://github.com/aimagelab/multimodal-garment-designer) for providing the public datasets
219
+
220
+ ## πŸ“ž Contact
221
+
222
+ For questions and support, please open an issue on GitHub or contact the authors.
223
+
224
+ ---
225
+
226
+ **⭐ If you find this project helpful, please give it a star!**
227
+