Domain Expansion of Image Generators

1Adobe Research
2Tel-Aviv University
3CMU
Paper Project Code

Starting from a pretrained generator modeling a single domain, such as the human face, we expand it to model hundreds of additional novel domains.


Abstract

Can one inject new concepts into an already trained generative model, while respecting its existing structure and knowledge? We propose a new task - domain expansion - to address this. Given a pretrained generator and novel (but related) domains, we expand the generator to jointly model all domains, old and new, harmoniously. First, we note the generator contains a meaningful, pretrained latent space. Is it possible to minimally perturb this hard-earned representation, while maximally representing the new domains? Interestingly, we find that the latent space offers unused, dormant directions, which do not affect the output. This provides an opportunity: By repurposing these directions, we can represent new domains without perturbing the original representation. In fact, we find that pretrained generators have the capacity to add several - even hundreds - of new domains! Using our expansion method, one expanded model can supersede numerous domain-specific models, without expanding the model size. Additionally, a single expanded generator natively supports smooth transitions between domains, as well as composition of domains.


How does it work?

We start by performing an orthogonal decomposition of the model's latent space and identify hundreds of low-magnitude directions which we qualify as dormant. Dormant directions have no effect while other directions are sufficient to represent the original domain. We now pick one dormant direction for every domain we wish to add. We call the space spanned by other directions the "base subspace". For each new domain, we transport the base space along its dormant direction, defining another subspace we call the "repurposed space". To capture the new domain, we apply an off-the-shelf domain adaptation method, modified to operate only on latent codes from the repurposed space. A regularization loss is applied on the base subspace to ensure that the original domain is preserved. Thanks to the subspaces being parallel and the latent space being disentangled, the original factors of variation from the original domain are introduced to the new domains.


Continuous Control

The repurposed latent directions now control gradual transitions between the original and new domains. The new domain is now represented like other factors of variation that existed in the original generator. Expanded directions are meaningful not only between the two spaces! Extrapolations represent the opposite relationship between domains or exaggerations of it.


On the repurposed subspace

Images generated from the repurposed space itself are almost identical to those generated by a generator that was completely adapted to the new domain. Top two blocks display images generated by two distinct generators, adapted with StyleGAN-NADA and MyStyle. Both bottom blocks are generated from a single generator expanded with 105 domains!


Domain Composition

Since the domains are represented as disentangled linear directions, we can trivially compose them with other domains, creating hybrids, or with other factors of variation that exist in the source domain. This works regardless of the domain adaptation method used to learn the new domain! Starting from a latent code that generates the top-left image, we traverse along two latent directions. For each direction, we note the associated domain, its ordinal number in the latent space's basis, and the training method used to learn the domain.


Citation

@article{nitzan2023domain,
  author={Nitzan, Yotam and Gharbi, Micha{\"e}l and Zhang, Richard and Park, Taesung and Zhu, Jun-Yan and Cohen-Or, Daniel and Shechtman, Eli},
  title = {Domain Expansion of Image Generators},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year = {2023},
}