I've been modifying an LSTM GAN model to use a transformer in the encoder and it...

alephxyz · on April 15, 2023

I'm mostly working on fairly simple image segmentation tasks but in my experience just replacing convolutional layers with attention layers + position embeddings works well. Using convolutional embeddings before the transformer encoder also helps.

It still take a lot more epochs to train though, so you might have to decrease the learning rate of your discriminator by a lot.

radarsat1 · on April 15, 2023

I admit I do run out of patience when it's been running for quite a while and seems to be really far behind the equivalent number of iterations for my LSTM solution. I often stop and adjust things and try again, when maybe it just needs to run longer. I will try that, thanks.