Finetuning in a distributed way with questionable network would be lot more energy/cost inefficient than doing it with a single node or a well connected cluster. Also, you can finetune 70b model for million tokens for $2 in lambda cloud or <$10 in replicate.