I would advise against folding it into clusters, and instead have each cardset b...

I would advise against folding it into clusters, and instead have each cardset be its own thing. What you have right now is a cardset that I'd label "General" -- abstract the backend so you can create different sets of cards ("Food", "Fashion", etc). If manually curate cardsets, you won't even need to worry about clustering. Yes, it's a shame you spent time on it, but no clustering will ever beat manually creating sets of cards.

In terms of producing a "total match score" with a user, you compute a match score for each cardset that both users have, then use a simple normalized linear combination to get the total.

If users A and B have cardsets X Y and Z in common, you would produce similarity scores "S" for S(A,B,X), S(A,B,Y), and S(A,B,Z). Then, you use the weights that user A selects for each cardset (W(A), W(B), W(C)), normalize such that they add up to 1 but maintain their ratios, and compute total similarity of A matched to B as: W(A) * S(A) + W(B) * S(B) + W(C) * S(C).

As long as you have pre-computed the scores between all user-cardset pairs (your scaling pain point), computing match scores even with weights is trivial and fast.