> So, the problem we'd like to solve is what set of 250 or so cards has the highest predictive value in that similar answers lead to the best interpersonal matches?
I think ideally you should somehow allow users indicate what kind of "interpersonal" similarity they are looking for. Perhaps right now I want to find intellectually similar people -- but maybe later I want to find people with the same sense of humor. Hell, maybe I just want to find people interested in eating the same food.
Here's an abstraction that I think could be a killer feature, and could solve a lot of "problems" around matching, privacy, etc: Cardsets.
A cardset would be a set of cards with its own theme. My answers to the cardsets can be hidden/shown at will. I can view similar users based on responses to a single cardset, and my "overall matches" would be a weighted similarity across all cardsets, where I can toggle how important each cardset is.
So now imagine the following cardsets:
- Politics. Swipe left for "bad idea", right for "good idea".
- Humor. Swipe left for "not funny", right for "funny".
- Food. Swipe left for "disgusting", right for "delicious".
- Aesthetics. Swipe left for "ugly", right for "beautiful".
- Hobbies.
- Activities I want to try.
- Places I want to go.
- Fashion, technology, programming, ...etc...
I can now find people that have similar fashion taste to me. Or that want to travel to the same places as me. Or that like the same food as me. There's also the nice side effect that I don't have to swipe on cardsets I don't care about, and that there's a lot of swiping I could do, which is potentially fun. Sky's the limit here!
Furthermore, if you want to get "advanced", add a way for me to find "custom matches", by which I weigh the importance of each cardset. Perhaps I want to find people that match based on "Politics" and "Humor" ... so I'd set those as having high weight and the others lower.
You now have a channel by which you can add endless amounts of content, and can iteratively improve your matching and engagement. Adding a new cardset opens up an entirely new set of matches to all of your users, and gives them something to swipe.
I think this concept is extremely powerful and valuable, and opens up endless avenues for future growth. Imagine brands being able to fill out cardsets for "aesthetics", or "food", etc. Now you can match users to brands they care about.
Lastly, to light a fire under your ass, if you don't commit to doing this, I will. The more I think about it, the more I think this is something that could catch on, especially if you let users create cardsets and add other viral features (eg: ability to send a link to someone to fill out a cardset to find out how much they match me).
Another fantastic reply! I really need to get this to take off already so I can hire you before you make your own and out-compete me.
The more I read this reply, the more I agree with it, and I think it may actually not be that difficult of a change. Card sets could be integrated into the existing cluster concept and I could just give users the ability to choose which sets (clusters) they swipe on and the weighting that they apply to each. They could also decide which clusters should be factored into their similarity matching. I _think_ this will all work with the existing CUBE concept, which is exciting, because many other proposed solutions by others didn't fit nicely within that mathematical structure.
You've honestly given me a lot to think about and I think I see a better way forward now. Your insight really increased my mood because I think you've discovered something very important that I am likely going to be spending quite a bit of time on in the next coming weeks and months.
I would advise against folding it into clusters, and instead have each cardset be its own thing. What you have right now is a cardset that I'd label "General" -- abstract the backend so you can create different sets of cards ("Food", "Fashion", etc). If manually curate cardsets, you won't even need to worry about clustering. Yes, it's a shame you spent time on it, but no clustering will ever beat manually creating sets of cards.
In terms of producing a "total match score" with a user, you compute a match score for each cardset that both users have, then use a simple normalized linear combination to get the total.
If users A and B have cardsets X Y and Z in common, you would produce similarity scores "S" for S(A,B,X), S(A,B,Y), and S(A,B,Z). Then, you use the weights that user A selects for each cardset (W(A), W(B), W(C)), normalize such that they add up to 1 but maintain their ratios, and compute total similarity of A matched to B as: W(A) * S(A) + W(B) * S(B) + W(C) * S(C).
As long as you have pre-computed the scores between all user-cardset pairs (your scaling pain point), computing match scores even with weights is trivial and fast.
> You've honestly given me a lot to think about and I think I see a better way forward now. Your insight really increased my mood because I think you've discovered something very important that I am likely going to be spending quite a bit of time on in the next coming weeks and months.
Happy to hear. I've been working on my own project for close to a year and am close to launch, so I think I understand where you are coming from.
It is practically impossible to view your product as someone unfamiliar with it would. So, that leaves you asking for feedback. Next, it is really difficult to distill user feedback (such as found on this thread) into things you should actually work on. Is a comment just a vocal minority complaining or an indication that some concept should be changed? I think you're doing a good job taking feedback to heart and I'm really rooting for you.
> Another fantastic reply! I really need to get this to take off already so I can hire you before you make your own and out-compete me.
If/when it takes off, just make me an adviser and shoot a couple percentage points my way!
Firstly, congrats on the launch. Designing and implementing a new concept from scratch, by yourself, is definitely fraught with difficulties. It's hard to know if what you're designing is "right" and if you're implementing "the right way". Sticking with it for an entire year and actually launching is definitely something you should feel good about.
I think this is a great idea, and for a first version the execution is actually quite good. Once I figured it out, I enjoyed using it and flew through all of the cards. I looked at a lot of matches and even sent a few messages. Overall, I can see myself coming back for random chats from time to time.
Here is my feedback, in order of what I perceive to be most important.
1. It was not at all obvious to me how to swipe. I tried swiping the card but it didn't move. So I clicked the arrows. Since I "agreed" with the first 4 or so cards, I did not realize the arrows were to go back and forth. (In fact, I would not have ever expected back and forward to be an option). Then I realized that the "help button" thing was actually how to slide, and I got it from there.
I suggest you make the slider thing look "slideable" and obvious that left is disagree (red), right is agree (green). Right now, it definitely is easy to see it as a "help" button.
I also suggest you make the card navigation arrows not ambiguous -- I think it is fair to say people will assume they mean "agree" and "disagree" rather than "next" and "prev". Either provide context so that it's obvious what they do, or use words "next", "prev".
Here is a crappy mock-up that hopefully shows what I mean: https://jsbin.com/wekuzebasu/ - In this shows progress and is obvious what next/prev do, and (for me at least) it is much more obvious what the call to action is, and how to operate it. (You could replace "|||" with "<->", or show arrows next to the slider circle for even more emphasis).
2. I really love the "percent" agree/disagree. I think that's way better than just "yes/no" and it greatly increases my confidence that I'll get better results.
3. I think your initial card set is actually quite good. There are a few where it is ambiguous of "left" and "right" mean. Agree / Disagree?Like / Dislike? Support / Don't support? Care / Don't care?
4. If I were you, it would be very easy for me to worry about spam/abuse/gaming of this. But, honestly, your biggest problem will be overcoming the network effect to get a critical mass of users to make this better. If for an average user there are more similar people, and they are closer, it becomes much more compelling for more users to join -- leading a virtuous cycle. If you want this to catch on, your #1 concern should be getting the most users in the least amount of time.
Handling spam/abuse is a problem you want to have, and I'd put it off until it's necessary and/or until it's clear it is slowing user growth.
5. I've read all of your comments, and I think you're making some very good decisions and are putting a lot of thought into this. I hope you stick with it.
6. The clustering is good and does provide some level of "privacy" -- though for my "similar" people I often do not see any clusters at all. Anyway, this seems like the type of app where I want people to know my opinion on stuff, and if there's a question for which I want my opinion to be unknown, I'll just skip it.
-----
Now a question regarding the matching. Is there a difference between "unanswered", "skipped", "0% (as opposed to -X% or +X%)" ? Consider these cases:
1) I don't care about a card. This is distinct from being "0%" -- I don't want this card counted towards producing my matches. If someone is 100% or -100% on it, don't penalize the similarity.
2) I am neutral about a card, but I do care if somebody is +/- on it. Do consider this in similarity.
3) I have not answered a card. This should not count towards similarity.
4) I previously had an opinion on a card, but now I don't care. How do I "reset" it?
Depending on how this is handled, you'll need to change the UI to make it more clear. Eg, you may need a "skip" button.
Wow, this is an incredible reply. Thanks so much for such an in-depth analysis and compliments!
I actually really like your mock-up and really appreciate that you took the time to make it. Hopefully you don't mind if I take quite a bit of inspiration from it, because I definitely think it looks much better than my current one.
Yeah, I agree that the network effect is one of the biggest problems. Retention seems very tricky with a concept like this because while posting it to reddit or HN can result in many registrations, few people stick around because there isn't much to do on the site because there's so few users at the moment. It honestly does make motivation pretty difficult, but I am indeed going to stick with it regardless.
With regard to your last 4 numbered points, currently 0% == not answered == neutral, and this is mostly due to technical limitations. Your default similarity cube is just (0.0, 0.0, 0.0, 0.0, 0.0, ..., 0.0) (50 zeroes), and voting on cards adjusts this accordingly. Such that if you answer 100% to every card in the first cluster and answer no other cards then your CUBE would look like (1.0, 0.0, 0.0, 0.0, 0.0, ..., 0.0)
I was thinking about a potential way to solve this, but as of right now due to the constraints of CUBE similarity matching, this is the current solution.
Regarding the 0.0/default/skipped -- see my other response about "cardsets", which the more I think about it, the more I am convinced will solve nearly all of your issues:
- you don't have to worry about the skipped vs neutral problem (as it is fair to expect a user to answer all cards in a cardset).
- you don't have to worry about clustering. a cardset is a "cluster".
- you can work around technical limitations by having each cardset be its own cube
- you boost retention by being able to have a library of dozens of cardsets (rather than 250 cards). limitless swiping, and swiping on things that your users will find interesting because they are selecting cardsets that are important to them.
- you can make your site viral by allowing users to send a link where the receiver can (anonymously) swipe all the cards of a cardset and see their "similarity" to the link sender. (Then you can ask them to register to be able to send links themselves).
- futures paths for revenue are endless
I think you are 100% right here. Wow, thanks so much. This really is amazing. I'm going to begin work on this soon (just need to get some sleep first!)
Also there is another aspect to cardsets that might be worth thinking about, which is that they are like "polling" done in a fun and easy way (swiping on pictures) -- the fact that you can also show people that answered similar to me doesn't need to be the focus.
Having polls like this can help you grow the site. For example, imagine I make a cardset for "Beatles albums" where I put in each Beatles album, and shoot the link out to, say, r/Beatles or whatever. Your site can let people swipe, then show the aggregate % for each of the cards, thus ranking them.
Of course... having viewed this really fun and quick poll, random pollers might wish to make their own new polls and spread your site.
Enjoy your sleep.. you have a long journey ahead of you!
As I understand it VCs don't acquire companies, they invest in them. The difference being that in an acquirer buys and runs the company, while a VC only buys a small portion of it and gains exposure to potential upside.
In my case, I'd like to go from an owner to something like a VC -- that is, I'd like to liquidate my company but still maintain a small potential upside. What I don't understand is how this would work, as the acquisition itself would be the upside.
> You could negotiate retaining x% of the stock (along with rules regarding stock type, splits, etc.).
Is this at all common? I would imagine that upon acquisition the buyer would prefer to simply roll the company into their own, rather then have to run it as a separate entity. Though, like you said, I imagine this can be negotiated.
I am unsure how common it is, but I do know that for many entrepreneurs it can sometimes be painful to sell your company/product/culture/idea that you have worked on for so long and for so hard, so I think it is a way for creators to remain part of their creation. :-)
I use honey, but I only enable it on the page before I'm about to check out, then disable it after it's found a discount.
Granted, during the time it is activated they could do whatever they want with my data (including hijacking my session), but at least this way they don't track everything.
I would like to be able to delineate strings with any character(s) of my choosing. The default could be single or double or triple quotes, but don't force me. I am sick of escaping characters in literals, I'd rather define a delineator that I know isn't in the string. Same for templating.
let name = "Jane";
let str = s{}:John said "Don't do that, {name}!":s
I would also like a JSON-like shorthand for declaring nested structures, without having to define each individual component separately. Eg:
struct Person {
user: {string name, int age},
address: {string street, ...}
}
let p = new Person(<json>);
print(p.user.age);
Convenience with arrays (python slicing is nice), maps, and primitive types pay huge dividends.
arr.first();
arr.last();
arr[5,-5]; //5th to 5th to last
Built-in arbitrary length/precision numbers would be neat.
Besides that, tons and tons of useful libraries: IO, Date, JSON, DB Bindings, etc.
What if the data I have is relational? Must it all be squashed down? Imagine a family tree: can a "node" simply contain "parents", and the model will figure out "virtual columns", or would I generate this input manually (for thousands of columns), example columns: "parent(0).parent(0).child(0).blood_type", "parent(0).parent(0).child(1).blood_type", etc.
Consequentially... I will definitely have more questions like this. What is a good resource to go off on my own, or with a more suitable audience?
In most cases I've seen, the model lives on its own, with only very surface-level connections with your system. It can't "look up" stuff it needs, such as the query you mention.
There are graph-based systems but I think they're more interested in understanding relationships in the graph itself - grouping like things together, predicting relationships and distances, etc., rather than attributes of the nodes.
I assume you want the system to learn something like "if parent A is type O, and grandparent of parent B is B+, child is more likely to be tall" or similar. I don't think the network can learn things like that in terms of understanding the linkages to predict numbers. It might be able to predict it simply by being given enough examples after "flattening," though, so the functional result is similar.
I've found Reddit's MLQuestions [0] group to be interesting and sometimes accessible for a non-academic ML enthusiast. Some Youtube content can be useful too, but most are just repeating the content of papers. I'm still seeking a real commoner-level message board to discuss this kind of stuff without dumbshaming.
What I've learned most from is downloading example code and actually using it, reading parts of it, and trying to apply it. It's easy to dip your toes into stuff on Google Colab notebooks.
Uber Ludwig [2] is an interesting all-in-one low-code system that lets you try out different ideas quickly. At least in theory. They give an example cases for all the different types of networks they support along with matching YAML to specify the model details. So you can sorta just throw some data in there and build a command line to try ideas, rather than learning a lot about Pytorch, Keras, etc., and potentially introducing subtle bugs.
Email me (addr in profile) if you want to chat more.
Perhaps you can help me with some follow-up questions:
- Let's imagine a standard excel sheet (a 2d array), with columns "A", "B", "C", "D" ... "Z".
1) Let's say I want to create a model, that takes this input: A single row where all columns have values, except for some (random) columns. I want it to autopopulate those columns with values that are most probable according to the data I trained it with. That is, _every_ column is both a "feature" and a "target". Is this possible?
2) Can I train the model by telling it this: "This input should DEFINITELY NOT confuse you" -- that is, can I "weigh" the inputs (or do I just put in more of them?)
1. You can predict multiple values, but it still has to be trained on target values for those features. So, you could predict "D", "P", and "Z", but not any at random - you'd have to design it that way. Look into "multidimensional regression"
2. That seems logical -- supplying a "confidence level" with the training data itself -- but I haven't heard of it, and can't seem to find anything on the search engines.
From a UI perspective, as well as the perspective of somebody trying to use this without possibly burning an e-mail address, please have a "preview" option before "audience".
You can look at the "legal" aspect of it by Googling "is it legal to scrape". My understanding (IANAL) is that it is OK as long as the user agreement is "opt-out" as opposed to "opt-in" (eg: clicking a consent box before viewing the content). You'll have to read up on this yourself and weigh the risk/reward of doing your project. I (NAFL) would assume the risk is quite small. The reward -- that's your call.
As for the other part, getting the data: it's called scraping. Depending on your experience with scraping, you may need to pay for certain aspects of it (eg, getting a large list of proxies so Reddit does not block you, or using a scraping API). Or maybe your project is small enough (or time constraints large enough) that you can slowly siphon the data via your own means.
As per Reddit allowing it: Refer to the legality of scraping, and apply it to Reddit.
I think ideally you should somehow allow users indicate what kind of "interpersonal" similarity they are looking for. Perhaps right now I want to find intellectually similar people -- but maybe later I want to find people with the same sense of humor. Hell, maybe I just want to find people interested in eating the same food.
Here's an abstraction that I think could be a killer feature, and could solve a lot of "problems" around matching, privacy, etc: Cardsets.
A cardset would be a set of cards with its own theme. My answers to the cardsets can be hidden/shown at will. I can view similar users based on responses to a single cardset, and my "overall matches" would be a weighted similarity across all cardsets, where I can toggle how important each cardset is.
So now imagine the following cardsets:
I can now find people that have similar fashion taste to me. Or that want to travel to the same places as me. Or that like the same food as me. There's also the nice side effect that I don't have to swipe on cardsets I don't care about, and that there's a lot of swiping I could do, which is potentially fun. Sky's the limit here!Furthermore, if you want to get "advanced", add a way for me to find "custom matches", by which I weigh the importance of each cardset. Perhaps I want to find people that match based on "Politics" and "Humor" ... so I'd set those as having high weight and the others lower.
You now have a channel by which you can add endless amounts of content, and can iteratively improve your matching and engagement. Adding a new cardset opens up an entirely new set of matches to all of your users, and gives them something to swipe.
I think this concept is extremely powerful and valuable, and opens up endless avenues for future growth. Imagine brands being able to fill out cardsets for "aesthetics", or "food", etc. Now you can match users to brands they care about.
Lastly, to light a fire under your ass, if you don't commit to doing this, I will. The more I think about it, the more I think this is something that could catch on, especially if you let users create cardsets and add other viral features (eg: ability to send a link to someone to fill out a cardset to find out how much they match me).