Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Obviously, but the question is, why were there no Black women in the data set, and what care can be taken to prevent racialized bias when selecting the data set in the future?


I would assume these data sets are not manually selected but imported from some mechanism.

Other issues which are sure to arise is that the a.i. will have trouble with people who aren't smiling, and that the data set probably contains people who look better than average, and almost certainly excludes people who suffer from injuries or deformities in appropriate proportions.

Perhaps an interesting project is simply the compilation of a vast dataset of “world proportional pictures of people”. — It would be an interesting undertaking to realize such a dataset.


World proportional is not good enough for this type of task. If we are to rely on AI for things like identifying people in pictures in a trial, we would need equal representation in the data set, so the AI doesn't have any kind of systematic bias. Otherwise, the AI's bias will compound errors in the real world. So you would need as many pictures of Australian aborigenees in the data set as Han Chinese people if you wanted to be sure there isn't a risk that a random person would be confused for someone of the over or under represented groups.


Certainly you can ask these questions but these are business process issues, not technical ones. They're unrelated to AI.

My personal take is you won't see any tangible movement on this until black women (or whatever group you choose) comprise a tangible proportion of revenue generating users. Corporations operate for money and nothing else.


Of course they are related to what we call AI, because what we call AI is primarily dependent on the quality of the business processes behind data selection and testing. If there is a strong trend of business processes to create systematic errors in the results the technology generates (an AI trained in China sucking at recognising white people wouldn't be a counter example of this phenomenon, it would be the same issue) it's an underlying weakness of the technology, and the utility of the technology needs to be viewed in the context that it's likely compromised by biases in the business processes of its developers.

Black women or other groups not viewed as the mainstream target for an AI solution aren't going to form a tangible proportion of revenue generating users if the software doesn't function properly for them. And a lot of the use cases for AI analysis don't involve the unrepresented-in-corpus minority group being the consumer anyway, they involve it being used to screen them by a third party who's been sold the tool on the false premise that it's free from human bias.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: