人工智能也有偏見與歧視,如何解決成為行業難題
IBM和Salesforce是人工智能工具軟件領域的兩家領軍企業,這兩家公司的高管在近日指出,對于那些希望采用人工智能軟件的公司來說,偏見仍然是一個值得擔憂的根本性問題。 讓很多企業越來越擔心的是,用于訓練人工智能系統的數據中所隱藏的偏見,有可能會導致系統生成的結果對某些應該受到保護的群體(比如女性和少數族裔)做出不公平的結論,甚至造成歧視或涉嫌違法。 比如有人發現,一些人臉識別系統在識別深膚色和淺膚色人臉時,精確度往往不高,原因是用于訓練該系統的深膚色人臉數據遠遠不足。最臭名昭著的一個例子是,美國部分州的司法部門使用了一套人臉識別系統,用來決定是否應該批準犯罪嫌疑人保釋或假釋。然而在犯罪記錄相似的前提下,系統卻認為黑人嫌犯比白人嫌犯有更高的再次犯罪風險。 軟件公司Salesforce的首席科學家理查德·佐赫爾在科羅拉多州阿斯彭市舉辦的《財富》頭腦風暴科技大會上指出:“偏見將成為未來人工智能領域里的一個根本性問題。” 在這次《財富》頭腦風暴科技大會上,IBM公司的研究總監達里奧·吉爾也表達了同樣的擔憂:“我們需要在人工智能工程上采取堅實手段,以防止人工智能出現毫無根據的偏見。” 吉爾表示,IBM正在加大相關技術的研發力度,為企業提供所謂“數據志”功能。這種技術能夠記錄系統是使用哪些數據決策的,這些數據又是如何生成的、何時被使用的,以及它是如何用于進行推薦或預測的。 吉爾說,這種人工智能的審查跟蹤機制對于問責十分重要,畢竟責任終究是要由人來承擔的。他表示:“我們必須把責任落實到開發這個軟件的人身上,我們要知道他們的目的是什么、意圖是什么。創建和使用這個軟件的機構必須要承擔責任。” 吉爾和佐赫爾都表示,消除人工智能的偏見并不是一件容易的事情,特別是機器學習系統非常擅于發現數據集里各個變量的相關性。因此,雖然我們可以告訴這些軟件在進行相關決策時(比如提供征信方面的建議)不考慮種族因素,但系統仍然會考慮到一個人的住址或郵編等變量。佐赫爾指出,至少在美國,像地址、郵編等信息,實際上還是有可能與族裔群體高度相關的。 吉爾還表示,著眼這一問題,IBM已經開發了一些相關軟件,比如它的AI Fairness 360工具包,可以幫助企業自動在數據中發現類似隱藏的相關性問題。 不過,佐赫爾也指出,發現這種相關性是一回事,但從很多方面看,知道究竟應該怎樣解決它,則是一個困難得多的問題。 佐赫爾表示,在某些情況下,只將一種產品推薦給女性是沒有問題的——比如吸奶器。而在其他情況下,如果系統在進行推薦時出現了類似的性別歧視,則可能涉嫌違法。Salesforce等公司生產的一些通用型人工智能工具幾乎各行各業都可以使用,因此他們面臨的困難也尤為特殊。 吉爾和佐赫爾都表示,正因為如此,很多企業才選擇用自己的數據來訓練人工智能系統,而不是使用已經使用預先訓練好的軟件包來執行聊天機器人或自動圖像標記程序的訓練任務。吉爾指出,構建自己的人工智能程序,讓企業掌握了更大的控制權,同時也更有可能檢測出隱藏的偏見。 佐赫爾和吉爾還表示,人工智能的優點之一,就是它能夠幫助企業發現其實際業務中現有的偏見因素。比如,它可以發現哪些管理者不愿意提拔女性員工,哪些金融機構不愿意向少數族裔發放信貸等等。佐赫爾表示:“人工智能有時就像我們面前的一面鏡子,它會告訴你,這就是你一直以來在做的事情。” 佐赫爾認為,在構建人工智能系統的人自身變得更加多元化之前,有些類型的偏見是不太可能被徹底消除的。目前,很多從事人工智能軟件開發的計算機工程師都是白人,而且當前開發的很多人工智能軟件都只反映了城市富裕人口的需求。他還表示,這也是Salesforce公司何以支持非洲深度學習大會(Deep Learning Indaba)等項目的原因之一。非洲深度學習大會也是非洲地區人工智能研究人員的一次盛會。(財富中文網) 譯者:樸成奎 |
Bias will continue to be a fundamental concern for businesses hoping to adopt artificial intelligence software, according to senior executives from IBM and Salesforce, two of the leading companies selling such A.I.-enabled tools. Companies have become increasingly wary that hidden biases in the data used to train A.I. systems may result in outcomes that unfairly—and in some cases illegally—discriminate against protected groups, such as women and minorities. For instance, some facial recognition systems have been found to be less accurate at differentiating between dark-skinned faces as opposed to lighter-skinned ones, because the data used to train such systems contained far fewer examples of dark-skinned people. In one of the most notorious examples, a system used by some state judicial systems to help decide whether to grant bail or parole was more likely to rate black prisoners as having a higher risk of re-offending than white prisoners with similar criminal records. “Bias is going to be one of the fundamental issues of A.I. in the future,” Richard Socher, the chief scientist at software company Salesforce, said. Socher was speaking at Fortune’s Brainstorm Tech conference in Aspen, Colo. Dario Gil, director of research at IBM, also speaking at Brainstorm Tech, echoed Socher’s concerns. “We need robust A.I. engineering to protect against unwarranted A.I. bias,” he said. At IBM, Gil said, the company was increasingly looking at techniques to provide businesses with a “data lineage” that would record what data a system used to make a decision, how that data was generated and how and when it was used to make a recommendation or prediction. Gil said this kind of A.I. audit trail was essential for ensuring accountability, something he said must always reside with human-beings. “We have to put responsibility back to who is creating this software and what is their purpose and what is their intent,” he said. “The accountability has to rest with the institutions creating and using this software.” Both Gil and Socher said that eliminating A.I. bias was not an easy problem to solve, especially because machine learning systems were so good at finding correlations between variables in data sets. So, while it was possible to tell such software to disregard race when making, for example, credit recommendations, the system might still use a person’s address or zip code. In the U.S., at least, that information can also be highly correlated with race, Socher said. Gil said that IBM has been developing software—such as its AI Fairness 360 toolkit—that can help businesses automatically discover such hidden correlations in their data. But, Socher said, discovering such correlations is one thing. Knowing exactly what to do about them is, in many ways, a much harder problem. Socher said that in some cases, such as marketing breast pumps, it might be alright to only recommend a product to women. Meanwhile, in other contexts, the same sort of gender discrimination in recommendations would be illegal. For a company like Salesforce that is trying to build A.I. tools that are general enough that companies from any industry can use them for almost any use case, this presents a particular dilemma, he said. This is one reason, both Gil and Socher said, many businesses are choosing to train A.I. systems from their own data rather than using pre-trained software packages for tasks chatbots or automated image-tagging. Building their own A.I., Gil said, gave businesses more control and more chances to detect hidden biases. Both Socher and Gil said that one of the great things about A.I. is that it can help companies uncover existing bias in their business practices. For instance, it can managers who don’t promote women or financial institutions that don't extend credit equally to minorities. “A.I. sometimes puts a mirror in front of our faces and says this is what you have been doing all the time,” Socher said. He also said that certain types of bias were unlikely to be resolved until the people building A.I. systems were themselves more diverse. At the moment, he said, too many of the computer scientists creating A.I. software are white men. He also said too many of the A.I. applications developed so far reflect the concerns of affluent urbanites. He said this is one reason Salesforce has been supporting projects like the Deep Learning Indaba, a conference designed to bring together A.I. researchers from across Africa. |