大數據:預知未來的高科技“水晶球”
????你可能還記得,塔吉特百貨公司(Target)在去年初曾深陷憤怒的輿論漩渦中心。那是因為這家零售商的數據專家們開發出了一種統計方法,可以預測哪些客戶有可能在近期懷孕,營銷人員向她們推銷嬰幼兒產品時,就擁有了先人一步的優勢。 ????這個模型很管用:在塔吉特購買孕期及嬰幼兒產品的客戶增長了30%。但這卻引來輿論一片嘩然,從《紐約時報》(The New York Times)到福克斯新聞(Fox News),幾乎所有人都指責該公司是在“暗中監測”購物者。這場風波好幾周后才平息下去。 ????如果塔吉特成功監測準媽媽這件事已經讓你覺得毛骨悚然了,那埃里克?西格爾的新書恐怕會讓你惶惶不可終日的。西格爾曾是哥倫比亞大學(Columbia University)的教授,他的公司叫“預測影響”(Predictive Impact),專門開發各類數學模型,這些模型能從海量原始數據中提取出極具價值的信息。各類公司都在使用這些工具進行預測,不管是我們想購買什么東西,還是我們想看什么電影,不管是我們碰上車禍的可能性有多高,還是我們有多大可能會信用卡欠款,都能預測出來。 ????在《預測分析:預測誰將點擊、購買、撒謊或死亡的力量》(Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die)一書中,西格爾用清晰生動的語言解釋了這些模型運作的機制及各類誤區。簡而言之,預測分析,或簡稱PA,就是一種從經驗中學習的科學。從既定人群——客戶、病人、即將假釋的囚犯、選民或員工——過去和當前的行為數據入手,分析師就能預知他們下一步可能的行為。 ????這是一種可以預知未來的高科技“水晶球”。西格爾寫道,它位居“日益盛行的、越來越依靠數據做決策的趨勢”幕后,“實際上,如果一個機構從來不用這種方式充分利用自己的數據,那就和一個人有過目不忘的本事卻從來不動腦筋無異”。 ????這本書列舉了豐富的案例,有關花旗集團(Citi)、Facebook、IBM、谷歌公司(Google)、網飛公司(Netflix)、貝寶(PayPal)和其他企業及政府機構利用預測分析的例子比比皆是。比如,輝瑞制藥(Pfizer)就有一個預測模型,它能預告病人在三周內對一種給定新藥產生藥效反應的幾率。LinkedIn會用PA來準確找到你希望聯系的用戶。而在美國國稅局(IRS),一套用于過去納稅申報單的數學排序系統“讓IRS的分析師在不增加調查的前提下,能發現比以前多25倍的逃稅情況。” ????還有一個惠普公司(Hewlett-Packard)的案例。幾年前,惠普的一些部門每年離職率高達20%,受此觸動,惠普決定預測其全球33萬名員工中誰最有可能辭職。分析師團隊從海量數據入手,如薪酬水平、加薪情況、升遷情況及輪崗情況等,將它們和已離職員工的詳細工作經歷聯系起來開展分析。在他們所發現的數據相似性基礎上,研究者們為目前每位員工都打了一個離職風險(Flight Risk)評分。 |
????Early last year, you might recall, Target found itself at the center of a storm of outrage. The retailer's number crunchers had come up with a statistical method for predicting which of its customers were most likely to become pregnant in the near future, giving Target's marketers a head start on pitching them baby products. ????The model worked: Target expanded its customer base for pregnancy and infant-care products by about 30%. But the media brouhaha, with everyone from The New York Times to Fox News accusing the company of "spying" on shoppers, took weeks to die down. ????If Target's success at setting its sights on potential moms-to-be gives you the creeps, Eric Siegel's new book could ruin your whole day. Siegel is a former Columbia professor whose company, Predictive Impact, builds mathematical models that cull valuable nuggets of data from floods of raw information. Companies use the tools to forecast everything from what we'll shop for, to which movies we'll watch, to how likely we are to be in a car accident or default on our credit cards. ????In Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Siegel explains how these models work and where the pitfalls are, in clear, colorful terms. Simply put, predictive analytics, or PA, is the science of learning from experience. Starting with data about the past and current behavior of a given group of people -- whether customers, patients, prison inmates up for parole, voters, or employees -- analysts can predict what they'll probably do next. ????This kind of high-tech crystal ball is behind "the growing trend to make decisions more 'data driven,'" Siegel writes. "In fact, an organization that doesn't leverage its data in this way is like a person with a photographic memory who never bothers to think." ????Predictive Analytics is packed with examples of how Citi, Facebook, Ford, IBM, Google, Netflix, PayPal and many other businesses and government agencies have put PA to work. Pfizer, for instance, has a predictive model to foretell the likelihood that a patient will respond to a given new drug within three weeks. LinkedIn uses PA to pinpoint the fellow members you might want as connections. At the IRS, a mathematical ranking system applied to past tax returns "empowered IRS analysts to find 25 times more tax evasion, without increasing the number of investigations." ????And then there's Hewlett-Packard. A couple of years ago, alarmed by annual turnover rates in some divisions as high as 20%, HP decided to try anticipating which of its 330,000 employees worldwide were most likely to quit. Beginning with reams of data on things like salaries, raises, promotions, and job rotations, a team of analysts correlated that information with detailed employment records of people who had already left. Based on the similarities they found, the researchers assigned each current employee a Flight Risk score. |