據稱,有三條瘋傳的視頻拍攝了湯姆·克魯斯表演魔術、講一個不太搞笑的笑話和練習高爾夫球揮桿動作等場面。據數字圖像法醫分析專家表示,這三條視頻是深度偽造視頻最逼真的例子。深度偽造視頻是利用人工智能技術創作的具有高度迷惑性的虛假視頻。
上周,一個名為@deeptomcruise的賬號在社交媒體TikTok上發布了這三條視頻,累計觀看次數達到約1,100萬次。該賬號有34.2萬粉絲,獲得該平臺的其他用戶100萬次點贊。
@deeptomcruise賬號背后的運營人員和運營團隊尚無法確定,但湯姆·克魯斯的模仿者埃文·費蘭迪在上周末告訴Mic網站,他認為這些視頻應該出自演員米爾斯·費舍爾之手。費舍爾與克魯斯長相酷似,以前曾經模仿過克魯斯。社交媒體網站上也有許多人認為是費舍爾在視頻中模仿了克魯斯,并使用深度偽造技術對臉部進行了修改。
加州大學伯克利分校(University of California at Berkeley)的漢尼·法立德教授專門從事數字圖像分析。他表示,他相信這些是深度偽造視頻,但他們“做得非常逼真”。
根據法立德和他的一名研究生施盧蒂·阿加瓦爾的分析,有一些微小的證據能夠證明這些視頻是人工智能生成的偽造視頻。在一條視頻中,克魯斯似乎正在用一枚硬幣表演魔術,在視頻最后克魯斯的眼睛顏色和眼型有輕微變化??唆斔沟暮缒ど线€可以看到兩個異常的小白點,表面看是光線反射。但法立德表示,這兩個小白點的變化較真實視頻的變化更加明顯。
使用人工智能方法創建深度偽造視頻,通常會在他們創建的圖像和視頻中留下微妙的視覺異常,比如眼睛顏色或眼型不一致、耳朵輪廓古怪或發際線異常等。
深度偽造視頻常被用于換頭或換臉,而不是生成整個身體,而且法立德指出,表演硬幣魔術的雙手看起來像是克魯斯本人的雙手。或許它們屬于正在表演硬幣魔術的另一位演員,但將臉換成了湯姆·克魯斯。
法立德還表示,雖然深度偽造視頻通常會涉及到全臉更換,但使用人工智能技術只生成臉部的一部分,有時候能夠達到更逼真的效果。他和阿加瓦爾懷疑,這三條視頻就是這種情況。他們認為,嘴部可能是真實的,但眼睛區域使用了深度偽造技術。
法立德說:“如果視頻中是真人在模仿克魯斯,可能通過化妝達到某種良好的效果,這或許有一定的道理,而且換掉獨特的眼睛足以讓臉部肖像達到逼真的效果。也有可能是在拍攝完視頻之后進行了一些編輯。”
深度偽造視頻使用一種名為GAN(生成對抗網絡)的機器學習技術創作而成,該技術會對兩個深度神經網絡進行輪流訓練。這種技術是大致基于人腦運行模式的一種機器學習技術。一個網絡通過克魯斯本人的圖片或視頻進行訓練,用于生成克魯斯處在不同環境中或擺出不同姿勢的新圖像,其逼真程度足以騙過另外一個網絡,后者接受的訓練是從許多人的圖像中挑選出湯姆·克魯斯。
與大部分人工智能方法一樣,數據的數量和質量決定了系統的效果。這解釋了為什么克魯斯經常成為深度偽造視頻模仿的對象:他是世界上被拍照次數最多的名人之一。這些數據更容易訓練出一個極其出色的湯姆·克魯斯圖像生成器。
法立德表示,克魯斯有獨特的嗓音和一些特殊的動作,這并沒有壞處,反而增加了與他有關的深度偽造視頻的娛樂價值和社交媒體傳播力。
在這三條克魯斯視頻之前,有一條最廣為流傳并且最神秘的深度偽造視頻同樣涉及到克魯斯。這條視頻的發布者網名為Ctrl Shift Face,已經創作了大批高度逼真的深度偽造視頻。在其去年發布的這條視頻中,喜劇演員比爾·哈德正在2008年大衛·萊特曼的節目中模仿克魯斯。發布者利用深度偽造技術對這條視頻進行了修改,讓哈德在模仿克魯斯的時候幾乎與本人一模一樣。在GAN技術誕生大約三年之后,即2017年,首次出現了深度偽造視頻。最早的視頻是將名人的頭換到色情影片女演員的身體上。但在那之后,該技術被用于創作了不同名人在不同環境下的視頻?,F在已經有一款現成的軟件,支持用戶創作同樣令人難以辨別的深度偽造視頻,并且由于深度偽造視頻可能被用于復雜的政治虛假信息宣傳的情況,因此引起了安全研究人員的高度警惕。不過到目前為止,雖然專家仍然在就幾種可能的情況進行爭論,但深度偽造視頻并不是虛假信息傳播的主要原因。
雖然今天的深度偽造視頻通??梢允褂脭底址ㄡt分析進行識別,但這個過程耗時漫長,而且需要具備豐富的專業知識。研究人員正在研究創建能夠自動識別虛假偽造視頻的人工智能系統。2019年,Facebook發起了年度競賽,希望找到最佳的系統。但在現有系統中,表現最好的系統只有65%的時間可以準確識別。
阿加瓦爾表示,使用專門的商業軟件,有可能創作出與這三條克魯斯視頻同等質量的深度偽造視頻。但這需要具備一定的技能,相關人工智能系統還需要有大量數據和訓練時間,而且訓練時間成本高昂。所以,為了一條TikTok爆款視頻付出這么多努力和時間是否值得,我們仍然無法確定。(財富中文網)
翻譯:劉進龍
審校:汪皓
據稱,有三條瘋傳的視頻拍攝了湯姆·克魯斯表演魔術、講一個不太搞笑的笑話和練習高爾夫球揮桿動作等場面。據數字圖像法醫分析專家表示,這三條視頻是深度偽造視頻最逼真的例子。深度偽造視頻是利用人工智能技術創作的具有高度迷惑性的虛假視頻。
上周,一個名為@deeptomcruise的賬號在社交媒體TikTok上發布了這三條視頻,累計觀看次數達到約1,100萬次。該賬號有34.2萬粉絲,獲得該平臺的其他用戶100萬次點贊。
@deeptomcruise賬號背后的運營人員和運營團隊尚無法確定,但湯姆·克魯斯的模仿者埃文·費蘭迪在上周末告訴Mic網站,他認為這些視頻應該出自演員米爾斯·費舍爾之手。費舍爾與克魯斯長相酷似,以前曾經模仿過克魯斯。社交媒體網站上也有許多人認為是費舍爾在視頻中模仿了克魯斯,并使用深度偽造技術對臉部進行了修改。
加州大學伯克利分校(University of California at Berkeley)的漢尼·法立德教授專門從事數字圖像分析。他表示,他相信這些是深度偽造視頻,但他們“做得非常逼真”。
根據法立德和他的一名研究生施盧蒂·阿加瓦爾的分析,有一些微小的證據能夠證明這些視頻是人工智能生成的偽造視頻。在一條視頻中,克魯斯似乎正在用一枚硬幣表演魔術,在視頻最后克魯斯的眼睛顏色和眼型有輕微變化??唆斔沟暮缒ど线€可以看到兩個異常的小白點,表面看是光線反射。但法立德表示,這兩個小白點的變化較真實視頻的變化更加明顯。
使用人工智能方法創建深度偽造視頻,通常會在他們創建的圖像和視頻中留下微妙的視覺異常,比如眼睛顏色或眼型不一致、耳朵輪廓古怪或發際線異常等。
深度偽造視頻常被用于換頭或換臉,而不是生成整個身體,而且法立德指出,表演硬幣魔術的雙手看起來像是克魯斯本人的雙手?;蛟S它們屬于正在表演硬幣魔術的另一位演員,但將臉換成了湯姆·克魯斯。
法立德還表示,雖然深度偽造視頻通常會涉及到全臉更換,但使用人工智能技術只生成臉部的一部分,有時候能夠達到更逼真的效果。他和阿加瓦爾懷疑,這三條視頻就是這種情況。他們認為,嘴部可能是真實的,但眼睛區域使用了深度偽造技術。
法立德說:“如果視頻中是真人在模仿克魯斯,可能通過化妝達到某種良好的效果,這或許有一定的道理,而且換掉獨特的眼睛足以讓臉部肖像達到逼真的效果。也有可能是在拍攝完視頻之后進行了一些編輯?!?/p>
深度偽造視頻使用一種名為GAN(生成對抗網絡)的機器學習技術創作而成,該技術會對兩個深度神經網絡進行輪流訓練。這種技術是大致基于人腦運行模式的一種機器學習技術。一個網絡通過克魯斯本人的圖片或視頻進行訓練,用于生成克魯斯處在不同環境中或擺出不同姿勢的新圖像,其逼真程度足以騙過另外一個網絡,后者接受的訓練是從許多人的圖像中挑選出湯姆·克魯斯。
與大部分人工智能方法一樣,數據的數量和質量決定了系統的效果。這解釋了為什么克魯斯經常成為深度偽造視頻模仿的對象:他是世界上被拍照次數最多的名人之一。這些數據更容易訓練出一個極其出色的湯姆·克魯斯圖像生成器。
法立德表示,克魯斯有獨特的嗓音和一些特殊的動作,這并沒有壞處,反而增加了與他有關的深度偽造視頻的娛樂價值和社交媒體傳播力。
在這三條克魯斯視頻之前,有一條最廣為流傳并且最神秘的深度偽造視頻同樣涉及到克魯斯。這條視頻的發布者網名為Ctrl Shift Face,已經創作了大批高度逼真的深度偽造視頻。在其去年發布的這條視頻中,喜劇演員比爾·哈德正在2008年大衛·萊特曼的節目中模仿克魯斯。發布者利用深度偽造技術對這條視頻進行了修改,讓哈德在模仿克魯斯的時候幾乎與本人一模一樣。在GAN技術誕生大約三年之后,即2017年,首次出現了深度偽造視頻。最早的視頻是將名人的頭換到色情影片女演員的身體上。但在那之后,該技術被用于創作了不同名人在不同環境下的視頻?,F在已經有一款現成的軟件,支持用戶創作同樣令人難以辨別的深度偽造視頻,并且由于深度偽造視頻可能被用于復雜的政治虛假信息宣傳的情況,因此引起了安全研究人員的高度警惕。不過到目前為止,雖然專家仍然在就幾種可能的情況進行爭論,但深度偽造視頻并不是虛假信息傳播的主要原因。
雖然今天的深度偽造視頻通常可以使用數字法醫分析進行識別,但這個過程耗時漫長,而且需要具備豐富的專業知識。研究人員正在研究創建能夠自動識別虛假偽造視頻的人工智能系統。2019年,Facebook發起了年度競賽,希望找到最佳的系統。但在現有系統中,表現最好的系統只有65%的時間可以準確識別。
阿加瓦爾表示,使用專門的商業軟件,有可能創作出與這三條克魯斯視頻同等質量的深度偽造視頻。但這需要具備一定的技能,相關人工智能系統還需要有大量數據和訓練時間,而且訓練時間成本高昂。所以,為了一條TikTok爆款視頻付出這么多努力和時間是否值得,我們仍然無法確定。(財富中文網)
翻譯:劉進龍
審校:汪皓
A trio of viral videos allegedly depicting the actor Tom Cruise performing a magic trick, telling a not-so-funny joke, and practicing his golf swing are some of the most sophisticated examples yet seen of deepfakes, highly convincing fake videos created using A.I. technology, according to experts in the forensic analysis of digital images.
The three videos, which were posted last week on the social media platform TikTok from an account called @deeptomcruise, have collectively been viewed about 11 million times. The account has garnered more than 342,000 followers and 1 million likes from other users of the social media platform.
The person or people behind @deeptomcruise have not yet been definitely identified, but Cruise impersonator Evan Ferrante told website Mic over the weekend that he believed the videos were the work of an actor named Miles Fisher, who resembles Cruise and has done impressions of him in the past. Several people on social media sites also said they believed Fisher is depicting Cruise in the videos, with his face modified using deepfake technology.
Hany Farid, a professor at the University of California at Berkeley who specializes in the analysis of digital images, says he is convinced that the videos are deepfakes but that they are “incredibly well done.”
According to an analysis by Farid and one of his graduate students, Shruti Agarwal, there are a few tiny pieces of evidence that give away the fact that the videos are A.I.-generated fakes. In one video, in which Cruise seems to perform a magic trick with a coin, Cruise’s eye color and eye shape change slightly at the end of the video. There are also two unusual small white dots seen in Cruise’s iris—ostensibly reflected light—that Farid says change more than would be expected in an authentic video.
The A.I. methods used to create deepfakes often leave subtle visual oddities in imagery and videos they create—inconsistencies in eye color or shape, or strange ear contours or anomalies around the hairline.
Deepfakes are most often used to swap one person’s head or face for another’s as opposed to generating the entire body, and Farid notes that the hands performing the coin trick don’t look like the real Cruise’s hands. Presumably they belong to an actor, who was filmed performing the coin trick and then had Cruise’s face substituted for his.
Farid also says that while a true deepfake often involves a full-face swap, a more convincing result can sometimes be obtained by using the A.I. technique to generate only a portion of the face. He and Agarwal suspect that this is the case with the three Cruise videos. They think that the mouth is probably real, but that the eye region has been created with deepfake technology.
“This would make sense if the actual person in the video resembles Cruise, did some good work with makeup perhaps, and the swapping of the distinct eyes is enough to finalize a compelling likeness,” Farid says. “It is also possible that there was some postproduction video editing.”
Deepfakes are created using a machine-learning technique called a GAN (generative adversarial network), in which two deep neural networks—a type of machine learning loosely based on the way the human brain works—are trained in tandem. One network is trained from pictures or videos of the real Cruise to generate new images of Cruise in different settings or poses that are realistic enough to fool the other network, which is trained to pick out images of Tom Cruise from those of other people.
As with most A.I. methods, the amount and quality of the data help determine how good the system is. That goes to explain why Cruise has been a frequent target for deepfakes: He is among one of the most photographed celebrities on the planet. All that data makes it easier to train a very good Tom Cruise image generator.
Farid says it also doesn’t hurt that Cruise has a distinctive voice and mannerisms that add to the entertainment value and social media virality of deepfakes involving him.
Prior to the current trio of Cruise videos, one of the most wildly circulated and uncanny examples of a deepfake also involved Cruise. Released last year by a person who goes by the Internet handle Ctrl Shift Face, who has created a number of highly realistic deepfakes, it involves a video of the comedian Bill Hader doing an impersonation of Cruise on the David Letterman show in 2008. Deepfake technology is used to modify the video so that Hader’s face seamlessly morphs into Cruise’s as he does the impression. Deepfakes first surfaced in 2017, about three years after GANs were invented. Some of the earliest examples were videos in which the head of a celebrity was swapped with the body of an actress in a pornographic film. But since then they have been used to create fake videos of a lot of different celebrities in different settings. There is now off-the-shelf software that enables users to create fairly convincing deepfakes, and security researchers have become increasingly alarmed that deepfakes could be used for sophisticated political disinformation campaigns. But so far, despite a couple of possible examples that are still being debated by experts, deepfakes have not become a major factor in disinformation efforts.
While today’s deepfakes are usually identifiable with careful digital forensic analysis, this process is time-consuming and requires a certain amount of expertise. Researchers are working to create A.I. systems that would be able to automatically identify deepfakes, and Facebook in 2019 launched an annual competition to find the best of these. But in its inaugural running, the top performing system was accurate only 65% of the time.
Agarwal says it is possible to create deepfakes of the quality seen in the three Cruise videos using commercial software for deepfake generation. But doing so requires some skill, as well as a significant amount of data and training time for the A.I. system involved—and that training time can be expensive. So whether it would have been worth that sort of effort and cost for a viral TikTok video remains uncertain.