正文
4. 在诸如谷歌的“大数据公司”里头,数据科学家天天进行一系列实验,监控几千上万个指标,可以随意把一条广告里面的蓝色字体改为红色,把两个版本分发到千万个人面前,然后记录哪一条广告的点击量更大,
这样的有效反馈,是用来调整算法、精细优化产品体验。
我对谷歌的数据操作持有许多保留意见,但是至少,这样的实验相当有效,符合统计学规律。
At Big Data companies like Google, by contrast, researchers run constant tests and monitor thousands of variables. They can change the font on a single advertisement from blue to red, serve each version to ten million people, and keep track of which one gets more clicks. They use this feedback to hone their algorithms and fine- tune their operation. While I have plenty of issues with Google, which we’ll get to, this type of testing is an effective use of statistics.
5. 若亚马逊的推荐模型出现相关性偏差,比如,开始给00后女孩子推的书都是草坪修剪的工具书,网站的点击数必定骤降,而工程师也会以此为反馈,持续调整模型,直到匹配妥当为止。若没有这样的反馈,统计引擎不断输出的分析结果,或有偏差乃至害处,更重要的是,
模型从不知错,无从改进
。
If Amazon. com, through a faulty correlation, started recommending lawn care books to teenage girls, the clicks would plummet, and the algorithm would be tweaked until it got it right. Without feedback, however, a statistical engine can continue spinning out faulty and damaging analysis while never learning from its mistakes.