专栏名称: 云技术实践
关注云计算,云技术,云运维,云存储,存储,分布式,OpenStack,SDN,Ceph,虚拟化,运维,分享在云计算/虚拟化/运维项目实施中的资讯、经验、技术,坚持干货。
目录
相关文章推荐
字节跳动技术团队  ·  远程访问代理+内网穿透:火山引擎边缘网关助力 ... ·  昨天  
字节跳动技术团队  ·  稀土掘金 x Trae ... ·  昨天  
51好读  ›  专栏  ›  云技术实践

亚马逊因放弃Oracle数据库,导致Prime Day宕机

云技术实践  · 公众号  · 架构  · 2018-10-24 19:55

正文

请到「今天看啥」查看全文


The report shows that Amazon struggled to identify the root cause of the Prime Day issue because of a feature it lost after the database was moved over. It also failed to come up with a contingency plan in case of an error in its newly installed database, called Aurora PostgreSQL, the documents show.

In one question, engineers were asked why Amazon's warehouse database didn't face the same problem "during the previous peak when it was on Oracle." They responded by saying that "Oracle and Aurora PostgreSQL are two different [database] technologies" that handle "savepoints" differently.

Savepoints are an important database tool for tracking and recovering individual transactions. On Prime Day, an excessive number of savepoints was created, and Amazon's Aurora software wasn't able to handle the pressure, slowing down the overall database performance, the report said.


Could have happened anyway

"It's quite possible the outage would not have occurred if Amazon had stuck with Oracle," said Matt Caesar, a computer science professor at the University of Illinois at Urbana-Champaign, after CNBC shared the details of the document. "Also, it appears they would have been able to diagnose the problem sooner if they were using Oracle's database, which could possibly have reduced the outage duration."

An Amazon spokesperson played down the issue in an emailed statement and said there was no outage, even though the internal document states that the database "degradation resulted in lags and complete outages."

"It is important to point out that there was never an outage at the facility, and the issue only resulted in delaying shipping of about one percent of packages for a short period of time," the spokesperson said. "This issue was quickly diagnosed and resolved."

The Ohio warehouse is the largest of the 13 warehouses that moved its database off Oracle prior to Prime Day. During the Prime Day period, it handled over 1.1 million packages per day, the documents say. All services and software that handle inventory and shipping data had been migrated to Aurora in those warehouses.







请到「今天看啥」查看全文