一.google论文系列 1. google系列论文译序 2. The anatomy of a large-scale hypertextual Web search engine 5. mapreduce: Simplied Data Processing on Large Clusters 6. bigtable: A Distributed Storage System for Structured Data 7. Chubby: The Chubby lock service for loosely-coupled distributed systems 8. Sawzall:Interpreting the Data--Parallel Analysis with Sawzall 9. Pregel: A System for Large-Scale Graph Processing 10. Dremel: Interactive Analysis of WebScale Datasets 11. Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications 12. MegaStore: Providing Scalable, Highly Available Storage for Interactive Services 13. Case Study GFS: Evolution on Fast-forward 14. Google File System II: Dawn of the Multiplying Master Nodes 15. Tenzing - A SQL Implementation on the MapReduce Framework 二.分布式与SQL理论系列 00. Appraising Two Decades of Distributed Computing Theory Research 0. How to Build a Highly Available System Using Consensus 1. 分布式理论系列译序 2. A brief history of Consensus_ 2PC and Transaction Commit 3. 拜占庭将军问题 --Leslie Lamport 4. Impossibility of distributed consensus with one faulty process 5. Leases:租约机制 7. The Part Time Parliament --Leslie Lamport8. Fast Paxos --Leslie Lamport 9. Paxos Made Live - An Engineering Perspective 10. Uniform consensus is harder than consensus 11. The Transaction Concept:Virtues and Limitations --Jim Gray12. 2pc-2阶段提交:Notes on Data Base Operating Systems --Jim Gray 13. 3pc-3阶段提交:NONBLOCKING COMMIT PROTOCOLS 14. Life beyond Distributed Transactions:an Apostate's Opinion 15. A Comparison of the Byzantine Agreement Problem and the Transaction Commit Problem --Jim Gray 16. Consensus on Transaction Commit --Jim Gray & Leslie Lamport 21. Time Clocks and the Ordering of Events in a Distributed System --Leslie Lamport 22. Distributed Snapshots: Determining Global States of a Distributed System --Leslie Lamport 23. Virtual Time and Global States of Distributed Systems 24. Timestamps in Message-Passing Systems That Preserve the Partial Ordering 25. Fundamentals of Distributed Computing:A Practical Tour of Vector Clock Systems 三.NoSql理论系列 0. Towards Robust Distributed Systems:Brewer's 2000 PODC key notes 1. CAP理论 2. Harvest, Yield, and Scalable Tolerant Systems 3. Brewer's conjecture and the feasibility of consistent, available, partition- tolerant web services 4. BASE模型:BASE an Acid Alternative 5. 最终一致性 6. 可扩展性设计模式 7. 可伸缩性原则 8. NoSql生态系统 9. scalability-availability-stability-patterns 10. The 5 Minute Rule and the 5 Byte Rule 11. The Five-Minute Rule 20 Years Later(and How Flash Memory Changes the Rules) 12. 关于MapReduce的争论 15. MapReduce和并行数据库,朋友还是敌人?(zz) 16. MapReduce and Parallel DBMSs-Friends or Foes(译) 17. MapReduce:A Flexible Data Processing Tool(译) 18. A Comparision of Approaches to Large-Scale Data Analysis(译) 四.基本算法和数据结构 2. 大数据量,海量数据处理方法总结(续) 3. Consistent Hashing And Random Trees 4. Merkle Trees 5. Scalable Bloom Filters 6. Introduction to Distributed Hash Tables 7. B-Trees and Relational Database Systems 8. The log-structured merge-tree 10. Data Structures for Spatial Database 11. Gossip 13. The Graph Traversal Pattern 五.基本系统和实践经验 2. Dynamo: Amazon's Highly Available Key-value Store 3. Cassandra - A Decentralized Structured Storage System 4. PNUTS: Yahoo!'s Hosted Data Serving Platform 5. Yahoo!的分布式数据平台PNUTS简介及感悟(zz) 6. LevelDB:一个快速轻量级的key-value存储库(译) 8. Megastore: Providing Scalable, Highly Available Storage for Interactive Services 9. Designs, Lessons and Advice from Building Large Distributed Systems --Jeff Dean 10. Challenges in Building Large-Scale Information Retrieval Systems --Jeff Dean 六.其他辅助系统 1. The ganglia distributed monitoring system:design, implementation, and experience 2. Chukwa: A large-scale monitoring system 3. Scribe : a way to aggregate data and why not, to directly fill the HDFS? 4. Benchmarking Cloud Serving Systems with YCSB 七. Hadoop相关 1. The Hadoop Distributed File System(译) 2. HDFS scalability:the limits to growth(译) 3. Name-node memory size estimates and optimization proposal. 5. HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs 6. HFile V2 7. Hive - A Warehousing Solution Over a Map-Reduce Framework 8. Hive – A Petabyte Scale Data Warehouse Using Hadoop 10. ZooKeeper: Wait-free coordination for Internet-scale systems 11. The life and times of a zookeeper 13. Apache Hadoop Goes Realtime at Facebook 14. Hadoop平台优化综述 15. The Anatomy of Hadoop I/O Pipeline 16. Hadoop公平调度器指南 17. 下一代Apache Hadoop MapReduce 八.其他 Reflections on Trusting Trust --Ken Thompson Who Needs an Architect? Go To statements considered harmfull --Edsger W.Dijkstra No Silver Bullet Essence and Accidents of Software Engineering --Frederick P. Brooks