EECS 247 LEC A: INFORMATION STORAGE (17420)

Home

News:

  • Please read papers for next class, and prepare for discussions.
  • If you would like to take the course as a concentration course, please contact Amy Pham of EECS.

Instructor:

Zhiying Wang

zhiying@uci.edu

Time & Location:

Lectures:     MW 11:00am-12:20pm in Zoom Links to an external site.

Topics:

  • Storage systems including Hadoop and Ceph
  • Computing on distributed platform
  • Definitions and algorithms on data consistency
  • Non-volatile memory
  • DNA storage

References:

Hadoop : the definitive guide, Tom White, 2015.
Hadoop beginner's guide, Matthew Rathbone, 2013. Links to an external site.

Download Hadoop setup from 2020

Download Ubuntu tutorial.pdf

 

Materials:

Date

Material

Student, Slides

1/4

Introduction

Download Introduction.pdf

1/6

Download MapReduce: simplified data processing on large clusters

Download MapReduce_Q.pdf

1/11

Download Ceph: A scalable, high-performance distributed file system

Download Ceph.pdf

  

1/13

Download Everything you always wanted to know about multicore graph processing but were afraid to ask

Download Everything you always wanted to know about_Q.pptx

 

1/20

Download Memory-Augmented Monte Carlo Tree Search

Download Memory-Augmented Monte Carlo Tree Search_Q.pdf

 

1/25

Download Lagrange Coded Computing Optimal Design for Resiliency Security and Privacy

Download Lagrange Coded Computing_Q.pdf

 

1/27

Download Compressed linear algebra for large-scale machine learning

Download Compressed linear algebra for large-scale machine learning_Q.pptx

 

2/1

Download A Degeneracy Framework for Graph Similarity

Download A Degeneracy Framework for Graph Similarity_Q.pptx

 

2/3

Download An Algorithm for the Principal Component Analysis of Large Data Sets

Download An Algorithm for the Principal Component Analysis of Large Data Sets_Q.pdf

 

2/8

Download Barrier-Enabled IO Stack for Flash Storage

did not cover

2/10

Download Paxos Made Simple

Download paxos-simple_Q.pdf

 

2/17

Project presentations

---

2/22

Download Giza: Erasure Coding Objects across Global Data Centers

Giza Download Giza 

2/24

Download Protocol-Aware Recovery for Consensus-Based Storage

yuanke zhang

3/1

Download Data retention in MLC NAND flash memory: Characterization, optimization, and recovery

Download nand flash retention.pdf

  

3/3

Download A DNA-of-things storage architecture to create materials with embedded memory

Rohit Vasu

3/8

Download Data storage in DNA with fewer synthesis cycles using composite DNA letters

Data storage in DNA with fewer synthesis cycles using composite DNA letters_Q.pdf Download Data storage in DNA with fewer synthesis cycles using composite DNA letters_Q.pdf 

3/10

Project presentation

---

HW policy:

  • You are encouraged to discuss with other students, and refer to course materials, but need to write down your own solutions.
  • No late homework is accepted.
  • HW will contain both analytical and programming problems. For general programming problems, any programming language can be used. For Hadoop problems, the Hadoop framework is needed.
  • Upload pdf solution and all source code.

Grading:

Hw

35%

Projects

35%

Lead discussion

10%

Participation

20%