ANNA UNIVERSITY, CHENNAI
REGULATIONS - 2013
CP7019 MANAGING BIG DATA SYLLABUS
CP7019 MANAGING BIG DATA SYLLABUS
ME 3RD SEM COMPUTER SCIENCE AND ENGINEERING SYLLABUS
OBJECTIVES:
Understand big data for business intelligence
Learn business case studies for big data analytics
Understand nosql big data management
Perform map-reduce analytics using Hadoop and related tools
UNIT I UNDERSTANDING BIG DATA
What is big data – why big data – convergence of key trends – unstructured data – industry examples of big data – web analytics – big data and marketing – fraud and big data – risk and big data – credit risk management – big data and algorithmic trading – big data and healthcare – big data in medicine – advertising and big data – big data technologies – introduction to Hadoop – open source technologies – cloud and big data – mobile business intelligence – Crowd sourcing analytics – inter and trans firewall analytics
UNIT II NOSQL DATA MANAGEMENT
Introduction to NoSQL – aggregate data models – aggregates – key-value and document data models – relationships – graph databases – schemaless databases – materialized views – distribution models – sharding – master-slave replication – peer-peer replication – sharding and replication – consistency – relaxing consistency – version stamps – map-reduce – partitioning and combining – composing map-reduce calculations
UNIT III BASICS OF HADOOP
Data format – analyzing data with Hadoop – scaling out – Hadoop streaming – Hadoop pipes – design of Hadoop distributed file system (HDFS) – HDFS concepts – Java interface – data flow – Hadoop I/O – data integrity – compression – serialization – Avro – file-based data structures
UNIT IV MAPREDUCE APPLICATIONS
MapReduce workflows – unit tests with MRUnit – test data and local tests – anatomy of MapReduce job run – classic Map-reduce – YARN – failures in classic Map-reduce and YARN – job scheduling – shuffle and sort – task execution – MapReduce types – input formats – output formats
UNIT V HADOOP RELATED TOOLS
Hbase – data model and implementations – Hbase clients – Hbase examples – praxis.Cassandra – cassandra data model – cassandra examples – cassandra clients – Hadoop integration. Pig – Grunt – pig data model – Pig Latin – developing and testing Pig Latin scripts. Hive – data types and file formats – HiveQL data definition – HiveQL data manipulation – HiveQL queries.
TOTAL: 45 PERIODS
OUTCOMES:
Upon Completion of the course,the students will be able to
Describe big data and use cases from selected business domains
Explain NoSQL big data management
Install, configure, and run Hadoop and HDFS
Perform map-reduce analytics using Hadoop
Use Hadoop related tools such as HBase, Cassandra, Pig, and Hive for big data analytics
REFERENCES:
1. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.
2. P. J. Sadalage and M. Fowler, "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence", Addison-Wesley Professional, 2012.
3. Tom White, "Hadoop: The Definitive Guide", Third Edition, O'Reilley, 2012.
4. Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
5. E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilley, 2012.
6. Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
7. Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
8. Alan Gates, "Programming Pig", O'Reilley, 2011.
Big Data Management
ReplyDeleteIf you're interested in diving deeper into data science and its applications, be sure to check out my data science course at Unified Mentor. It’s a great way to enhance your skills and stay ahead in this evolving field!
ReplyDeleteIf you want to learn more about data science and its applications, check out our fully online data science course at Unified Mentor. It's a great way to improve your skills and keep up with this changing field!
ReplyDeleteVisit: https://www.unifiedmentor.com/data-science