(一)课程中文简介
大数据基础作为大数据领域的入门课程,其核心宗旨在于为学生建立通往大数据知识空间的坚实基础。这门课程不仅仅是对大数据这一前沿领域的初步探索,更是一次深入学习与理解的起点。为了实现这一目标,课程安排遵循了四大核心原则。
首先是构建知识体系,通过系统化的教学内容,帮助学生逐步构建起关于大数据的完整认知框架,使其能够清晰地把握大数据的基本概念、发展历程及未来趋势;第二是阐明大数据基本原理,深入浅出地解析大数据背后的技术原理与核心算法,让学生理解数据如何被收集、存储、处理、分析及应用;第三是引导初级实践,通过实践操作与案例分析,鼓励学生动手尝试,培养解决实际问题的能力;最后是了解相关应用,介绍大数据在各行各业中的广泛应用案例,拓宽学生的视野,激发其创新思维,为未来在大数据领域的深入探索奠定基础。
课程将系统讲授大数据的基本概念、大数据处理架构Hadoop、分布式文件系统HDFS、分布式数据库HBase、NoSQL数据库、云数据库、分布式并行编程模型MapReduce、数据仓库Hive、基于内存的大数据处理架构Spark、大数据在互联网、生物医学和物流等各个领域的应用。在Hadoop、HDFS、HBase、MapReduce、Spark等重要章节,安排了入门级的实践操作,让学生更好地学习和掌握大数据关键技术。
(二)课程英文简介
As an introductory course in the field of big data, Fundamentals of Big Data Technology aims to build a solid foundational knowledge for students. This course serves not only as an initial exploration of the big data domain but also as a stepping stone for more advanced study. To achieve this, the course design follows four principles.
First, the course design focus on building a knowledge system, guiding students through an overview of big data, including its basic concepts, developmental history, and future trends. Second, the course design includes clarifying the fundamental principles of big data by detailing the underlying technical principles and algorithms, enabling students to understand the processes of data collection, storage, processing, analysis, and application. Third, the course incorporates practical experiments through case studies teaching and hands-on exercises designs. Students are expected to finish practical exercises and develop problem-solving skills. Finally, the course design covers relevant applications by introducing a wide range of real-world big data cases across various industries, which is expected to stimulating students’ innovative thinking, and laying the foundation for future in-depth exploration in the field of big data.
The course covers essential topics in big data, including basic concepts, the Hadoop processing framework, the distributed file system HDFS, the distributed database HBase, NoSQL databases, cloud databases, the distributed parallel programming model MapReduce, Hive, and the in-memory big data processing framework Spark. Additionally, it included the introductions of big data applications in various fields such as the internet, biomedical sciences, and logistics. For key chapters like Hadoop, HDFS, HBase, MapReduce, and Spark, practical experiments are designed to help students effectively learn and master these critical big data technologies.
…