The below architecture diagram is designed using EdrawMax as it has free architecture symbols. As per the below architecture diagram, Hadoop Distributed File System (HDFS) exposes a file system namespace and allows user data to be stored in files. There are several Data Repositories like EDW, ERP, CRM, and RDBMS. As the diagram suggests, a data repository refers to an enterprise data storage entity into which data has explicitly been partitioned for an analytical or reporting purpose. At the same time, Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data to analyze the queries.