有关大数据的英文论文参考文献

4个回答默认排序

默认排序

按时间排序

～*诗情画意*～

已采纳

Big data refers to the huge volume of data that cannotbe stored and processed with in a time frame intraditional file system.The next question comes in mind is how big this dataneeds to be in order to classify as a big data. There is alot of misconception in referring a term big data. Weusually refer a data to be big if its size is in gigabyte,terabyte, Petabyte or Exabyte or anything larger thanthis size. This does not define a big data completely.Even a small amount of file can be referred to as a bigdata depending upon the content is being used.Let’s just take an example to make it clear. If we attacha 100 MB file to an email, we cannot be able to do so.As a email does not support an attachment of this size.Therefore with respect to an email, this 100mb filecan be referred to as a big data. Similarly if we want toprocess 1 TB of data in a given time frame, we cannotdo this with a traditional system since the resourcewith it is not sufficient to accomplish this task.As you are aware of various social sites such asFacebook, twitter, Google+, LinkedIn or YouTubecontains data in huge amount. But as the users aregrowing on these social sites, the storing and processingthe enormous data is becoming a challenging task.Storing this data is important for various firms togenerate huge revenue which is not possible with atraditional file system. Here is what Hadoop comes inthe existence.Big Data simply means that huge amountof structured, unstructured and semi-structureddata that has the ability to be processed for information. Now a days massive amount of dataproduced because of growth in technology,digitalization and by a variety of sources, includingbusiness application transactions, videos, picture ,electronic mails, social media, and so on. So to processthese data the big data concept is introduced.Structured data: a data that does have a proper formatassociated to it known as structured data. For examplethe data stored in database files or data stored in excelsheets.Semi-Structured Data: A data that does not have aproper format associated to it known as structured data.For example the data stored in mail files or in docx.files.Unstructured data: a data that does not have any formatassociated to it known as structured data. For examplean image files, audio files and video files.Big data is categorized into 3 v’s associated with it thatare as follows:[1]Volume: It is the amount of data to be generated i.e.in a huge quantity.Velocity: It is the speed at which the data gettinggenerated.Variety: It refers to the different kind data which isgenerated.A. Challenges Faced by Big DataThere are two main challenges faced by big data [2]i. How to store and manage huge volume of dataefficiently.ii. How do we process and extract valuableinformation from huge volume data within a giventime frame.These main challenges lead to the development ofhadoop framework.Hadoop is an open source framework developed byduck cutting in 2006 and managed by the apachesoftware foundation. Hadoop was named after yellowtoy elephant.Hadoop was designed to store and process dataefficiently. Hadoop framework comprises of two maincomponents that are:i. HDFS: It stands for Hadoop distributed filesystem which takes care of storage of data withinhadoop cluster.ii. MAPREDUCE: it takes care of a processing of adata that is present in the HDFS.Now let’s just have a look on Hadoop cluster:Here in this there are two nodes that are Master Nodeand slave node.Master node is responsible for Name node and JobTracker demon. Here node is technical term used todenote machine present in the cluster and demon isthe technical term used to show the backgroundprocesses running on a Linux machine.The slave node on the other hand is responsible forrunning the data node and the task tracker demons.The name node and data node are responsible forstoring and managing the data and commonly referredto as storage node. Whereas the job tracker and tasktracker is responsible for processing and computing adata and commonly known as Compute node.Normally the name node and job tracker runs on asingle machine whereas a data node and task trackerruns on different machines.B. Features Of Hadoop:[3]i. Cost effective system: It does not require anyspecial hardware. It simply can be implementedin a common machine technically known ascommodity hardware.ii. Large cluster of nodes: A hadoop system cansupport a large number of nodes which providesa huge storage and processing system.iii. Parallel processing: a hadoop cluster provide theaccessibility to access and manage data parallelwhich saves a lot of time.iv. Distributed data: it takes care of splinting anddistributing of data across all nodes within a cluster.it also replicates the data over the entire cluster.v. Automatic failover management: once and AFMis configured on a cluster, the admin needs not toworry about the failed machine. Hadoop replicatesthe configuration Here one copy of each data iscopied or replicated to the node in the same rackand the hadoop take care of the internetworkingbetween two racks.vi. Data locality optimization: This is the mostpowerful thing of hadoop which make it the mostefficient feature. Here if a person requests for ahuge data which relies in some other place, themachine will sends the code of that data and thenother person compiles it and use it in particularas it saves a log to bandwidthvii. Heterogeneous cluster: node or machine can beof different vendor and can be working ondifferent flavor of operating systems.viii. Scalability: in hadoop adding a machine orremoving a machine does not effect on a cluster.Even the adding or removing the component ofmachine does not.C. Hadoop ArchitectureHadoop comprises of two componentsi. HDFSii. MAPREDUCEHadoop distributes big data in several chunks and storedata in several nodes within a cluster whichsignificantly reduces the time.Hadoop replicates each part of data into each machinethat are present within the cluster.The no. of copies replicated depends on the replicationfactor. By default the replication factor is 3. Thereforein this case there are 3 copies to each data on 3 differentmachines。reference:Mahajan, P., Gaba, G., & Chauhan, N. S. (2016). Big Data Security. IITM Journal of Management and IT, 7(1), 89-94.自己拿去翻译网站翻吧，不懂可以问

326 评论 2小时前发布

柔情似水9999

大数据论文参考文献回答于2018-09-14现今人们的生活到处充斥着大数据给我们带来的便利，那么大数据论文参考文献有哪些呢?小编为方便大家特意搜集了一些大数据论文参考文献，希望能帮助到大家。大数据论文参考文献一：[1] 陈杰. 本地文件系统数据更新模式研究[D]. 华中科技大学 2014[2] 刘洋. 层次混合存储系统中缓存和预取技术研究[D]. 华中科技大学 2013[3] 李怀阳. 进化存储系统数据组织模式研究[D]. 华中科技大学 2006[4] 邓勇强，朱光喜，刘文明. LDPC码的低复杂度译码算法研究[J]. 计算机科学. 2006(07)[5] 陆承涛. 存储系统性能管理问题的研究[D]. 华中科技大学 2010[6] 罗东健. 大规模存储系统高可靠性关键技术研究[D]. 华中科技大学 2011[7] 王健宗. 云存储服务质量的若干关键问题研究[D]. 华中科技大学 2012[8] 余雪里. 金属氧化物pn异质结对光电响应与气体敏感特性的作用[D]. 华中科技大学 2014[9] 王玮. 基于内容关联密钥的视频版权保护技术研究[D]. 华中科技大学 2014[10] 韩林. 云存储移动终端的固态缓存系统研究[D]. 华中科技大学 2014[11] 田宽. 宫内节育器用Cu/LDPE复合材料的表面改性研究[D]. 华中科技大学 2013[12] 聂雪军. 内容感知存储系统中信息生命周期管理关键技术研究[D]. 华中科技大学 2010[13] 王鹏. 低密度奇偶校验码应用于存储系统的关键技术研究[D]. 华中科技大学 2013[14] 刁莹. 用数学建模方法评价存储系统性能[D]. 哈尔滨工程大学 2013[15] 符青云. 面向大规模流媒体服务的高性能存储系统研究[D]. 电子科技大学 2009[16] 王玉林. 多节点容错存储系统的数据与缓存组织研究

102 评论 10小时前发布

ShangHaiWendy

内容如下：

1、大数据对商业模式影响

2、大数据下地质项目资金内部控制风险

3、医院统计工作模式在大数据时代背景下改进

4、大数据时代下线上餐饮变革

5、基于大数据小微金融

大数据(big data)，或称巨量资料，指的是所涉及的资料量规模巨大到无法透过目前主流软件工具，在合理时间内达到撷取、管理、处理、并整理成为帮助企业经营决策更积极目的的资讯。

在维克托·迈尔-舍恩伯格及肯尼斯·库克耶编写的《大数据时代》中大数据指不用随机分析法（抽样调查）这样捷径，而采用所有数据进行分析处理。大数据的5V特点（IBM提出）：Volume（大量）、Velocity（高速）、Variety（多样）、Value（低价值密度）、Veracity（真实性）。

141 评论 10小时前发布

护手霜adb

学术堂整理了十五个和大数据有关的毕业论文题目，供大家进行参考：1、大数据对商业模式影响2、大数据下地质项目资金内部控制风险3、医院统计工作模式在大数据时代背景下改进4、大数据时代下线上餐饮变革5、基于大数据小微金融6、大数据时代下对财务管理带来机遇和挑战7、大数据背景下银行外汇业务管理分析8、大数据在互联网金融领域应用9、大数据背景下企业财务管理面临问题解决措施10、大数据公司内部控制构建问题11、大数据征信机构运作模式监管12、基于大数据视角下我国医院财务管理分析13、大数据背景下宏观经济对微观企业行为影响14、大数据时代建筑企业绩效考核和评价体系15、大数据助力普惠金融

222 评论 12小时前发布

有关大数据的英文论文参考文献

4个回答 默认排序 默认排序 按时间排序

相关问答

期刊论文

向你推荐

热门问题

4个回答默认排序

默认排序

按时间排序