nlp毕业论文
nlp毕业论文
你的论文准备往什么方向写,选题老师审核通过了没,有没有列个大纲让老师看一下写作方向?
老师有没有和你说论文往哪个方向写比较好?写论文之前,一定要写个大纲,这样老师,好确定了框架,避免以后论文修改过程中出现大改的情况!!
学校的格式要求、写作规范要注意,否则很可能发回来重新改,你要还有什么不明白或不懂可以问我,希望你能够顺利毕业,迈向新的人生。
一、毕业论文选题的重要意义
第一、选题是撰写毕业论文的第一步,它实际上是确定“写什么”的问题,也就是确定论文论述的方向。如果“写什么”都不明确,“怎么写”根本无从谈起,因此毕业论文的顺利完成离不开合适的论文选题。
第二、毕业论文的写作一方面是对这几年所学知识的一次全面检验,同时也是对同学们思考问题的广度和深度的全面考察。因此,毕业论文的选题非常重要,既要考虑论文涉及的层面,又要考虑它的社会价值。
二、毕业论文选题的原则
(一)专业性原则
毕业论文选题必须紧密结合自己所学的专业,从那些学过的课程内容中选择值得研究或探讨的学术问题,不能超出这个范围,否则达不到运用所学理论知识来解决实际问题的教学目的。我们学的是工商管理专业,选题当然不能脱离这个大范畴,而且在限定的小范围内,也不能脱离工商管理、企业经营去谈公共事业管理或金融问题。学术研究是无止境的,任何现成的学说,都有需要完善改进的地方,这就是选题的突破口,由此入手,是不难发现问题、提出问题的。
(二)价值性原则
论文要有科学价值。那些改头换面的文章抄袭、东拼西凑的材料汇集以及脱离实际的高谈阔论,当然谈不上有什么价值。既然是论文,选题就要具有一定的学术意义,也就是要具有先进性、实践性和一定的理论意义。对于工商管理专业的学生而言,我们可以选择企业管理中有理论意义和实践指导意义的论题,或是对提高我国企业的管理水平有普遍意义的议题,还可以是新管理方法的使用。毕业论文的价值关键取决于是否有自己的恶创见。也就是说,不是简单地整理和归纳书本上或前人的见解,而是在一定程度上用新的事实或新的理论来丰富专业学科的某些氦姬份肯莓厩逢询抚墨内容,或者运用所学专业知识解决现实中需要解决的问题。
(三)可能性原则
选题要充分考虑到论题的宽度和广度以及你所能占有的论文资料。既要有“知难而进”的勇气和信心,又要做到“量力而行”。”选题太大、太难,自己短时间内无力完成,不行;选题太小、太易,又不能充分发挥自己的才能,也不行。一切应从实际出发,主要应考虑选题是否切合自己的特长和兴趣,是否可以收集到足够的材料和信息,是否和自己从事的工作相接近。一定要考虑主客观条件和时限,选择那些适合自己情况,可以预期成功的课题。一般来说,题目的大小要由作者实际情况而定,很难作硬性规定要求。有的同学如确有水平和能力,写篇大文章,在理论上有所突破和创新,当然是很好的。但从成人高校学生的总体来看,选题还是小点为宜。小题目论述一两个观点,口子虽小,却能小题大做,能从多层次多角度进行分析论证.这样,自己的理论水平可以发挥,文章本身也会写得丰满而充实。选择一个比较恰当的小论题,特别是与自己的工作或者生活密切相关的问题,不仅容易搜集资料,同时对问题也看得准,论述也会更透彻,结论也就可能下得更准确。
三、毕业论文选题的方法
第一、 浏览捕捉法。这种方法是通过对占有的论文资料快速、大量地阅读,在比较中来确定题目的方法。浏览,一般是在资料占有达到一定数量时集中一段时间进行,这样便于对资料作集中的比较和鉴别。浏览的目的是在咀嚼消化已有资料的过程中,提出问题,寻找自己的论题。这就需要我们对收集到的材料进行全面阅读研究,主要的、次要的、不同角度的、不同观点的都应了解,不能“先入为主”,不能以自己头脑中原有的观点决定取舍。而应冷静地、客观地对所有资料作认真的分析思考,从内容丰富的资料中吸取营养,反复思考琢磨之后,就会有所发现,然后再根据自己的实际确定自己的论题。
第二、 追溯验证法。这种方法要求同学们先有一种拟想,然后再通过阅读资料加以验证来确定选题的方法。同学们应该先有自己的主观论点,即根据自己平时的积累,初步确定准备研究的方向、题目或选题范围。这种选题方法应注意:看自己的“拟想”是否与别人重复,是否对别人的观点有补充作用;如果自己的“拟想”虽然别人还没有谈到,但自己尚缺乏足够的理由来加以论证,那就应该中止,再作重新构思。要善于捕捉一闪之念,抓住不放,深入研究。在阅读文献资料或调查研究中,有时会突然产生一些思想火花,尽管这种想法很简单、很朦胧,也未成型,但千万不可轻易放弃。
第三、 知识迁移法。通过四年的学习,对某一方面的理论知识(经济或者法律或者其它)有一个系统的新的理解和掌握。这是对旧知识的一种延伸和拓展,是一种有效的更新。在此基础之上,同学们在认识问题和解决问题的时候就会用所学到的新知识来感应世界,从而形成一些新的观点。理论知识和现实的有机结合往往会激发同学们思维的创造力和开拓性,为毕业论文的选题提供了一个良好的实践基础和理论基础。
第四、 关注热点法。热点问题就是在现代社会中出现的能够引起公众广泛注意的问题。这些问题或关系国计民生,或涉及时代潮流,而且总能吸引人们注意,引发人们思考和争论。同学们在平时的学习和工作中大部分也都会关注国际形势、时事新闻、经济变革。选择社会热点问题作为论文论题是一件十分有意义的事情,不仅可以引起指导老师的关注,激发阅读者的兴趣和思考,而且对于现实问题的认识和解决也具有重要的意义。将社会热点问题作为论文的论题对于同学们搜集材料、整理材料、完成论文也提供了许多便利。
第五,调研选题法。调研选题法类同于关注社会热点这样的选题方法,但所涉及的有一部分是社会热点问题,也有一部分并不是社会热点问题。社会调研可以帮助我们更多地了解调研所涉问题的历史、现状以及发展趋势,对问题的现实认识将更为清晰,并可就现实问题提出一些有针对性的意见和建议。同学们将社会调研课题作为毕业论文的论题,有着十分重要的现实意义,不仅可为地方经济建设和社会发展提供有价值的资料和数据,而且可为解决一些社会现实问题提供一个很好的路径。
NLP领域必读的8篇论文
推荐下NLP领域内最重要的8篇论文吧(依据学术范标准评价体系得出的8篇名单):
一、Deep contextualized word representations
摘要:We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.
全文链接: Deep contextualized word representations——学术范
二、Glove: Global Vectors for Word Representation
摘要:Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.
全文链接: Glove: Global Vectors for Word Representation——学术范
三、SQuAD: 100,000+ Questions for Machine Comprehension of Text
摘要:We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on dependency and constituency trees. We build a strong logistic regression model, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at this https URL
全文链接: SQuAD: 100,000+ Questions for Machine Comprehension of Text——学术范
四、GloVe: Global Vectors for Word Representation
摘要:Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.
全文链接: GloVe: Global Vectors for Word Representation——学术范
五、Sequence to Sequence Learning with Neural Networks
摘要:Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
全文链接: Sequence to Sequence Learning with Neural Networks——学术范
六、The Stanford CoreNLP Natural Language Processing Toolkit
摘要:We describe the design and use of the Stanford CoreNLP toolkit, an extensible pipeline that provides core natural language analysis. This toolkit is quite widely used, both in the research NLP community and also among commercial and government users of open source NLP technology. We suggest that this follows from a simple, approachable design, straightforward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
全文链接: The Stanford CoreNLP Natural Language Processing Toolkit——学术范
七、Distributed Representations of Words and Phrases and their Compositionality
摘要:The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
全文链接: Distributed Representations of Words and Phrases and their Compositionality——学术范
八、Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
摘要:Semantic word spaces have been very useful but cannot express the meaning of longer phrases in a principled way. Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition. To remedy this, we introduce a Sentiment Treebank. It includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality. To address them, we introduce the Recursive Neural Tensor Network. When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effects of negation and its scope at various tree levels for both positive and negative phrases.
全文链接: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank——学术范
希望可以对大家有帮助, 学术范 是一个新上线的一站式学术讨论社区,在这里,有海量的计算机外文文献资源与研究领域最新信息、好用的文献阅读及管理工具,更有无数志同道合的同学以及学术科研工作者与你一起,展开热烈且高质量的学术讨论!快来加入我们吧!
【NLP论文笔记】Sequence to Sequence Learning with Neural Networks
本文主要用于记录谷歌发表于2014年的一篇神作(引用量上千),现已被广泛使用的Sequence to Sequence模型论文。方便初学者快速入门,以及自我回顾。
论文链接:
基本目录如下:
------------------第一菇 - 摘要------------------
深度神经网络(DNNS)在2014年之前已经被证明可用于各种复杂的学习任务,并且均被证实其可行性及高准确率。但是其有一个弊端,即它需要有足够的标注数据,因此其并不适用于去做序列到序列的映射任务(map sequences to sequences)。本论文主要贡献在于提出了一种端到端(end-to-end)的神经网络模型,来学习这种映射关系。作者用一个多层的LSTM网络来将输入序列映射(编码)为一个固定大小纬度的向量,再用另外一个多层的LSTM网络来解码该向量为输出序列。为了论证该模型的可行性,作者将其应用于英语-法语的翻译任务,最终发现其所能达到的效果与当时最好的成绩(可能是某一种SMT,统计机器翻译模型)相差无几。最后,作者对模型以及实验结果进行分析以后,还发现几个有趣的点。1)模型对句子的主动与被动语态并不敏感,但是对输入词的顺序很敏感。2)倒序输入句子能提升模型效果
------------------第二菇 - 核心思想------------------
话不多说,直接介绍该论文提出的模型结构。NLP里面的王牌长短时记忆模型(LSTM)当仁不让的被选为作为该模型的基础。主要的思想就是,用一个LSTM(已经被证实能很好的解决长时序列依赖问题)来编码输入序列,顺序输入序列(one step at a time,并没有对输入序列对长度做限制!),以此我们就会得到一个固定大小纬度的向量表达,然后再用另一个LSTM(本质上就是一个语言模型,除了其初始状态就是输入序列被编码得到的向量)来解码该向量,并得到输出序列。可以说这套Seq2Seq框架的提出,为之后的序列映射任务(比如机器翻译等)的质量提升,奠定了扎实的基础。没有理解透其原理的读者,可以再多看一眼论文的图1,我这边也贴出来了,简单理解一下就是输入序列为ABC以及输入序列结束符号<EOS>,从<EOS>开始解码出WXYZ以及结束符号<EOS>,停止解码。多说一句该模型架构在翻译任务上,均取得了不错的效果,且还有巨大的提升空间(比如引入注意力机制),具体细节可以看论文第一章。
在随后介绍模型的第二章,作者也强调了他们对LSTM一些改进的点,主要可以归纳为如下3点:
1)他们使用了2个不同的LSTM模型,以此在增加模型参数的同时,计算量的增加微乎其微,但却提高量模型的泛化能力。
2)实验证明,深层的LSTM模型要比浅层的表现更好,这也符合我们的一贯认知。
3)倒序输入句子,意思就是ABC输入的顺序为CBA,实验表征这样的效果更好。论文自己也很坦诚说自己还没有给出完善的理论依据,但也强行解释了一波,这里我的理解是,顺序与逆序输入并没有改变平均距离(这里距离指的是time step diff),但是却让一开始的几个词他们的距离变短了,而句子末尾的词距离变长的代价似乎并不显著,因此逆序输入会得到更好的效果。
还有一个模型细节可以提一下就是,在解码阶段,beam search解码方式被采用,且能显著提高准确率。该方法其实本质就是在解码各阶段保留多个候选句子,最后在选择概率最大的序列。如果该方法的保留个数为1,其实就是贪心的思想,每一次解码都选取概率最大的那个值。该论文给出的结论是当size为2的时候,模型的表现最好。更多有关该方法的讨论大家可以移步该 博客 。
更多关于模型的训练细节,大家还是仔细看一下论文的第三章,这里就不展开介绍了,唯一想记录的一点就是,他们实验的时候尽量保证一个Batch的句子长度相似,据说提高了2倍的训练速度。
具体的实验结果这里就不展现了,原文中也表述的十分清晰,该框架模
型的结果还是杠杠的,这里具体谈2个比较有趣且有价值的点。
1)模型对句子中词的先后顺序较为敏感,但是对其语态并不敏感。这里再贴一张原文的图如下:
大家可以发现,John和Mary的顺序颠倒以后,并没有很好的聚合到一起,倒是admires和is in love with能聚合到一起,由此可见,我们的模型对句子对顺序是十分敏感的。
2)该框架模型对长句的翻译表现出乎意料的好。这里贴一张原文的图如下:
左图呈现了BLEU分数随着句子长度的增加,我们的模型表现并没有呈现明显的下降趋势(其实还在上升),只有在超过35字以后,才略微有一点下降。这也说明了该模型框架对于长句的处理也是足够能胜任的,能想到的就是在实际应用中可能也不需要特别对长句做特殊处理。
右图呈现的就是对那些比较生僻的句子,模型的泛化能力,可以看到模型下降的趋势还是比较明显的,这倒也不难理解,毕竟生僻句子在数据集本身占比就少,如果有特殊的应用场景,那通常的做法我们都是加该场景下的特殊数据,进行微调,应该也能取得不错的效果。
------------------第三菇 - 总结------------------
到这里,整篇论文的核心思想及其创新点已经说清楚了。这套seq2seq的框架,也为后续的序列映射任务奠定了基础。论文作者也是尤其惊诧逆序输入句子对效果的提升,以及该模型对长句的翻译能力!
简单总结一下本文就是先罗列了一下该论文的摘要,再具体介绍了一下模型细节,最后再具体探讨了几个创新的来结束该论文笔记。希望大家读完本文后能进一步加深对该论文的理解。有说的不对的地方也请大家指出,多多交流,大家一起进步~?
上一篇:毕业论文和省考
下一篇:平安校园论文素材