Mostra el registre d'ítem simple

dc.contributor.authorBarrón-Cedeño, Alberto
dc.contributor.authorVila, Marta
dc.contributor.authorMartí, Maria Antonia
dc.contributor.authorRosso, Paolo
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned2013-10-04T08:32:27Z
dc.date.available2013-10-05T09:02:55Z
dc.date.created2013
dc.date.issued2013
dc.identifier.citationBarron-Cedeño, A. [et al.]. Plagiarism meets paraphrasing: insights for the next generation in automatic plagiarism detection. "Computational linguistics", 2013, vol. 39, núm. 4, p. 1-32.
dc.identifier.issn0588-9324
dc.identifier.urihttp://hdl.handle.net/2117/20297
dc.description.abstractAlthough paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism detectors find it difficult to detect cases of paraphrase plagiarism. In this article, we analyze the relationship between paraphrasing and plagiarism, paying special attention to which paraphrase phenomena underlie acts of plagiarism and which of them are detected by plagiarism detection systems. With this aim in mind, we created the P4P corpus, a new resource that uses a paraphrase typology to annotate a subset of the PAN-PC-10 corpus for automatic plagiarism detection. The results of the Second International Competition on Plagiarism Detection were analyzed in the light of this annotation. The presented experiments show that (i) more complex paraphrase phenomena and a high density of paraphrase mechanisms make plagiarism detection more difficult, (ii) lexical substitutions are the paraphrase mechanisms used the most when plagiarizing, and (iii) paraphrase mechanisms tend to shorten the plagiarized text. For the first time, the paraphrase mechanisms behind plagiarism have been analyzed, providing critical insights for the improvement of automatic plagiarism detection systems.
dc.format.extent32 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcshPlagiarism detection systems
dc.titlePlagiarism meets paraphrasing: insights for the next generation in automatic plagiarism detection
dc.typeArticle
dc.subject.lemacPlagi
dc.subject.lemacLingüística computacional
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.identifier.doi10.1162/COLI_a_00153
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00153?prevSearch=allfield%253A%2528Barr%25C3%25B3n%2529&searchHistoryKey=
dc.rights.accessOpen Access
local.identifier.drac12478493
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/WIQ-EI
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/FP7/246016/EU/Alain Bensoussan Career Development Enhancer/ABCDE
local.citation.authorBarron-Cedeño, A.; Vila, M.; Martí, M.; Rosso, P.
local.citation.publicationNameComputational linguistics
local.citation.volume39
local.citation.number4
local.citation.startingPage1
local.citation.endingPage32


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple