Historikk

Cristin-resultat-ID: 1925399

Sist endret: 15. februar 2022, 15:19

NVI-rapporteringsår: 2021

Resultat

Vitenskapelig Kapittel/Artikkel/Konferanseartikkel

2021

Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning

Joakim Olsen
Arild Brandrud Næss og
Pierre Lison

Bok Bok

Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

ISBN:

978-91-7929-614-8

Utgiver

Linköping University Electronic Press

NVI-nivå 1

Finn i kanalregisteret

Serie

Linköping Electronic Conference Proceedings

ISSN 1650-3686
e-ISSN 1650-3740

NVI-nivå 1

Finn i kanalregisteret

Om resultatet Om resultatet

Vitenskapelig Kapittel/Artikkel/Konferanseartikkel

Publiseringsår: 2021

Hefte: 178

Sider: 112 - 123

ISBN:

978-91-7929-614-8

Lenker Lenker

ORIA

Søk i ORIA med 978-91-7929-614-8

Klassifisering Klassifisering

Fagfelt (NPI)

Fagfelt: Tverrfaglig teknologi

- Fagområde: Realfag og teknologi

Beskrivelse Beskrivelse

Engelsk

Tittel

Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning

Sammendrag

This paper explores how to automatically measure the quality of human-generated summaries, based on a Norwegian corpus of real estate condition reports and their corresponding summaries. The proposed approach proceeds in two steps. First, the real estate reports and their associated summaries are automatically labelled using a set of heuristic rules gathered from human experts and aggregated using weak supervision. The aggregated labels are then employed to learn a neural model that takes a document and its summary as inputs and outputs a score reflecting the predicted quality of the summary. The neural model maps the document and its summary to a shared “summary content space” and computes the cosine similarity between the two document embeddings to predict the final summary quality score. The best performance is achieved by a CNN-based model with an accuracy (measured against the aggregated labels obtained via weak supervision) of 89.5%, compared to 72.6% for the best unsupervised model. Manual inspection of examples indicate that the weak supervision labels do capture important indicators of summary quality, but the correlation of those labels with human judgements remains to be validated. Our models of summary quality predict that approximately 30% of the real estate reports in the corpus have a summary of poor quality.

Vis fullstendig beskrivelse

Bidragsytere Bidragsytere

Joakim Olsen

Forfatter
ved Institutt for matematiske fag ved Norges teknisk-naturvitenskapelige universitet
Forfatter
ved NTNU Handelshøyskolen ved Norges teknisk-naturvitenskapelige universitet

Arild Brandrud Næss

Forfatter
ved NTNU Handelshøyskolen ved Norges teknisk-naturvitenskapelige universitet

Pierre Lison

Forfatter
ved Avdeling for statistisk analyse og maskinlæring for brukermotiverte anvendelser SAMBA ved Norsk Regnesentral

1 - 3 av 3

Tilknyttede prosjekter Tilknyttede prosjekter

Graph-based Neural Models for Dialogue Management

Pierre Lison + 5 deltakere

Norsk Regnesentral

35 resultater

Aktivt prosjekt

1 - 1 av 1

Resultatet er en del av Resultatet er en del av

Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa).

Dobnik, Simon; Øvrelid, Lilja. 2021, Linköping University Electronic Press. UIOVitenskapelig antologi/Konferanseserie

1 - 1 av 1