Automated database design for document stores with multicriteria optimization

dc.contributor.authorHewasinghage, Moditha Lakshan Dharmasir
dc.contributor.authorNadal Francesch, Sergi
dc.contributor.authorAbelló Gamazo, Alberto
dc.contributor.authorZimányi, Esteban
dc.contributor.groupUniversitat Politècnica de Catalunya. inSSIDE - integrated Software, Services, Information and Data Engineering
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació
dc.date.accessioned2023-05-18T08:57:04Z
dc.date.available2023-05-18T08:57:04Z
dc.date.issued2023-03-11
dc.description.abstractDocument stores have gained popularity among NoSQL systems mainly due to the semi-structured data storage structure and the enhanced query capabilities. The database design in document stores expands beyond the first normal form by encouraging de-normalization through nesting. This hinders the process, as the number of alternatives grows exponentially with multiple choices in nesting (including different levels) and referencing (including the direction of the reference). Due to this complexity, document store data design is mostly carried out in trial-and-error or ad-hoc rule-based approaches. However, the choices affect multiple, often conflicting, aspects such as query performance, storage space, and complexity of the documents. To overcome these issues, in this paper, we apply multicriteria optimization. Our approach is driven by a query workload and a set of optimization objectives. First, we formalize a canonical model to represent alternative designs and introduce an algebra of transformations that can systematically modify a design. Then, using these transformations, we implement a local search algorithm driven by a loss function that can propose near-optimal designs with high probability. Finally, we compare our prototype against an existing document store data design solution purely driven by query cost, where our proposed designs have better performance and are more compact with less redundancy.
dc.description.peerreviewedPeer Reviewed
dc.description.sponsorshipOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research has been funded by the European Commission through the Erasmus Mundus Joint Doctorate "Information Technologies for Business Intelligence—Doctoral College" (IT4BI-DC). Sergi Nadal is partly supported by the Spanish Ministerio de Ciencia e Innovación, as well as the European Union—NextGenerationEU, under project FJC2020-045809-I / AEI/10.13039/501100011033.
dc.description.versionPostprint (published version)
dc.format.extent33 p.
dc.identifier.citationHewasinghage, M. [et al.]. Automated database design for document stores with multicriteria optimization. "Knowledge and information systems", 11 Març 2023, vol. 65, p. 3046-3078.
dc.identifier.doi10.1007/s10115-023-01828-3
dc.identifier.issn0219-3116
dc.identifier.urihttps://hdl.handle.net/2117/387548
dc.language.isoeng
dc.publisherSpringer
dc.relation.publisherversionhttps://link.springer.com/article/10.1007/s10115-023-01828-3
dc.rights.accessOpen Access
dc.rights.licensenameAttribution 4.0 International
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Bases de dades
dc.subject.lcshDatabase design
dc.subject.lemacBases de dades -- Disseny
dc.subject.otherDocument store
dc.subject.otherOptimization
dc.titleAutomated database design for document stores with multicriteria optimization
dc.typeArticle
dspace.entity.typePublication
local.citation.authorHewasinghage, M.; Nadal, S.; Abello, A.; Zimányi, E.
local.citation.endingPage3078
local.citation.publicationNameKnowledge and information systems
local.citation.startingPage3046
local.citation.volume65
local.identifier.drac35617730

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
s10115-023-01828-3.pdf
Mida:
2.64 MB
Format:
Adobe Portable Document Format
Descripció: