Articles de revista

Articles de revista http://hdl.handle.net/2117/3778 2024-04-24T21:28:39Z Information retrieval from scientific abstract and citation databases: A query-by-documents approach based on Monte-Carlo sampling http://hdl.handle.net/2117/370214 Information retrieval from scientific abstract and citation databases: A query-by-documents approach based on Monte-Carlo sampling Lechtenberg, Fabian; Farreres de la Morena, Xavier; Galvan Cara, Aldwin Lois; Somoza Tornos, Ana; Espuña Camarasa, Antonio; Graells Sobré, Moisès The rapidly increasing amount of information and entries in abstract and citation databases steadily complicates the information retrieval task. In this study, a novel query-by-document approach using Monte-Carlo sampling of relevant keywords is presented. From a set of input documents (seed) keywords are extracted using TF-IDF and subsequently sampled to repeatedly construct queries to the database. The occurrence of returned documents is counted and serves as a proxy relevance metric. Two case studies based on the Scopus® database are used to demonstrate the method and its key advantages. No expert knowledge and human intervention is needed to construct the final search strings which reduces the human bias. The methods practicality is supported by the high re-retrieval of seed documents of 7/8 and 26/31 in high ranks in the two presented case studies. 2022-07-14T13:13:08Z Lechtenberg, Fabian Farreres de la Morena, Xavier Galvan Cara, Aldwin Lois Somoza Tornos, Ana Espuña Camarasa, Antonio Graells Sobré, Moisès The rapidly increasing amount of information and entries in abstract and citation databases steadily complicates the information retrieval task. In this study, a novel query-by-document approach using Monte-Carlo sampling of relevant keywords is presented. From a set of input documents (seed) keywords are extracted using TF-IDF and subsequently sampled to repeatedly construct queries to the database. The occurrence of returned documents is counted and serves as a proxy relevance metric. Two case studies based on the Scopus® database are used to demonstrate the method and its key advantages. No expert knowledge and human intervention is needed to construct the final search strings which reduces the human bias. The methods practicality is supported by the high re-retrieval of seed documents of 7/8 and 26/31 in high ranks in the two presented case studies. Zipf’s laws of meaning in Catalan http://hdl.handle.net/2117/358980 Zipf’s laws of meaning in Catalan Catala Roig, Neus; Baixeries i Juvillà, Jaume; Ferrer Cancho, Ramon; Padró, Lluís; Hernández Fernández, Antonio In his pioneering research, G. K. Zipf formulated a couple of statistical laws on the relationship between the frequency of a word with its number of meanings: the law of meaning distribution, relating the frequency of a word and its frequency rank, and the meaning-frequency law, relating the frequency of a word with its number of meanings. Although these laws were formulated more than half a century ago, they have been only investigated in a few languages. Here we present the first study of these laws in Catalan. We verify these laws in Catalan via the relationship among their exponents and that of the rank-frequency law. We present a new protocol for the analysis of these Zipfian laws that can be extended to other languages. We report the first evidence of two marked regimes for these laws in written language and speech, paralleling the two regimes in Zipf’s rank-frequency law in large multi-author corpora discovered in early 2000s. Finally, the implications of these two regimes will be discussed. 2021-12-21T13:27:15Z Catala Roig, Neus Baixeries i Juvillà, Jaume Ferrer Cancho, Ramon Padró, Lluís Hernández Fernández, Antonio In his pioneering research, G. K. Zipf formulated a couple of statistical laws on the relationship between the frequency of a word with its number of meanings: the law of meaning distribution, relating the frequency of a word and its frequency rank, and the meaning-frequency law, relating the frequency of a word with its number of meanings. Although these laws were formulated more than half a century ago, they have been only investigated in a few languages. Here we present the first study of these laws in Catalan. We verify these laws in Catalan via the relationship among their exponents and that of the rank-frequency law. We present a new protocol for the analysis of these Zipfian laws that can be extended to other languages. We report the first evidence of two marked regimes for these laws in written language and speech, paralleling the two regimes in Zipf’s rank-frequency law in large multi-author corpora discovered in early 2000s. Finally, the implications of these two regimes will be discussed. Unleashing textual descriptions of business processes http://hdl.handle.net/2117/354549 Unleashing textual descriptions of business processes Sànchez-Ferreres, Josep; Burattin, Andrea; Carmona Vargas, Josep; Montali, Marco; Padró, Lluís; Quishpi Betún, Luis Hernán Textual descriptions of processes are ubiquitous in organizations, so that documentation of the important processes can be accessible to anyone involved. Unfortunately, the value of this rich data source is hampered by the challenge of analyzing unstructured information. In this paper we propose a framework to overcome the current limitations on dealing with textual descriptions of processes. This framework considers extraction and analysis and connects to process mining via simulation. The framework is grounded in the notion of annotated textual descriptions of processes, which represents a middle-ground between formalization and accessibility, and which accounts for different modeling styles, ranging from purely imperative to purely declarative. The contributions of this paper are implemented in several tools, and case studies are highlighted. 2021-10-26T07:33:48Z Sànchez-Ferreres, Josep Burattin, Andrea Carmona Vargas, Josep Montali, Marco Padró, Lluís Quishpi Betún, Luis Hernán Textual descriptions of processes are ubiquitous in organizations, so that documentation of the important processes can be accessible to anyone involved. Unfortunately, the value of this rich data source is hampered by the challenge of analyzing unstructured information. In this paper we propose a framework to overcome the current limitations on dealing with textual descriptions of processes. This framework considers extraction and analysis and connects to process mining via simulation. The framework is grounded in the notion of annotated textual descriptions of processes, which represents a middle-ground between formalization and accessibility, and which accounts for different modeling styles, ranging from purely imperative to purely declarative. The contributions of this paper are implemented in several tools, and case studies are highlighted. Computation of alignments of business processes through relaxation labelling and local optimal search http://hdl.handle.net/2117/336365 Computation of alignments of business processes through relaxation labelling and local optimal search Padró, Lluís; Carmona Vargas, Josep A fundamental problem in conformance checking is aligning event data with process models. Unfortunately, existing techniques for this task are either complex, or can only be applicable to restricted classes of models. This in practice means that for large inputs, current techniques often fail to produce a result. In this paper we propose a method to compute alignments for unconstrained process models, which relies on the use of relaxation labelling techniques on top of a partial order representation of the process model. The technique proposed in this paper precomputes information used in the search for alignments, and is able to produce real alignments that may be close to optimal ones by combining the aforementioned techniques with a locally applied A strategy. Remarkably, the implementation on the proposed technique achieves a speed-up of several orders of magnitude with respect to the approaches in the literature (either optimal, sup-optimal or approximate), often with a reasonable trade-off on the cost of the obtained alignment. 2021-02-02T07:42:39Z Padró, Lluís Carmona Vargas, Josep A fundamental problem in conformance checking is aligning event data with process models. Unfortunately, existing techniques for this task are either complex, or can only be applicable to restricted classes of models. This in practice means that for large inputs, current techniques often fail to produce a result. In this paper we propose a method to compute alignments for unconstrained process models, which relies on the use of relaxation labelling techniques on top of a partial order representation of the process model. The technique proposed in this paper precomputes information used in the search for alignments, and is able to produce real alignments that may be close to optimal ones by combining the aforementioned techniques with a locally applied A strategy. Remarkably, the implementation on the proposed technique achieves a speed-up of several orders of magnitude with respect to the approaches in the literature (either optimal, sup-optimal or approximate), often with a reasonable trade-off on the cost of the obtained alignment. Flexible process model mapping using relaxation labeling http://hdl.handle.net/2117/330656 Flexible process model mapping using relaxation labeling Delicado Alcántara, Luis; Carmona Vargas, Josep; Padró, Lluís Computing a mapping between two process models is a crucial technique, since it enables reasoning and operating across processes, like providing a similarity score between two processes, or merging different process variants to generate a consolidated process model. In this paper we present a new flexible technique for process model mapping, based on the relaxation labeling constraint satisfaction algorithm. The technique can be instantiated so that different modes are devised, depending on the context. For instance, it can be adapted to the case where one of the mapped process models is incomplete, or it can be used to ground an adaptable similarity measure between process models. The approach has been implemented inside the open platform NLP4BPM, providing a visualization of the performed mappings and computed similarity scores. The experimental results witness the flexibility and usefulness of the technique proposed. 2020-10-22T14:55:08Z Delicado Alcántara, Luis Carmona Vargas, Josep Padró, Lluís Computing a mapping between two process models is a crucial technique, since it enables reasoning and operating across processes, like providing a similarity score between two processes, or merging different process variants to generate a consolidated process model. In this paper we present a new flexible technique for process model mapping, based on the relaxation labeling constraint satisfaction algorithm. The technique can be instantiated so that different modes are devised, depending on the context. For instance, it can be adapted to the case where one of the mapped process models is incomplete, or it can be used to ground an adaptable similarity measure between process models. The approach has been implemented inside the open platform NLP4BPM, providing a visualization of the performed mappings and computed similarity scores. The experimental results witness the flexibility and usefulness of the technique proposed. Filtrado de especificaciones de software escritas en lenguaje natural http://hdl.handle.net/2117/193050 Filtrado de especificaciones de software escritas en lenguaje natural Castell Ariño, Núria; Hernández Gómez, M. Angeles La fase de especificación es una de las más importantes y menos controladas en el proceso de desarrollo de software. Hemos concebido SAREL1 (la versión original de este articulo, en inglés, ha sido publicado por Springer Verlag en la colección Lecture Notes on Al [11]) (Sistema de Ayuda para la Redacción de Especificaciones de Software escritas en Lenguage Natural) como una herramienta que mejore la base de especificación. SAREL es una continuación del programa de investigación y desarrollo llamado LESD (Jnge11iería Liflgüistica para el Diseño de Software). El propósito de SAREL1 (este trabajo ha sido parcialmente subvencionado por CICYT (TIC93-420}) es ayudar a los ingenieros en la creación de especificaciones de software escrilas en Lenguaje Natural. Está dividido en tres módulos: el primero controla los requerimientos de acuerdo a las normas de redacción, el segundo obtiene una representación conceptual utilizando la base de conocimiento, y el tercero lleva a cabo una serie de análisis teniendo en cuenta las siguientes propiedades de calidad: consistencia. completitud. trazabilidad, verificabilidad y modificabilidad. Una vez el requisito ha sido etiquetado como correcto, su representación conceptual es añadida a la base de requisitos. 2020-07-16T15:10:02Z Castell Ariño, Núria Hernández Gómez, M. Angeles La fase de especificación es una de las más importantes y menos controladas en el proceso de desarrollo de software. Hemos concebido SAREL1 (la versión original de este articulo, en inglés, ha sido publicado por Springer Verlag en la colección Lecture Notes on Al [11]) (Sistema de Ayuda para la Redacción de Especificaciones de Software escritas en Lenguage Natural) como una herramienta que mejore la base de especificación. SAREL es una continuación del programa de investigación y desarrollo llamado LESD (Jnge11iería Liflgüistica para el Diseño de Software). El propósito de SAREL1 (este trabajo ha sido parcialmente subvencionado por CICYT (TIC93-420}) es ayudar a los ingenieros en la creación de especificaciones de software escrilas en Lenguaje Natural. Está dividido en tres módulos: el primero controla los requerimientos de acuerdo a las normas de redacción, el segundo obtiene una representación conceptual utilizando la base de conocimiento, y el tercero lleva a cabo una serie de análisis teniendo en cuenta las siguientes propiedades de calidad: consistencia. completitud. trazabilidad, verificabilidad y modificabilidad. Una vez el requisito ha sido etiquetado como correcto, su representación conceptual es añadida a la base de requisitos. Analitzant DESIGN/1 de FOUNDATION com a eina per especificar sistemes d'informació http://hdl.handle.net/2117/193048 Analitzant DESIGN/1 de FOUNDATION com a eina per especificar sistemes d'informació Martín Escofet, Carme; Oliva Solé, Marta; Sesé Muniátegui, Feliciano; Slavkova Hernández, Ólga 2020-07-16T14:29:08Z Martín Escofet, Carme Oliva Solé, Marta Sesé Muniátegui, Feliciano Slavkova Hernández, Ólga Supporting the process of learning and teaching process models http://hdl.handle.net/2117/193012 Supporting the process of learning and teaching process models Sànchez-Ferreres, Josep; Delicado Alcántara, Luis; Andaloussi, Amine Abbad; Burattin, Andrea; Calderón Ruiz, Guillermo; Weber, Barbara; Carmona Vargas, Josep; Padró, Lluís The creation of a process model faces the challenge of constructing a syntactically correct entity which accurately reflects the semantics of the reality, and is understandable. This paper proposes a framework called ModelJudge , focused towards the two main actors in the process of learning process model creation: novice modellers and instructors. For modellers, the platform enables the automatic validation of the process models created from the textual description, providing explanations about quality issues in the model. ModelJudge can provide diagnostics regarding model structure, writing style, and seman- tics by aligning annotated textual descriptions to models. For instructors, the platform facilitates the creation of modelling exercises by providing an editor to annotate the main parts of a textual description, that is empowered with Natural Language Processing (NLP) capabilities so that the annotation effort is minimized. So far around 300 students, in process modelling courses of five different universities around the world have used the platform. The feedback gathered from some of these courses shows good potential in helping students to improve their learning experience, which might, in turn, impact process model quality and understandability. Moreover, our results show that instructors can benefit from getting insights into the evolution of modeling processes including arising quality issues of single students, but also discover tendencies in groups of students. Although the framework has been applied to process model creation, it could be extrapolated to other contexts where the creation of models based on a textual description plays an important role. 2020-07-16T09:07:07Z Sànchez-Ferreres, Josep Delicado Alcántara, Luis Andaloussi, Amine Abbad Burattin, Andrea Calderón Ruiz, Guillermo Weber, Barbara Carmona Vargas, Josep Padró, Lluís The creation of a process model faces the challenge of constructing a syntactically correct entity which accurately reflects the semantics of the reality, and is understandable. This paper proposes a framework called ModelJudge , focused towards the two main actors in the process of learning process model creation: novice modellers and instructors. For modellers, the platform enables the automatic validation of the process models created from the textual description, providing explanations about quality issues in the model. ModelJudge can provide diagnostics regarding model structure, writing style, and seman- tics by aligning annotated textual descriptions to models. For instructors, the platform facilitates the creation of modelling exercises by providing an editor to annotate the main parts of a textual description, that is empowered with Natural Language Processing (NLP) capabilities so that the annotation effort is minimized. So far around 300 students, in process modelling courses of five different universities around the world have used the platform. The feedback gathered from some of these courses shows good potential in helping students to improve their learning experience, which might, in turn, impact process model quality and understandability. Moreover, our results show that instructors can benefit from getting insights into the evolution of modeling processes including arising quality issues of single students, but also discover tendencies in groups of students. Although the framework has been applied to process model creation, it could be extrapolated to other contexts where the creation of models based on a textual description plays an important role. Del texto a la información http://hdl.handle.net/2117/192997 Del texto a la información Atserias Batalla, Jordi; Castell Ariño, Núria; Catala Roig, Neus; Rodríguez Hontoria, Horacio; Turmo Borras, Jorge Las aplicaciones informáticas centradas en el Tratamiento de la Lengua (TL) han experimentado en los últimos años un notable auge sobre todo en el ámbito del acceso a la información textual no restriginda (ni codificada). En este contexto están adquiriendo importancia creciente los sistemas de extracción de información a partir de textos no restringidos. Este auge ha dado lugar a la aparición de una nueva disciplina, la Ingeniería lingüística, que aborda todos los aspectos (técnicas, métodos, herramientas. recursos) que conducen a la construcción de aplicaciones basadas en el Tl. Nuestra propuesta se enmarca en esta doble corriente: por una parte presentamos un entorno de extracción de información, es decir una aplicación concreta de TL. Por otra parte, describimos como en este entorno se integran diferentes módulos que abordan diferentes problemas de TL en castellano que potencialmente podrían utilizarse en otras aplicaciones. 2020-07-15T18:03:27Z Atserias Batalla, Jordi Castell Ariño, Núria Catala Roig, Neus Rodríguez Hontoria, Horacio Turmo Borras, Jorge Las aplicaciones informáticas centradas en el Tratamiento de la Lengua (TL) han experimentado en los últimos años un notable auge sobre todo en el ámbito del acceso a la información textual no restriginda (ni codificada). En este contexto están adquiriendo importancia creciente los sistemas de extracción de información a partir de textos no restringidos. Este auge ha dado lugar a la aparición de una nueva disciplina, la Ingeniería lingüística, que aborda todos los aspectos (técnicas, métodos, herramientas. recursos) que conducen a la construcción de aplicaciones basadas en el Tl. Nuestra propuesta se enmarca en esta doble corriente: por una parte presentamos un entorno de extracción de información, es decir una aplicación concreta de TL. Por otra parte, describimos como en este entorno se integran diferentes módulos que abordan diferentes problemas de TL en castellano que potencialmente podrían utilizarse en otras aplicaciones. Un entorno para la extracción de información semántica del diccionario VOX http://hdl.handle.net/2117/192943 Un entorno para la extracción de información semántica del diccionario VOX Ageno Pulido, Alicia; Castellón Masalles, Irene; Ribas Framis, Francesc; Rigau Claramunt, German; Rodríguez Hontoria, Horacio; Martí Antonin, Maria Antònia; Taulé, Mariona; Verdejo Maillo, Maria Felisa 2020-07-14T17:47:28Z Ageno Pulido, Alicia Castellón Masalles, Irene Ribas Framis, Francesc Rigau Claramunt, German Rodríguez Hontoria, Horacio Martí Antonin, Maria Antònia Taulé, Mariona Verdejo Maillo, Maria Felisa