Relevant publications

ICONICTranslation Machines - Incremental Interlingua-based Neural Machine Translation

Gemma Thomas — Mon, 27 May 2019 10:57:19 +0000

Author: Dr. Marta R. Costa-jussà, a Ramón y Cajal Researcher, TALP Research Center, Universitat Politècnica de Catalunya, Barcelona

This week, we have a guest post from Marta R. Costa-jussà, a Ramón & Cajal Researcher from the TALP Research Center at the Universitat Politècnica de Catalunya, in Barcelona. In Issue #37 we saw that, in order for zero-shot translation to work well, we must be able to encode the source text into a language-independent representation, and to decode from this common representation to the target language. In this week’s issue, Marta gives us more insight on this topic, and explains how to build such a system incrementally.

Introduction

Multilingual Neural Machine Translation is a standard practice nowadays. A typical architecture includes one universal encoder and decoder that are fed with multiple languages in training which allows for zero-shot translation in inference. The decoder is told which language to translate by simply recognising a tag in the source sentence that has this information. An alternative to this architecture is the use of multiple encoders and decoders for each language and sharing an attention layer which becomes the interlingua component. In both cases, components are trained at the same time and adding a new language implies to retrain the entire system.

Joint training and Incremental Language Addition

We propose an architecture that allows to incrementally add new languages, refraining from training languages already in the system. For this, we propose an architecture of independent encoders and decoders and having one encoder and one decoder for each language. These encoders and decoders share the same intermediate representation.

Let’s assume we initially train our system with two languages (X and Y). To train a multilingual system with these two languages, we combine the tasks of auto-encoding in both languages (XX and YY) and translation from X to Y (XY) and from Y to X (YX). This is performed by optimising the auto-encoder losses from both languages and the two translation losses . In addition, we compute yet another loss which minimises the distance between the intermediate representation of encoder X and encoder Y. We refer to this loss term as the interlingua loss .

Given the jointly trained model, the next step is to train a language Z without retraining any of the languages in the system. Having parallel data from language Z to either X or Y (let’s assume having parallel data Z-X, for illustration), we train a new bilingual system. We use the previously trained decoder X and train our encoder Z. Note that our decoder X is frozen and we only train the new module which is encoder Z. By doing this we are forcing encoder Z to produce similar representations to the already trained languages. As a consequence, our system is now able to translate from language Z to X and, in addition, we allow zero-shot translation between Z and Y because our architecture builds on compatible encoders and decoders.

Preliminary results

We have analysed how the model behaves for different low resourced languages (English, Turkish and Kazakh). Our model outperforms current bilingual systems by 5% and supersedes pivoting approaches by 14%. These results confirm that we are able to train independent encoders and decoders which are able to share intermediate representations.

However, the visualisation of the intermediate representations for different languages shows that similar sentences still tend to be placed in different points in the intermediate representation. This contrasts with previous good translation results that showed compatible encoders and decoders. Contrary to our expectations, this suggests that the system may not require common representations to learn compatible modules.

In summary

We show first steps towards achieving competitive translations with a flexible architecture for multilingual and zero-shot translation that enables scaling to new languages without retraining languages in the system.

One of the next steps will be to further investigate the learning compatible representations versus forcing the exact same representation since our focus is to benefit from the advantages of the interlingua translation approach (i.e. reduction of quadratic dependency on languages to linear and incremental training) which may not imply creating such universal representation.

Acknowledgements

This work is in cooperation with Carlos Escolano and José A. R. Fonollosa and the technical paper can be found here. This work is supported in part by a Google Faculty Research Award and the Spanish Ministerio de Economía y Competitividad, the European Regional Development Fund and the Agencia Estatal de Investigación, through the post-doctoral senior grant Ramón y Cajal.

LA VANGUARDIA - Los "sentidos" de la Inteligencia Artificial tienen firma española

Gemma Thomas — Wed, 22 May 2019 10:54:26 +0000

Redacción: Álvaro Celorio

Nueva York, 20 may (EFE).- Los sentidos de la Inteligencia Artificial tienen firma española, en concreto de Verbio, empresa afincada en Cataluña que aporta "el sentido del oído y del habla" a los nuevos empleados digitales para que sean capaces de escuchar y responder a los clientes.

Así lo expresa el director general de Verbio, Jordi Torres, que en una entrevista con Efe explica cómo su empresa ha conseguido que Amelia, el software de Inteligencia Artificial de la compañía IPsoft que Telefónica utiliza en sus centros de llamadas de Perú, comprenda y sea útil para sus clientes.

"Para que Amelia pueda entender lo que el usuario está diciendo, lo que necesitamos es pasar la voz a texto. El ordenador no entiende la voz, entiende ceros y unos. Y al revés: para que Amelia pueda responder al usuario necesitamos transformar esos ceros y unos en voz sintetizada", aclara el catalán en un encuentro en Nueva York sobre Inteligencia Artificial.

Verbio ha sido la encargada de que los centros de llamadas de Telefónica en Perú, que recientemente han integrado un sistema de atención al cliente con "empleados digitales" programados con Inteligencia Artificial, hablen con acento peruano y con un tono agradable para sus usuarios.

Para explicar cómo lo hizo, Torres dio una charla en Nueva York en el marco del III Encuentro de Empleados Digitales, organizado por la empresa IPsoft, y donde Telefónica también presentó su iniciativa para mejorar la experiencia del usuario a través de la IA.

"Con Telefónica en Perú nos pasó una cosa sorprendente: tuvimos que dar marcha atrás para que el saludo de Amelia al usuario incluyera 'tu asistente virtual', porque llegaba un punto en que los usuarios no eran conscientes de que hablaban con una máquina", explica Torres, que lleva en Verbio desde su fundación en 1999, auspiciada por la Universidad Politécnica de Cataluña.

"El cliente empezaba a hablarle sin fin, pensando que quien estaba detrás era un humano, cuando necesitábamos que la interacción fuera más concisa y corta. Esto es algo bueno, porque la interacción resulta en que al final al usuario le resulta difícil darse cuenta de que es una máquina", asevera.

El 85 % de los empleados de la empresa -que cuenta con sedes en Estados Unidos, México y Brasil- son técnicos, aunque Torres desgrana que esto no incluye solo ingenieros, sino también al perfil más demandado y difícil de encontrar por su compañía: los lingüistas.

"No solo necesitas a gente que aporte la parte tecnológica, sino también la parte lingüística. Cuando necesitas que la máquina entienda al usuario, para que lo haga tiene que haber un proceso de comprensión semántica, contextual, de los diferentes tipos de acentos...", detalla Torres.

Este perfil de lingüista computacional es el difícil de encontrar, "no solo en España, sino también en Estados Unidos", y a veces provienen de la Universidad y, otras, de hacer tareas similares en otras compañías.

Estos expertos en lengua y gramática resultan primordiales para otra parte fundamental: construir la voz que habla a los usuarios. Estos se encargan de crear un corpus de palabras, con diferentes entonaciones, para después grabarlas y con un sistema informático "dividirlos e miles de trocitos para luego pegarlos y generar una voz".

Entre los ingenieros y lingüistas, el equipo también cuenta con un 20 % de doctorados, que se dedican al desarrollo del "state-of-the-art" en la tecnología: "Ese perfil te permite que no solo te centres en el día a día de la compañía, lo que hoy te sirve para hacer negocio, sino que se dedican a buscar productos nuevos".

Porque la mayor dificultad de la compañía está, no solo en encontrar los lingüistas, sino en trazar el nuevo camino a seguir, ya que es algo que no existe hoy: "La mayor complejidad está en tratar de estar evolucionando constantemente. Es definir hacia dónde queremos ir, qué queremos conseguir, no saber si esa nueva tecnología va a funcionar a nivel de usuario", subraya. EFE

European Commission: Ethics guidelines for trustworthy AI

Gemma Thomas — Fri, 12 Apr 2019 10:06:40 +0000

REPORT / STUDY8 April 2019

Today, the High-Level Expert Group on AI presents their ethics guidelines for trustworthy artificial intelligence. This follows the publication of the guidelines' first draft in December 2018 on which more than 500 comments were received through an open consultation.

Download the guidelines https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

Visit the website https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines#Top

According to the guidelines, trustworthy AI should be:

(1) lawful - respecting all applicable laws and regulations

(2) ethical - respecting ethical principles and values

(3) robust - both from a technical perspective while taking into account its social environment

The guidelines put forward a set of 7 key requirements that AI systems should meet in order to be deemed trustworthy. A specific assessment list aims to help verify the application of each of the key requirements:

Human agency and oversight: AI systems should empower human beings, allowing them to make informed decisions and fostering their fundamental rights. At the same time, proper oversight mechanisms need to be ensured, which can be achieved through human-in-the-loop, human-on-the-loop, and human-in-command approaches

Technical Robustness and safety: AI systems need to be resilient and secure. They need to be safe, ensuring a fall back plan in case something goes wrong, as well as being accurate, reliable and reproducible. That is the only way to ensure that also unintentional harm can be minimized and prevented.

Privacy and data governance: besides ensuring full respect for privacy and date protection, adequate data governance mechanisms must also be ensured, taking into account the quality and integrity of the data, and ensuring legitimised access to data.

Transparency: the data, system and AI business models should be transparent. Traceability mechanisms can help achieving this. Moreover, AI systems and their decisions should be explained in a manner adapted to the stakeholder concerned. Humans need to be aware that they are interacting with an AI system, and must be informed of the system’s capabilities and limitations.

Diversity, non-discrimination and fairness: Unfair bias must be avoided, as it could could have multiple negative implications, from the marginalization of vulnerable groups, to the exacerbation of prejudice and discrimination. Fostering diversity, AI systems should be accessible to all, regardless of any disability, and involve relevant stakeholders throughout their entire life circle.

Societal and environmental well-being: AI systems should benefit all human beings, including future generations. It must hence be ensured that they are sustainable and environmentally friendly. Moreover, they should take into account the environment, including other living beings, and their social and societal impact should be carefully considered.

Accountability: Mechanisms should be put in place to ensure responsibility and accountability for AI systems and their outcomes. Auditability, which enables the assessment of algorithms, data and design processes plays a key role therein, especially in critical applications. Moreover, adequate an accessible redress should be ensured.

Next Steps

A piloting process will be set up as a means of gathering practical feedback on how the assessment list, that operationalises the key requirements, can be improved. All interested stakeholders can already register their interest to participate in the piloting process that will be kicked-off in summer 2019

Moreover, a forum discussion was set up to foster the exchange of best practices on the implementation of Trustworthy AI.

Following the piloting phase and building on the feedback received, the High-Level Expert Group on AI will review the assessment lists for the key requirements in early 2020. Based on this review, the Commission will evaluate the outcome and propose any next steps.

All relevant information on the document as well as the next steps towards the review of the assessment list can be found on the new AI Alliance, page dedicated to the guidelines.

Deep learning backend for single and multisession i-vector speaker recognition

admin — Fri, 01 Dec 2017 09:43:10 +0000

Author: Ghahabi, Omid and Hernando Pericás, Francisco Javier

Date: 2017-04-01

Abstract: The lack of labeled background data makes a big performance gap between cosine and Probabilistic Linear Discriminant Analysis (PLDA) scoring baseline techniques for i-vectors in speaker recognition. Although there are some unsupervised clustering techniques to estimate the labels, they cannot accurately predict the true labels and they also assume that there are several samples from the same speaker in the background data that could not be true in reality. In this paper, the authors make use of Deep Learning (DL) to fill this performance gap given unlabeled background data. To this goal, the authors have proposed an impostor selection algorithm and a universal model adaptation process in a hybrid system based on deep belief networks and deep neural networks to discriminatively model each target speaker. In order to have more insight into the behavior of DL techniques in both single- and multisession speaker enrollment tasks, some experiments have been carried out in this paper in both scenarios. Experiments on National Institute of Standards and Technology 2014 i-vector challenge show that 46% of this performance gap, in terms of minimum of the decision cost function, is filled by the proposed DL-based system. Furthermore, the score combination of the proposed DL-based system and PLDA with estimated labels covers 79% of this gap.

Reference: Ghahabi, O., Hernando, J. Deep learning backend for single and multisession i-vector speaker recognition. "IEEE-ACM Transactions on Audio Speech and Language Processing", 1 Abril 2017, vol. 25, núm. 4, p. 807-817.

Link: http://ieeexplore.ieee.org/document/7847321/?reload=true

UPCommons: http://upcommons.upc.edu/handle/2117/104282

Best Paper Award RTTH 2017

Biometrics, ASR, Speaker Recognition, Deep Learning

Introduction to the special issue on cross-language algorithms and applications

magdalena.biesialska — Thu, 01 Dec 2016 08:49:46 +0000

Authors (in signing order): Ruiz, M.; Bangalore, S.; Lambert, P.; Màrquez, L.; Montiel-Ponsoda, E.

Title: Introduction to the special issue on cross-language algorithms and applications

Journal (title, volume, start and end page): Journal of artificial intelligence research. Vol.55 pag.1-15

Year: 2016

Impact factor (SCI/SSCI/AHCI): 1.257 (JCR-Science)

Citations received:

Other quality indices (state database and impact factor):

Abstract: With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing. With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and effective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.

Citation: Ruiz, M., Bangalore, S., Lambert, P., Màrquez, L., Montiel-Ponsoda, E. Introduction to the special issue on cross-language algorithms and applications. "Journal of artificial intelligence research", 1 Desembre 2016, vol. 55, p. 1-15.

Link: http://upcommons.upc.edu/handle/2117/102165

MT, Deep Learning

Domain adaptation strategies in statistical machine translation: a brief overview

magdalena.biesialska — Sat, 31 Oct 2015 23:00:00 +0000

Authors (in signing order): Ruiz, M.

Title: Domain adaptation strategies in statistical machine translation: a brief overview

Journal (title, volume, start and end page): Knowledge engineering review. Vol.30. Num.5. Pag 514-520

Year: 2015

Impact factor (SCI/SSCI/AHCI): 1.039 (JCR-Science)

Citations received: 1 (Scholar)

Other quality indices (state database and impact factor):

Abstract: Statistical machine translation (SMT) is gaining interest given that it can easily be adapted to any pair of languages. One of the main challenges in SMT is domain adaptation because the performance in translation drops when testing conditions deviate from training conditions. Many research works are arising to face this challenge. Research is focused on trying to exploit all kinds of material, if available. This paper provides an overview of research, which copes with the domain adaptation challenge in SMT.

Citation: Ruiz, M. Domain adaptation strategies in statistical machine translation: a brief overview. "Knowledge engineering review", 1 Novembre 2015, vol. 30, núm. 5, p. 514-520.

Link: http://upcommons.upc.edu/handle/2117/104733

MT, Deep Learning

How much hybridisation does machine translation need?

magdalena.biesialska — Wed, 30 Sep 2015 22:00:00 +0000

Authors (in signing order): Ruiz, M.

Title: How much hybridisation does machine translation need?

Journal (title, volume, start and end page): Journal of the Association for Information Science and Technology. Vol.66. Num.10. Pag 2160-2165

Year: 2015

Impact factor (SCI/SSCI/AHCI): 1.864 (JCR-Social Sciences)

Citations received: 2 (Scopus) / 5 (Scholar)

Other quality indices (state database and impact factor):

Abstract: Rule-based and corpus-based Machine Translation (MT) have coexisted for more than 20 years. Recently, boundaries between both paradigms have narrowed and hybrid approaches are gaining interest from the academic and business point of view. However, since the hybrid approaches involve the multidisciplinary interaction of linguists, computer scientists, engineers and informatics, there arise a variety of questions. While statistical methods currently dominate research work in statistical MT, most commercial MT systems are technically hybrid systems. The research community should more actively investigate the benefits and questions surrounding hybridization of MT systems. This squib discusses about different issues related to hybrid MT including the origins, the architectures, the achievements and frustrations. Understanding hybridization in the wide sense, both rule-based and corpus-based MT systems have benefited from hybridization when correctly integrated. In addition and to some extension, most of the current rule/corpus-based MT approaches are already hybrid since they tend to include some statistics/rules at some stage.

Citation: Ruiz, M. How much hybridisation does machine translation need?. "Journal of the Association for Information Science and Technology", 1 Octubre 2015, vol. 66, núm. 10, p. 2160-2165.

Link: http://upcommons.upc.edu/handle/2117/104737

MT, Deep Learning

Latest trends in hybrid machine translation and its applications

admin — Wed, 04 Feb 2015 19:11:32 +0000

Authors (in signing order): Ruiz, M.; Fonollosa, José A. R.

Title: Latest trends in hybrid machine translation and its applications

Journal (title, volume, start and end page):

Computer speech and language. Vol.32. Num.1. Pag. 3 - 10

Year: 2015

Impact factor (SCI/SSCI/AHCI): 1.324 (JCR-Science)

Citations received:

4 (Scopus) / 23 (Scholar)

Other quality indices (state database and impact factor):

Abstract: This survey on hybrid machine translation (MT) is motivated by the fact that hybridization techniques have become popular as they attempt to combine the best characteristics of highly advanced pure rule or corpus-based MT approaches. Existing research typically covers either simple or more complex architectures guided by either rule or corpus-based approaches. The goal is to combine the best properties of each type. This survey provides a detailed overview of the modification of the standard rule-based architecture to include statistical knowl- edge, the introduction of rules in corpus-based approaches, and the hybridization of approaches within this last single category. The principal aim here is to cover the leading research and progress in this field of MT and in several related applications.

Citation: Fonollosa, José A. R.; Costa-jussà, Marta R. Latest trends in hybrid machine translation and its applications. "Computer speech and language", Juliol 2015, vol. 32, núm. 1, p. 3-10.

Link: http://upcommons.upc.edu/handle/2117/27122

MT, Deep Learning

Using annotations on Mechanical Turk to perform supervised polarity classification of Spanish customer comments

magdalena.biesialska — Sun, 30 Nov 2014 23:00:00 +0000

Authors (in signing order): Ruiz, M.; Grivolla, J.; Mellebeek, B.; Benavent, F.; Codina, J.; Banchs, R.

Title: Using annotations on Mechanical Turk to perform supervised polarity classification of Spanish customer comments

Journal (title, volume, start and end page): Information sciences. Vol.275. Pag 400-412

Year: 2014

Impact factor (SCI/SSCI/AHCI): 4.038 (JCR-Science)

Citations received: 3 (Scholar)

Other quality indices (state database and impact factor):

Abstract: One of the major bottlenecks in the development of data-driven AI Systems is the cost of reliable human annotations. The recent advent of several crowdsourcing platforms such as Amazon’s Mechanical Turk, allowing requesters the access to affordable and rapid results of a global workforce, greatly facilitates the creation of massive training data. Most of the available studies on the effectiveness of crowdsourcing report on English data. We use Mechanical Turk annotations to train an Opinion Mining System to classify Spanish consumer comments. We design three different Human Intelligence Task ( HIT) strategies and report high inter-annotator agreement between non-experts and expert annotators. We evaluate the advantages/drawbacks of each HIT design and show that, in our case, the use of non-expert annotations is a viable and cost-effective alternative to expert annotations.

Citation: Costa-jussà, M. R., Grivolla, J., Mellebeek, B., Benavent, F., Codina, J., Codina, J., Banchs, R. Using annotations on Mechanical Turk to perform supervised polarity classification of Spanish customer comments. "Information sciences", 01 Desembre 2014, vol. 275, p. 400-412.

Link: http://upcommons.upc.edu/handle/2117/82570

MT, Deep Learning

META-NET Strategic Research Agenda for Multilingual Europe 2020

Super User-2 — Thu, 06 Dec 2012 09:58:13 +0000

More than two years in the making, the final version of the META-NET Strategic Research Agenda for Multilingual Europe 2020 (SRA) was published on December 01, 2012. This document is the result of a discussion between hundreds of experts from research and industry. The main purpose of the SRA is to raise awareness for the field of Language Technology in Europe and attract the attention of and inform politicians and policy makers on the regional, national and international level in their decisions, especially with regard to the upcoming European funding opportunities Horizon 2020 and Connecting Europe Facility (CEF).

META-NET Language White Papers

Super User-2 — Wed, 05 Dec 2012 09:58:13 +0000

The META-NET Language White Paper series “Languages in the European Information Society” reports on the state of each European language with respect to Language Technology and explains the most urgent risks and chances.

The series will cover all official European languages and several other languages spoken in geographical Europe. While there have been a number of valuable and comprehensive scientific studies on certain aspects of languages and technology, there exists no generally understandable compendium that takes a stand by presenting the main findings and challenges for each language. The META-NET white paper series will fill this gap.