In recent years it has been pointed out that, in a number of applications involving classification, the final goal is not determining which class (or classes) individual unlabelled data items belong to, but determining the prevalence (or “relative frequency”) of each class in the unlabelled data. The latter task is known as quantification.
Assume a market research agency wants to find out how many among the tweets mentioning product X express a favourable view of it. On the surface, this seems a standard instance of query-biased tweet sentiment classification. However, the agency is likely not interested in whether a specific individual has a positive view of X, but in knowing #how many# among the people who tweet about X have a positive view of it; that is, the agency is actually interested in knowing the relative frequency of the positive class, i.e., is interested not in classification but in quantification. Essentially, quantification is classification evaluated at the aggregate (rather than at the individual) level.
The research community has recently shown a growing interest in tackling quantification as a task in its own right. One of the reasons is that, since the goal of quantification is different than that of classification, quantification requires evaluation measures different from those used for classification. A second, related reason is that, as it has been shown, using a method optimized for classification accuracy is suboptimal when quantification accuracy is the real goal. A third reason is the growing awareness that quantification is going to be more and more important; with the advent of big data, more and more application contexts are going to spring up in which we will simply be happy with analyzing data at the aggregate level rather than at the individual level.
The goal of this tutorial is to introduce the audience to the problem of quantification, to the techniques that have been proposed for solving it, to the metrics used to evaluate them, and to the problems that are still open in the area. Applications in fields as diverse as machine learning, natural language processing, data mining, epidemiology, social sciences, and still others, will be discussed.
Fabrizio Sebastiani has been a Principal Scientist at QCRI-QF since June 2014; from 2006 to then he has been a Senior Researcher at Istituto di Scienza e Tecnologie dell'Informnazione, Consiglio Nazionale delle Ricerche, Italy, from which he is currently on leave; before 2006 he was an Associate Professor at the Department of Pure and Applied Mathematics of the University of Padova, Italy. His main current research interests are at the intersection of information retrieval, machine learning, and human language technologies, with particular emphasis on text classification, information extraction, opinion mining, and their applications. He is an Associate Editor for ACM Transactions on Information Systems (ACM Press) and AI Communications (IOS Press), and a member of the Editorial Boards of Information Retrieval (Kluwer) and Foundations and Trends in Information Retrieval (Now Publishers); of the latter he is also a past co-Editor-in-Chief. He is also a past member of the Editorial Boards of the Journal of the American Society for Information Science and Technology (Wiley), Information Processing and Management (Elsevier), and ACM Computing Reviews (ACM Press). He is the Editor for Europe, Middle East, and Africa, of Springer's “Information Retrieval” book series. He has been the General Chair of ECIR 2003 and SPIRE 2011, and a Program co-Chair of SIGIR 2008 and ECDL 2010; he is the appointed General co-Chair of SIGIR 2016. From 2003 to 2007 he has been the Vice-Chair of ACM SIGIR. He has given several tutorials at international conferences (ECDL 1997, ECDL 1998, ER 1998, WWW 1999, ECDL 2000, COLING 2000, IJCAI 2001, ECDL 2001, ECIR 2014) and summer schools (among which ESSLLI 2003 and ESSIR 2005) on themes at the intersection of machine learning and information retrieval. |
Home page: Fabrizio Sebastiani@ISTI-CNR
Artificial Intelligence planning and verification have a strong record of contributions in industrial applications, and the research advances in these areas are mainly motivated by the growing need for efficiency, reliability and robustness of new solutions.
AI Planning is moving towards ever more demanding applications that present temporal, spatial and continuous constraints, and require planning techniques capable of reasoning with suitably rich models. This motivates both increases in the modelling power to capture these constraints and extensions to classical approaches to planning to enable efficient management of such complex problems.
Many real-world scenarios involve hybrid systems, which are described by both continuous control variables and discrete logical modes. Some example applications include autonomous underwater vehicles, smart power management, autonomous drones, satellite observation planning. Such scenarios motivate the need to reason with mixed discrete-continuous domains from a verification as well as a planning perspective.
In this tutorial, I will provide an introduction to AI Planning, with a particular focus on modelling planning problems in hybrid domains. I will consider some working examples based on real-world scenarios, showing how even complex interactions can be easily modelled using the planning formalism. In the second part of the tutorial, I will present some tools for solving planning problems and discuss what challenges arise when one wants to use plans in practice: plan validation, plan execution and replanning, planning-control-robotics integration. The tutorial will end with a discussion on current challenges and open problems.
The target is the broad audience of AI*IA, as the tutorial covers multidisciplinary topics, relevant to planning, machine learning, robotics, verification etc., with a proper mix of theory and applications. No prior knowledge of planning is required to follow the tutorial.
Daniele Magazzeni is a lecturer in Artificial Intelligence in the Department of Informatics at King's College London.
Dr. Magazzeni completed a Ph.D. in Computer Science at University of L'Aquila in 2009.
His research explores the links between artificial intelligence planning and model-checking, with a particular focus on temporal continuous planning, planning and robotics, and hybrid systems control and verification.
His recent contributions include the co-development of new techniques for planning in hybrid domains, a formal translation between planning and model checking formalisms, and a planning-based approach for policy-learning.
He is co-investigator on a number of funded projects, involving applications of planning and verification in various real-world domains, including underwater robotics, electric power system management, AUV missions control.
He is Co-Editor in Chief of AI Communications Journal. He is serving in the organizing and programme committees of AAAI, IJCAI, ICAPS, ECAI conferences.
He will be conference chair of ICAPS 2016. |
Home page: Daniele Magazzeni@King's College
Nowadays the textual information available online is provided in an increasingly wide range of languages. This language explosion clearly forces researchers to focus on the challenging problem of being able to analyze and understand text written in any language. At the core of this problem lies the lexical ambiguity of language, an issue which has to be addressed at two levels: the availability of multilingual knowledge resources and ontologies which connect concepts and named entities to their lexicalizations in dozens of languages and the exploitation of high performance algorithms for performing two key intertwined tasks: multilingual Word Sense Disambiguation (WSD) and Entity Linking (EL).
In this tutorial, I will showcase for the first time BabelNet 3.0 (http://babelnet.org), a new release of the largest multilingual encyclopedic dictionary and semantic network, which now covers around 260 languages and provides both lexicographic and encyclopedic knowledge for all the open-class parts of speech (nouns, verbs, adjectives and adverbs), thanks to the seamless integration of WordNet, Wikipedia, Wiktionary, OmegaWiki, Wikidata and the Open Multilingual WordNet. BabelNet 3.0 features a full-fledged taxonomy both for concepts (car is-a motor vehicle) and named entities (Donald Duck is-a character), thanks to the integration of the Wikipedia Bitaxonomy (http://wibitaxonomy.org).
In the second part of the tutorial, I will present Babelfy (http://babelfy.org), a unified approach that leverages BabelNet to perform word sense disambiguation and entity linking in arbitrary languages, with performance on both tasks on a par with, or surpassing, those of task-specific state-of-the-art supervised and knowledge-based systems. We will conclude with ideas and future directions on this beautiful multilingual world.
Both parts of the tutorial will feature hands-on sessions for Java developers and SPARQL fans. The tutorial does not require any specific knowledge of Natural Language Processing and targets anyone interested in multilinguality, lexical knowledge representation, language ambiguity and disambiguation.
Roberto Navigli is an Associate Professor in the Department of Computer Science of the Sapienza University of Rome. He was awarded the Marco Cadoli 2007 AI*IA Prize for the best doctoral thesis in Artificial Intelligence and the Marco Somalvico 2013 AI*IA Prize for the best young researcher in AI. He is the recipient of an ERC Starting Grant in computer science and informatics on multilingual word sense disambiguation(2011-2016) and a co-PI of a Google Focused Research Award on Natural Language Understanding.His research lies in the field of Natural Language Processing (including word sense disambiguation and induction, ontology learning from scratch, large-scale knowledge acquisition, open information extraction and relation extraction). He has served as an area chair of ACL, WWW, and *SEM, and a senior program committee member of IJCAI. Currently he is an Associate Editor of the Artificial Intelligence Journal, a member of the editorial board of the Journal of Natural Language Engineering, a guest editor of the Journal of Web Semantics, and a former editorial board member of Computational Linguistics. |
Home page: Roberto Navigli@Sapienza