Similarity measures over refinement graphs

Similarity also plays a crucial role in support vector machines. Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employ...

Full description

Bibliographic Details
Authors: Ontañón, Santiago, Plaza, Enric
Format: article
Status:Versión aceptada para publicación
Publication Date:2012
Country:España
Institution:Consejo Superior de Investigaciones Científicas (CSIC)
Repository:DIGITAL.CSIC. Repositorio Institucional del CSIC
OAI Identifier:oai:digital.csic.es:10261/138171
Online Access:http://hdl.handle.net/10261/138171
Access Level:Open access
Keyword:Lazy learning
Refinement graphs
Similarity measures
Feature terms
Case-based reasoning
Description
Summary:Similarity also plays a crucial role in support vector machines. Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employed to assess and reason about similarity. We will define and analyze two similarity measures, S λ and S π, based on refinement graphs. The anti-unification-based similarity, S λ, assesses similarity by finding the anti-unification of two instances, which is a description capturing all the information common to these two instances. The property-based similarity, S π, is based on a process of disintegrating the instances into a set of properties, and then analyzing these property sets. Moreover these similarity measures are applicable to any representation language for which a refinement graph that satisfies the requirements we identify can be defined. Specifically, we present a refinement graph for feature terms, in which several languages of increasing expressiveness can be defined. The similarity measures are empirically evaluated on relational data sets belonging to languages of different expressiveness. © 2011 The Author(s).