Similarity of samples and trimming
We say that two probabilities are similar at level a if they are contaminated versions (up to an a fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in t...
| Autores: | , , , |
|---|---|
| Tipo de recurso: | artículo |
| Fecha de publicación: | 2012 |
| País: | España |
| Institución: | Universidad de Cantabria (UC) |
| Repositorio: | UCrea Repositorio Abierto de la Universidad de Cantabria |
| Idioma: | inglés |
| OAI Identifier: | oai:repositorio.unican.es:10902/29685 |
| Acceso en línea: | https://hdl.handle.net/10902/29685 |
| Access Level: | acceso abierto |
| Palabra clave: | Asymptotics Bootstrap Consistency Mass transportation problem Over-fitting Robustness Similarity of distributions Trimmed probability Wasserstein distance |
| id |
ES_7f6f26c8d277609ab80baffc8c95ece4 |
|---|---|
| oai_identifier_str |
oai:repositorio.unican.es:10902/29685 |
| network_acronym_str |
ES |
| network_name_str |
España |
| repository_id_str |
|
| spelling |
Similarity of samples and trimmingÁlvarez Esteban, Pedro César|||0000-0002-8818-0194Barrio Tellado, Eustasio delCuesta Albertos, Juan Antonio|||0000-0001-8228-5924Matran Bea, CarlosAsymptoticsBootstrapConsistencyMass transportation problemOver-fittingRobustnessSimilarity of distributionsTrimmed probabilityWasserstein distanceWe say that two probabilities are similar at level a if they are contaminated versions (up to an a fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples.Research partially supported by the Spanish Ministerio de Ciencia e Innovación, Grant MTM2008-06067-C02-01, and 02 and by the Consejería de Educación y Cultura de la Junta de Castilla y León, GR150. The authors would like to thank two anonymous referees for their careful reading of the manuscript, their suggestions and the pointers to relevant references that helped us to greatly improve our original version.International Statistical Institute; Chapman and HallUniversidad de Cantabria20122012-05-01journal articlehttp://purl.org/coar/resource_type/c_6501NAhttp://purl.org/coar/version/c_be7fb7dd8ff6fe43info:eu-repo/semantics/articlehttps://hdl.handle.net/10902/29685Bernoulli, 2012, 18(2), 606-634reponame:UCrea Repositorio Abierto de la Universidad de Cantabriainstname:Universidad de Cantabria (UC)Inglésengopen accesshttp://purl.org/coar/access_right/c_abf2info:eu-repo/semantics/openAccessoai:repositorio.unican.es:10902/296852026-06-02T12:39:31Z |
| dc.title.none.fl_str_mv |
Similarity of samples and trimming |
| title |
Similarity of samples and trimming |
| spellingShingle |
Similarity of samples and trimming Álvarez Esteban, Pedro César|||0000-0002-8818-0194 Asymptotics Bootstrap Consistency Mass transportation problem Over-fitting Robustness Similarity of distributions Trimmed probability Wasserstein distance |
| title_short |
Similarity of samples and trimming |
| title_full |
Similarity of samples and trimming |
| title_fullStr |
Similarity of samples and trimming |
| title_full_unstemmed |
Similarity of samples and trimming |
| title_sort |
Similarity of samples and trimming |
| dc.creator.none.fl_str_mv |
Álvarez Esteban, Pedro César|||0000-0002-8818-0194 Barrio Tellado, Eustasio del Cuesta Albertos, Juan Antonio|||0000-0001-8228-5924 Matran Bea, Carlos |
| author |
Álvarez Esteban, Pedro César|||0000-0002-8818-0194 |
| author_facet |
Álvarez Esteban, Pedro César|||0000-0002-8818-0194 Barrio Tellado, Eustasio del Cuesta Albertos, Juan Antonio|||0000-0001-8228-5924 Matran Bea, Carlos |
| author_role |
author |
| author2 |
Barrio Tellado, Eustasio del Cuesta Albertos, Juan Antonio|||0000-0001-8228-5924 Matran Bea, Carlos |
| author2_role |
author author author |
| dc.contributor.none.fl_str_mv |
Universidad de Cantabria |
| dc.subject.none.fl_str_mv |
Asymptotics Bootstrap Consistency Mass transportation problem Over-fitting Robustness Similarity of distributions Trimmed probability Wasserstein distance |
| topic |
Asymptotics Bootstrap Consistency Mass transportation problem Over-fitting Robustness Similarity of distributions Trimmed probability Wasserstein distance |
| description |
We say that two probabilities are similar at level a if they are contaminated versions (up to an a fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples. |
| publishDate |
2012 |
| dc.date.none.fl_str_mv |
2012 2012-05-01 |
| dc.type.none.fl_str_mv |
journal article http://purl.org/coar/resource_type/c_6501 NA http://purl.org/coar/version/c_be7fb7dd8ff6fe43 |
| dc.type.openaire.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| dc.identifier.none.fl_str_mv |
https://hdl.handle.net/10902/29685 |
| url |
https://hdl.handle.net/10902/29685 |
| dc.language.none.fl_str_mv |
Inglés eng |
| language_invalid_str_mv |
Inglés |
| language |
eng |
| dc.rights.none.fl_str_mv |
open access http://purl.org/coar/access_right/c_abf2 |
| dc.rights.openaire.fl_str_mv |
info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
open access http://purl.org/coar/access_right/c_abf2 |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
International Statistical Institute; Chapman and Hall |
| publisher.none.fl_str_mv |
International Statistical Institute; Chapman and Hall |
| dc.source.none.fl_str_mv |
Bernoulli, 2012, 18(2), 606-634 reponame:UCrea Repositorio Abierto de la Universidad de Cantabria instname:Universidad de Cantabria (UC) |
| instname_str |
Universidad de Cantabria (UC) |
| reponame_str |
UCrea Repositorio Abierto de la Universidad de Cantabria |
| collection |
UCrea Repositorio Abierto de la Universidad de Cantabria |
| repository.name.fl_str_mv |
|
| repository.mail.fl_str_mv |
|
| _version_ |
1869411824097558528 |
| score |
15,300724 |