Harald Hammarström

ORCID: 0000-0003-0120-6396
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Language and cultural evolution
  • Linguistic Variation and Morphology
  • Linguistics and Cultural Studies
  • Lexicography and Language Studies
  • Syntax, Semantics, Linguistic Variation
  • Historical Linguistics and Language Studies
  • Authorship Attribution and Profiling
  • Multilingual Education and Policy
  • Linguistics, Language Diversity, and Identity
  • Topic Modeling
  • Pacific and Southeast Asian Studies
  • Spanish Linguistics and Language Studies
  • Physics and Engineering Research Articles
  • Linguistic Studies and Language Acquisition
  • Language, Linguistics, Cultural Analysis
  • Linguistics and language evolution
  • Language, Discourse, Communication Strategies
  • Algorithms and Data Compression
  • Phonetics and Phonology Research
  • Image Retrieval and Classification Techniques
  • Radiomics and Machine Learning in Medical Imaging
  • African history and culture analysis
  • Speech and dialogue systems
  • Australian Indigenous Culture and History

Uppsala University
2017-2024

Max Planck Institute for Psycholinguistics
2013-2023

University of Zurich
2022

Max Planck Society
2011-2020

Humboldt-Universität zu Berlin
2020

Australian National University
2018-2020

Max Planck Institute for the Science of Human History
2015-2018

Université Claude Bernard Lyon 1
2018

Centre National de la Recherche Scientifique
2018

Max Planck Institute for Evolutionary Anthropology
2010-2014

It is widely assumed that one of the fundamental properties spoken language arbitrary relation between sound and meaning. Some exceptions in form nonarbitrary associations have been documented linguistics, cognitive science, anthropology, but these studies only involved small subsets 6,000+ languages world today. By analyzing word lists covering nearly two-thirds world's languages, we demonstrate a considerable proportion 100 basic vocabulary items carry strong with specific kinds human...

10.1073/pnas.1605782113 article EN Proceedings of the National Academy of Sciences 2016-09-12
Hedvig Skirgård Hannah J. Haynie Damián E. Blasí Harald Hammarström Jeremy Collins and 95 more Jay J. Latarche Jakob Lesage Tobias Weber Alena Witzlack-Makarevich Sam Passmore Angela M. Chira Luke Maurits Russell Dinnage Michael Dunn Ger P. Reesink Ruth Singer Claire Bowern Patience Epps Jane H. Hill Outi Vesakoski Martine Robbeets Noor Karolin Abbas Daniel Auer Nancy A. Bakker Giulia Barbos Robert Borges Swintha Danielsen Luise Dorenbusch Ella Dorn John P. Elliott Giada Falcone Jana Fischer Yustinus Ghanggo Ate Hannah Gibson Hans-Philipp Göbel Jemima A. Goodall Victoria Gruner Andrew Harvey Rebekah Hayes Leonard Heer Roberto E. Herrera Miranda Nataliia Hübler Biu Huntington-Rainey Jessica K. Ivani Marilen Johns Erika Just Eri Kashima Carolina Kipf Janina V. Klingenberg Nikita König Aikaterina Koti Richard Kowalik Olga Krasnoukhova Nora L. M. Lindvall Mandy Lorenzen Hannah Lutzenberger Tânia R. A. Martins Celia Mata German Suzanne Van Der Meer Jaime Montoya Samamé Michael Müller Saliha Muradoğlu Kelsey Neely Johanna Nickel Miina Norvik Cheryl Akinyi Oluoch Jesse Peacock India O.C. Pearey Naomi Peck Stéphanie Petit Sören Pieper Mariana Poblete Daniel Prestipino Linda Raabe Amna Raja Janis Reimringer Sydney C. Rey Julia Rizaew Eloisa Ruppert Kim K. Salmon Jill Sammet Rhiannon Schembri Lars Schlabbach Frederick W. P. Schmidt Amalia Skilton Wikaliler Daniel Smith Hilário de Sousa Kristin Sverredal Daniel Valle Javier Vera Judith Voß Tim Witte Henry Wu Stephanie Yam Jingting Ye Maisie Yong Tessa Yuditha Roberto Zariquiey Robert Forkel Nicholas Evans

While global patterns of human genetic diversity are increasingly well characterized, the languages remains less systematically described. Here, we outline Grambank database. With over 400,000 data points and 2400 languages, is largest comparative grammatical database available. The comprehensiveness allows us to quantify relative effects genealogical inheritance geographic proximity on structural world's evaluate constraints linguistic diversity, identify most unusual languages. An analysis...

10.1126/sciadv.adg6175 article EN cc-by-nc Science Advances 2023-04-19

This paper describes a computerized alternative to glottochronology for estimating elapsed time since parent languages diverged into daughter languages. The method, developed by the Automated Similarity Judgment Program (ASJP) consortium, is different from in four major respects: (1) it automated and thus more objective, (2) applies uniform analytical approach single database of worldwide languages, (3) based on lexical similarity as determined Levenshtein (edit) distances rather than...

10.1086/662127 article EN Current Anthropology 2011-11-30

This article surveys work on Unsupervised Learning of Morphology. We define Morphology as the problem inducing a description (of some kind, even if only morpheme-segmentation) how orthographic words are built up given raw text data language. briefly go through history and motivation this problem. Next, over 200 items listed with brief characterization, most important ideas in field critically discussed. summarize achievements so far give pointers for future developments.

10.1162/coli_a_00050 article EN cc-by-nc-nd Computational Linguistics 2011-04-05

What would your ideas about language evolution be if there was only one left on earth? Fortunately, our investigation need not that impoverished. In the present article, we survey state of knowledge regarding kinds found among humans, inventory, population sizes, time depth, grammatical variation, and other relevant issues a theory should minimally take into account.

10.1093/jole/lzw002 article EN Journal of Language Evolution 2016-01-01

Abstract The amount of available digital data for the languages world is constantly increasing. Unfortunately, most are provided in a large variety formats and therefore not amenable comparison re-use. Cross-Linguistic Data Formats initiative proposes new standards two basic types historical typological language (word lists, structural datasets) framework to incorporate more (e.g. parallel texts, dictionaries). specification cross-linguistic comes along with software package validation...

10.1038/sdata.2018.205 article EN cc-by Scientific Data 2018-10-16

This discussion note reviews responses of the linguistics profession to grave issues language endangerment identified a quarter century ago in journal Language by Krauss, Hale, England, Craig, and others (Hale et al. 1992). Two half decades worldwide research not only have given us much more accurate picture number, phylogeny, typological variety world's languages, but they also seen development wide range new approaches, conceptual technological, problem documenting them. We review these...

10.1353/lan.2018.0070 article EN Language 2018-01-01

Abstract This paper presents a precise definition of numeral classifiers, steps to identify classifier language, and database 3,338 languages, which 723 languages have been identified as having system. The database, named World Atlas Classifier Languages (WACL), has systematically constructed over the last 10 years via manual survey relevant literature also an automatic scan digitized grammars followed by checking. open-access release WACL is thus significant contribution linguistic research...

10.1515/lingvan-2022-0006 article EN cc-by Linguistics Vanguard 2022-11-01

In this paper, we seek to draw attention Malayo-Polynesian languages outside of the Oceanic subgroup with innovative bases and complex numerals involving various additive, subtractive, multiplicative procedures. We highlight fact that number showing such innovations is more than previously recognized in literature. Finally, observe concentration numeral region eastern Indonesia suggests Papuan influence, either through contact or substrate. However, also note sociocultural factors, form...

10.1353/ol.2013.0023 article EN Oceanic Linguistics 2013-01-01

Human history is written in both our genes and languages. The extent to which biological linguistic histories are congruent has been the subject of considerable debate, with clear examples matches mismatches. To disentangle patterns demographic cultural transmission, we need a global systematic assessment Here, assemble genomic database (GeLaTo, or Genes Languages Together) specifically curated investigate genetic diversity worldwide. We find that most populations GeLaTo speak languages same...

10.1073/pnas.2122084119 article EN cc-by Proceedings of the National Academy of Sciences 2022-11-18

One attempt at explaining why some language families are large (while others small) is the hypothesis that now became because their ancestral speakers had a technological advantage, most often agriculture. Variants of this idea referred to as Language Farming Dispersal Hypothesis. Previously, detailed family studies have uncovered various supporting examples and counterexamples idea. In present paper I weigh evidence from ALL attested families. For each family, use number member languages...

10.1075/dia.27.2.02ham article EN Diachronica 2010-10-11

Preview this article: Problems with, and alternatives to,the tree model in historical linguistics, Page 1 of < Previous page | Next > /docserver/preview/fulltext/jhl.00005.kal-1.gif

10.1075/jhl.00005.kal article EN Journal of Historical Linguistics 2019-07-02

Glottocodes constitute the backbone identification system for language, dialect and family inventory Glottolog (https://glottolog.org). In this paper, we summarize motivation history behind of glottocodes describe principles practices data curation, technical infrastructure update/version-tracking systematics. Since our understanding target domain – dialects, languages language families entire world is continually evolving, changes updates are relatively common. The resulting assessed in...

10.3233/sw-212843 article EN other-oa Semantic Web 2022-01-14

While the notion of ‘area’ or ‘Sprachbund’ has a long history in linguistics, with geographically-defined regions frequently cited as useful means to explain typological distributions, problem delimiting areas not been well addressed. Lists general-purpose, largely independent ‘macro-areas’ (typically continent size) have proposed step rule out contact an explanation for various large-scale linguistic phenomena. This squib points some problems currently widely-used predetermined areas, those...

10.1163/22105832-00401001 article EN Language Dynamics and Change 2014-01-01

Abstract This paper shows how it is possible to count languages vs. dialects if, for every pair of varieties, we are given whether they mutually intelligible or not. The method divide the varieties into a minimum number internally groups where each group counts as one language. Expressed in terms graphs (as discrete mathematics), even easier understood as: applying graph-colouring graph over with intelligibility interrelationships edges. Graph colouring already mathematically well-understood...

10.1080/09296170701794278 article EN Journal of Quantitative Linguistics 2008-01-17
Coming Soon ...