NFDI4DS | UHH-SEMS - Publication Details

Paweł Mandera

ORCID: 0000-0003-2408-1215

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5085085423

Research Areas

Text Readability and Simplification
Neurobiology of Language and Bilingualism
Natural Language Processing Techniques
Reading and Literacy Development
Second Language Acquisition and Learning
Speech and dialogue systems
Categorization, perception, and language
Topic Modeling
Lexicography and Language Studies
Digital Communication and Language
Child and Animal Learning Development
Language and cultural evolution
Authorship Attribution and Profiling
Cognitive Abilities and Testing
Technology Adoption and User Behaviour
Language Development and Disorders
Aging and Gerontology Research
Open Source Software Innovations
Advanced Text Analysis Techniques
Linguistics, Language Diversity, and Identity
Knowledge Management and Sharing
Migration, Policy, and Dickens Studies
Language and Culture
Computational and Text Analysis Methods
Multisensory perception and integration

McMaster University
2021

Brock University
2021

Tilburg University
2021

Ghent University Hospital
2013-2021

Ghent University
2014-2020

Institute of Psychology
2018-2020

Jagiellonian University
2018-2020

Subtlex-UK: A New and Improved Word Frequency Database for British English

OPENALEX - Publications

Walter J. B. van Heuven Paweł Mandera Emmanuel Keuleers Marc Brysbaert

We present word frequencies based on subtitles of British television programmes. show that the SUBTLEX-UK explain more variance in lexical decision times Lexicon Project than National Corpus and SUBTLEX-US frequencies. In addition to form frequencies, we also measures contextual diversity part-of-speech specific children programmes, bigram giving researchers English access full range norms recently made available for other languages. Finally, introduce a new measure frequency, Zipf scale,...

10.1080/17470218.2013.850521 article EN cc-by Quarterly Journal of Experimental Psychology 2013-10-04

How Many Words Do We Know? Practical Estimates of Vocabulary Size Dependent on Word Definition, the Degree of Language Input and the Participant’s Age

OPENALEX - Publications

Marc Brysbaert Michaël Stevens Paweł Mandera Emmanuel Keuleers

Based on an analysis of the literature and a large scale crowdsourcing experiment, we estimate that average 20-year-old native speaker American English knows 42,000 lemmas 4,200 non-transparent multiword expressions, derived from 11,100 word families. The numbers range 27,000 for lowest 5% to 52,000 highest 5%. Between ages 20 60, person learns 6,000 extra or about one new lemma every 2 days. knowledge words can be as shallow knowing exists. In addition, people learn tens thousands inflected...

10.3389/fpsyg.2016.01116 article EN cc-by Frontiers in Psychology 2016-07-29

The Word Frequency Effect in Word Processing: An Updated Review

OPENALEX - Publications

Marc Brysbaert Paweł Mandera Emmanuel Keuleers

The word frequency effect refers to the observation that high-frequency words are processed more efficiently than low-frequency words. Although was first described over 80 years ago, in recent it has been investigated detail. It become clear considerable quality differences exist between estimates and we need a new standardized measure does not mislead users. Research also points consistent individual effect, meaning will be present at different ranges for people with degrees of language...

10.1177/0963721417727521 article EN Current Directions in Psychological Science 2017-12-13

Word knowledge in the crowd: Measuring vocabulary size and word prevalence in a massive online experiment

OPENALEX - Publications

Emmanuel Keuleers Michaël Stevens Paweł Mandera Marc Brysbaert

We use the results of a large online experiment on word knowledge in Dutch to investigate variables influencing vocabulary size population and examine effect prevalence-the percentage knowing word-as measure occurrence. Nearly 300,000 participants were presented with about 70 stimuli (selected from list 53,000 words) an adapted lexical decision task. identify age, education, multilingualism as most important factors size. The suggest that accumulation throughout life multiple languages...

10.1080/17470218.2015.1022560 article EN Quarterly Journal of Experimental Psychology 2015-02-25

Word prevalence norms for 62,000 English lemmas

OPENALEX - Publications

Marc Brysbaert Paweł Mandera Samantha F. McCormick Emmanuel Keuleers

10.3758/s13428-018-1077-9 article EN Behavior Research Methods 2018-07-02

The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2.

OPENALEX - Publications

Marc Brysbaert Michaël Stevens Paweł Mandera Emmanuel Keuleers

Keuleers, Stevens, Mandera, and Brysbaert (2015) presented a new variable, word prevalence, defined as knowledge in the population. Some words are known to more people than other. This is particularly true for low-frequency (e.g., screenshot vs. scourage). In present study, we examined impact of measure by collecting lexical decision times 30,000 Dutch lemmas various lengths (the Lexicon Project 2). Word prevalence had second highest correlation with (after frequency): Words everyone...

10.1037/xhp0000159 article EN Journal of Experimental Psychology Human Perception & Performance 2015-10-26

Subtlex-pl: subtitle-based word frequency estimates for Polish

OPENALEX - Publications

Paweł Mandera Emmanuel Keuleers Zofia Wodniecka Marc Brysbaert

10.3758/s13428-014-0489-4 article EN Behavior Research Methods 2014-06-19

How useful are corpus-based methods for extrapolating psycholinguistic variables?

OPENALEX - Publications

Paweł Mandera Emmanuel Keuleers Marc Brysbaert

Subjective ratings for age of acquisition, concreteness, affective valence, and many other variables are an important element psycholinguistic research. However, even well-studied languages, usually cover just a small part the vocabulary. A possible solution involves using corpora to build semantic similarity space apply machine learning techniques extrapolate existing previously unrated words. We conduct systematic comparison two extrapolation techniques: k-nearest neighbours, random...

10.1080/17470218.2014.988735 article EN Quarterly Journal of Experimental Psychology 2015-02-19

Recognition times for 62 thousand English words: Data from the English Crowdsourcing Project

OPENALEX - Publications

Paweł Mandera Emmanuel Keuleers Marc Brysbaert

10.3758/s13428-019-01272-8 article EN Behavior Research Methods 2019-07-31

SPALEX: A Spanish Lexical Decision Database From a Massive Online Data Collection

OPENALEX - Publications

José Aguasvivas Manuel Carreiras Marc Brysbaert Paweł Mandera Emmanuel Keuleers and 1 more

10.3389/fpsyg.2018.02156 article EN cc-by Frontiers in Psychology 2018-11-12

Which words do English non-native speakers know? New supernational levels based on yes/no decision

OPENALEX - Publications

Marc Brysbaert Emmanuel Keuleers Paweł Mandera

To have more information about the English words known by second language (L2) speakers, we ran a large-scale crowdsourcing vocabulary test, which yielded 17 million useful responses. It provided us with list of 445 to nearly all participants. The was compared various existing lists advised include in first stages L2 teaching. data also ranking 61,000 terms degree and speed word recognition correlated r = .85 similar based on native speakers. speakers our study were relatively better at...

10.1177/0267658320934526 article EN Second language Research 2020-06-19

When a second language hits a native language. What ERPs (do and do not) tell us about language retrieval difficulty in bilingual language production.

OPENALEX - Publications

Zofia Wodniecka Jakub Szewczyk Patrycja Kałamała Paweł Mandera Joanna Durlik

10.1016/j.neuropsychologia.2020.107390 article EN Neuropsychologia 2020-02-11

How do Spanish speakers read words? Insights from a crowdsourced lexical decision megastudy

OPENALEX - Publications

José Aguasvivas Manuel Carreiras Marc Brysbaert Paweł Mandera Emmanuel Keuleers and 1 more

10.3758/s13428-020-01357-9 article EN Behavior Research Methods 2020-02-18

The Italian Crowdsourcing Project: Visual word recognition times for 130,495 Italian words

OPENALEX - Publications

Simona Amenta Andrea Gregor de Varda Paweł Mandera Emmanuel Keuleers Marc Brysbaert and 1 more

10.3758/s13428-024-02548-4 article EN Behavior Research Methods 2024-12-28

Recognition times for 62 thousand English words: Data from the English Crowdsourcing Project

OPENALEX - Publications

Marc Brysbaert Emmanuel Keuleers Paweł Mandera

We present a new dataset of English word recognition times for total 62 thousand words, called the Crowdsourcing Project. The data were collected via an internet vocabulary test, in which more than one million people participated. is limited to native speakers. Participants asked indicate words they knew. Their response registered, although at no point participants respond as fast possible. Still, correlate around .75 with Lexicon Project shared words. Also results virtual experiments that...

10.31234/osf.io/wm5gv preprint EN 2019-07-04

Recognition Times for 54 Thousand Dutch Words: Data from the Dutch Crowdsourcing Project

OPENALEX - Publications

Marc Brysbaert Emmanuel Keuleers Paweł Mandera

We present a new database of Dutch word recognition times for total 54 thousand words, called the Crowdsourcing Project. The data were collected with an internet vocabulary test. is limited to native speakers. Participants asked indicate which words they knew. Their response registered, even though participants not respond as fast possible. Still, correlate around .7 Lexicon Projects shared words. Also results virtual experiments that are valid addition Projects. This only means we have...

10.5334/pb.491 article EN cc-by Psychologica Belgica 2019-01-01

Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish

OPENALEX - Publications

Karolina Rataj Patrycja Kakuba Paweł Mandera Walter J. B. van Heuven

This study presents a Polish semantic priming dataset and similarity ratings for word pairs obtained with native speakers, as well range of spaces. The include strongly related, weakly semantically unrelated pairs. rating (Experiment 1) confirmed that the three conditions differed in relatedness. lexical decision carefully matched subset stimuli 2), revealed strong effects related pairs, whereas showed smaller but still significant effect relative to datasets both experiments those...

10.1371/journal.pone.0284801 article EN cc-by PLoS ONE 2023-04-24

Coming Soon ...