Tanja Säily

ORCID: 0000-0003-4407-8929
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Linguistic Variation and Morphology
  • Lexicography and Language Studies
  • Linguistics and language evolution
  • Authorship Attribution and Profiling
  • Gender Studies in Language
  • Syntax, Semantics, Linguistic Variation
  • Linguistics, Language Diversity, and Identity
  • Topic Modeling
  • Second Language Acquisition and Learning
  • Language, Discourse, Communication Strategies
  • Digital Humanities and Scholarship
  • Data Visualization and Analytics
  • Phonetics and Phonology Research
  • Multilingual Education and Policy
  • Mathematics, Computing, and Information Processing
  • EFL/ESL Teaching and Learning
  • Speech and dialogue systems
  • Spanish Linguistics and Language Studies
  • Speech Recognition and Synthesis
  • Linguistic research and analysis
  • Language and cultural evolution
  • Islamic Finance and Banking Studies
  • Digital Communication and Language
  • Organizational Management and Leadership

University of Helsinki
2015-2024

Linnaeus University
2021

Clinical Research Center Kiel
2020

Finding out whether a word occurs significantly more often in one text or corpus than another is an important question analysing corpora. As noted by Kilgarriff (Language never, ever, random, Corpus Linguistics and Linguistic Theory , 2005; 1(2): 263–76.), the use of χ 2 log-likelihood ratio tests problematic this context, as they are based on assumption that all samples statistically independent each other. However, words within not independent. pointed (Comparing corpora, International...

10.1093/llc/fqu064 article EN Digital Scholarship in the Humanities 2014-12-08

The first aim of this work is to examine gender-based variation in the productivity nominal suffixes - ness and ity present-day British English. Possible interpretations are presented for findings that used less productively by women, while with there no gender difference. second analyse validity hapax-based measures sociolinguistic research. It discovered they require a significantly larger corpus than type-based ones, category-conditioned degree P unusable when comparing subcorpora based...

10.1515/cllt.2011.006 article EN Corpus Linguistics and Linguistic Theory 2011-01-01

Abstract This paper tracks stylistic variation in the use of two roughly synonymous suffixes, Romance - ity and native ness , during Early Modern English period. We seek to verify from a statistical viewpoint claims Rodríguez-Puente (2020) who reports on decrease favour registers representative speech-written formal-informal continua at that time. To this end, we develop new methods visual analysis enable diachronic comparisons competing processes across subcorpora, building upon an earlier...

10.1075/ijcl.22014.rod article EN cc-by International Journal of Corpus Linguistics 2022-08-19

Journal Article Variation in noun and pronoun frequencies a sociohistorical corpus of English Get access Tanja Säily, Säily Department Modern Languages, University Helsinki, Finland Search for other works by this author on: Oxford Academic Google Scholar Terttu Nevalainen, Nevalainen Harri Siirtola Computer Sciences, Tampere, Literary Linguistic Computing, Volume 26, Issue 2, June 2011, Pages 167–188, https://doi.org/10.1093/llc/fqr004 Published: 06 May 2011

10.1093/llc/fqr004 article EN Literary and Linguistic Computing 2011-05-06

Abstract This paper presents ongoing work on Säily and Suomela’s (

10.1515/cllt-2015-0064 article EN Corpus Linguistics and Linguistic Theory 2015-12-08

Research in the digital humanities and computational social sciences requires overcoming complexity research data, methodology, questions. In this article, we show through case studies of three different science projects, that these problems are prevalent, multiform, as well laborious to counter. Yet, without facilities for acknowledging, detecting, handling correcting such bias, any results based on material will be faulty. Therefore, argue need a wider recognition acknowledgement...

10.5617/dhnbpub.11180 article EN Digital Humanities in the Nordic and Baltic Countries Publications 2020-06-01

This paper reviews the gap between current methods of text visualization and needs corpus-linguistic research, introduces a tool that takes step towards bridging gap. Current tend to treat problem as data-encoding issue only, do not strive for interactive, tightly coupled representations would foster discovery. The argues such visualizations should always be linked effortless movement its visualization, controls provide continuous immediate feedback facilitate exploration. We introduce tool,...

10.1075/ijcl.19.3.05sii article EN International Journal of Corpus Linguistics 2014-09-01

Mika Hämäläinen, Tanja Säily, Jack Rueter, Jörg Tiedemann, Eetu Mäkelä. Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature. 2019.

10.18653/v1/w19-2509 article EN 2019-01-01

Abstract We introduce the Language Change Database (LCD), which provides access to results of previous corpus-based research dealing with change in English language. The LCD will be published on an open-access linked data platform that allow users enter information about their own publications into database and conduct searches based linguistic extralinguistic parameters. Both metadata numerical from original available for download, enabling systematic reviews, meta-analyses, replication...

10.1515/icame-2016-0006 article EN ICAME journal 2016-03-01

Endeavors to computationally model language variation and change are ever increasing. While analyses of recent diachronic trends frequently conducted, long-term accounting for sociolinguistic less well-studied. Our work sheds light on the temporal dynamics use British 18th century women as a group in transition across two situational contexts. findings reveal that formal contexts adapt register conventions, while informal they act innovators influencing others. adopted from other...

10.3389/frai.2021.609970 article EN cc-by Frontiers in Artificial Intelligence 2021-06-01

Digitalization is changing how research carried out in all areas of science. Humanities no exception - materials that used to be hand-written or printed on paper are increasingly available digital form. This development scholars interacting with their material. We addressing the problem interactive text visualization context sociolinguistic language study. When a scholar reading and analyzing from computer screen instead paper, we can support this by providing dashboard for reading, creating...

10.1109/iv.2016.57 article EN 2020 24th International Conference Information Visualisation (IV) 2016-07-01

Abstract This paper describes ongoing work towards a rich analysis of the social contexts neologism use in historical corpora, particular Corpora Early English Correspondence , with research questions concerning innovators, meanings and diffusion neologisms. To enable this kind study, we are developing new processes, tools ways combining data from different sources, including Oxford Dictionary Historical Thesaurus contemporary published texts. Comparing candidates across these sources is...

10.1075/pc.18001.sai article EN Pragmatics & Cognition 2018-12-31
Coming Soon ...