NFDI4DS | UHH-SEMS - Publication Details

Piotr Stańczyk

ORCID: 0000-0003-0124-2936

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5013012032

Research Areas

Reinforcement Learning in Robotics
Natural Language Processing Techniques
Topic Modeling
Sports Analytics and Performance
Machine Learning and Data Classification
Economic and Fiscal Studies
Data Stream Mining Techniques
Nutrition and Health Studies
Educational Games and Gamification
Management and Organizational Practices
Labour Market and Migration
Adversarial Robustness in Machine Learning
Polish socio-economic development
Evolutionary Algorithms and Applications
Finance, Markets, and Regulation
Machine Learning and Algorithms
Social Issues in Poland
Economic and Business Development Strategies
Consumer Attitudes and Food Labeling
Speech Recognition and Synthesis
Advanced Text Analysis Techniques
Consumer Behavior in Brand Consumption and Identification
Banking, Crisis Management, COVID-19 Impact
Complex Systems and Time Series Analysis
Image Retrieval and Classification Techniques

Wroclaw University of Economics and Business
2003-2024

Google (United States)
2019-2023

Bar-Ilan University
2023

University of Białystok
2018

University of Wrocław
2018

University of the West of England
1999

Google Research Football: A Novel Reinforcement Learning Environment

OPENALEX - Publications

Karol Kurach Anton Raichuk Piotr Stańczyk Michał Zając Olivier Bachem and 6 more

Recent progress in the field of reinforcement learning has been accelerated by virtual environments such as video games, where novel algorithms and ideas can be quickly tested a safe reproducible manner. We introduce Google Research Football Environment, new environment agents are trained to play football an advanced, physics-based 3D simulator. The resulting is challenging, easy use customize, it available under permissive open-source license. In addition, provides support for multiplayer...

10.1609/aaai.v34i04.5878 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Gemma 2: Improving Open Language Models at a Practical Size

OPENALEX - Publications

Gemma Team Morgane Rivière Shreya Pathak Pier Giuseppe Sessa Cassidy Hardin and 95 more

In this work, we introduce Gemma 2, a new addition to the family of lightweight, state-of-the-art open models, ranging in scale from 2 billion 27 parameters. version, apply several known technical modifications Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie 2023). We also train 2B 9B models with knowledge distillation (Hinton 2015) instead next token prediction. The resulting deliver best performance for their...

10.48550/arxiv.2408.00118 preprint EN arXiv (Cornell University) 2024-07-31

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

OPENALEX - Publications

Marcin Andrychowicz Anton Raichuk Piotr Stańczyk Manu Orsini Sertan Girgin and 7 more

In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of resulting agents. Those choices usually not extensively discussed in literature, leading discrepancy between published descriptions implementations. This makes it hard attribute progress...

10.48550/arxiv.2006.05990 preprint EN other-oa arXiv (Cornell University) 2020-01-01

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference

OPENALEX - Publications

Lasse Espeholt Raphaël Marinier Piotr Stańczyk Ke Wang Marcin Michalski

We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing accelerators, we show that it is not only possible to train on millions of frames per second but also lower the cost experiments compared current methods. achieve this with simple architecture features centralized inference and an optimized communication layer. adopts two state art distributed algorithms, IMPALA/V-trace (policy gradients) R2D2 (Q-learning), evaluated...

10.48550/arxiv.1910.06591 preprint EN cc-by-nc-sa arXiv (Cornell University) 2019-01-01

Acme: A Research Framework for Distributed Reinforcement Learning

OPENALEX - Publications

Matthew W. Hoffman Bobak Shahriari John Aslanides Gabriel Barth-Maron Nikola Momchev and 34 more

Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in underlying architectures being trained as well complexity RL algorithms used train them. These increases turn made it more difficult for researchers rapidly prototype new ideas or reproduce published algorithms. To address concerns this work describes Acme, a framework constructing novel that is specifically designed enable agents...

10.48550/arxiv.2006.00979 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

OPENALEX - Publications

Paul Roit Johan Ferret Lior Shani Roee Aharoni Geoffrey Cideron and 14 more

Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Leonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Pietquin, Idan Szpektor. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.344 article EN cc-by 2023-01-01

Generalized Knowledge Distillation for Auto-regressive Language Models

OPENALEX - Publications

Rishabh Agarwal Nino Vieillard Piotr Stańczyk Sabela Ramos Matthieu Geist and 1 more

Knowledge distillation (KD) is widely used for compressing a teacher model to reduce its inference cost and memory footprint, by training smaller student model. However, current KD methods auto-regressive sequence models suffer from distribution mismatch between output sequences seen during those generated the inference. To address this issue, we introduce Generalized Distillation (GKD). Instead of solely relying on fixed set sequences, GKD trains self-generated leveraging feedback such...

10.48550/arxiv.2306.13649 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Wykształcenie ludności II Rzeczypospolitej w świetle badań GUS / Education of population of the Second Polish Republic in the light of research by CSO

OPENALEX - Publications

Piotr Stańczyk

W artykule przedstawiono zróżnicowanie struktury ludności Polski w dwudziestoleciu międzywojennym według poziomu wykształcenia (w tym umiejętności czytania i pisania) ze względu na płeć, wyznanie miejsce zamieszkania. Głównym źródłem danych statystycznych były wyniki spisów latach 1921 1931. Na podstawie materiału empirycznego stwierdzono, iż dominującą pozycję strukturze stanowiło wykształcenie początkowe – łącznie 37,5% ogółu wieku 15 więcej lat; osoby z wykształceniem wyższym stanowiły...

10.15611/sie.2016.1.01 article PL cc-by-nc-nd Społeczeństwo i Ekonomia/Nauki Społeczne 2016-01-01

Rationalization of Energy Expenditure: Household Behavior in Poland

OPENALEX - Publications

Elżbieta Stańczyk Katarzyna Szalonka Małgorzata Niklewicz-Pijaczyńska Wioletta Nowak Piotr Stańczyk and 2 more

Background: The implementation of the EU climate and energy policy, along with changes in legal environment, has led to a significant increase prices Poland. Consequently, expenditures are now larger part household budgets. These rising costs evolving landscape compelling households invest energy-saving solutions modify their consumption habits. This article aims identify activities Poland regarding rationalization expenditures. It formulates following research hypothesis: appliances...

10.3390/en17215329 article EN cc-by Energies 2024-10-26

BOND: Aligning LLMs with Best-of-N Distillation

OPENALEX - Publications

Pier Giuseppe Sessa Robert Dadashi Léonard Hussenot Johan Ferret Nino Vieillard and 15 more

Reinforcement learning from human feedback (RLHF) is a key driver of quality and safety in state-of-the-art large language models. Yet, surprisingly simple strong inference-time strategy Best-of-N sampling that selects the best generation among N candidates. In this paper, we propose Distillation (BOND), novel RLHF algorithm seeks to emulate but without its significant computational overhead at inference time. Specifically, BOND distribution matching forces generations policy get closer...

10.48550/arxiv.2407.14622 preprint EN arXiv (Cornell University) 2024-07-19

Launchpad: A Programming Model for Distributed Machine Learning Research

OPENALEX - Publications

Fan Yang Gabriel Barth-Maron Piotr Stańczyk Matthew W. Hoffman Siqi Liu and 3 more

A major driver behind the success of modern machine learning algorithms has been their ability to process ever-larger amounts data. As a result, use distributed systems in both research and production become increasingly prevalent as means scale this growing At same time, however, distributing can drastically complicate implementation even simple algorithms. This is especially problematic many practitioners are not well-versed design systems, let alone those that have complicated...

10.48550/arxiv.2106.04516 preprint EN other-oa arXiv (Cornell University) 2021-01-01

The Impact of Personality and Competence of Leaders on Business Success

OPENALEX - Publications

Elżbieta Stańczyk Piotr Stańczyk Katarzyna Szalonka

Purpose: This article aims to identify leaders’ personality and competence traits that determine success for Polish small medium-sized enterprises. Design/Methodology/Approach: Empirical data are selected from an experimental survey conducted by the Statistics Poland December 2017 January 2018 as part of Determinants Entrepreneurship Developments in SMEs Sector project. We used 20959 surveys enterprises which leader (an owner or a manager) played dominant role. To test dependence measures...

10.35808/ersj/1612 article EN EUROPEAN RESEARCH STUDIES JOURNAL 2020-04-01

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

OPENALEX - Publications

Sabela Ramos Sertan Girgin Léonard Hussenot Damien Vincent Hanna Yakubovich and 7 more

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM) including Reinforcement (RL), from Demonstrations, Offline RL or Imitation Learning. enables not only reproducibility existing research easy generation new datasets, but also accelerates novel research. By providing a standard lossless format datasets it to quickly test algorithms on wider range tasks. The...

10.48550/arxiv.2111.02767 preprint EN cc-by-nc-sa arXiv (Cornell University) 2021-01-01

The determinants of shopping place selection in Poland – the survey results

OPENALEX - Publications

Anna Gardocka-Jałowiec Katarzyna Szalonka Piotr Stańczyk

Goal -The aim of the empirical research using electronic survey questionnaire was to identify determinants consumers' behavior on market basis type, place and frequency purchasing goods that satisfy basic needs.Research methodology -There were presented results conducted online (n=482) in Poland.The character sample random representative.Score showed 97% respondents do shopping at least once a week, whereas 33% surveyed individuals go every day.There observed statistically essential...

10.15290/oes.2018.04.94.20 article EN Optimum Economic Studies 2018-01-01

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

OPENALEX - Publications

Paul Roit Johan Ferret Lior Shani Roee Aharoni Geoffrey Cideron and 14 more

Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent with respect their input. This phenomenon is emphasized in tasks like summarization, which generated summaries should be corroborated by source article. In this work, we leverage recent progress on textual entailment models directly address problem for abstractive summarization systems. We use reinforcement learning reference-free, rewards optimize factual...

10.48550/arxiv.2306.00186 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Determinants of medicines consumption in Poland

OPENALEX - Publications

Elżbieta Stańczyk Katarzyna Szalonka Piotr Stańczyk Wioletta Nowak Jolanta Blicharz and 3 more

Celem artykułu jest przedstawienie najważniejszych czynników konsumpcji leków w Polsce. Artykuł powstał na podstawie badań pierwotnych przeprowadzonych próbie 428 respondentów marcu i kwietniu 2020 roku Polsce metodą ankiety internetowej. Zastosowano algorytm drzew klasyfikacyjnych celu wyodrębnienia kategorii respondentów, którzy spożywali leki receptę oraz dostępne bez recepty (over-the-counter - OTC). Ponadto zaproponowano regresję logistyczną do oceny zależności między przyjmowaniem a...

10.32383/farmpol/171531 article PL cc-by-nc Farmacja Polska 2023-08-24

Google Research Football: A Novel Reinforcement Learning Environment

OPENALEX - Publications

Karol Kurach Anton Raichuk Piotr Stańczyk Michał Zając Olivier Bachem and 6 more

10.48550/arxiv.1907.11180 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Consumption of sustainable medications and its impact on health

OPENALEX - Publications

Piotr Stańczyk Katarzyna Szalonka

Consumption of sustainable medications and its impact on healthSustainable consumption medicines is medically justifi ed medicine consumption, indispensable in the treatment process, which contributes to quality life patient extends patient’s resulting from doctor’s recommendations verifi by pharmacist elimination interactions. The aim this paper analyze diff erences level Poland wider world, expectancy health status. There a large variation per capita intake certain countries. following...

10.19195/2658-1310.25.4.6 article EN Ekonomia/Acta Universitatis Wratislaviensis. Ekonomia 2020-01-02

Coming Soon ...