Piotr Stańczyk

ORCID: 0000-0003-0124-2936
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Natural Language Processing Techniques
  • Topic Modeling
  • Sports Analytics and Performance
  • Machine Learning and Data Classification
  • Economic and Fiscal Studies
  • Data Stream Mining Techniques
  • Nutrition and Health Studies
  • Educational Games and Gamification
  • Management and Organizational Practices
  • Labour Market and Migration
  • Adversarial Robustness in Machine Learning
  • Polish socio-economic development
  • Evolutionary Algorithms and Applications
  • Finance, Markets, and Regulation
  • Machine Learning and Algorithms
  • Social Issues in Poland
  • Economic and Business Development Strategies
  • Consumer Attitudes and Food Labeling
  • Speech Recognition and Synthesis
  • Advanced Text Analysis Techniques
  • Consumer Behavior in Brand Consumption and Identification
  • Banking, Crisis Management, COVID-19 Impact
  • Complex Systems and Time Series Analysis
  • Image Retrieval and Classification Techniques

Wroclaw University of Economics and Business
2003-2024

Google (United States)
2019-2023

Bar-Ilan University
2023

University of Białystok
2018

University of Wrocław
2018

University of the West of England
1999

Recent progress in the field of reinforcement learning has been accelerated by virtual environments such as video games, where novel algorithms and ideas can be quickly tested a safe reproducible manner. We introduce Google Research Football Environment, new environment agents are trained to play football an advanced, physics-based 3D simulator. The resulting is challenging, easy use customize, it available under permissive open-source license. In addition, provides support for multiplayer...

10.1609/aaai.v34i04.5878 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

In this work, we introduce Gemma 2, a new addition to the family of lightweight, state-of-the-art open models, ranging in scale from 2 billion 27 parameters. version, apply several known technical modifications Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie 2023). We also train 2B 9B models with knowledge distillation (Hinton 2015) instead next token prediction. The resulting deliver best performance for their...

10.48550/arxiv.2408.00118 preprint EN arXiv (Cornell University) 2024-07-31

In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of resulting agents. Those choices usually not extensively discussed in literature, leading discrepancy between published descriptions implementations. This makes it hard attribute progress...

10.48550/arxiv.2006.05990 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing accelerators, we show that it is not only possible to train on millions of frames per second but also lower the cost experiments compared current methods. achieve this with simple architecture features centralized inference and an optimized communication layer. adopts two state art distributed algorithms, IMPALA/V-trace (policy gradients) R2D2 (Q-learning), evaluated...

10.48550/arxiv.1910.06591 preprint EN cc-by-nc-sa arXiv (Cornell University) 2019-01-01

Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in underlying architectures being trained as well complexity RL algorithms used train them. These increases turn made it more difficult for researchers rapidly prototype new ideas or reproduce published algorithms. To address concerns this work describes Acme, a framework constructing novel that is specifically designed enable agents...

10.48550/arxiv.2006.00979 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Leonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Pietquin, Idan Szpektor. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.344 article EN cc-by 2023-01-01

Knowledge distillation (KD) is widely used for compressing a teacher model to reduce its inference cost and memory footprint, by training smaller student model. However, current KD methods auto-regressive sequence models suffer from distribution mismatch between output sequences seen during those generated the inference. To address this issue, we introduce Generalized Distillation (GKD). Instead of solely relying on fixed set sequences, GKD trains self-generated leveraging feedback such...

10.48550/arxiv.2306.13649 preprint EN cc-by arXiv (Cornell University) 2023-01-01

W artykule przedstawiono zróżnicowanie struktury ludności Polski w dwudziestoleciu międzywojennym według poziomu wykształcenia (w tym umiejętności czytania i pisania) ze względu na płeć, wyznanie miejsce zamieszkania. Głównym źródłem danych statystycznych były wyniki spisów latach 1921 1931. Na podstawie materiału empirycznego stwierdzono, iż dominującą pozycję strukturze stanowiło wykształcenie początkowe – łącznie 37,5% ogółu wieku 15 więcej lat; osoby z wykształceniem wyższym stanowiły...

10.15611/sie.2016.1.01 article PL cc-by-nc-nd Społeczeństwo i Ekonomia/Nauki Społeczne 2016-01-01

Background: The implementation of the EU climate and energy policy, along with changes in legal environment, has led to a significant increase prices Poland. Consequently, expenditures are now larger part household budgets. These rising costs evolving landscape compelling households invest energy-saving solutions modify their consumption habits. This article aims identify activities Poland regarding rationalization expenditures. It formulates following research hypothesis: appliances...

10.3390/en17215329 article EN cc-by Energies 2024-10-26

Reinforcement learning from human feedback (RLHF) is a key driver of quality and safety in state-of-the-art large language models. Yet, surprisingly simple strong inference-time strategy Best-of-N sampling that selects the best generation among N candidates. In this paper, we propose Distillation (BOND), novel RLHF algorithm seeks to emulate but without its significant computational overhead at inference time. Specifically, BOND distribution matching forces generations policy get closer...

10.48550/arxiv.2407.14622 preprint EN arXiv (Cornell University) 2024-07-19

A major driver behind the success of modern machine learning algorithms has been their ability to process ever-larger amounts data. As a result, use distributed systems in both research and production become increasingly prevalent as means scale this growing At same time, however, distributing can drastically complicate implementation even simple algorithms. This is especially problematic many practitioners are not well-versed design systems, let alone those that have complicated...

10.48550/arxiv.2106.04516 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Purpose: This article aims to identify leaders’ personality and competence traits that determine success for Polish small medium-sized enterprises. Design/Methodology/Approach: Empirical data are selected from an experimental survey conducted by the Statistics Poland December 2017 January 2018 as part of Determinants Entrepreneurship Developments in SMEs Sector project. We used 20959 surveys enterprises which leader (an owner or a manager) played dominant role. To test dependence measures...

10.35808/ersj/1612 article EN EUROPEAN RESEARCH STUDIES JOURNAL 2020-04-01

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM) including Reinforcement (RL), from Demonstrations, Offline RL or Imitation Learning. enables not only reproducibility existing research easy generation new datasets, but also accelerates novel research. By providing a standard lossless format datasets it to quickly test algorithms on wider range tasks. The...

10.48550/arxiv.2111.02767 preprint EN cc-by-nc-sa arXiv (Cornell University) 2021-01-01

Goal -The aim of the empirical research using electronic survey questionnaire was to identify determinants consumers' behavior on market basis type, place and frequency purchasing goods that satisfy basic needs.Research methodology -There were presented results conducted online (n=482) in Poland.The character sample random representative.Score showed 97% respondents do shopping at least once a week, whereas 33% surveyed individuals go every day.There observed statistically essential...

10.15290/oes.2018.04.94.20 article EN Optimum Economic Studies 2018-01-01

Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent with respect their input. This phenomenon is emphasized in tasks like summarization, which generated summaries should be corroborated by source article. In this work, we leverage recent progress on textual entailment models directly address problem for abstractive summarization systems. We use reinforcement learning reference-free, rewards optimize factual...

10.48550/arxiv.2306.00186 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Celem artykułu jest przedstawienie najważniejszych czynników konsumpcji leków w Polsce. Artykuł powstał na podstawie badań pierwotnych przeprowadzonych próbie 428 respondentów marcu i kwietniu 2020 roku Polsce metodą ankiety internetowej. Zastosowano algorytm drzew klasyfikacyjnych celu wyodrębnienia kategorii respondentów, którzy spożywali leki receptę oraz dostępne bez recepty (over-the-counter - OTC). Ponadto zaproponowano regresję logistyczną do oceny zależności między przyjmowaniem a...

10.32383/farmpol/171531 article PL cc-by-nc Farmacja Polska 2023-08-24

Recent progress in the field of reinforcement learning has been accelerated by virtual environments such as video games, where novel algorithms and ideas can be quickly tested a safe reproducible manner. We introduce Google Research Football Environment, new environment agents are trained to play football an advanced, physics-based 3D simulator. The resulting is challenging, easy use customize, it available under permissive open-source license. In addition, provides support for multiplayer...

10.48550/arxiv.1907.11180 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Consumption of sustainable medications and its impact on healthSustainable consumption medicines is medically justifi ed medicine consumption, indispensable in the treatment process, which contributes to quality life patient extends patient’s resulting from doctor’s recommendations verifi by pharmacist elimination interactions. The aim this paper analyze diff erences level Poland wider world, expectancy health status. There a large variation per capita intake certain countries. following...

10.19195/2658-1310.25.4.6 article EN Ekonomia/Acta Universitatis Wratislaviensis. Ekonomia 2020-01-02
Coming Soon ...