Jan Michal Dubinski

ORCID: 0000-0002-2568-0132
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • High-Energy Particle Collisions Research
  • Particle physics theoretical and experimental studies
  • Quantum Chromodynamics and Particle Interactions
  • Particle Detector Development and Performance
  • Nuclear reactor physics and engineering
  • Generative Adversarial Networks and Image Synthesis
  • Computational Physics and Python Applications
  • Model Reduction and Neural Networks
  • Dark Matter and Cosmic Phenomena
  • Statistical Methods and Bayesian Inference
  • Adversarial Robustness in Machine Learning
  • Superconducting Materials and Applications
  • Nuclear physics research studies
  • Advanced Data Storage Technologies
  • Advanced Graph Neural Networks
  • Stochastic processes and statistical mechanics
  • Anomaly Detection Techniques and Applications
  • Cosmology and Gravitation Theories
  • Domain Adaptation and Few-Shot Learning
  • Pediatric Urology and Nephrology Studies
  • Muon and positron interactions and applications
  • Art History and Market Analysis
  • Music and Audio Processing
  • Security and Verification in Computing
  • Privacy-Preserving Technologies in Data

Warsaw University of Technology
2020-2025

Center for Theoretical Physics
2024

A. Alikhanyan National Laboratory
2022-2024

Innovative Designs in Environments for an Aging Society
2024

Integrated Detector Electronics AS (Norway)
2024

Corporación Universitaria de Colombia Ideas
2024

Currently, over 50% of the computing power at CERN's GRID is used to run High Energy Physics simulations. The recent updates Large Hadron Collider (LHC) create need for developing more efficient simulation methods. In particular, there exist a demand fast neutron Zero Degree Calorimeter, where existing Monte Carlo-based methods impose significant computational burden. We propose an alternative approach problem that leverages machine learning. Our solution utilises neural network classifiers...

10.1063/5.0203567 article EN AIP conference proceedings 2024-01-01

Generative diffusion models, including Stable Diffusion and Midjourney, can generate visually appealing, diverse, high-resolution images for various applications. These models are trained on billions of internet-sourced images, raising significant concerns about the potential unauthorized use copyright-protected images. In this paper, we examine whether it is possible to determine if a specific image was used in training set, problem known cybersecurity community as membership inference...

10.1109/wacv57701.2024.00479 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Image autoregressive (IAR) models have surpassed diffusion (DMs) in both image quality (FID: 1.48 vs. 1.58) and generation speed. However, their privacy risks remain largely unexplored. To address this, we conduct a comprehensive analysis comparing IARs to DMs. We develop novel membership inference attack (MIA) that achieves significantly higher success rate detecting training images (TPR@FPR=1%: 86.38% for 4.91% DMs). Using this MIA, perform dataset (DI) find require as few six samples...

10.48550/arxiv.2502.02514 preprint EN arXiv (Cornell University) 2025-02-04

In this work, we propose a novel end-to-end Sinkhorn Autoencoder with noise generator for efficient data collection simulation. Simulating processes that aim at collecting experimental is crucial multiple real-life applications, including nuclear medicine, astronomy, and high energy physics. Contemporary methods, such as Monte Carlo algorithms, provide high-fidelity results price of computational cost. Multiple attempts are taken to reduce burden, e.g. using generative approaches based on...

10.1109/access.2020.3048622 article EN cc-by IEEE Access 2020-12-31

Graph Neural Networks (GNNs) are recognized as potent tools for processing real-world data organized in graph structures. Especially inductive GNNs, which enable the of graph-structured without relying on predefined structures, gaining importance an increasingly wide variety applications. As these networks demonstrate proficiency across a range tasks, they become lucrative targets model-stealing attacks where adversary seeks to replicate functionality targeted network. A large effort has...

10.48550/arxiv.2405.12295 preprint EN arXiv (Cornell University) 2024-05-20

The research of innovative methods aimed at reducing costs and shortening the time needed for simulation, going beyond conventional approaches based on Monte Carlo methods, has been sparked by development collision simulations Large Hadron Collider CERN. Deep learning generative including VAE, GANs diffusion models have used this purpose. Although they are much faster simpler than standard approaches, do not always keep high fidelity simulated data. This work aims to mitigate issue,...

10.48550/arxiv.2405.14049 preprint EN arXiv (Cornell University) 2024-05-22

In High Energy Physics simulations play a crucial role in unraveling the complexities of particle collision experiments within CERN's Large Hadron Collider. Machine learning simulation methods have garnered attention as promising alternatives to traditional approaches. While existing mainly employ Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs), recent advancements highlight efficacy diffusion models state-of-the-art generative machine methods. We present first for...

10.48550/arxiv.2406.03233 preprint EN arXiv (Cornell University) 2024-06-05

Simulating detector responses is a crucial part of understanding the inner-workings particle collisions in Large Hadron Collider at CERN. The current reliance on statistical Monte-Carlo simulations strains CERN's computational grid, underscoring urgency for more efficient alternatives. Addressing these challenges, recent proposals advocate generative machine learning methods. In this study, we present an innovative deep simulation approach tailored proton Zero Degree Calorimeter ALICE...

10.48550/arxiv.2406.03263 preprint EN arXiv (Cornell University) 2024-06-05

We introduce Mediffusion -- a new method for semi-supervised learning with explainable classification based on joint diffusion model. The medical imaging domain faces unique challenges due to scarce data labelling insufficient standard training, and critical nature of the applications that require high performance, confidence, explainability models. In this work, we propose tackle those single model combines diffusion-based generative task in shared parametrisation. By sharing...

10.48550/arxiv.2411.09434 preprint EN arXiv (Cornell University) 2024-11-12

Diffusion Models (DMs) benefit from large and diverse datasets for their training. Since this data is often scraped the Internet without permission owners, raises concerns about copyright intellectual property protections. While (illicit) use of easily detected training samples perfectly re-created by a DM at inference time, it much harder owners to verify if was used when outputs suspect are not close replicas. Conceptually, membership attacks (MIAs), which detect given point during...

10.48550/arxiv.2411.12858 preprint EN arXiv (Cornell University) 2024-11-19

Large-scale vision models have become integral in many applications due to their unprecedented performance and versatility across downstream tasks. However, the robustness of these foundation has primarily been explored for a single task, namely image classification. The vulnerability other common tasks, such as semantic segmentation depth estimation, remains largely unknown. We present comprehensive empirical evaluation adversarial self-supervised encoders multiple Our attacks operate...

10.48550/arxiv.2407.12588 preprint EN arXiv (Cornell University) 2024-07-17

In this work, we propose a novel end-to-end sinkhorn autoencoder with noise generator for efficient data collection simulation. Simulating processes that aim at collecting experimental is crucial multiple real-life applications, including nuclear medicine, astronomy and high energy physics. Contemporary methods, such as Monte Carlo algorithms, provide high-fidelity results price of computational cost. Multiple attempts are taken to reduce burden, e.g. using generative approaches based on...

10.48550/arxiv.2006.06704 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Generative diffusion models, including Stable Diffusion and Midjourney, can generate visually appealing, diverse, high-resolution images for various applications. These models are trained on billions of internet-sourced images, raising significant concerns about the potential unauthorized use copyright-protected images. In this paper, we examine whether it is possible to determine if a specific image was used in training set, problem known cybersecurity community referred as membership...

10.48550/arxiv.2306.12983 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Currently, over half of the computing power at CERN GRID is used to run High Energy Physics simulations. The recent updates Large Hadron Collider (LHC) create need for developing more efficient simulation methods. In particular, there exists a demand fast neutron Zero Degree Calorimeter, where existing Monte Carlo-based methods impose significant computational burden. We propose an alternative approach problem that leverages machine learning. Our solution utilises neural network classifiers...

10.48550/arxiv.2306.13606 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Machine Learning as a Service (MLaaS) APIs provide ready-to-use and high-utility encoders that generate vector representations for given inputs. Since these are very costly to train, they become lucrative targets model stealing attacks during which an adversary leverages query access the API replicate encoder locally at fraction of original training costs. We propose Bucks Buckets (B4B), first active defense prevents while attack is happening without degrading representation quality...

10.48550/arxiv.2310.08571 preprint EN cc-by arXiv (Cornell University) 2023-01-01

We introduce a new method for internal replay that modulates the frequency of rehearsal based on depth network. While strategies mitigate effects catastrophic forgetting in neural networks, recent works generative show performing only deeper layers network improves performance continual learning. However, approach introduces additional computational overhead, limiting its applications. Motivated by observation earlier networks forget less abruptly, we propose to update with varying using...

10.48550/arxiv.2207.01562 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Generative Adversarial Networks (GANs) are powerful models able to synthesize data samples closely resembling the distribution of real data, yet diversity those generated is limited due so-called mode collapse phenomenon observed in GANs. Especially prone conditional GANs, which tend ignore input noise vector and focus on information. Recent methods proposed mitigate this limitation increase samples, they reduce performance when similarity required. To address shortcoming, we propose a novel...

10.48550/arxiv.2207.01561 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...