Pooya Jannaty

ORCID: 0009-0009-8016-8156
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Image Retrieval and Classification Techniques
  • Distributed and Parallel Computing Systems
  • Medical Image Segmentation Techniques
  • 3D Surveying and Cultural Heritage
  • Image Processing Techniques and Applications

Nvidia (United States)
2024

Physical AI needs to be trained digitally first. It a digital twin of itself, the policy model, and world, world model. In this paper, we present Cosmos World Foundation Model Platform help developers build customized models for their setups. We position foundation model as general-purpose that can fine-tuned into downstream applications. Our platform covers video curation pipeline, pre-trained models, examples post-training tokenizers. To builders solve most critical problems our society,...

10.48550/arxiv.2501.03575 preprint EN arXiv (Cornell University) 2025-01-07

We introduce GenUSD, an end-to-end text-to-scene generation framework that transforms natural language queries into realistic 3D scenes, including objects and layouts. The process involves two main steps: 1) A Large Language Model (LLM) generates a scene layout hierarchically. It first proposes high-level plan to decompose the multiple functionally spatially distinct subscenes. Then, for each subscene, LLM with detailed positions, poses, sizes, descriptions. To manage complex object...

10.1145/3641520.3665306 article EN 2024-07-25

We introduce Edify Image, a family of diffusion models capable generating photorealistic image content with pixel-perfect accuracy. Image utilizes cascaded pixel-space trained using novel Laplacian process, in which signals at different frequency bands are attenuated varying rates. supports wide range applications, including text-to-image synthesis, 4K upsampling, ControlNets, 360 HDR panorama generation, and finetuning for customization.

10.48550/arxiv.2411.07126 preprint EN arXiv (Cornell University) 2024-11-11
Coming Soon ...