Jason Lee

ORCID: 0000-0003-1604-1395
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Distributed systems and fault tolerance
  • Interconnection Networks and Systems
  • Embedded Systems Design Techniques
  • Natural Language Processing Techniques
  • Caching and Content Delivery
  • Distributed and Parallel Computing Systems
  • Peer-to-Peer Network Technologies
  • Privacy-Preserving Technologies in Data
  • Cloud Computing and Resource Management
  • Topic Modeling
  • Stochastic Gradient Optimization Techniques
  • Adversarial Robustness in Machine Learning
  • Scientific Computing and Data Management
  • Text Readability and Simplification
  • Generative Adversarial Networks and Image Synthesis
  • Machine Learning and ELM
  • Domain Adaptation and Few-Shot Learning
  • Topological and Geometric Data Analysis
  • Numerical Methods and Algorithms
  • Multimodal Machine Learning Applications
  • 3D Shape Modeling and Analysis
  • Image and Signal Denoising Methods
  • Model Reduction and Neural Networks

Los Alamos National Laboratory
2018-2024

Google (United States)
2021

Princeton Public Schools
2019-2020

New York University
2019

The University of Melbourne
2017

Lawrence Berkeley National Laboratory
1994-2011

Simon Fraser University
2008-2010

Non-volatile, byte-addressable memory (NVM) has been introduced by Intel in the form of NVDIMMs named Intel® Optane™ DC PMM. This module ability to persist data stored it without need for power. expands hierarchy into a hybrid system due differences access latency and bandwidth from DRAM, which predominant main technology. The Optane modules have up 8x capacity DDR4 DRAM can expand byte-address space 6 TB per node. Many applications now scale their problem size given such system. We evaluate...

10.1145/3357526.3357541 article EN Proceedings of the International Symposium on Memory Systems 2019-09-30

Jason Lee, Kyunghyun Cho, Douwe Kiela. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1447 article EN cc-by 2019-01-01

We have designed, built, and analyzed a distributed parallel storage system that will supply image streams fast enough to permit multi-user, “real-time”, video-like applications in wide-area ATM network-based Internet environment. based the implementation on user-level code order secure portability; we characterized performance bottlenecks arising from operating hardware issues, this optimized our design make best use of available performance. Although at time only operated with few classes...

10.1145/192593.192709 article EN 1994-01-01

Summary Debugging is crucial for producing reliable software. One of the effective bug localization techniques spectral‐based fault (SBFL). It helps to locate a buggy statement by applying an evaluation metric program spectra and ranking components on basis score it computes. SBFL example dynamic analysis – computer that performed executing with sufficient number test cases. Static analysis, other hand, in non‐runtime environment. We introduce weighting technique combining these two kinds...

10.1002/spe.2490 article EN Software Practice and Experience 2017-03-13

Modern FPGAs are able to implement complex systems such as Systems-on-Chips (SoCs) and Networks-on-Chips (NoCs). Appropriate NoC topology choices for ASICs have been investigated typically topologies that can be easily mapped a two-dimensional fabric used reduce chip area ensure electrical characteristics. However, FPGAs, each device's size routing fixed. Since these resources exist independent of use, the choice is only limited by performance itself. In this work, we investigate how...

10.1145/1723112.1723118 article EN 2010-02-21

We study the problem of designing privacy-enhanced solutions for interest-based advertisement (IBA). IBA is a key component online ads ecosystem and provides better ad experience to users. Indeed, enables advertisers show users impressions that are relevant them. Nevertheless, current way tech companies achieve this by building detailed interest profiles individual In work we ask whether such fine grained personalization required, present mechanisms competitive performance while giving...

10.1145/3447548.3467180 article EN 2021-08-13

Key–value (KV) software has proven useful to a wide variety of applications including analytics, time-series databases, and distributed file systems. To satisfy the requirements diverse workloads, KV stores have been carefully tailored best match performance characteristics underlying solid-state block devices. Emerging storage device is promising technology for both simplifying stack improving persistent storage-based applications. However, while providing fast, predictable put get...

10.1145/3582013 article EN ACM Transactions on Storage 2023-01-21

Past research has identified a rich set of handcrafted linguistic features that can potentially assist various tasks. However, their extensive number makes it difficult to effectively select and utilize existing features. Coupled with the problem inconsistent implementation across works, there been no categorization scheme or generally-accepted feature names. This creates unwanted confusion. Also, actively-maintained open-source library extracts wide variety The current extraction practices...

10.18653/v1/2023.bea-1.1 article EN cc-by 2023-01-01

Popular software key-value stores such as LevelDB and RocksDB are often tailored for efficient writing. Yet, they tend to also perform well on read operations. This is because while data initially stored in a format that favors writes, it later transformed by the DB background into better accommodates reads. Write-optimized can still block writes. happens when those workers cannot keep up with foreground insertion workload.This paper advocates hardware-accelerated store, enabling...

10.1109/cluster52292.2023.00019 article EN 2023-10-31

Modern FPGAs are used to implement complex Systems-on-Chip (SoCs) and more recently Networks-on-Chip (NoCs). NoCs consist of computing nodes that connected via switches or routers a network point-to-point links define the topology. Previous work has investigated appropriate topology choices for ASICs as dictated by their electrical characteristics. However, since FPGA prefabricated interconnect, NoC implementations not restricted these concerns. Preliminary looked at homogeneous...

10.1109/fpt.2009.5377628 article EN 2009-12-01

Driven by the growing data transfer needs of scientific community and standardization 100 Gbps Ethernet Specification, is now becoming a reality for many HPC sites. This tenfold increase in bandwidth creates number significant technical challenges. We show that using heavy tail flow effect as filter, it should be possible to perform active IDS analysis at this traffic rate cluster commodity systems driven dedicated load balancing mechanism. Additionally, we examine nature current network...

10.1145/2063348.2063367 article EN 2011-11-12

Generative adversarial networks (GANs) are a widely used framework for learning generative models. Wasserstein GANs (WGANs), one of the most successful variants GANs, require solving minmax optimization problem to global optimality, but in practice successfully trained using stochastic gradient descent-ascent. In this paper, we show that, when generator is one-layer network, descent-ascent converges solution with polynomial time and sample complexity.

10.48550/arxiv.1910.07030 preprint EN other-oa arXiv (Cornell University) 2019-01-01

We describe the design and implementation of a distributed parallel storage system that uses high-speed ATM networks as key element architecture. Other elements include collection network-based disk block servers, an associated name server provides some file functionality. The is based on user level software runs UNIX workstations. Both architecture are intended to provide for easy economical scalability. This approach has yielded data source scales economically very high speed. Target...

10.5555/602770.602872 article EN Conference on High Performance Computing (Supercomputing) 1994-11-14

Nonuniform Memory Access (NUMA) will likely continue to be the chief abstraction used expose heterogeneous memory. One major problem with using NUMA in this way is, assignment of memory devices, mediated by hardware and Linux OS, is only resolved page granularity. That pages, not allocations, are explicitly assigned devices. This particularly troublesome if one wants migrate data between devices: since pages can migrated, other allocated on same migrated as well, it isn't easy tell what...

10.1145/3286475.3286568 article EN 2018-11-11

The Exascale Computing Project (ECP)’s Simplified Interface to Complex Memories (SICM) effort focuses on developing universal interfaces for discovering, managing, and sharing data across complex memory hierarchies. These facilitate the exploitation of emerging technologies support precise control over their various trade-offs such as high-bandwidth versus low-latency, persistent ephemeral, high-capacity low-capacity, near-CPU near-GPU. SICM comprises three interrelated components: a...

10.1177/10943420241288243 article EN The International Journal of High Performance Computing Applications 2024-11-03

Many complex systems require the use of floating point arithmetic that is exceedingly time consuming to perform on personal computers. However, operators are also hardware resource intensive and longer latencies than fixed complete. Due reduced logic density FPGAs relative ASICs, it often only possible accelerate a portion application in hardware. This paper presents an application-specific architecture for acceleration complete Fourier Integral Operator (FIO) kernel used seismic imaging...

10.1109/asap.2008.4580178 article EN 2008-07-01

PCCE Project background: Our experience in building distributed collaboratories has shown us that there is a growing need for simple, non-intrusive, and flexible ways to stay touch work together. Towards this goal we are developing Pervasive Collaborative Computing Environment (PCCE) within which participants can rendezvous interact with each other. The aims support continuous or ad hoc collaboration, target daily tasks base connectivity, be easy use install across multiple platforms,...

10.11578/dc.20220414.4 article EN OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) 2004-05-15

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans. We find that agents were initially pretrained to produce can also experience detrimental drift: when a non-linguistic reward is used in goal-based task, e.g. some scalar success metric, the protocol may radically diverge language. recast translation as game examine auxiliary training constraints for their effectiveness mitigating drift. show combination of syntactic...

10.48550/arxiv.1909.04499 preprint EN other-oa arXiv (Cornell University) 2019-01-01
Coming Soon ...