- Parallel Computing and Optimization Techniques
- Embedded Systems Design Techniques
- Interconnection Networks and Systems
- Distributed and Parallel Computing Systems
- Robotics and Sensor-Based Localization
- Real-Time Systems Scheduling
- Low-power high-performance VLSI design
- Advanced Data Storage Technologies
- VLSI and FPGA Design Techniques
- Computational Geometry and Mesh Generation
- Cloud Computing and Resource Management
- Cryptography and Residue Arithmetic
- Data Management and Algorithms
- Remote Sensing and LiDAR Applications
- Advanced Image and Video Retrieval Techniques
- Security and Verification in Computing
- VLSI and Analog Circuit Testing
- Cryptography and Data Security
- Graph Theory and Algorithms
- Robotic Path Planning Algorithms
- Autonomous Vehicle Technology and Safety
- Real-time simulation and control systems
- Advanced Neural Network Applications
- Advanced Manufacturing and Logistics Optimization
- Digital Filter Design and Implementation
Nagoya University
2016-2025
Computing Research Association
2018
Waseda University
2013
NEC (Japan)
1993-2010
The University of Tokyo
1985-2009
Princeton University
1993-2005
Article A clustering-based optimization algorithm in zero-skew routings Share on Author: Masato Edahiro View Profile Authors Info & Claims DAC '93: Proceedings of the 30th international Design Automation ConferenceJuly 1993 Pages 612–616https://doi.org/10.1145/157485.165066Online:01 July 1993Publication History 211citation700DownloadsMetricsTotal Citations211Total Downloads700Last 12 Months15Last 6 weeks0 Get Citation AlertsNew Alert added!This alert has been successfully added and will be...
Graphics processing units (GPUs) provide an order-of-magnitude improvement on peak performance and performance-per-watt as compared to traditional multicore CPUs. However, GPU-accelerated systems currently lack a generalized method of power prediction, which prevents system designers from ultimate goal dynamic optimization. This is due the fact that their characteristics are not well captured across architectures, result, existing modeling approaches only available for limited range...
In this paper, we aim to explore path following. We implement a following component by referring the existing Pure Pursuit algorithm. Using simulation and field operational test, identified problem in component. The main problems were with respect vehicles meandering off path, turning corner, instability of steering control. Therefore, apply some modifications have also conducted tests again evaluate these modifications.
This study addresses the task-scheduling optimization challenges in parallel computing systems using a novel meta-heuristic framework. We analyze differential evolution task scheduling and propose an advanced shift-chain methodology to improve cooperation between components. The proposed framework introduces wait-ing time-based neighborhood exploration strategy for handling complex dependencies, along with two implementation approaches: basic matching vector (MV) parallelization event-driven...
Graphics processing units (GPUs) embrace many-core compute devices where massively parallel threads are offloaded from CPUs. This heterogeneous nature of GPU computing raises non-trivial data transfer problems especially against latency-critical real-time systems. However even the basic characteristics transfers associated with not well studied in literature. In this paper, we investigate and characterize currently-achievable methods cutting-edge technology. We implement these using...
[abstFig src='/00280004/06.jpg' width='300' text='Monocular Visual Localization in Tsukuba Challenge 2015. Left: result of localization inside the map created by ORB-SLAM. Right: position tracking at starting point.' ] For 2015 Challenge, we realized an implementation vision-based based on Our method combined mapping ORB-SLAM and Velodyne LIDAR SLAM, utilized these maps a process using only monocular camera. We also apply sensor fusion odometer from all maps. The delivered better accuracy...
Abstract| A bucket algorithm is proposed for zeroskew routing with linear time complexity on the average.Our much simpler and more ecient than best known which uses Delaunay triangulations segments Manhattan distance.Experimental results show that linearity of our accomplished.Our generates a zero-skew 3000-pin benchmark data within 5 seconds 90MIPS RISC workstation.
Discusses delay minimization for zero skew routing technologies and applications.
We propose a secure platform on chip multiprocessor, FIDES, in order to enable next generation mobile terminals execute downloaded native applications for Linux. Its most important feature is the higher security based multigrained separation mechanisms. Four new technologies support FIDES platform: bus filter logic, XIP kernels, policy separation, and dynamic access control. With these technologies, can tolerate both application-level kernel-level bugs an actual download subsystem. Thus,...
A triple-CPU mobile application processor is developed on an 8.95mm/spl times/8.95mm die in a 0.13/spl mu/m CMOS process. The IC integrates 3/spl times/ARM926 cores, DSP several accelerators, as well strong bus and memory interfaces. It consumes 120mW for digital TV, Web browser, 30 graphics, 250mW@200MHz 600MIPS with full processing.
Vision-based object detection using camera sensors is an essential piece of perception for autonomous vehicles. Various combinations features and models can be applied to increase the quality speed detection. A well-known approach uses histograms oriented gradients (HOG) with deformable detect a car in image [15]. major challenge this found computational cost introducing real-time constraint relevant real world. In paper, we present implementation technique graphics processing units (GPUs)...
The problem of scaling out relational join performance for large data sets in the database management system (DBMS) has been studied years. Although in-memory DBMS engines can reduce load times by storing main memory, queries still remain computationally expensive. Modern graphics processing units (GPUs) provide massively parallel computing and may enhance such queries; however, it is not clearyet what condition joins perform well on GPUs. In this paper, we identify characteristics GPU...
Preemption techniques for hardware (HW) tasks have been studied in order to improve system responsiveness at the task level and utilization of FPGA area. This letter presents a fair comparison existing state-of-the-art preemption approaches from point view their capabilities limitations as well impact on static dynamic properties task. In comparison, we use set cryptographic, image, audio processing HW perform tests common platform based Virtex-4 Xilinx. Furthermore, propose method which can...
A 125 MHz 1 GIPS at 1.3 V W microprocessor with single-chip tightly-coupled multiprocessor architecture and low-voltage circuits is targeted to high-performance low-power embedded systems, especially smart information terminals. This paper shows an entire chip diagram integrating four processors. Each processing element (PE) in-order two-way issue superscalar two ALU pipelines. power-management unit (PMU) cuts off the leakage current of each power-control domain independently using dedicated...
Dynamic Partial Reconfiguration technology coupled with an Operating System for Reconfigurable Systems (OS4RS) allows implementation of a hardware task concept, that is, active computing object which can contend reconfigurable resources and request OS services in way software does conventional OS. In this work, we show complete model lightweight OS4RS supporting preemptable clock-scalable tasks. We also propose novel, scheduling mechanism allowing timely priority-based reservation resources,...
Clustering is the task of dividing an input dataset into groups objects based on their similarity. This process frequently required in many applications. However, it computationally expensive when running traditional CPUs due to large number connections and system needs inspect. In this paper, we investigate use NVIDIA graphics processing units programming platform CUDA acceleration Euclidean clustering (EC) autonomous driving systems. We propose GPU-accelerated algorithms for EC problem...
In this paper, we propose a novel methodology for scheming an interconnect strategy, such as what structure should be taken, how repeaters inserted, and when new metal or dielectric materials adopted. the methodology, strategic system performance analysis model is newly developed calculation that predicts LSI operation frequency chip size with electrical parameters of transistors interconnects well circuit configuration. The indicates delay overcomes block cycle time at specific length;...
What will information terminals look like in the 21st century? We believe they require three important features: mobility, intelligence, and diversity. With Internet, a large number of our daily activities take place on terminals. Mobility increases their convenience; for instance, wherever we are, bring global currency with electronic commerce, whenever want to go, are public or business offices government virtual offices. Intelligence should be another key issue new millennium. Predictions...
In this paper, we have presented a characterization of power and performance for GPU-accelerated systems. We selected four different NVIDIA GPUs from three generations the GPU architecture in order to demonstrate generality our contribution. One findings is that efficiency characteristics differ such best configuration not identical between GPUs. This evidence encourages future work on management systems benefit dynamic voltage frequency scaling. work, plan develop scaling algorithm