Linhai Song

ORCID: 0000-0002-3185-9278
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Testing and Debugging Techniques
  • Software System Performance and Reliability
  • Parallel Computing and Optimization Techniques
  • Software Engineering Research
  • Advanced Malware Detection Techniques
  • Web Data Mining and Analysis
  • Software Reliability and Analysis Research
  • Spam and Phishing Detection
  • Network Security and Intrusion Detection
  • FinTech, Crowdfunding, Digital Finance
  • Digital and Cyber Forensics
  • ICT Impact and Policies
  • Influenza Virus Research Studies
  • Text and Document Classification Technologies
  • Advanced Data Storage Technologies
  • Real-Time Systems Scheduling
  • Open Source Software Innovations
  • Advanced Database Systems and Queries
  • Digital Platforms and Economics
  • Internet Traffic Analysis and Secure E-voting
  • Algorithms and Data Compression
  • Security and Verification in Computing
  • Mobile Crowdsensing and Crowdsourcing
  • Global Energy and Sustainability Research
  • Model-Driven Software Engineering Techniques

Pennsylvania State University
2019-2024

University of Wisconsin–Madison
2011-2016

Institute of Computing Technology
2009-2010

Chinese Academy of Sciences
2009

Developers frequently use inefficient code sequences that could be fixed by simple patches. These can cause significant performance degradation and resource waste, referred to as bugs. Meager increases in single threaded the multi-core era increasing emphasis on energy efficiency call for more effort tackling

10.1145/2254064.2254075 article EN 2012-06-11

Fixing software bugs has always been an important and time-consuming process in development. concurrency become especially critical the multicore era. However, fixing is challenging, part due to non-deterministic failures tricky parallel reasoning. Beyond correctly original problem software, a good patch should also avoid introducing new bugs, degrading performance unnecessarily, or damaging readability. Existing tools cannot automate whole provide good-quality patches. We present AFix, tool...

10.1145/1993316.1993544 article EN ACM SIGPLAN Notices 2011-06-04

Fixing software bugs has always been an important and time-consuming process in development. concurrency become especially critical the multicore era. However, fixing is challenging, part due to non-deterministic failures tricky parallel reasoning. Beyond correctly original problem software, a good patch should also avoid introducing new bugs, degrading performance unnecessarily, or damaging readability. Existing tools cannot automate whole provide good-quality patches.

10.1145/1993498.1993544 article EN 2011-06-04

Performance bugs are programming errors that create significant performance degradation. While developers often use automated oracles for detecting functional bugs, usually requires time-consuming, manual analysis of execution profiles. The human effort limits the number tests analyzed and enables to easily escape production. Unfortunately, while profilers can successfully localize slow executing code, cannot be effectively used as oracles. This paper presents Toddler, a novel oracle which...

10.5555/2486788.2486862 article EN International Conference on Software Engineering 2013-05-18

Online scan engines such as VirusTotal are heavily used by researchers to label malicious URLs and files. Unfortunately, it is not well understood how the labels generated reliable scanning results are. In this paper, we focus on its 68 third-party vendors examine their labeling process phishing URLs. We perform a series of measurements setting up our own websites (mimicking PayPal IRS) submitting for scanning. By analyzing incoming network traffic dynamic changes at VirusTotal, reveal new...

10.1145/3355369.3355585 article EN 2019-10-18

Developers frequently use inefficient code sequences that could be fixed by simple patches. These can cause significant performance degradation and resource waste, referred to as bugs. Meager increases in single threaded the multi-core era increasing emphasis on energy efficiency call for more effort tackling This paper conducts a comprehensive study of 110 real-world bugs are randomly sampled from five representative software suites (Apache, Chrome, GCC, Mozilla, MySQL). The findings this...

10.1145/2345156.2254075 article EN ACM SIGPLAN Notices 2012-06-11

Design and implementation defects that lead to inefficient computation widely exist in software. These are difficult avoid discover. They severe performance degradation energy waste during production runs, becoming increasingly critical with the meager increase of single-core hardware increasing concerns about constraints. Effective tools diagnose problems point out inefficiency root cause sorely needed.

10.1145/2660193.2660234 article EN 2014-10-15

Rust is a young programming language designed for systems software development. It aims to provide safety guarantees like high-level languages and performance efficiency low-level languages. The core design of set strict rules enforced by compile-time checking. To support more controls, allows programmers bypass these compiler checks write unsafe code.

10.1145/3385412.3386036 article EN 2020-06-07

Go is a statically-typed programming language that aims to provide simple, efficient, and safe way build multi-threaded software. Since its creation in 2009, has matured gained significant adoption production open-source advocates for the usage of message passing as means inter-thread communication provides several new concurrency mechanisms libraries ease multi-threading programming. It important understand implication these proposals comparison shared memory synchronization terms program...

10.1145/3297858.3304069 article EN 2019-04-04

Performance bugs are programming errors that create significant performance degradation. While developers often use automated oracles for detecting functional bugs, usually requires time-consuming, manual analysis of execution profiles. The human effort limits the number tests analyzed and enables to easily escape production. Unfortunately, while profilers can successfully localize slow executing code, cannot be effectively used as oracles. This paper presents Toddler, a novel oracle which...

10.1109/icse.2013.6606602 article EN 2013 35th International Conference on Software Engineering (ICSE) 2013-05-01

Writing efficient software is difficult. Design and implementation defects can cause severe performance degradation. Unfortunately, existing diagnosis techniques like profilers are still preliminary. They locate code regions that consume resources, but not the ones waste resources. In this paper, we first design a root-cause fix-strategy taxonomy for inefficient loops, one of most common problems in field. We then static-dynamic hybrid analysis tool, LDoctor, to provide accurate loops....

10.1109/icse.2017.41 article EN 2017-05-01

To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each having a set of rules to guide the behaviors contracts. Violating ERC could cause serious security issues and financial loss, signifying importance verifying follow ERCs. Today's practices such verification are manually audit single contract, use expert-developed program-analysis tools, or large language models (LLMs), all which far from effective in identifying rule...

10.48550/arxiv.2502.07644 preprint EN arXiv (Cornell University) 2025-02-11

Different techniques have been recommended to detect fraudulent responses in online surveys, but little research has taken systematically test the extent which they actually work practice. In this paper, we conduct an empirical evaluation of 22 anti-fraud tests two complementary surveys. The first survey recruits Rust programmers on public forums and social media networks. We find that respondents involve both bot human characteristics. Among different tests, those designed based domain...

10.1145/3485447.3512230 article EN Proceedings of the ACM Web Conference 2022 2022-04-25

Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads correctness bugs and over performance problems. To improve the efficiency of multi-threaded software, we need a better understanding challenges faced by real-world developers. This paper studies code repositories open-source software projects obtain broad in- depth view how developers handle synchronizations. We first examine critical sections changed when evolves checking 250,000 revisions...

10.1145/2786805.2786815 article EN 2015-08-26

Go is a statically typed programming language designed for efficient and reliable concurrent programming. For this purpose, provides lightweight goroutines recommends passing messages using channels as less error-prone means of thread communication. has become increasingly popular in recent years been adopted to build many important infrastructure software systems. However, empirical study shows that concurrency bugs, especially those due misuse channels, exist widely Go. These bugs severely...

10.1145/3445814.3446756 article EN 2021-04-11

Rust is a young systems programming language designed to provide both the safety guarantees of high-level languages and execution performance low-level languages. To achieve this design goal, provides suite rules checks against those at compile time eliminate many memory-safety thread-safety issues. Due its performance, Rust's popularity has increased significantly in recent years, it already been adopted build safety-critical software systems.

10.1145/3510003.3510164 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Rust is a relatively new programming language designed for systems software development. Its objective to combine the safety guarantees typically associated with high-level languages performance efficiency often found in executable programs implemented low-level languages. The core design of set strict rules enforced through compile-time checks. However, support more controls, also allows programmers bypass its compiler checks by writing <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/tse.2024.3380393 article EN IEEE Transactions on Software Engineering 2024-03-25

This paper provides a simple but effective approach, named ECON, to fully-automatically extract content from Web news page. ECON uses DOM tree represent the page and leverages substantial features of tree. finds snippet-node by which part is wrapped firstly, then backtracks until summary-node found, entire summary-node. During process backtracking, removes noise. Experimental results showed that can achieve high accuracy fully satisfy requirements for scalable extraction. Moreover, be...

10.1109/apweb.2010.11 article EN 2010-04-01

Go is a young programming language invented to build safe and efficient concurrent programs. It provides goroutines as lightweight threads channels for inter-goroutine communication. Programmers are encouraged explicitly pass messages through connect goroutines, with the purpose of reducing chance making mistakes introducing concurrency bugs. one most beloved languages has already been used many critical infrastructure software systems in data-center environment. However, recent study shows...

10.1145/3503222.3507753 article EN 2022-02-22

Design and implementation defects that lead to inefficient computation widely exist in software. These are difficult avoid discover. They severe performance degradation energy waste during production runs, becoming increasingly critical with the meager increase of single-core hardware increasing concerns about constraints. Effective tools diagnose problems point out inefficiency root cause sorely needed. The state art diagnosis is preliminary. Profiling can identify functions consume most...

10.1145/2714064.2660234 article EN ACM SIGPLAN Notices 2014-10-15

VirusTotal is the largest online anti-malware scanning service. It widely used by security researchers for labeling malware data or serving as a comparison baseline. However, several important challenges of using are left unaddressed (e.g., whether labels already stable, when can be trusted), severely harming correctness research projects depending on VirusTotal.

10.1145/3372297.3420013 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2020-10-30

The execution of smart contracts on Ethereum, a public blockchain system, incurs fee called gas for its computation and data-store consumption. When programmers develop (e.g., in the Solidity programming language), they could unknowingly write code snippets that unnecessarily cause more fees. These issues, or what we call wastes, lead to significant monetary waste users. Yet, there have been no systematic examination them effective tools detecting them. This paper takes initiative helping...

10.48550/arxiv.2403.02661 preprint EN arXiv (Cornell University) 2024-03-05

As a relatively new programming language, Rust is designed to provide both memory safety and runtime performance. To achieve this goal, conducts rigorous static checks against its rules during compilation, effectively eliminating issues that plague C/C++ programs. Although useful, the pose challenges programmers, since programmers can easily violate when coding in Rust, leading their code be rejected by compiler, fact underscored recent user study. There exists desire automate process of...

10.1145/3597503.3639103 article EN 2024-04-12

Fixing software bugs has always been an important and time-consuming process in development. concurrency become especially critical the multicore era. However, fixing is challenging, part due to non-deterministic failures tricky parallel reasoning. Beyond correctly original problem software, a good patch should also avoid introducing new bugs, degrading performance unnecessarily, or damaging readability. Existing tools cannot automate whole provide good-quality patches.

10.1145/2345156.1993544 article EN ACM SIGPLAN Notices 2012-08-06
Coming Soon ...