NFDI4DS | UHH-SEMS - Publication Details

Ryan R. Curtin

ORCID: 0000-0002-9903-8214

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5026833192

Research Areas

Data Management and Algorithms
Parallel Computing and Optimization Techniques
Algorithms and Data Compression
Machine Learning and Data Classification
Advanced Image and Video Retrieval Techniques
Machine Learning and Algorithms
Numerical Methods and Algorithms
Adversarial Robustness in Machine Learning
Face and Expression Recognition
Distributed and Parallel Computing Systems
Embedded Systems Design Techniques
Advanced Clustering Algorithms Research
Neural Networks and Applications
Sparse and Compressive Sensing Techniques
Complexity and Algorithms in Graphs
Data Mining Algorithms and Applications
Network Security and Intrusion Detection
Stochastic Gradient Optimization Techniques
Music and Audio Processing
Advanced Malware Detection Techniques
Anomaly Detection Techniques and Applications
Automated Road and Building Extraction
Advanced Multi-Objective Optimization Algorithms
Advanced Database Systems and Queries
Semantic Web and Ontologies

Booz Allen Hamilton (United States)
2024

NortonLifeLock (United States)
2016-2021

Freie Universität Berlin
2018

Czech Academy of Sciences, Institute of Computer Science
2018

Data61
2017

Commonwealth Scientific and Industrial Research Organisation
2017

Georgia Institute of Technology
2011-2015

Georgia Tech Research Institute
2014

Armadillo: a template-based C++ library for linear algebra

OPENALEX - Publications

Conrad Sanderson Ryan R. Curtin

The C++ language is often used for implementing functionality that performance and/or resource sensitive. While the standard library provides many useful algorithms (such as sorting), in its current form it does not provide direct handling of linear algebra (matrix maths). Armadillo an open source linear language, aiming towards a good balance between speed and ease of use. Its high-level Application Programming Interface (API) deliberately similar to widely Matlab Octave languages...

10.21105/joss.00026 article EN cc-by The Journal of Open Source Software 2016-06-10

Detecting Adversarial Samples from Artifacts

OPENALEX - Publications

Reuben Feinman Ryan R. Curtin Saurabh Shintre Andrew B. Gardner

Deep neural networks (DNNs) are powerful nonlinear architectures that known to be robust random perturbations of the input. However, these models vulnerable adversarial perturbations--small input changes crafted explicitly fool model. In this paper, we ask whether a DNN can distinguish samples from their normal and noisy counterparts. We investigate model confidence on by looking at Bayesian uncertainty estimates, available in dropout networks, performing density estimation subspace deep...

10.48550/arxiv.1703.00410 preprint EN other-oa arXiv (Cornell University) 2017-01-01

mlpack 3: a fast, flexible machine learning library

OPENALEX - Publications

Ryan R. Curtin Marcus Edel Mikhail Lozhnikov Yannis Mentekidis Sumedh Ghaisas and 1 more

10.21105/joss.00726 article EN cc-by The Journal of Open Source Software 2018-06-18

Detecting DGA domains with recurrent neural networks and side information

OPENALEX - Publications

Ryan R. Curtin Andrew B. Gardner Sławomir Grzonkowski Alexey Kleymenov Alejandro Mosquera

Modern malware typically makes use of a domain generation algorithm (DGA) to avoid command and control domains or IPs being seized sinkholed. This means that an infected system may attempt access many in contact the server. Therefore, automatic detection DGA is important task, both for sake blocking malicious identifying compromised hosts. However, DGAs English wordlists generate plausibly clean-looking names; this difficult. In work, we devise notion difficulty families called smashword...

10.1145/3339252.3339258 article EN Proceedings of the 17th International Conference on Availability, Reliability and Security 2019-08-09

MLPACK: A Scalable C++ Machine Learning Library

OPENALEX - Publications

Ryan R. Curtin James R. Cline N. P. Slagle William B. March Parikshit Ram and 2 more

MLPACK is a state-of-the-art, scalable, multi-platform C++ machine learning library released in late 2011 offering both simple, consistent API accessible to novice users and high performance flexibility expert by leveraging modern features of C++. provides cutting-edge algorithms whose benchmarks exhibit far better than other leading libraries. version 1.0.3, licensed under the LGPL, available at http://www.mlpack.org.

10.48550/arxiv.1210.6293 preprint EN other-oa arXiv (Cornell University) 2012-01-01

mlpack 4: a fast, header-only C++ machine learning library

OPENALEX - Publications

Ryan R. Curtin Marcus Edel Omar Shrit Shubham Agrawal Suryoday Basak and 14 more

For over 15 years, the mlpack machine learning library has served as a "swiss army knife'' for C++-based (Curtin et al., 2013).Its efficient implementations of common and cutting-edge algorithms have been used in wide variety scientific industrial applications.This paper overviews 4, significant upgrade its predecessor 2018).The significantly refactored redesigned to facilitate an easier prototyping-to-deployment pipeline, including bindings other languages (Python, Julia, R, Go, command...

10.21105/joss.05026 article EN cc-by The Journal of Open Source Software 2023-02-01

Armadillo: An Efficient Framework for Numerical Linear Algebra

OPENALEX - Publications

Conrad Sanderson Ryan R. Curtin

A major challenge in the deployment of scientific software solutions is adaptation research prototypes to production-grade code. While high-level languages like MATLAB are useful for rapid prototyping, they lack resource efficiency required scalable production applications, necessitating translation into lower level C++. Further, machine learning and signal processing underlying linear algebra primitives, generally provided by standard BLAS LAPACK libraries, unwieldy difficult use, requiring...

10.48550/arxiv.2502.03000 preprint EN arXiv (Cornell University) 2025-02-05

Fast Exact Max-kernel Search

OPENALEX - Publications

Ryan R. Curtin Parikshit Ram Alexander Gray

The wide applicability of kernels makes the problem max-kernel search ubiquitous and more general than usual similarity in metric spaces. We focus on solving this efficiently. begin by characterizing inherent hardness with a novel notion directional concentration. Following that, we present method to use an O(n log n) algorithm index any set objects (points RD or abstract objects) directly Hilbert space without explicit feature representations space. first provably O(log for exact using...

10.1137/1.9781611972832.1 article EN 2013-05-02

Armadillo: an Efficient Framework for Numerical Linear Algebra

OPENALEX - Publications

Conrad Sanderson Ryan R. Curtin

10.1109/iccae64891.2025.10980539 article EN 2025-03-20

Dual‐tree fast exact max‐kernel search

OPENALEX - Publications

Ryan R. Curtin Parikshit Ram

Abstract The problem of max‐kernel search arises everywhere: given a query point \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$p_q$ \end{document} , set reference objects \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$S_r$ and some kernel...

10.1002/sam.11218 article EN Statistical Analysis and Data Mining The ASA Data Science Journal 2014-05-13

On Functional Aggregate Queries with Additive Inequalities

OPENALEX - Publications

Mahmoud Abo Khamis Ryan R. Curtin Benjamin Moseley Hung Q. Ngo XuanLong Nguyen and 2 more

Motivated by fundamental applications in databases and relational machine learning, we formulate study the problem of answering functional aggregate queries (FAQ) which some input factors are defined a collection additive inequalities between variables. We refer to these as FAQ-AI for short. To answer Boolean semiring, define relaxed tree decompositions submodular fractional hypertree width parameters. show that an extension InsideOut algorithm using Chazelle's geometric data structure...

10.1145/3294052.3319694 article EN 2019-06-17

Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification

OPENALEX - Publications

Fred Lu Ryan R. Curtin Edward Raff Francis Ferraro James Holt

While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular approaches can dominate data dimensionality increases. Recent work non-interactive algorithms shows that approximate solutions for be obtained efficiently with only single round among machines. However, this approximation degenerates the number machines In paper, building recent optimal weighted average method, we introduce new...

10.48550/arxiv.2406.01753 preprint EN arXiv (Cornell University) 2024-06-03

Functional Aggregate Queries with Additive Inequalities

OPENALEX - Publications

Mahmoud Abo Khamis Ryan R. Curtin Benjamin Moseley Hung Q. Ngo XuanLong Nguyen and 2 more

10.1145/3426865 article EN ACM Transactions on Database Systems 2020-12-06

Classifying broiler chicken condition using audio data

OPENALEX - Publications

Ryan R. Curtin Wayne Daley David V. Anderson

This paper is an effort to help prevent broiler chicken mortality caused by stressful conditions. We assume a relation between vocalizations and stress; therefore, microphones were used monitor flock of birds over the course their lifetime (approximately 65 days). A noise removal method based on spectral oversubtraction was developed filter out significant fan heater shown be very effective. Then, radar processing technique employed count number vocalizations. It found that effective for...

10.1109/globalsip.2014.7032300 article EN 2014-12-01

Practical Sparse Matrices in C++ with Hybrid Storage and Template-Based Expression Optimisation

OPENALEX - Publications

Conrad Sanderson Ryan R. Curtin

Despite the importance of sparse matrices in numerous fields science, software implementations remain difficult to use for non-expert users, generally requiring understanding underlying details chosen matrix storage format. In addition, achieve good performance, several formats may need be used one program, explicit selection and conversion between formats. This can both tedious error-prone, especially users. Motivated by these issues, we present a user-friendly open-source class C++...

10.3390/mca24030070 article EN cc-by Mathematical and Computational Applications 2019-07-19

High-Dimensional Distributed Sparse Classification with Scalable Communication-Efficient Global Updates

OPENALEX - Publications

Fred Lu Ryan R. Curtin Edward Raff Francis Ferraro James Holt

As the size of datasets used in statistical learning continues to grow, distributed training models has attracted increasing attention. These methods partition data and exploit parallelism reduce memory runtime, but suffer increasingly from communication costs as or number iterations grows. Recent work on linear shown that a surrogate likelihood can be optimized locally iteratively improve an initial solution communication-efficient manner. However, existing versions these experience...

10.1145/3637528.3672038 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024-08-24

Tree-Independent Dual-Tree Algorithms

OPENALEX - Publications

Ryan R. Curtin William B. March Parikshit Ram David V. Anderson Alexander Gray and 1 more

Dual-tree algorithms are a widely used class of branch-and-bound algorithms. Unfortunately, developing dual-tree for use with different trees and problems is often complex burdensome. We introduce four-part logical split: the tree, traversal, point-to-point base case, pruning rule. provide meta-algorithm which allows development in tree-independent manner easy extension to entirely new types trees. Representations provided five common algorithms; k-nearest neighbor search, this leads novel,...

10.48550/arxiv.1304.4327 preprint EN other-oa arXiv (Cornell University) 2013-01-01

An open source C++ implementation of multi-threaded Gaussian mixture models, k-means and expectation maximisation

OPENALEX - Publications

Conrad Sanderson Ryan R. Curtin

Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling often done via Gaussian mixture models (GMMs), which use computationally expensive potentially unstable training algorithms. We provide an overview fast robust implementation GMMs the C++ language, employing multi-threaded versions Expectation Maximisation (EM) k-means Multi-threading achieved through reformulation EM algorithms into...

10.1109/icspcs.2017.8270510 preprint EN 2017-12-01

Coming Soon ...