Christian Sohler

ORCID: 0000-0001-8990-3326
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complexity and Algorithms in Graphs
  • Data Management and Algorithms
  • Computational Geometry and Mesh Generation
  • Machine Learning and Algorithms
  • Advanced Graph Theory Research
  • Advanced Clustering Algorithms Research
  • Optimization and Search Problems
  • Algorithms and Data Compression
  • Stochastic Gradient Optimization Techniques
  • Facility Location and Emergency Management
  • Sparse and Compressive Sensing Techniques
  • Privacy-Preserving Technologies in Data
  • Complex Network Analysis Techniques
  • Face and Expression Recognition
  • Markov Chains and Monte Carlo Methods
  • Automated Road and Building Extraction
  • Limits and Structures in Graph Theory
  • Data Stream Mining Techniques
  • Time Series Analysis and Forecasting
  • Statistical Methods and Inference
  • Cryptography and Data Security
  • Caching and Content Delivery
  • Advanced Optimization Algorithms Research
  • Optimization and Packing Problems
  • Data Mining Algorithms and Applications

University of Cologne
2019-2024

TU Dortmund University
2011-2020

University of Warwick
2019

Paderborn University
2001-2017

University of Bonn
2008

Heinz Nixdorf Stiftung
2007-2008

@d can be approximated up to (1 + e)-factor, for an arbitrary small e > 0, using the O(k/e2)-rank approximation of A and a constant. This implies, example, that optimal k-means clustering rows is e)-approximated by their projection on O(k/e2) first right singular vectors (principle components) A.A (j, k)-coreset projective set points yields e)-approximation sum squared distances from n any k affine subspaces, each dimension at most j. Our embedding (0, k)-coresets size O(k) handling queries,...

10.5555/2627817.2627920 article EN 2013-01-06

We develop a new <it>k</it>-means clustering algorithm for data streams of points from Euclidean space. call this StreamKM++. Our computes small weighted sample the stream and solves problem on using <it>k</it>-means++ Arthur Vassilvitskii (SODA '07). To compute sample, we propose two techniques. First, use an adaptive, nonuniform sampling approach similar to seeding procedure obtain coresets stream. This construction is rather easy implement and, unlike other coreset...

10.1145/2133803.2184450 article EN ACM Journal of Experimental Algorithmics 2012-05-22

We present two space bounded random sampling algorithms that compute an approximation of the number triangles in undirected graph given as a stream edges. Our first algorithm does not make any assumptions on order edges stream. It uses is inversely related to ratio between and triples with at least one edge induced subgraph, constant expected update time per edge. second designed for incidence streams (all incident same vertex appear consecutively). length 2 paths O(log|V|⋅(1+s⋅|V|/|E|)),...

10.1145/1142351.1142388 article EN 2006-06-26

Given a point set P ⊆ Rd the k-means clustering problem is to find C=(c1,...,ck) of k points and partition into clusters C1,...,Ck such that sum squared errors ∑i=1k ∑p ∈ Ci |p -ci |22 minimized. For given centers this cost function minimized byassigning nearest center.The probably most widely used in area clustering.In paper we show every unweighted has weak (ε, k)-coreset size Poly(k,1/ε) for problem, i.e. its independent cardinality |P| dimension d Euclidean space Rd. A coreset weighted S...

10.1145/1247069.1247072 article EN 2007-01-01

We prove that the sum of squared Euclidean distances from n rows an × d matrix A to any compact set is spanned by k vectors in ℝd can be approximated up (1+ε)-factor, for arbitrary small ε > 0, using O(k/ε2)-rank approximation and a constant. This implies, example, optimal k-means clustering (1 + ε)-approximated their projection on O(k/ε2) first right singular (principle components) A.A (j, k)-coreset projective points yields ε)-approximation affine subspaces, each dimension at most j. Our...

10.1137/1.9781611973105.103 preprint EN 2013-01-06

We develop and analyze a method to reduce the size of very large set data points in high-dimensional Euclidean space $\mathbb{R}^d$ small weighted such that result predetermined analysis task on reduced is approximately same as for original point set. For example, computing first $k$ principal components will return or centers $k$-means clustering an approximation Such also known coreset. The main new feature our construction cardinality independent dimension $d$ input sets are mergeable [P....

10.1137/18m1209854 article EN SIAM Journal on Computing 2020-01-01

A dynamic geometric data stream consists of a sequence m insert/delete operations points from the discrete space 1,…,Δd [26]. We develop streaming (1 + e)-approximation algorithms for k-median, k-means, MaxCut, maximum weighted matching (MaxWM), travelling salesperson (MaxTSP), spanning tree (MaxST), and average distance over streams. Our maintain small set points(a coreset) that approximates with probability 2/3 current point respect to considered problem during stream. They use poly (e-1,...

10.1145/1060590.1060622 article EN 2005-05-22

We study a generalization of the k -median problem with respect to an arbitrary dissimilarity measure D. Given finite set P size n , our goal is find C such that sum errors D( P,C ) = ∑ p ∈ min c {D( p,c )} minimized. The main result in this article can be stated as follows: There exists (1+ϵ)-approximation algorithm for D, if 1-median approximated within factor (1+ϵ) by taking random sample constant and solving on exactly. This requires time 2 O ( mk log( /ϵ)), where m depends only ϵ Using...

10.1145/1824777.1824779 article EN ACM Transactions on Algorithms 2010-08-01

In this paper we develop an efficient implementation for a k-means clustering algorithm. Our algorithm is variant of KMHybrid [28, 20], i.e. it uses combination Lloyd-steps and random swaps, but as novel feature coresets to speed up the A coreset small weighted set points that approximates original point with respect considered problem. The main strength can quickly determine clusterings same many values k. This necessary in applications, since, typically, one does not know good value k...

10.1145/1137856.1137879 article EN 2006-06-05

We consider the problem of approximating a set P η points in Rd by j-dimensional subspace under lp measure, which we wish to minimize sum distances from each point this subspace. More generally, Fq (lp)-subspace approximation asks for j-subspace that minimizes qth powers lp-distances subspace, up multiplicative factor (1 + e).We develop techniques approximation, regression, and matrix can be used deal with massive data sets high dimensional spaces. In particular, coresets sketches, i.e....

10.5555/1873601.1873654 article EN Symposium on Discrete Algorithms 2010-01-17

We study graph properties that are testable for bounded-degree graphs in time independent of the input size. Our goal is to distinguish between having a predetermined property and far from every property. It well known model (where two considered “far” if they differ $\varepsilon n$ edges positive constant $\varepsilon$), many cannot be tested even with or polylogarithmic number queries. Therefore this paper we focus our attention on testing special classes graphs. Specifically, show...

10.1137/070681831 article EN SIAM Journal on Computing 2009-01-01

We show there is a distribution over linear mappings R:l1n -> l1O(d log d), such that with arbitrarily large constant probability, for any fixed d-dimensional subspace L, all x ∈ L we have |x|1 ≤ |Rx|1 = O(d d)|x|1. This provides the first analogue of ubiquitous Johnson-Lindenstrauss embedding l1-norm. Importantly, target dimension and distortion are independent ambient n. give several applications this result. First, faster algorithm computing well-conditioned bases. Our simple, avoiding...

10.1145/1993636.1993736 article EN 2011-06-06

A $k$-disc around a vertex $v$ of graph $G=(V,E)$ is the subgraph induced by all vertices distance at most $k$ from $v$. We show that structure planar on $n$ vertices, and with constant maximum degree $d$, determined, up to modification (insertion or deletion) $\epsilon d n$ edges, frequency $k$-discs for certain $k=k(\epsilon,d)$ independent size graph. can replace graphs any hyperfinite class graphs, which includes, example, every does not contain set forbidden minors. pure combinatorial...

10.1137/120890946 article EN SIAM Journal on Computing 2013-01-01

We obtain the first strong coresets for k-median and subspace approximation problems with sum of distances objective function, on n points in d dimensions, a number weighted that is independent both d; namely, our have size poly(k/ε). A coreset (1+ε)-approximates cost function all possible sets centers simultaneously. also give efficient nnz(A) + (n+d) poly(k/ε) exp(poly(k/ε)) time algorithms computing these coresets. result by introducing new dimensionality reduction technique significantly...

10.1109/focs.2018.00081 article EN 2018-10-01

We analyse a randomized pursuit-evasion game played by two players on graph, hunter and rabbit. Let is trivial lower bound the escape length in both models. Furthermore, we prove that our upper optimal up to constant factors against unrestricted rabbits.

10.1017/s0963548303005625 article EN Combinatorics Probability Computing 2003-05-01

A dynamic geometric data stream is a sequence of m Add/Remove operations points from discrete space (1,...,Δ)d [21]. Add(p) inserts point p into the current set, Remove(p) deletes P. We develop low-storage structures to (i) maintain ε-approximations range spaces P with constant VC-dimension and (ii) an ε-approximation weight Euclidean minimum spanning tree Our use O(log3ε • log3(1/ε) log(1/ε)/ε2) O(log (1/δ) (log Δ/ε)O(d)) bits memory, respectively (we assume that dimension d constant), they...

10.1145/1064092.1064116 article EN 2005-06-06

ABSTRACT We present sublinear‐time (randomized) algorithms for finding simple cycles of length at least and tree‐minors in bounded‐degree graphs. The complexity these is related to the distance graph from being C k ‐minor free (resp., having corresponding tree‐minor). In particular, if ‐far cycle‐free (i.e., a constant fraction edges must be deleted make cycle‐free), then algorithm finds cycle polylogarithmic time , where N denotes number vertices. This optimal up factors. foregoing results...

10.1002/rsa.20462 article EN Random Structures and Algorithms 2012-08-22

We study fair clustering problems as proposed by Chierichetti et al. (NIPS 2017). Here, points have a sensitive attribute and all clusters in the solution are required to be balanced with respect it (to counteract any form of data-inherent bias). Previous algorithms for do not scale well. show how model compute so-called coresets problems, which can used significantly reduce input data size. prove that composable them streaming setting. Furthermore, we propose variant Lloyd's algorithm...

10.48550/arxiv.1812.10854 preprint EN other-oa arXiv (Cornell University) 2018-01-01

The spectrum of a network or graph $G=(V,E)$ with adjacency matrix A , consists the eigenvalues normalized Laplacian $L= I - D^-1/2 $. This set encapsulates many aspects structure graph, including extent to which posses community structures at multiple scales. We study problem approximating spectrum, $łambda = (łambda_1,\dots,łambda_|V| )$, G in regime where is too large explicitly calculate spectrum. present sublinear time algorithm that, given ability query random node and select neighbor...

10.1145/3219819.3220119 preprint EN 2018-07-19

We consider the problem of computing weight a Euclidean minimum spanning tree for set n points in $\mathbb R^d$. focus on setting where input point is supported by certain basic (and commonly used) geometric data structures that can provide efficient access to structured way. present an algorithm estimates with high probability within $1 + \eps$ using only $\widetilde{\O}(\sqrt{n} \, \text{poly} (1/\eps))$ queries constant d. The assumes minimal bounding cube enclosing it, orthogonal range...

10.1137/s0097539703435297 article EN SIAM Journal on Computing 2005-01-01
Coming Soon ...