Academic Work

2025; 2024; 2023; 2022; 2021; 2020; 2019; 2018; 2017; 2016; 2015; 2014; 2013; 2012; 2011; 2010; 2009; 2008; 2007; 2006; 2005; 2004; 2003; 2002; 2001; 2000; 1999 and before.

Publications:
2025:

(2025 Conf) EMO: Epigraph Based Multilevel Optimization for Enhancing Chain of Thought Reasoning Capabilities Songtao Lu, Yanna Ding, Lior Horesh, Jianxi Gao, M. Magdon-Ismail, ICASSP 2025

2024:

(2024 Invited Talk) Malik Magdon-Ismail, "AI in the Modern Era: Handle With Care", Brief tour of modern AI, especially generative AI (LLM) at the New York City Council Chambers. (pdf)
(2024 Jrnl) Yanna Ding, Jianxi Gao, Malik Magdon-Ismail, "Efficient parameter inference in networked dynamical systems via steady states: A surrogate objective function approach integrating mean-field and nonlinear least squares", Physical Review E, 2024. (print) (pdf).
Summary: We give an efficient method to reverse engineer parameters in complex networked dynamics based only on observations of the steady state. The method uses a nonlinear least squares approach by minimizing a surrogate objective obtained by efficiently estimating the steady states using a mean-field approach.
(2024 Conf) Eureka: A General Framework for Black-box Differential Privacy Estimators, Yun Lu, M. Magdon-Ismail, Yu Wei, Vassilis Zikas, C&P 2024.
(2024 Conf) Natural Language Processing for Extracting Rich Disease Data Aligned To Satellite Meteorological Data, Pasarkar, Mahi and Kim, Junseob and O’Gara, Eoin and Zhang, Alan and Magdon-Ismail, Malik and Munasinghe, Thilanka and Weng, Jiaqi and Qiu, David and Cruz, Ethan and Wei, Jennifer C. and Pathirana, Ashan, IEEE BigData 2024.
(2024 Conf) Graph Representation Learning for Dengue Forecasting, Weng, Jiaqi and Qiu, David and Cruz, Ethan and Magdon-Ismail, Malik and Munasinghe, Thilanka and Wei, Jennifer. C. and Pathirana, Ashan and Pasarkar, Mahi, IEEE BigData 2024.

2023:

(2023 Conf) Learning Network Dynamics from Noisy Steady States, Yanna Ding, Jianxi Gao, Malik Magdon-Ismail, ASONAM 2023.

2022:

(2022 Last Lecture, Invited Talk) Malik Magdon-Ismail, "Data Data Everywhere, Nor Any Drop of Wisdom", I was honored to touch our students one last time before they went off to the big stage. Check out my last lecture to them. (video)
(2022 Conf) FairMM: A Fast and Frontrunning-Resistant Crypto Market-Maker Ciampi, M., Ishaq, M., Magdon-Ismail, M., Ostrovsky, R., Zikas, V., CSCML 2022.
(2022 Conf) Subpopulation Analysis in Causal Inference: A Healthcare Case Study, G. Mavroudeas, N. Neehal, J. K. Bennett, M. Magdon-Ismail, BIBM 2022.
(2022 Conf) HMM-Boost: Improved Time Series State Prediction Via Supervised Hidden Markov Models: Case Studies in Epileptic Seizure and Complex Care Management, G. Mavroudeas, M. Magdon-Ismail, K. Bennett, X. Shou, ICKG 2022.

2021:

(2021 Wkgpap) An Algorithm for Reconstructing the Orphan Stream Progenitor with MilkyWay@ home Volunteer Computing, arXiv preprint arXiv:2102.07257.
(2021 Wkgpap) FairMM: A fast and frontrunning-resistant crypto market-maker, Michele Ciampi, Muhammad Ishaq, Malik Magdon-Ismail, Rafail Ostrovsky, Vassilis Zikas, Cryptology ePrint Archive.
(2021 Conf) Predictive Modeling for Complex Care Management, G Mavroudeas, N Neehal, X Shou, M Magdon-Ismail, JN Kuruzovich, 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
(2021 Conf) Learning GraphQL query cost, G Mavroudeas, G Baudart, A Cha, M Hirzel, JA Laredo, M Magdon-Ismail. 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

2020:

(2020 Keynote-Talk) Malik Magdon-Ismail, "AI and ML for Predicting COVID-19", AAAI Symposium on AI for Social Good, 2020. (pdf)
(2020 Demo) Malik Magdon-Ismail, "COVID-19 War-Room: Situation Analysis"

(2020 WkgPap) Malik Magdon-Ismail, "Machine Learning the Phenomenology of COVID-19 From Early Infection Dynamics" (arXiv).

Summary: We use an data-driven machine learning with an aggregated model to learn properties of COVID-19 pandemic from its early dynamics up to March 14 2020 (54 days of coarse data). The predicted number of new infections from March 14 are shown below. Starting on or before March 12, the USA instituted aggressive social distancing protocols, so we should observe new infections being significantly lower than model predictions starting around March 22, if the predicted lag of 10 is correct. Blue indicates our prediction for when social distancing kicks in. There was a mild drop in infection growth. Lock down occurred on march 21, and those effects will begin to show on March 31.

Live Test Prediction of Total Confirmed Infections

Date:	03/15	03/16	03/17	03/18	03/19	03/20	03/21	03/22	03/23	03/24	03/25	03/26	03/27	03/28	03/29	03/30	03/31	04/1	04/2	04/3
Prediction:	2845	3712	4842	6316	8241	10750	14025	18297	23870	31142	40628	53004	69151	90216	117000	154000	200000	261000	341000	445000
Range (x1000):	[2.8, 5.0]	[3.7, 8.8]	[4.8, 13.7]	[6.3, 20.0]	[8.1, 28.3]	[10.6, 39.1]	[13.8, 53.2]	[17.9, 71.6]	[23.4, 95.5]	[30.4, 127]	[39.7, 168]	[51.6, 221]	[67.2, 290]	[87.6, 381]	[114, 499]	[148, 653]	[193, 854]	[252, 1117]	[328, 1459]	[427, 1906]
Observed:	2951	3774	4651	6417	9405	14240	19614	26737	35196	46432	55221	69184	85981	104680	124660	143020	164610	189610	216710	245530

How do we know the predictions are real predictions and do not include forward looking, data snooping or overfitting? Because the predictions were time-stamped in version 1 of the archive article, and the test data only came later. I thought of this as a pretty cute use of arXiv as a time-stamping mechanism to convince a third party who does not trust you that your predictions are bona-fide. In general, machine learning is in need of a trusted 3rd party validation setup to ensure that predictions are non-forward looking, especially in high-impact time-series prediction. Perhaps cryptography's multi-party computation or a transparency facilitated by blockchain can do the trick.

(2020 Conf) Brissette, C., Jianxi Gao, Malik Magdon-Ismail, Slota, G. "Parallel computation of fixed points on networks of nonlinear ODE", SIAM Workshop on Network Science, NS20 2020. ( pdf )
(2020 Conf) Chunheng Jiang, Jianxi Gao, Malik Magdon-Ismail, "Inferring Degrees from Incomplete Networks and Nonlinear Dynamics", IJCAI, 2020. (full oral) ( pdf )
(2020 Conf) Dong Hu, Alex Gittens, Malik Magdon-Ismail, "NoisyCUR: An Algorithm for Two-Cost Budgeted Matrix Completion", ECML, 2020. ( pdf )
(2020 Conf) Chunheng Jiang, Jianxi Gao, Malik Magdon-Ismail, "True Nonlinear Dynamics from Incomplete Networks", AAAI, 2020. (full oral) ( pdf )

2019:

(2019 WkgPap) Malik Magdon-Ismail, Alex Gittens, "Fast Fixed Dimension L2-Subspace Embeddings of Arbitrary Accuracy, With Application to L1 and L2 Tasks", (arXiv).
Summary: We give a fast oblivious fixed dimension L2 embedding which is nonlinear, decoupling the accuracy of the embedding from the dimension. This allows downstream machine learning applications to benefit from both a low dimension and high accuracy (in prior embeddings higher accuracy means higher dimension). We also give pplications to L1 and fast approximation of Lewis weights.
(2019 WkgPap) A Chowdhury, Malik Magdon-Ismail, B Yener, "Quantifying contribution and propagation of error from computational steps, algorithms and hyperparameter choices in image classification pipelines", (arXiv).
Summary: We develop methods to quantify the contributions of various steps in a learning process to the ultimate error of the predictor.
(2019 WkgPap) Kshiteesh Hegde, Malik Magdon-Ismail, "Network Lens: Node Classification in Topologically Heterogeneous Networks", (arXiv).
Summary: We study the problem of identifying different behaviors occurring in different parts of a large heterogenous network. We zoom in to the network using lenses of different sizes to capture the local structure of the network.
(2019 Conf) Xiao Shou, Georgios Mavroudeas, Kofi Arhin, Jason N. Kuruzovich, Malik Magdon-Ismail, Kristin P. Bennett, "Supervised Mixture Models for Population Health", IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2019. (pdf)
(2019 Conf) Maksim Tsikhanovich, Malik Magdon-Ismail, Muhammad Ishaq, Vassilis Zikas, "PD-ML-Lite: Private Distributed Machine Learning from Lightweight Cryptography", International Security Conference (ISC), 2019. (pdf)
(2019 Conf) Heidi Jo Newberg, Siddhartha Shelton, Eric Mendelsohn, Jake Weiss, Matthew Arsenault, Jacob S. Bauer, Travis Desell, Roland Judd, Malik Magdon-Ismail, Lee A. Newberg, Matthew Newby, Clayton Rayment, Colin Rice, Boleslaw K. Szymanski, Jeffery M. Thompson, Steve Ulin, Carlos Varela, Lawrence M. Widrow, and Benjamin A. Willett, "Streams and the Milky Way Dark Matter Halo", Galactic Dynamics in the Era of Large Surveys, Proc. IAU Symposium No. 353, 2019 (pdf) ( ppt)
(2019 Conf) Malik Magdon-Ismail, Kshiteesh Hegde, "The Intrinsic Scale of Networks is Small", IEEE ASONAM, 2019. (pdf)
(2019 Conf) Siddhartha Shelton, Jake Weiss, Jacob Bauer, Eric Mendelsohn, Heidi Jo Newberg, Travis Desell, Malik Magdon-Ismail, Larry Widrow, "Reconstructing the Orphan Stream Progenitor with MilkyWay@ home Volunteer Computing", American Astronomical Society Meeting Abstracts# 233, 2019 (abstract)

2018:

(2018 Jrnl) A. F. Atiya, Malik Magdon-Ismail, "The maximum drawdown of discrete time processes", Advanced Mathematical Models and Applications, Vol. 3, No. 2, pages 95-105, 2018. (print) (pdf).
Summary: We derive the MDD-distribution for a discrete time process using integral equation recursions.
(2018 Jrnl) Petros Drineas, Ilse Ipsen, Eugenia-Maria Kontopoulou, Malik Magdon-Ismail, "Structural convergence results for approximation of dominant subspaces from block Krylov spaces", SIAM Journal on Matrix Analysis and Applications, Vol. 39, No. 2, pp 567-586, 2018. (print) (pre-print).
Summary: This paper is concerned with approximating the dominant left singular vector space of a real matrix A of arbitrary dimension, from block Krylov spaces generated by the matrix AAT and the block vector AX. Two classes of results are presented. First are bounds on the distance, in the two and Frobenius norms, between the Krylov space and the target space. The distance is expressed in terms of principal angles. Second are quality of approximation bounds, relative to the best approximation in the Frobenius norm. For starting guesses X of full column-rank, the bounds depend on the tangent of the principal angles between X and the dominant right singular vector space of A. The results presented here form the structural foundation for the analysis of randomized Krylov space methods. The innovative feature is a combination of traditional Lanczos convergence analysis with optimal approximations via least squares problems.
(2018 Conf) Malik Magdon-Ismail, Lirong Xia "A Mathematical Model For Optimal Decisions In A Representative Democracy", NIPS, Dec. 2018. (pdf) ( ppt)
(2018 Conf) Kshiteesh Hegde, Malik Magdon-Ismail, "Separating Terrorist-Like Topological Signatures Embedded in Benign Networks", MILCOM, Oct. 2018. (pdf) ( ppt)
(2018 Conf) Prasanna Date, Christopher Carothers, Malik Magdon-Ismail, James Hendler, "Efficient Classification of Supercomputer Failures Using Neuromorphic Computing", IEEE Symposium Series on Computational Intelligence (SSCI), Nov 2018. (pdf) ( ppt)
(2018 Ligntning Talk, Not In Proceedings) Prasanna Date, Christopher Carothers, Malik Magdon-Ismail, James Hendler, "Efficient Classification of Supercomputer Failures Using Neuromorphic Computing", Int. Conf. Neuromorphic Computing (ICONS), Knoxville, Tennessee, July 2018. (pdf) ( ppt)

2017:

(2017 Jrnl) Abhisek Kundu, Petros Drineas, Malik Magdon-Ismail, "Recovering PCA and Sparse PCA via Hybrid-(l1,l2) Sparse Sampling of Data Elements", Journal of Machine Learning Research (JMLR): 18(75):1-34, 2017. (print) (pre-print).
Summary: We show that a sparse sampling of the elements of a matrix using sampling probabilitis based on a hybrid of L1 and L2 element-norms give better spectral reconstruction that using either element-norm independently. We show that this allows us to reconstruct PCA and sparse-PCA from incomplete data. We also consider extensions to the streaming setting.
(2017 Jrnl) Malik Magdon-Ismail "NP-hardness and inapproximability of sparse PCA", Information Processing Letters, Volume 126, October 2017, Pages 35-38. (print) (pre-print).
Summary: Using a reduction from CLIQUE we show NP-hardness and inapproximability of Sparse PCA.

2016:

(2016 WkgPap) P. Drineas, I. Ipsen, E. Kontopoulou, Malik Magdon-Ismail " Structural Convergence Results for Low-Rank Approximations from Block Krylov Spaces", (arXiv).
Summary: This paper is concerned with approximating the dominant left singular vector space of a real matrix A of arbitrary dimension, from block Krylov spaces q(AAT,AX). Two classes of results are presented. First are bounds on the distance, in the two and Frobenius norms, between the Krylov space and the target space. The distance is expressed in terms of principal angles. Second are quality of approximation bounds, relative to the best low-rank approximation in the Frobenius norm. For starting guesses X of full column-rank, the bounds depend on the tangent of the principal angles between X and the dominant right singular vector space of A. The results presented here form the structural foundation for the analysis of randomized Krylov space methods. The innovative feature is a combination of traditional Lanczos convergence analysis with optimal low-rank approximations via least squares problems.
(2016 WkgPap) A. Kundu, P. Drineas, Malik Magdon-Ismail "Recovering PCA from Hybrid-(ℓ1,ℓ2) Sparse Sampling of Data Elements", (arXiv).
Summary: This paper addresses how well we can recover a data matrix when only given a few of its elements. We present a randomized algorithm that element-wise sparsifies the data, retaining only a few its elements. Our new algorithm independently samples the data using sampling probabilities that depend on both the squares (ℓ2 sampling) and absolute values (ℓ1 sampling) of the entries. We prove that the hybrid algorithm recovers a near-PCA reconstruction of the data from a sublinear sample-size: hybrid-(ℓ1,ℓ2) inherits the ℓ2-ability to sample the important elements as well as the regularization properties of ℓ1 sampling, and gives strictly better performance than either ℓ1 or ℓ2 on their own. We also give a one-pass version of our algorithm and show experiments to corroborate the theory.
(2016 WkgPap) K. Wu, Malik Magdon-Ismail "Node-By-Node Greedy Deep Learning for Interpretable Features", (arXiv).
Summary: Multilayer networks have seen a resurgence under the umbrella of deep learning. Current deep learning algorithms train the layers of the network sequentially, improving algorithmic performance as well as providing some regularization. We present a new training algorithm for deep networks which trains \emph{each node in the network} sequentially. Our algorithm is orders of magnitude faster, creates more interpretable internal representations at the node level, while not sacrificing on the ultimate out-of-sample performance.
(2016 Jrnl) S. Das, A. Lavoie, Malik Magdon-Ismail, "Manipulation among the arbiters of collective intelligence: How Wikipedia administrators mold public opinion", ACM Transactions on the Web (pdf preprint)
Summary: We document that a surprisingly large number of editors change their behavior and begin focusing more on a particular controversial topic once they are promoted to administrator status. The conscious and unconscious biases of these few, but powerful, administrators may be shaping the information on many of the most sensitive topics on Wikipedia; some may even be explicitly infiltrating the ranks of administrators in order to promote their own points of view. In addition, we ask whether administrators who change their behavior in this suspicious manner can be identified in advance. Neither prior history nor vote counts during an administrator’s election are useful in doing so, but we find that an alternative mea- sure, which gives more weight to influential voters, can successfully reject these suspicious candidates. This second result has important implications for how we harness collective intelligence: even if wisdom exists in a collective opinion (like a vote), that signal can be lost unless we carefully distinguish the true expert voter from the noisy or manipulative voter.
(2016 Jrnl) R. Korolov, J. Peabody, A. Lavoie, S. Das, Malik Magdon-Ismail, W. Wallace, "Predicting Charitable Donations Using Social Media", Social Network Analysis and Mining, 6:31. (pdf preprint)
Summary: We study the relationship between chatter on social media and observed actions concerning charita- ble donation. One hypothesis is that a fraction of those who act will also tweet about it, implying a linear relation. However, if the contagion is present, we expect a super- linear scaling. We consider two scenarios: donations in response to a natural disaster, and regular donations. We empirically validate the model using two location-paired sets of social media and donation data, corresponding to the two scenarios. Results show a quadratic relation between chatter and action in emergency response case. In case of regular donations, we observe a near-linear relation. Additionally, regular donations can be explained by demographic factors, while for a disaster response social media is a much better predictor of action. A contagion model is used to predict the near-quadratic scaling for the disaster response case. This suggests that diffusion is pre- sent in emergency response case, while regular charity does not spread via social network.
(2016 Jrnl) S. Paul, Malik Magdon-Ismail, P. Drineas, "Feature selection for linear SVM with provable guarantees", Pattern Recognition, 60, pages 205--214. (pdf preprint)
Summary: We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ε-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. (2007).
(2016 Jrnl) K. Clarkson, P. Drineas, Malik Magdon-Ismail, M. Mahoney, X. Meng, D. Woodruff "The Fast Cauchy Transform and Faster Robust Regression", SIAM J. on Computing. (pdf preprint)
Summary: We provide fast algorithms for robust (L_1) and general L_p regression using a fast projection of the design matrix with a Hardmard scaled by Cauchy random variables (hence the name Fast Cauchy Transform, FCT). Our projection algorithm runs in O(ndlogn) and provides O(d^(2+)) L_1 distortion obliviously for an arbitrary subspace of dimension d. We give applications of our fast embedding algorithm to robust regression, L_p regression and L_1 subspace approximation.
(2016 Conf) Malik Magdon-Ismail, C. Boutsidis, "Optimal Sparse Linear Encoders and Sparse PCA", NIPS 2016, Barcelona, Spain, Dec 2016. (pdf) ( ppt)
(2016 Conf) K. Hegde, Malik Magdon-Ismail, B. Szymanski, K. Kuzmin "Clustering, Prominence and Social Network Analysis on Incomplete Networks", Complex Networks 2016, Milan, Italy, Nov-Dec 2016. (pdf) ( ppt)
(2016 Conf) K. Wu, P. Waters, Malik Magdon-Ismail, "Network Classification Using Adjacency Matrix Embeddings and Deep Learning", ASONAM 2016, San Francisco, Aug 2016. (pdf) ( ppt)

2015:

(2015 WkgPap) Malik Magdon-Ismail, Christos Boutsidis "Optimal Sparse Linear Auto-Encoders and Sparse PCA", (arXiv).
Summary: Principal components analysis (PCA) is the optimal linear auto-encoder of data, and it is often used to construct features. Enforcing sparsity on the principal components can promote better generalization, while improving the interpretability of the features. We study the problem of constructing optimal sparse linear auto-encoders. Two natural questions in such a setting are: i) Given a level of sparsity, what is the best approximation to PCA that can be achieved? ii) Are there low-order polynomial-time algorithms which can asymptotically achieve this optimal tradeoff between the sparsity and the approximation quality? In this work, we answer both questions by giving efficient low-order polynomial-time algorithms for constructing asymptotically \emph{optimal} linear auto-encoders (in particular, sparse features with near-PCA reconstruction error) and demonstrate the performance of our algorithms on real data.
(2015 WkgPap) Malik Magdon-Ismail "NP-Hardness and Inapproximability of Sparse PCA", (arXiv).
Summary: We give a reduction from {\sc clique} to establish that sparse PCA is NP-hard. The reduction has a gap which we use to exclude an FPTAS for sparse PCA (unless P=NP). Under weaker complexity assumptions, we also exclude polynomial constant-factor approximation algorithms.
(2015 WkgPap) Mark Goldberg, Mykola Hayvanovych, Malik Magdon-Ismail, William Wallace "Extracting Hidden Groups and their Structure from Streaming Interaction Data", (arXiv).
Summary: When actors in a social network interact, it usually means they have some general goal towards which they are collaborating. This could be a research collaboration in a company or a foursome planning a golf game. We call such groups \emph{planning groups}. Our particular focus is hidden planning groups who have not "declared" their membership explicitly. We formulate the problem of hidden group discovery from streaming interaction data, and we propose efficient algorithms for identifying the hidden group structures by isolating the hidden group's non-random, planning-related, communications from the random background communications. We validate our algorithms on real data (the Enron email corpus and Blog communication data).
(2015 Conf) S. Paul, Malik Magdon-Ismail, P. Drineas "Column Selection via Adaptive Sampling ", NIPS 2015, Montreal, Canada, Dec 2015. (pdf) ( ppt)
(2015 Conf) A. Kundu, P. Drineas, Malik Magdon-Ismail, "Approximatinig Sparse PCA from Incomplete Data", NIPS 2015, Montreal, Canada, Dec 2015. (pdf) ( ppt)
(2015 Conf) R. Korolov, J. Peabody, A. Lavoie, S. Das, Malik Magdon-Ismail, W. Wallace "Actions Are Louder than Words in Social Media", Proceedings of 2015 International Conference on Advences in Social Networks Analysis and Mining (ASONAM), Paris, France, August 2015. (pdf) ( ppt)
(2015 Conf) S. Paul, Malik Magdon-Ismail, P. Drineas "Feature Selection for Linear SVM with Provable Guarantees", Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, CA, May 2015. (pdf) (supplementary material) ( ppt)
(2015 Conf) E. Gertle, E. Mackin, , Malik Magdon-Ismail, L. Xia, Y. Yi, "Computing Manipulations of Ranking Systems", Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Istanbul, Turkey, May 2015. (pdf) ( ppt)

2014:

(2014 Talk) Malik Magdon-Ismail, "Machines that Learn From Data", Rensselaer - TALKS, December 6, 2014. (pdf)
(2014 Talk) Malik Magdon-Ismail, "Using Social Media to Predict Material Convergence", RPI Workshop on Disaster Response Logistics, December 4, 2014. (pdf)
(2014 Talk) Malik Magdon-Ismail, "Learning, Sparsity and Big Data", Bloomberg ML Group (pdf)
Summary: PCA, Clustering and Regression with sparse features: compute them efficiently and not sacrifice in in-sample performance too much. Mostly show and tell. A little SVM.
(2014 Jrnl) S. Adali, X. Lu, Malik Magdon-Ismail "Local, community and global centrality methods for analyzing networks", Social Network Analysis and Mining (SNAM) (Vol. 4, No 1). (pdf preprint)
Summary: We introduce different notions of prominence in a network based on its community structure: a local rank, which is the prominence of an actor within its communities (the communities are computed using a clustering algorithm) and its community rank, which is the rank of an actor's community within the "graph of communities". We show in a variety of real networks that these two measures of prominence capture different effects. On these same real networks, we show that external measures of prominence (like an authors citation prominence, or an actors ability to feature in big-budget movies) can be deconstructed into a contribution from local rank and community rank. Local and community ranks also play differing roles depending on the nature of the network and the type of external prominence one wishes to capture.
(2014 Jrnl) C. Boutsidis, P. Drineas, Malik Magdon-Ismail "Near-Optimal Column-Based Matrix Reconstruction", SIAM Journal on Computing (Vol. 43, Issue 2). (pdf)
Summary: We give o(SVD) algorithms that construct O(k/e) columns to reconstruct a matrix to within O(1+e)| |A-Ak||_F, where Ak is the best rank k reconstruction from the SVD. The number of columns is near optimal, asymptotically matching a known lower bound. We introduce two new tools in our algorithm - fast approximate matrix factorizations, and dual set sparsification results which may be of independent interest.
(2014 Jrnl) C. Boutsidis, Malik Magdon-Ismail "A Note on Sparse Least-squares Regression", Information Processing Letters, accepted. (arXiv version)
Summary: We give additive error bounds on sparse regression (where the regression vector is required to have a small number of non-zeros) in terms of the top-k PCA regression.
(2014 Jrnl) S. Paul, C. Boutsidis, Malik Magdon-Ismail, P. Drineas "Random Projections for Linear Support Vector Machines", ACM Transactions on Knowledge Discovery from Data, accepted. (arXiv version)
Summary: We show that using random projections for dimensionality reduction in SVM preserves (to within relative error) both the margin and the radius of the data. Since the margin and data radius are the two quantities parametrizing the statistical generalization error, these statistical bounds are preserved during the feature selection.
(2014 WkgPap) Christos Boutsidis Malik Magdon-Ismail, "Faster SVD-truncated Regularized Least-squares", (arXiv).
Summary: We develop a new algorithm that computes the SVD-truncated solution to a least squares problem (regression onto the top-k principal components) in time faster than SVD, while guaranteeing an additive error. The algorithm first quickly constructs an approximation to the top-k PCA and uses that approximation in the regression. We give a lower bound showing that our quality of approximation is the best one can expect of such algorithms which first approximate PCA.
(2014 Conf) C. Boutsidis, Malik Magdon-Ismail "Faster SVD-truncated Regularized Least-squares", IEEE International Symposium on Information Theory (ISIT), July 2014, Honolulu, HI, USA. (pdf)
Summary: We give additive error bounds on sparse regression (where the regression vector is required to have a small number of non-zeros) in terms of the top-k PCA regression.

2013:

(2013 Talk) Malik Magdon-Ismail, "Efficiently Implementing Sparsity in Learning", NIPS Workshop on Large Scale Matrix Analysis and Inference (pdf)
Summary: This talk describes ways to do PCA, Clustering and Regression with sparse features: compute them efficiently and not sacrifice in in-sample performance too much.
(2013 Talk) Malik Magdon-Ismail, "Learning, Sparsity and Big Data", RPI Dean's Seminar Series (pdf)
Summary: This talk describes ways to do PCA, Clustering and Regression with sparse features: compute them efficiently and not sacrifice in in-sample performance too much. Mostly show and tell.
(2013 Talk) Malik Magdon-Ismail, "Sparsity in Machine Learning", Talk at ICML Workshop on Numerical Linear Algebra in Machine Learning (pdf)
Summary: This talk describes ways to do PCA, Clustering and Regression with sparse features: compute them efficiently and not sacrifice in in-sample performance too much.
(2013 Jrnl) C. Boutsidis, Malik Magdon-Ismail, "Near-Optimal Coresets for Least-Squares Regression", IEEE Transactions on Infomation Theory, Vol 59, No. 10, 2013. (pdf, coming soon)
Summary: We give linear coresets for linear regression in both the spectral and Frobenius norms. Our algorithms are deterministic. We also provide lower bounds on coreset size for both deterministic and randomized algorithms.
(2013 Jrnl) C. Boutsidis, Malik Magdon-Ismail, "Deterministic Feature Selection for k-means Clustering", IEEE Transactions on Infomation Theory, Vol 59, No. 9, 2013. (pdf, coming soon)
Summary: We give new deterministic algorithms for selecting O(k) features to perform clustering. The resulting features give a relative error approximation to the optimal objective when considering the clustering error in the full space.
(2013 Jrnl) Matthew Newby, Nathan Cole, Heidi Jo Newberg, Travis Desell, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos Varela, Benjamin Willett, and Brian Yanny, "A Spatial Characterization of the Sagittarius Dwarf Galaxy Tidal Tails", The Astronomical Journal Volume 145 Number 6, 2013. (pdf,print version)
Summary: Using a maximum likelihood algorithm, we simultaneously estimate the dynamical parameters of the Saggitarius Dwarf galaxy stream as well as the parameters of the Milky Way galaxy. As a by-product, we present the first catalog of stars having the same density profile as the Saggitarius stream. These results were achieved by leveraginig the MilkyWay@home volunteer computing platform.
(2013 Jrnl) Sibel Adali, Malik Magdon-Ismail, Xiaohui Lu, "iHypR: Prominence Ranking in Networks of Collaborations with Hyperedges", ACM Transactions on Knowledge Discovery from Data (TKDD), 2013. (accepted) (pdf, coming soon)
Summary: We give new algorithms for ranking in collaborative networks based on [i]clustering[/i] the collaborative artifacts into hyperedges and using an iterative pagerank type algorithm to obtain the rankings from these hyperedges.
(2013 WkgPap) Elliot Anshelevich, Ameya Hate, Malik Magdon-Ismail, "Seeding Influential Nodes in Non-Submodular Models of Information Diffusion", (arXiv).
Summary: We develop new algorithms to seed an information diffusions for a non-submodlar model. The model is built to mimic social proceeses that are likely to occur in information diffusion. Our approach is to project the non-submodular model to a class of submodular models for which an efficient solution can be constructed. We demonstrate significantly better performance as compared to using random or high degree seeding strategies on real and synthetic networks.
(2013 WkgPap) Kenneth L. Clarkson, Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, Xiangrui Meng, David P. Woodruff, "The Fast Cauchy Transform and Faster Robust Linear Regression", submitted, 2013. (arXiv).
Summary: We develop the Fast Cauchy Transform (FCT), the L1 equivalent of the Fast Johnson Lindenstrauss Transform, which allows us to develop a fast algorithm for robust (L1) regression. This can be extended to Lp regression for p in the range [1,2]. We also give experimental simulations which show that the practice follows the theory closely.
(2013 Conf) Sanmay Das, Allen Lavoie, Malik Magdon-Ismail, "Manipulation Among the Arbiters of Collective Intelligence: How Wikipedia Administrators Mold Public Opinion", ACM Conference on Information and Knowledge Management (CIKM), 2013. Pages 1097-1106. (pdf) (slides)
(2013 Conf) Sibel Adali, Xiaohui Lu, Malik Magdon-Ismail, "Deconstructing centrality: thinking locally and ranking globally in networks", Proceedings 5th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Niagara Falls, Canada, August 2013. (pdf) (slides)
(2013 Conf) S. Paul, C. Boutsidis, Malik Magdon-Ismail, P. Drineas "Random Projections for Support Vector Machines", Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) Scottsdale, AZ, April 2013. (pdf, coming soon) ( ppt)
(2013 Conf) Elliot Anshelevich, Ameya Hate, Malik Magdon-Ismail, "Seeding Influential Nodes in Non-Submodular Models of Information Diffusion", Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), St. Paul, Minnesota, May 2013. (pdf) (arXiv) ( ppt)
(2013 Conf) Kenneth Clarkson, Petros Drineas, Malik Magdon-Ismail, Michael Mahoney, Xiangrui Meng, David Woodruff, "The Fast Cauchy Transform and Faster Robust Linear Regression", Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), New Orleans, Louisiana, January 2013. (pdf) ( ppt)

2012:

(2012 Book) Yaser Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin "Learning from Data: A Short Course", amlbook.com, ISBN-13: 978-1600490064, March 2012.
This book is Part I of our text book on Machine Learning that is on the foundations and basic principles of learning from data. Roughly speaking, the book addresses the following topics: What is learning? Is learning feasible? Can we do it efficiently (linear models)? Can we do it well (overfitting and regularization)? Lesons learned - what are the take home messages?
(2012 Jrnl) Petros Drineas, Malik Magdon-Ismail, Michael Mahoney, David Woodruff, "Fast Approximation of Matrix Coherence and Statistical Leverage", Journal of Machine Learning Research (JMLR), Volume 13, pages 3441-3472, Dec 2012. (pdf)
Summary: We give the first algorithms to compute statistical leverage and coherence in sub-SVD time using random projections. The algorithms are accurate to within relative error. Datapoints with high statistical leverage are important in that they are likely to belong to good coresets that can reproduce a linear regression to within relative error.
(2012 Jrnl) Sibel Adali, Tina Liu, Malik Magdon-Ismail, "An Analysis of Optimal Link Bombs", Theoretical Computer Science, volume 437, pages 1-20, 2012. (preprint pdf) (print version online).
Summary: We consider the optimal attack pattern to spam the page rank algorithm. It turns out to be particularly simple: all attackers point to the victim. This optimally increases not only the page-rank of the victim but also the rank (based on the page-rank) of the victim. We also consider optimal 'disguised' attacks, i.e. attacks where the attackers do not want to be directly associated to the victim. A similar result holds in that setting.
(2012 Jrnl) Ali Civril, Malik Magdon-Ismail, "Column subset selection via sparse approximation of SVD", Theoretical Computer Science, Volume 421, pages 1-14, 2012. (preprint pdf) (print version).
Summary: We give one of the first deterministic algorithms for provable reconstruction of a matrix using a small subset of its columns. The method is based on a greedy approximation of the SVD. In order to greedily approximate the SVD from a dictionary, we extend Natarajan's result for greedy approximation of a vector from a dictionary.
(2012 WkgPap) Saurabh Paul, Christos Boutsidis, Malik Magdon-Ismail, Petros Drineas, "Random Projections for Support Vector Machines", submitted, 2012. (arXiv).
Summary: We give an oblivious random projection algorithm which samples a number features proportional to the rank of the data matrix such that the margin and radius of the data in the dimensionally reduced space are comparable to those in the original full-dimension space. Thus the generalization properties of the SVM are preserved. We give experimental simulations which bear out the theory.
(2012 WkgPap) Christos Boutsidis, Petros Drineas, Malik Magdon-Ismail, "Rich Coresets For Constrained Linear Regression", submitted, 2012. (arXiv).
Summary: We give the first polynomial algorithms which find a coreset of size order the number of variables, such that the regression on this coreset gives a relative error approximation to the regression on the full data set.
(2012 WkgPap) P. Horn, Malik Magdon-Ismail, "Spreading Processes and Large Components in Ordered, Directed Random Graphs", submitted, 2011. (arXiv).
Summary: We consider a spreading process that can be mapped onto a random graph model: order the vertices and randomply place directed edges from lower ranked to higher ranked vertices; each such edge exists independently with probability \math{p}. If the spread starts at the first vertex, the size of the infection is the component reachable from this vertex. We prove existence of a sharp threshold \math{p^*=\log n/n} at which this reachable component transitions from \math{o(n)} to \math{\Omega(n)}.
(2012 WkgPap) Malik Magdon-Ismail, "A Note On Estimating the Spectral Norm of A Matrix Efficiently". (arXiv).
Summary: A random projection based power iteration for estimating spectral norms to within relative error.
(2012 Conf) Sibel Adali, Xiaohui Lu, Malik Magdon-Ismail, "Attentive Betweenness Centrality (ABC): Considering Options and Bandwidth when Measuring Criticality", Proceedings IEEE Conference on Social Computing (SocialCom), Amsterdam, The Netherlands, September 2012. (pdf) (slides)
(2012 Conf) Pranay Anchuri, Malik Magdon-Ismail, "Communities and Balance in Signed Networks: A Spectral Approach", Proceedings of 4th International on Advances in Social Networks Analysis and Mining (ASONAM), Istanbul, Turkey, August 2012. (pdf) ( ppt)
(2012 Conf) Mark Goldberg, Malik Magdon-Ismail, James Thompson "Identifying Long Lived Social Communities Using Structural Properties", Proceedings of 4th International on Advances in Social Networks Analysis and Mining (ASONAM), Istanbul, Turkey, August 2012. (pdf) ( ppt)
(2012 Conf) Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, David P. Woodruff "Fast Approximation of Matrix Coherence and Statistical Leverage", Proceedings of 25th International Conference on Machine Learning (ICML), Edinburgh, Scotland, June 2012. (pdf) ( ppt)
(2012 Wkshp) Hsuan-Tien Lin, Malik Magdon-Ismail, Yaser S. Abu-Mostafa "Teaching Machine Learning to a Diverse Audience: the Foundation-based Approach", Teaching Machine Learning Workshop at the 25th International Conference on Machine Learning (ICML), Edinburgh, Scotland, June 2012. (pdf) ( ppt)
(2012 Conf) Mark Goldberg, Malik Magdon-Ismail, William Wallace, John Schwartz and Bridget Gutting "Graph Search Beyond Text: Relational Searches in Semantic Hyperlinked Data", Proceedings of 10th International Conference on Intelligence and Security Informatics (ISI), Washington DC, June 2012. (pdf) ( ppt)
(2012 Conf) Malik Magdon-Ismail, Brian Orecchio "Guard Your Connections: Infiltration of a Trust/Reputation Based Network", Proceedings of 4th International Conference on Web Science (WebSci) in conjunction with NetSci 2012, Evanston IL, June 2012. ( pdf) ( ppt)
(2012 Wkshp) Sanmay Das, Allen Lavoie, Malik Magdon-Ismail, "Pushing Your Point of View: Behavioral Measures of Manipulation in Wikipedia", Workshop on Social Computing and User Generated Content at ACM Conference on Electronic Commerce (ACM-EC), Valencia, Spain, June 2012. (pdf) ( ppt)
(2012 Conf) Aseem Brahma, Mithun Chakraborty, Sanmay Das, Allen Lavoie, Malik Magdon-Ismail, "A Bayesian Market Maker", Proceedings of 13th International Conference on Electronic Commerce (EC), Valencia, Spain, June 2012. ( pdf) ( ppt)
(2012 Conf) Cindy Hui, William Wallace, Malik Magdon-Ismail, Mark Goldberg "Information Cascades in Social Media in Response to a Crisis: a Preliminary Model and a Case Study", Proceedings of Workshop on Social Web for Disaster Management (SWDM) at World Wide Web Conference (WWW), April 2012. pdf. Poster: ppt.
(2012 Conf) Sibel Adali, Fred Sisenda, Malik Magdon-Ismail, "Actions Speak as Loud as Words: Predicting Relationships from Social Behavior Data", Proceedings of World Wide Web Conference (WWW), April 2012. pdf. Poster: ppt.

2011:

(2011 BkChap) S. Kelley, M. Goldberg, Malik Magdon-Ismail, K. Mertsalov, W. Wallace, "Defining and Discovering Communities in Social Networks", Book Chapter in Handbook of Optimization in Complex Networks, eds. M. Thai and P. Pardalos, Chapter 6, pages, 2011. (pdf).
Summary: Overlapping communities play a large role in the functioning of social networks. We give a definition based on minimal properties that such communities should have, and develop efficient algorithms to find such comunities. We give some methods for validating that the communities found are "real" and compare this minimal axiomatic approach to several other benchmark algorithms which take a specific view as to the structure of the communities.
(2011 BkChap) C. Busch, Malik Magdon-Ismail, J. Xi, "Oblivious Routing for Sensor Network Topologies", Book Chapter in Theoretical Aspects of Distributed Computing in Sensor Networks, eds. S. Nikoletseas and J. D. P. Rolim, Chapter 13, pages 381-400, Springer, 2011. (pdf).
Summary: We collect some of our previous results on oblivious routing algorithms with near-optimal congestion and dilation (as compared with optimal offline routing) for sensor network topologies which include the d-dimensional mesh and uniform geometric networks.
(2011 Jrnl) Ali Civril, Malik Magdon-Ismail "Exponential Inapproximability of Selecting a Maximum Volume Sub-matrix", Algorithmica, pages 1-18 Journal Version: pdf. (arXiv).
Summary: We show that selecting columns of a data matrix to have maximum "numerical rank" (as measured by the volume of the data vectors selected) is exponentially hard to approximate.
(2011 Jrnl) S. Das, Malik Magdon-Ismail, "A Model for Information Growth in Collective Wisdom Processes", ACM Transactions on Knowledge Discovery from Data (TKDD), volume 6, issue 2, article 6, July 2011. preprint (pdf). (pdf).
Summary: We give a succint model for edit dynamics in collective wisdom processes in which users (the collective) arrive sequentially and contribute to the wisdom on a particular topic. The model captures the arrival of new information, together with the saturation of information on a particular topic. As the quality of a particular topic increases, there is a tradeoff between more uses arriving and those users having less to contribute. We demonstrate the model on Wiki and Blog edit behavior, in particular capturing the main observed phenomenological dynamics within just this simple model.
(2011 Jrnl) M. Goldberg, Malik Magdon-Ismail, "Embedding a Forest in a Graph", Electronic Journal of Combinatorics (EJC), Volume 18(1), P99, Apr 29, 2011. (pdf).
Summary: For $p\ge 1$ , we prove that every forest with $p$ trees whose sizes are $a_1, . . . , a_p$ can be embedded in any graph containing at least $\sum_{i=1}^p (a_i + 1)$ vertices and having minimum degree at least $\sum_{i=1}^p a_i$ .
(2011 Jrnl) Malik Magdon-Ismail, Jonathan Purnell, "Approximating the Covariance Matrix of GMMs with Low-rank Perturbations", International Journal of Data Mining, Modelling and Management, Special issue on the best papers of IDEAL'10 (invited), pp 300--307. preprint (pdf).
Summary: We present a method for using Gaussian Mixture Models (GMMs) efficiently using low rank perturbations to a diagonal covariance matrix. The full covariance matrix has O(d^2) free parameters and could be prone to overfitting as well as inefficient. The diagonal covariance matrix has O(d) parameters and is efficient but can be restrictive. The low rank perturbation to a diagonal covariance matrix has O(d) parameters, is efficient and offers a middle ground between full and diagonal covariance which can capture correlations but using linear (in d) parameters.
(2011 WkgPap) C. Boutsidis Malik Magdon-Ismail, "Deterministic Feature Selection for k-means Clustering", submitted, 2011. (arXiv).
Summary: We show that it is possible to reduce the feature dimension and provably obtain clusterings which are comparable to the good clusterings in the original dimension.
(2011 WkgPap) Sanmay Das, Allen Lavoie, Malik Magdon-Ismail, "Pushing Your Point of View: Behavioral Measures of Manipulation in Wikipedia", 2011. (arXiv).
Summary: We develop edit behavioral indices that capture change in behavior and show correlation between blocked (potentially manipulative) editors and adverse changes in this index.
(2011 Conf) , M. Goldberg, J. Greenman, B. Gutting, Malik Magdon-Ismail, J. Schwartz, W. Wallace, "Toward Efficient Search for a Fragment Network in a Large Semantic Database", Proc.6th Annual Network Science Workshop, West Point, Oct 23-24 2011. pdf Talk: pdf.
(2011 Conf) T. Desel, Malik Magdon-Ismail, H. Newberg, L. Newberg, B. Szymanski, and C. Varela, "A Robust Asynchronous Newton Method for Massive Scale Computing Systems", In the 2011 IEEE International Conference on Computational Intelligence and Software Engineering (CiSE 2011). Wuhan, China. December 9-11, 2011. pdf Talk: pdf.
(2011 Conf) T. Desel, Malik Magdon-Ismail, L. Newberg, B. Szymanski, "Finding Protein Binding Sites Using Volunteer Computing Grids", Proc. 2nd International Congress on Computer Applications and Computational Science (CACS 2011), Bali, Indonesia, Nov 15-17 2011. pdf Talk: pdf.
(2011 Conf) C. Boutsidis, P. Drineas, Malik Magdon-Ismail, "Sparse Features for PCA-like Regression", Proc. 25th Conference on Neural Information Processing Systems (NIPS 2011), Granada, Spain, Dec 13-15 2011. pdf (coming soon) Talk: pdf.
(2011 Conf) C. Boutsidis, P. Drineas, Malik Magdon-Ismail, "Near-Optimal Column Based Matrix Reconstruction", Proc. 52nd IEEE Symposium on Foundations of Computer Science (FOCS 2011), Palm Springs, CA, Oct 23-25 2011. pdf (full version: arXiv). Talk: pdf.
(2011 Conf) M. Goldberg, Malik Magdon-Ismail, S. Nambirajan, J. Thompson "Tracking and Predicting Evolution of Social Communities", Proc. 3rd IEEE International Conference on Social Computing (SocialCom2011), MIT, Boston, MA, Oct 9-11 2011. pdf. Talk: pdf.
(2011 Conf) Malik Magdon-Ismail, J. Purnell "SSDE-CLuster: Fast Overlapping Clustering of Networks Using Sampled Spectral Distance Embedding and GMMs", Proc. 3rd IEEE International Conference on Social Computing (SocialCom2011), MIT, Boston, MA, Oct 9-11 2011. pdf. Talk: pdf.
(2011 Conf) C. Hui, Malik Magdon-Ismail, W. Wallace, M. Goldberg "Aborting a Message Flowing Through Social Communities", Proc. 3rd IEEE International Conference on Social Computing (SocialCom2011), MIT, Boston, MA, Oct 9-11 2011. pdf. Talk: pdf.
(2011 Conf) Mithun Chakraborty, Sanmay Das, Allen Lavoie, Malik Magdon-Ismail, Yonatan Naamad, "Instructor Rating Markets", (abstract only appeared) at Workshop on Social Computing and User Generated Content at the ACM Conference on Electronic Commerce, June 2011 and Proc. 2nd Conference on Auctions, Market Mechanisms, and Their Applications (AMMA), August 2011. pdf. Poster: pdf.
(2011 Conf) Mithun Chakraborty, Sanmay Das, Malik Magdon-Ismail, "Near-Optimal Target Learning With Stochastic Binary Signals", Proc. 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, July 14-17, 2011. pdf. Poster: pdf.
(2011 Conf) Cindy Hui, Malik Magdon-Ismail, Mark Goldberg, William Wallace, "Effectiveness of Information Retraction", 1st IEEE International Workshop on Network Science (NSW 2011), West Point, NY, June 22-24, 2010. pdf. Poster: pdf.
(2011 Conf) Sibel Adali, Xiaohui Lu, Malik Magdon-Ismail, Jonathan Purnell, "Prominence Ranking in Graphs with Community Structure", Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM), July 2011. pdf. Poster: ppt.

2010:

(2010 BkChap) Nathan Cole, Travis Desell, Daniel Lombranaa Gonzalez, Francisco Fernandez de Vega, Malik Magdon-Ismail, Heidi Newberg, Boleslaw Szymanski and Carlos A. Varela, "Evolutionary Algorithms on Volunteer Computing Platforms: The MilkyWay@Home Project", Bookchapter In F. Fernandez de Vega, E. Cantu-Paz (Eds.): Parallel and Distributed Computational Intelligence, SCI 269, pp. 63-90. Springer-Verlag Berlin Heidelberg 2010. preprint (pdf). Available at springer.com.
(2010 Jrnl) Malik Magdon-Ismail, Konstantin Mertsalov, "A Permutation Approach to Validation", Statistical Analysis and Data Mining Special issue on the best papers of SDM'10 (invited), Vol. 3, No. 6, pp 361-380. preprint (pdf).
Summary: We present a method for "estimating" the out-sample error (validation) using a permutation estimate. For classification, one can bound the out-sample error using a permutation complexity which is related to the permutation estimate. The permutation estimate, however, applies to classification, regression, multiclass, ... and can be extended to different error measures (other than squared error) -- the paper primarily focusses on classification and L2-regression. Our experiments show that the permutation estimate performs well for model selection, outperforming a variety of other model methods including LOO-CV and 10-fold CV. The permutation estimate general, easy to compute and efficient, relying only on being able to run the learning algorithm on data.

Erratum: Rudiger Hewer, in his bachelor thesis graciously pointed out to me that the bound in Lemma 8 should be corrected by a factor of 2 which propagates to the error term in Theorem 5, which should be 5*sqrt(8/n...) instead of 3*sqrt(8/n...).
(2010 Jrnl) Cindy Hui, Mark Goldberg, Malik Magdon-Ismail, William Wallace, "Simulating the Diffusion of Information: An Agent-based Modeling Approach", Special Issue on Agent-Directed Simulation, International Journal of Agent Technologies and Systems (invited paper, in press) pdf.
Summary: We develop an agent based model for information diffusion in social networks based on information fusion, propagation and querying. We validate the model on real data from the San Diego wild fires evacuation in 2007.
(2010 WkgPap) Aseem Brahma, Sanmay Das Malik Magdon-Ismail, "Comparing Prediction Market Structures, With an Application to Market Making", 2010. (arXiv).
Summary: We develop a novel 2-d ball mechanism for comparing prediction market microstructures, and use it to compare two market makers LMSR and BMM.
(2010 Conf) Costas Busch, Malik Magdon-Ismail, "Optimal Oblivious Routing in Hole-Free Networks", Proceedings of the 7th International Conference on Heterogeneous Networking for Quality, Reliability, Security, and Robustness (QShine), Houston, Texas, November 2010. (invited) pdf. Slides: ppt.
(2010 Conf) Cindy Hui, Malik Magdon-Ismail, Mark Goldberg, William Wallace, "Importance of Ties in Information Diffusion", Extended Abstract, Workshop on Information In Networks (WIN), NY, NY, Sept 24-25, 2010. pdf. Poster: pdf.
(2010 Conf) Cindy Hui, Malik Magdon-Ismail, Mark Goldberg, William Wallace, "Weak Ties and the Diffusion of Information in Dynamic Networks", NetSci 2010, poster presentation, Boston, MA. pdf. Poster: pdf.
(2010 Conf) Malik Magdon-Ismail, "Permutation Complexity Bound on Out-Sample Error", Proc. 24th Annual Conference on Neural Information Processing Systems (NIPS), pages, Dec. 6-9, 2010. pdf. Poster: pdf.
(2010 Conf) Travis Desell, Anthony Waters, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos A. Varela, Matthew Newby, Heidi Newberg, Andreas Przystawik, David Anderson, "Accelerating the MilkyWay@Home Volunteer Computing Project with GPUs", 8th International Conference on Parallel Processing and Applied Mathematics, Part I (PPAM), pages 276-288 pdf. Slides: pdf.
(2010 Conf) Goldberg, M. K., Kelley, S., Magdon-Ismail, M., Wallace, W., "Overlapping Communities in Social Networks", Proc. 2nd Confernence on Social Computation (SocialCom), pages 104-113, Aug 20-22, Minneapolis, Minnesota, 2010; also appeared in Proc. Workshop on Social Network Analysis (SNAKDD) at KDD, pages 43-52, July 25-28, Washington, DC, 2010 pdf. Slides: pdf.
(2010 Conf) Jonathan Purnell, Malik Magdon-Ismail, "Approximating the Covariance Matrix with Low Rank Perturbations", Proc. 11th Int. Conf. on Intelligent Data Engineering and Automated Learning (IDEAL), 1-3 Sept., Paisley, Scotland, 2010. pdf. Slides: pdf.
(2010 Conf) Goldberg, M. K., Hayvanovych, M., Magdon-Ismail, M., "Measuring Similarity between Sets of Overlapping Clusters", Proc. Workshop on Social Network Analysis (SNAKDD) at KDD, pages 62-68, July 25-28, Washington, DC, 2010 ; also appeared in Proc. Workshop on Social Intelligence (SIN) at SocialCom, pages 303-308, Aug 20-22, Minneapolis, Minnesota, 2010 pdf. Slides: pdf.
(2010 Conf) Travis Desell, David P. Anderson, Malik Magdon-Ismail, Heidi Newberg, Boleslaw Szymanski, Carlos A. Varela, "An Analysis of Massively Distributed Evolutionary Algorithms", Proc. 2010 IEEE Congress on Evolutionary Computation (CEC), pages (to appear), Barcelona, Spain, July 18-23 2010. pdf. Slides: ppt.
(2010 Conf) Travis Desell, Malik Magdon-Ismail, Bolek Szymanski, Carlos Varela, Heidi Newberg, David P. Anderson, "Validating Evolutionary Algorithms on Volunteer Computing Grids", Proc. 10th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2010), pages 29-41, Amsterdam, Netherlands, June 7-10 2010. pdf. Slides: ppt.
(2010 Conf) Adali, S., Escriva, R., Goldberg, M. K., Hayvanovych, M., Magdon-Ismail, M., Szymanski, B. K., Wallace, W. A., Williams, G.M, "Measuring Behavioral Trust in Social Networks", Proc. International Conference on Intelligence and Security Informatics (ISI), pages 150-152, Vancouver, BC, May 23-26, 2010. pdf. Technical Report: pdf. Slides: pdf.
(2010 Conf) Sanmay Das, Malik Magdon-Ismail, "Collective Wisdom: Information Growth in Wikis and Blogs", ACM Conference on E-Commerce (EC 2010), pages 231-240, June 7-8 , Cambridge Massachusetts, 2010. pdf. Slides: pdf.
(2010 Conf) Cindy Hui, Mark Goldberg, Malik Magdon-Ismail, William Wallace, "Agent Based Simulation of Diffusion Warnings", Proc. Agent-Directed Simulation Symposium (ADS), Orlando FL, April 12-May 14, 2010. pdf. Slides: pdf. [Best Paper]
(2010 Conf) Malik Magdon-Ismail, Konstantin Mertsalov, "A Permutation Approach to Validation", Proc. 10th SIAM International Conference on Data Mining (SDM), pages 882-983, Columbus Ohio, April 29-May 1, 2010. pdf. Slides: pdf. [Best paper, top 3]

2009:

(2009 Jrnl) Ali Civril, Malik Magdon-Ismail "On Selecting a Maximum Volume Sub-Matrix of a Matrix and Related Problems", Theoretical Computer Science, Journal Version: postscript, pdf.
Summary: We study the problem of selecting a set of columns of a matrix with maximum volume. This is useful in trying to construct representative columns of high "numerical rank", which may be of use in matrix reconstruction. We study the algorithmic problem, showing that it is NP-hard and inapproximable. We then study a natural greedy algorithm for this problem, giving a worst case approximation guarantee, also demonstrating a problem which almost achieves this worst case. The performance of greedy is considerably worse than the inapproximability result we prove, suggesting avenues for further work in strengthening the inapproximability result or finding better algorithms.
(2009 Jrnl) Costas Busch, Malik Magdon-Ismail "Atomic Routing Games on Maximum Congestion", Theoretical Computer Science, Volume 410, Issue 36, Pages 3337-3347, 2009. Journal Version: postscript, pdf. (link to journal web version) Conference Version: AAIM 2006, ps, pdf.
Summary: We study atomic routing games on networks in which players choose a path with the objective of minimizing the maximum congestion along the edges of their path. The social cost is the global maximum congestion over all edges in the network. We show that the price of stability is 1. The price of anarchy, PoA, is determined by topological properties of the network. In particular, PoA=O(L+logn), where L is the length of the longest path in the player strategy sets, and n is the size of the network. Further, K-1<=PoA<=c(K^2+log^2 n), where K is the length of the longest cycle in the network, and c is a constant.
(2009 Conf) Travis Desell, Malik Magdon-Ismail, Bolek Szymanski, Carlos Varela, Heidi Newberg, Nathan Cole "Robust Asynchronous Optimization for Volunteer Computing Grids", Proc. 5th International Conference e-Science, pages 263-270, Oxford UK, Dec 7-9, 2009. pdf. Slides: ppt.
(2009 Conf) Jonathan Purnell, Malik Magdon-Ismail "Learning American English Accents Using Ensemble Learning with GMMs", Proc. 8th International Conference on Machine Learning and Applications (ICMLA), pages 47-52, 2009. pdf. Slides: pdf.
(2009 Conf) Christopher Wynnyk, Malik Magdon-Ismail "Pricing the American Option using Reconfigurable Hardware", Proc. 7th IEEE/IFIP Conference on Embedded and Ubiquituous Computing (EUC-2009) pages 532-536, 2009. pdf. Slides: pdf.
(2009 Conf) Mark Goldberg, Malik Magdon-Ismail, Konstantin Mertsalov, "Models of Communication Dynamics for Simulation of Information Diffusion", Proceedings of Advances in Social Networks Analysis and Mining (ASONAM 2009), pages 44-49, 2009. pdf. Slides: ppt.
(2009 Conf) Stephen Kelley, Mark Goldberg, Malik Magdon-Ismail, Konstantin Mertsalov,William Wallace, Mohammed Zaki "graphOnt: An Ontology Based Library for Conversion from Semantic Graphs to JUNG", Proceedings 2009 IEEE International Conference on Intelligence and Security Informatics, pages 170-172, 2009. pdf. Slides: pdf.
(2009 Conf) Stephen Kelley, Mark Goldberg, Malik Magdon-Ismail, Konstantin Mertsalov "Stability of Individual and Group Behavior in a Blog Network", Proceedings 2009 IEEE International Conference on Intelligence and Security Informatics, pages 7-12, 2009. pdf. Slides: pdf.
(2009 Conf) C. Hui, Malik Magdon-Ismail, Mark Goldberg, William A. Wallace "The Impact of Changes in Network Structure on Diffusion of Warnings", Proc. Workshop on Analysis of Dynamic Networks (SIAM International Conference on Data Mining), pages, 2009. pdf. Slides: pdf.

2008:

(2008 Jrnl) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail, William Wallace "Reverse Engineering an Agent-Based Hidden Markov Model for Complex Social Systems", International Journal of Neural Systems, pdf.
Summary: We give heuristics for learning the parameters of a HMM describing dynamics of social groups in social networks. The input data are communications for the social network. The HMM is an agent based micro-law model which is highly interdependent. We apply our methodology to real data from blogs, newsgroups.
(2008 Jrnl) Jeffery Baumes, Hung-Ching (Justin) Chen, Matthew Francisco, Mark Goldberg, Malik Magdon-Ismail, William Wallace "ViSAGE: A Virtual Laboratory for Simulation and Analysis of Social Group Evolution", ACM Transactions on Autonomous and Adaptive Systems (TAAS), postscript, pdf.
Summary: We give a parameterised generative HMM model for social group evolution based on social capital theory, and demonstrate its potential for modeling of social groups in social networks. In particular, one may observe certain abrupt changes in the behavior of macroscopic properties such as the group size distribution as one continuously changes some of the parameters. Within this model we can reproduce many observed phenemona within the social science literature.
(2008 Jrnl) Nathan Cole, Heidi Joe Newberg, Malik Magdon-Ismail, Travis Desell, Kristopher Dawsey, Warren Hayashi, Xinyang (Fred) Liu, Jonathan Purnell, Boleslaw Szymanski, Carlos Varela, James Wisniewski, "Maximum Likelihood Fitting of Tidal Streams with application to the Sagittarius Dwarf Tidal Tails", the Astrophysical Journal, Vol 683, pages 750-766 (2008). Journal Version: postscript, pdf. Conference Versions: eScience 2007 (pdf) . ISMIS 2005 (pdf).
Summary: We give a maximum likelihood algorithm to identify streams in from SDSS like data. In particular, we use a cylindrical profile for the stream over small length scales and a Hernquist profile for the Milky Way halo. The algorithm finds the stream and separates it from the background halo, thus producing more accurate parameters of the stream and giving the first catalogue of stars extracted from the data to have the stream density profile. In general, we are studying the problem of separating geometric objects in spatial data bases.
(2008 Conf) Nathan Cole, Heidi Joe Newberg, Malik Magdon-Ismail, Travis Desell, Boleslaw Szymanski, Carlos Varela, "Tracing the Sagittarius Tidal Stream with Maximum Likelihood", Proc. AIP Conference, Volume 1082, pages 216--220, December 5th, 2008. pdf. Slides: pdf.
(2008 Conf) Ali Civril, Malik Magdon-Ismail "Deterministic Sparse Column Based Matrix Reconstruction via Greedy Approximation of SVD", Proc. 19th International Symposium on Algorithms and Computation (ISAAC), pages, December 2008. pdf. Slides: pdf.
(2008 Conf) Sanmay Das, Malik Magdon-Ismail "Collective Wisdom: Information growth in Wikis and Blogs", NIPS Workshop "Beyond Search: Computational Intelligence for the Web", December 2008. ps, pdf. Poster: pdf.
(2008 Conf) Sanmay Das, Malik Magdon-Ismail "Adapting to a Market Shock: Optimal Sequential Market-Making", Proc. Advances in Neural Information Processing Systems (NIPS), pages, December 2008. pdf (paper). pdf (supplementary material). Poster: pdf.
(2008 Conf) Mark Goldberg, Stephen Kelley, Malik Magdon-Ismail, Konstantin Mertsalov, William Wallace "Communication Dynamics of Blog Networks", Proc. SIGKDD Workshop on Social Network Mining and Analysis, pages August, 2008. pdf. Slides: pdf.
(2008 Conf) Mark Goldberg, Stephen Kelley, Malik Magdon-Ismail, Konstantin Mertsalov, "Stable Statistics of the Blogograph", Proc.Interdisciplinary Studies in Information Privacy and Security (ISIPS), pages, 2008. pdf. Slides: pdf.
(2008 Conf) C. Hui, Malik Magdon-Ismail, Mark Goldberg, William A. Wallace "Micro-Simulation of Diffusion on Warnings", Proc. 5th Int. Conf. on Information Systems for Crisis Response and Management ISCRAM, pages 424-430, 2008. pdf. Slides: pdf.
(2008 Conf) M. Hayvanovych, A. Hoonlor, M. Goldberg, S. Kelley, Malik Magdon-Ismail, K. Mertsalov, B. Szymanski, W. Wallace, "Discovery, analysis and monitoring of hidden social networks and their evolution", Proc. IEEE Conference on Technologies for Homeland Security, pages 1--6 May, Boston, 2008. pdf. Slides: pdf.
(2008 Conf) Yingjie Zhou, Malik Magdon-Ismail, William A. Wallace, Mark Goldberg, "A Generative Model for Statistical Determination of Information Content from Conversation Threads", Proc. International Conference on Intelligencs and Security Informatics (ISI), pages 331--342 June 17-20, Taipei, Taiwan, 2008. pdf. Slides: pdf.
(2008 Conf) Mark Goldberg, Stephen Kelley, Malik Magdon-Ismail, Konstantin Mertsalov, "A Locality Model of the Evolution of Blog Networks", Proc. International Conference on Intelligencs and Security Informatics (ISI), pages 191--193 June 17-20, Taipei, Taiwan, 2008. pdf. Slides: ppt.
(2008 Conf) Baumes, Jeffery, Goldberg, Mark K., Malik Magdon-Ismail, Wallace, William "Discovering Hidden Groups in Communication Networks", Invited Book Chapter, in "Security Informatics and Terrorism: Patrolling the Web",NATO Science for Peace and Security Series, Sub-series D: Information and Communication Security, eds.Cecilia S. Gal and Paul B. Kantor and Bracha Shapira, NATO, pages 82--108. ps (preprint) , pdf (preprint).

2007:

(2007 Jrnl) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, "Efficient Bufferless Packet Switching on Trees and Leveled Networks", Journal of Parallel and Distributed Computing, Volume 67 (2007), pages 1168-1186. Journal Version: postscript, pdf. Conference Versions: EUROPAR 2005 (Leveled), EUROPAR 2004 (Trees), ps, pdf. (Leveled) ps, pdf. (Trees)
Summary: We give bufferless hot-potato style scheduling algorithms for trees and leveled networks. For leveled networks, we give centralized and distributed algorithms. The centralized algorithm has routing time within a logarithmic factor of optimal, and the distributed algorithm is a logarithmic factor worse than the centralized. For trees, we give deterministic and randomized algorithms. The deterministic algorithm applies to bounded degree networks and has routing time within a logaritmic factor of optimal. The randomized algorithm applies to arbitrary networks, and has routing time within a log-squared factor from optimal.
(2007 Jrnl) Costas Busch, Malik Magdon-Ismail, Jing Xi, "Optimal Oblivious Path Selection on the Mesh", IEEE Transactions on Computers, Vol. 57, No. 5, pages 660-671 Journal Version: postscript, pdf. Conference Version: International Parallel and Distributed Processing Symposium (IPDPS), 2005, postscript, pdf.
Summary: We give an oblivious algorithm for the d-dimensional mesh in which packets select their paths independently of each other. The stretch and congestion are both within O(d^2) from the optimal stretch and congestion attainable by oblivious algorithms, and for oblivious algorithms, the congestion is within a logarithmic factor from the optimal congestion attainable by non-oblivious algorithms. Our algorithm is randomized, and we show that significant randomization is required for any algorithm which attains near optimal congestion. In particular, we show that our algorithm uses at most a factor O(d) more bits per packet than any algorithm which attains a comparable congestion.
(2007 Jrnl) Costas Busch, Malik Magdon-Ismail Fikret Sivrikaya, Bulent Yener "Contention-free MAC protocols for asynchronous wireless sensor networks", Distributed Computing, Vol. 21, No. 1, pages 23-42, 2008 Journal Version: ps, pdf. Conference Version: 18th Annual Conference on Distributed Computing (DISC), 2004, postscript, pdf.
Summary: We study TDMA-based MAC protocols for asynchronous wireless sensor networks in very harsh environments. Specifically, the protocols are contention-free (avoid collisions), distributed and self-stabilize to topological changes in the network; any topological changes are contained, namely, affect only the nodes in the vicinity of the change; our protocols do not assume that nodes have a global time reference, that is, nodes may not be time-synchronized, and nodes may wake up in an arbitrary order. We also discuss how to accomodate clock skew, slot misalignment and inability for collision detection at a node.
(2007 Jrnl) Volkan Isler, Malik Magdon-Ismail "Sensor Selection in Arbitrary Dimension", IEEE Transactions on Automation Science and Engineering (TASE), Vol. 5, No. 4, pages 651-660, 2008 ps, pdf.
Summary: We address the problem of localizing an object in many dimensions by selecting a small number of sensors from which to obtain a measurement. We abstract a sensor measurement as a convex set in R^n and a localization as the intersection of all measurments available to the object. We show that a constant number of sensors is all that is required to obtain a constant factor approximation to the localization error obtained from using all the sensors. Instrumental in our proof is a new construction of an enclosing simplex of a convex polygon with bounded volume.
(2007 Jrnl) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, "Universal Bufferless Packet Switching", Siam Journal on Computing, Volume 37, Issue 4, pages 1139-1162, 2007. Journal Version: ps, pdf. Conference Version: Workshop in Approximate and On-line Algorithms (WAOA), 2004, postscript, pdf.
Summary: We give universal packet switching algorithms for bufferless networks, settling an open question regarding the possibility of optimal (to within poly-log factors) bufferless packet switching algorithms. Our argument is constructive and the algorithm is polynomial time. The main idea is to convert a bufferless scheduling problem with prespecified paths in a graph G to a new routing problem in a related graph G'. The problem in G' is solved using buffers, and the solution in G' is emulated in a deterministic bufferless manner in G to give the final bufferless schedule. Buffering in G' is emulated by packet circulation in regions of G.
(2007 Jrnl) Malik Magdon-Ismail, and Joseph Sill "A Linear Fit Gets the Correct Monotonicity Directions", Machine Learning, Volume 70, Number 1 / January, 2008, pages 21-43. Journal Version: postscript , pdf. Conference Version: Conference on Learning Theory (COLT), 2003, postscript, pdf.
Summary: It is often the case that learning with a monotonicity constraint is appropriate, however the directions of the monotonicity constraints may not be available in a general d-dimensional setting. One approach is to use a simple learning model to learn the monotonicity constraints, which can then be applied to constrain a more complex learning model. We show that the optimal linear fit can extract the correct monotonicity directions in one dimension, which remains true in multi-dimensions provided that the input distribution is of a Mahalanobis type. The same results remain true asymptotically when the the OLS estimator is used instead of the optimal linear fit.
(2007 Jrnl) Sibel Adali, Brandeis Hill, Malik Magdon-Ismail, "Information vs. Robustness in Rank Aggregation: Models, Algorithms and a Statistical Framework for Evaluation", (accepted to appear in) Journal of Digital Information Management (JDIM), special issue on Web Information Retrieval pdf.
Summary: We develop a statistical framework for studying ranker aggregation and a new algorithm for ranker aggregation based on a Kernigan-Lin style iterative best flip approach. We give experimental results for varying amounts of noise in the rankers as well as dis-similarity in the rankers (mis-information). In this space the optimal ranker can vary significantly, illustrating an information vs. robustness tradeoff for ranker aggregation.
(2007 Conf) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Reverse Engineering an Agent-Based Hidden Markov Model for Complex Social Systems", Proc. International Conference on Intelligent Data Engineering and Automated Learning (IDEAL), pages 940-949, December 16-19, Birmingham, UK, 2007. postscript, pdf. Slides: postscript, pdf.
(2007 Conf) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Discover The Power of Social and Hidden Curriculum to Decision Making: Experiments with Enron Email and Movie Newsgroups", Proc. International Conference on Machine Learning and Applications (ICMLA), pages 154--159, December 13-15, 2007, Cincinnati, Ohio, 2007. postscript, pdf. Slides: postscript, pdf.
(2007 Conf) Travis Desell, Nathan Cole, Malik Magdon-Ismail, Heidi Newberg, Bolek Szymanski, Carlos Varela "Distributed and Generic Maximum Likelihood Evaluation", [Best paper Award, 2nd place] Proc. 3rd International Conference on e-Science and Grid Computing, pp , Bangalore, India, Dec 10-13, 2007. pdf. Slides: pdf.
(2007 Conf) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Learning What Makes a Society Tick", Proc. Worksop on Optimization-based Data Mining Techniques with Applications at the Seventh IEEE International Conference on Data Mining (ICDM), pages 195--200, October 28-31, Omaha, Nebraska, 2007. pdf. Slides: postscript, pdf.
(2007 Conf) Fikret Sivrikaya, Costas Busch, Malik Magdon-Ismail, Bulent Yener "ASAND: Asynchronous Slot Assignment and Neighbor Discovery Protocol for Wireless Networks", OPNETWORK 2007, August 27-31,Washington DC. postscript, pdf. Slides: powerpoint,
(2007 Conf) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Inferring Agent Dynamics from Social Communication Network", Proc. Joint 9th Web Mining (WebKDD) and 1st Social Network Analysis (SNA-KDD) Workshop and KDD 2007, August 12-15, San Jose, CA, 2007. postscript, pdf. Slides: ppt,
(2007 Conf) Hung-Ching (Justin) Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Extracting Agent Based Models from a Social Communication Network", Conference of Agent-based Modelers and Agent-based Modeling Platform Users (SwarmFest 2007), July 12-14, Chicago, IL USA, 2007. abstract(pdf), Slides: ppt,
(2007 Conf) Yingjie Zhou, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Strategies for Cleaning Organizational Emails with an Application to Enron Email Dataset", 5th Conf. of North American Association for Computational Social and Organizational Science (NAACSOS 07), Emory - Atlanta, Georgia , USA. June 7-9, 2007. postscript, pdf. Slides: ppt,
(2007 Conf) Jeffery Baumes, Mark Goldberg, Mykola Hayvanovych, Stephen Kelley, Malik Magdon-Ismail Konstantin Mertsalov and Al Wallace, "SIGHTS: A Software System for Finding Coalitions and Leaders in a Social Network", Proc. 5th Conference on Intelligence and Security Informatics (ISI), May 23-24, Rutgers, NJ, 2007. postscript, pdf. Slides: postscript, pdf.
(2008 Conf) Baumes, Jeffery, Goldberg, Mark K., Malik Magdon-Ismail, Wallace, William "Identification of Hidden Groups in Communications", Invited Book Chapter, in "Handbooks in Information Systems 2 (National Security)", Elsevier, eds. , pages 209--242. ps (preprint) , pdf (preprint). Technical Report

2006:

(2006 Jrnl) Victor Boyarshinov, Malik Magdon-Ismail, "Efficient Optimal Linear Boosting of a Pair of Classifiers", IEEE Transactions on Neural Networks, Vol. 18, No. 2, pages 317-328, 2007. postscript, pdf.
Summary: We relate the problem of boosting a pair of classifiers to an optimal weighted linear separation problem in two dimensions. We give efficient algorithms to perform this separation and compute the leave one out error. The main idea is to carefully enumerate the possible separators to finally obtain an O(n^2logn) algorithm, where n is the number of data points.
(2006 Jrnl) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, Paul Spirakis "Direct Routing: Algorithms and Complexity", Algorithmica, Volume 45, Number 1, Pages 45--68, June 2006. (Invited submission from ESA 2004). Journal Version: postscript , pdf. Conference Version: European Symposium on Algorithms (ESA), 2004, postscript , pdf.
Summary: We consider the direct routing problem, in which packets must follow pre-specified paths in a bufferless network. In this routing model, the only parameter to be determined for a packet is the injection time. We give a general greedy algorithm which has routing time O(C D), and we show that there are routing problems where this is the best achievable by a direct algorithm. We show that optimal direct routing is NP-hard to approximate by gap preserving reduction from graph coloring. We give versions of the greedy algorithm which have optimal or near optimal routing time for trees, multi-dimensional meshes, hypercubes and the butterfly. Finally, we use hard direct routing problems to obtain lower bounds on the buffering requirements of any algorithm that requires packets to follow pre-specified paths. If such algorithms obtain near optimal routing time, then packets must be buffered Omega(N^{1/3}) times (on average), where N is the number of packets.
(2006 Conf) Hung-Ching (Justin) Chen, Malik Magdon-Ismail, "Learning Martingale Measures From High Frequency Finiancial Data to Help Option Pricing", 5th International Conference on Computational Intelligence in Economics and Finance (CIEF 2006) in conjunction with 9th Joint Conference on Information Sciences (JCIS 2006) October 8 - 11, Kaohsiung, Taiwan. postscript, pdf. Slides: postscript, pdf.
(2006 Conf) Hung-Ching (Justin) Chen, Malik Magdon-Ismail, "NN-OPT: Neural Networks for Option Pricing Using Multinomial Trees", The 13th International Conference on Neural Information Processing (ICONIP2006), October 3-6, Hong Kong. postscript, pdf. Slides: postscript, pdf.
(2006 Conf) Matthew Francisco, Jeffery Baumes, Hung-Ching Chen, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Using Agent-Based Modeling to Traverse Frameworks in Theories of the Social", International Conference on Complex Systems (ICCS06), Boston, MA, June 25-30, 2006. postscript, pdf. Slides: postscript, pdf.
(2006 Conf) Jeffery Baumes, Hung-Ching Chen, Matthew Francisco, Mark Goldberg, Malik Magdon-Ismail and Al Wallace, "Modeling the Cultural Subjectivity: Towards Computational Critique", 4th Conf. of North American Association for Computational Social and Organizational Science (NAACSOS 06), Notre-Dame, Indiana, June 22-23, 2006. postscript, pdf. Slides: postscript, pdf.
(2006 Conf) Petros Drineas, Asif Javed, Malik Magdon-Ismail, Gopal Pandurangan, Reino Virrankoski, Andreas Savvides "Distance Matrix Reconstruction from Incomplete Distance Information for Sensor Network Localization", Third Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (IEEE SECON06), Sep. 25-28, Reston, VA, USA. postscript, pdf.
(2006 Conf) Ali Civril, Malik Magdon-Ismail and Eli Bocek-Rivele, "SSDE: Fast Graph Drawing Using Sampled Spectral Distance Embedding", 14th International Symposium on Graph Drawing (GD2006), Karlsruhe, Germany, Sept 18-20, 2006. postscript, pdf. Slides: powerpoint.
(2006 Conf) Jeffery Baumes, Hung-Ching Chen, Matthew Francisco, Mark Goldberg, Malik Magdon-Ismail, and Al Wallace "Social Capital Experiments", 4th Conf. of North American Association for Computational Social and Organizational Science (NAACSOS 06), Notre-Dame, Indiana, June 22-23, 2006. postscript, pdf. Slides: ps, pdf, powerpoint,
(2006 Conf) Jeffery Baumes, Mark Goldberg, Mykola Hayvonovych, Malik Magdon-Ismail, William Wallace, Mohammed Zaki, "Finding Hidden Group Structure in a Stream of Communications", [Top 3 Paper Award] Proceedings of the 4th Symposium on Intelligence and Security Informatics (ISI 06), San Diego, CA, May 23-24 2006. postscript, pdf. Slides: ps, pdf,
(2006 Conf) Costas Busch, Malik Magdon-Ismail, "Atomic Routing Games on Maximum Congestion", Proceedings of the 2nd International Conference on Algorithmic Aspects in Information and Management (AAIM), Hong Kong, June 22-23, 2006. postscript, pdf. Slides: postscript, pdf.
(2006 Conf) Sibel Adali, Brandeis Hill and Malik Magdon-Ismail, "The Impact of Ranker Quality on Ranker Aggregation Algorithms: Information vs. Robustness", Proc. ICDE Workshop on Challenges in Web Information Retrieval and Integration (WIRI), pp 10-19, Atlanta, Georgia, April 3, 2006. postscript, pdf. Slides: postscript, pdf.

2005:

(2005 Jrnl) Malik Magdon-Ismail, Fikret Sivrikaya, Bulent Yener "Joint Problem of Power Optimal Connectivity and Coverage in Wireless Sensor Networks", Wireless Networks, Vol. 13, No. 4, pages 537-550, 2006. postscript , pdf.
Summary: We model a node using a Markovian automaton which transitions between various modes which offer different capabilities and use power at different states. We develop a mean field theoretic analysis for the behavior of the system, which we optimize with respect to power consumption using connectivity and coverage as constraints. We apply the methodology to a 3 state automaton with off, listen and transmit states. We give results comparing the mean field theory analysis with a simulation, and we compare our approach with existing power saving approaches, which do not necessarily offer connectivity and coverage.
(2005 Jrnl) Victor Boyarshinov, Malik Magdon-Ismail "Linear Time Isotonic and Unimodal Regression in the L1 and Linfinity Norms", Journal of Discrete Algorithms, Vol. 4, No. 4, pages 676-691, 2006. postscript , pdf.
Summary: We give linear time algorithms for (unweighted) isotonic and unimodal regression in the Linfinity norm. For the L1 norm, we give linear time algorithms when the output values are in a bounded, finite set, and an approximation algorithm when the output values are in a bounded range. An open question that remains is to construct linear time isotonic regression in the L1 norm for the arbitrary case.
(2005 Jrnl) Costas Busch, Malik Magdon-Ismail and Mukkai Krishnamoorthy "Hardness Results for Cake-Cutting", Bulletin of the European Association for Theoretical Computer Science, Number 86, pp. 85-106, June 2005. Journal Version: postscript , pdf. Conference Version: Symposium on the Theoretical Aspects of Computing (STACS), 2003, postscript, pdf.
Summary: We consider the algorithm aspects of fair division of a resource, in which limited input from the users may be obtained regarding how much they value a certain part of the resource. If there are n users, it is known that a fair division of the resource (one in which no user thinks they have a less than 1/n of the resource) can be obtained using n logn cuts of the resource (which is modeled as the unit interval). No matching lower bound is available. We take a first step in this direction by showing (through reduction from sorting) that Omega(n logn) comparisons are made by any fair division algorithm. We also consider strong envy-free division (in which no user thinks she has a lesser share than any other user), and give lower bounds of Omega(n^2) on the number of cuts that must be made.
(2005 Conf) Hung-Ching (Justin) Chen, Malik Magdon-Ismail, "Learning Martingale Measures to Price Options", 1st Workshop on Machine Learning in Finance at NIPS 2005 Vancouver/Whistler, Dec 9, 2005. postscript, pdf. Slides: postscript, pdf.
(2005 Conf) Ali Civril, Malik Magdon-Ismail and Eli Bocek-Rivele, "SDE: Graph Drawing Using Spectral Distance Embedding", 13th International Symposium on Graph Drawing (GD05), Limerick, Ireland, Sept 12-14, 2005. postscript, pdf. Slides: postscript, pdf. powerpoint. Technical Report: postscript, pdf.
(2005 Conf) Sibel Adali, Tina Liu and Malik Magdon-Ismail, "Optimal Link Bombs are Uncoordinated", First International Workshop on Adversarial Information Retrieval on the Web (AIRWeb 05) at the 14th International World Wide Web Conference (WWW2005), Chiba, Japan, 10-14 May, 2005. postscript, pdf. Slides: ps, pdf, powerpoint,
(2005 Conf) Jeffery Baumes, Hung-Ching Chen, Matthew Francisco, Mark Goldberg, Malik Magdon-Ismail, and Al Wallace "Dynamics of Bridging and Bonding in Social Groups, a Multi-Agent Model", 3rd Conf. of North American Association for Computational Social and Organizational Science (NAACSOS 05), Notre-Dame, Indiana, June 26-28, 2005. postscript, pdf. Slides: ps, pdf, powerpoint,
(2005 Conf) Jeffery Baumes, Mark Goldberg, Malik Magdon-Ismail, "Efficient Identification of Overlapping Communities", IEEE International Conference on Intelligence and Security Informatics (ISI 05), Atlanta, Georgia, May 19-20 2005. postscript, pdf. Slides: ps, pdf, powerpoint,
(2005 Conf) Costas Busch, Malik Magdon-Ismail, Jing Xi "Oblivious Routing on Geometric Networks", 17th ACM Symp. on Parallelism in Alg. and Arch. (SPAA) 2005, July 17-20, Las Vegas, Nevada, USA. postscript, pdf. Slides: ps, pdf.
(2005 Conf) Costas Busch, Shailesh Kelkar, Malik Magdon-Ismail "Efficient Bufferless Routing on Leveled Networks", EURO-PAR 2005, Aug 30-Sep 2, Lisboa, Portugal. postscript, pdf. Slides: postscript, pdf.
(2005 Conf) Jonathan Purnell, Malik Magdon-Ismail, Heidi Newberg "A Probabilistic Approach to Finding Geometric Objects in Spatial Datasets of the Milky Way", 15th International Symposium on Methodoligies for Intelligent Systems (ISMIS 2005), May 25-28, Saratoga Springs, NY, USA. postscript, pdf. Slides: ps, pdf, powerpoint,
(2005 Conf) Mark Goldberg, David Hollinger, Malik Magdon-Ismail "Experimental Evaluation of the Greedy and Random Algorithms for Finding Independent Sets in Random Graphs", 4th International Workshop on Efficient and Experimental Algorithms (WEA 2005), May 10-13, Santorini, Greece. postscript, pdf. Slides: postscript, pdf.
(2005 Conf) Costas Busch, Malik Magdon-Ismail, Jing Xi "Optimal Oblivious Path Selection on the Mesh", International Parallel and Distributed Processing Symposium (IPDPS 2005), Apr 3-8, Denver, Colorado, USA. postscript, pdf. Slides: postscript, pdf.
(2005 Conf) Jeffrey Baumes, Mark Goldberg, Mukkai Krishnamoorthy, Malik Magdon-Ismail, Nathan Preston "Finding Communities by Clustering a Graph into Overlapping Subgraphs", International Conference on Applied Computing (IADIS 2005), Feb 22-25, Algarve, Portugal. postscript, pdf, doc. Slides: ps, pdf, powerpoint,
(2005 Conf) Seyit Ahmet Camtepe, Mark Goldberg, Malik Magdon-Ismail, Mukkai Krishnamoorthy "Detecting Conversing Groups of Chatters: A Model, Algorithms and Tests", International Conference on Applied Computing (IADIS 2005), Feb 22-25, Algarve, Portugal. postscript, pdf, doc. Slides: ps, pdf, powerpoint,

2004:

(2004 Jrnl) Malik Magdon-Ismail, Amir F. Atiya, "Maximum Drawdown", Risk Magazine, Volume 17, Number 10, pages 99-102, October, 2004. postscript (preprint) , pdf (preprint). Print Version.
Summary: We use some recent results on the behavior of the Maximum Drawdown of a Brownian motion to develop a scaling law for the Maximum Drawdown which allows one to compare the Sterling/Calmar ratios of trading strategies when the historical data available for the different trading strategies extends over different lengths of time.
(2004 Jrnl) Malik Magdon-Ismail, Amir F. Atiya, Amrit Pratap, and Yaser S. Abu-Mostafa "On the Maximum Drawdown of a Brownian Motion", Journal of Applied Probability, Volume 41, Number 1, pages 147-161, March, 2004. Journal Version: postscript (preprint) , pdf (preprint). Print Version. Conference Version: Conference on Computational Intelligence for Financial Engineering (CIFEr), 2003, postscript, pdf.
Summary: We develop the distribution for the Maximum Drawdown (largest drop from a peak to a bottom) of a Brownian motion. We compute the expected value as an infinite integral series. We reduce this function to a single "universal" function, which can be computed numerically once, and used to compute the expected drawdown for any Brownian motion. We compute the asymptotic behavior of this infinite series to reveal three regimes for the behavior of the MDD -- logarithmic, square-root and linear, depending on the sign of the drift parameter in the Brownian motion.
(2004 Conf) Costas Busch, Malik Magdon-Ismail, Fikret Sivrikaya, Bulent Yener "Contention-Free MAC Protocols for Wireless Sensor Networks", 18th Annual Conference on Distributed Computing (DISC 2004), Oct 4-7, Amsterdam, the Netherlands. postscript, pdf. Slides: ps, pdf, powerpoint,
(2004 Conf) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, "Universal Bufferless Routing", Workshop in On-line Algorithms (with ESA 2004 in conjunction with ALGO 2004), Sept 14-17, Bergen, Norway. postscript, pdf. Slides: ps, pdf, powerpoint,
(2004 Conf) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, Paul Spirakis "Direct Routing: Algorithms and Complexity", 12th European Symposium on Algorithms (ESA 2004 in conjunction with ALGO 2004), Sept 14-17, Bergen, Norway. postscript, pdf. Slides: postscript, pdf.
(2004 Conf) Costas Busch, Malik Magdon-Ismail, Marios Mavronicolas, Roger Wattenhofer "Near-Optimal Hot Potato Routing on Trees", EURO-PAR 2004, 31 Aug-3 Sept, Pisa, Italy. postscript, pdf. Slides: ps, pdf, powerpoint,
(2004 Conf) Hung-Ching Chen, Mark Goldberg, Malik Magdon-Ismail, "Identifying Multi-ID users in Open Forums", 2nd NSF/NIJ Symposium on Intelligence and Security Informatics (ISI 04), Tucson, AZ, June 11-12 2004. postscript, pdf. Slides: ps , pdf , powerpoint ,
(2004 Conf) Jeffery Baumes, Mark Goldberg, Malik Magdon-Ismail, William Wallace "Discovering Hidden Groups in Communication Networks", 2nd NSF/NIJ Symposium on Intelligence and Security Informatics (ISI 04), Tucson, AZ, June 11-12 2004. postscript, pdf. Slides: ps, pdf, powerpoint,

2003:

(2003 Jrnl) Malik Magdon-Ismail and Amir F. Atiya "A Maximum Likelihood Approach to Variance Estimation of a Brownian Motion Using the High, Low and Close", Quantitative Finance, volume 3, issue 5, pages 376 - 384, August 2003. Pre-print versions: postscript , pdf. Print version
Summary: Since joint distribution of the high, low and close (conditioned on the open, drift and variance parameters) can be computed, we use this distribution to develop a likelihood for observing the high, low and close conditioned on the drift and variance parameters. By maximizing this likelihood, we construct an estimate of the variance parameter, which we compare to other well known estimators of variance parameter. Maximizing the likelihood yields a systematic gain in performance.
(2003 Conf) Malik Magdon-Ismail and Joseph Sill "Using a Linear Fit to Determine Monotonicity Directions", The 16th Annual Conference on Learning Theory (COLT 2003), Washington, DC, USA, August 24-27 2003. ps pdf Slides: postscript, pdf.
(2003 Conf) Mark Goldberg, Paul Horn, Malik Magdon-Ismail, Jessie Riposo, David Siebecker and William Wallace Bulent Yener "Statistical Modeling of Social Groups on Communication Networks", First conference of the North American Association for Computational Social and Organizational Science (CASOS 03), Pittsburgh PA, June 22-25, 2003. postscript, pdf. Slides: postscript, pdf.
(2003 Conf) Malik Magdon-Ismail, Mark Goldberg, David Siebecker and William Wallace "Locating Hidden Groups in Communication Networks Using Hidden Markov Models ", NSF/NIJ Symposium on Intelligence and Security Informatics (ISI 03), Tucson, AZ, June 2-3 2003. postscript, pdf. Slides: postscript, pdf.
(2003 Conf) Malik Magdon-Ismail, Amir Atiya, Amrit Pratap, and Yaser S. Abu-Mostafa "The Maximum Drawdown of the Brownian Motion", International Conference on Computational Intelligence for Financial Engineering (CIFEr 03), Hong Kong, March 2003. postscript, pdf. Slides: postscript, pdf.
(2003 Conf) Malik Magdon-Ismail "Pricing the American Put Using a New Class of Tight Lower Bounds", International Conference on Computational Intelligence for Financial Engineering (CIFEr 03), Hong Kong, March 2003. postscript, pdf. Slides: postscript, pdf.
(2003 Conf) Malik Magdon-Ismail, Costas Busch and Mukkai Krishnamoorthy "Cake-Cutting is Not A Piece of Cake", Symposium on the Theoretical Aspects of Computer Science (STACS 03), Berlin, February 27 - March 1, 2003. postscript, pdf. Slides: ps, pdf, powerpoint,

2002:

(2002 Jrnl) Malik Magdon-Ismail and Amir F. Atiya "Density Estimation and Random Variate Generation Using Multi-Layer Networks", IEEE Transactions on Neural Networks, Volume 13, Number 3, pp 497--520, May 2002. Journal Version: postscript , pdf. Conference Versions: Advances in Neural Information Processing Systems (NIPS), 1998, postscript, pdf. (Density Estimation) Neural Networks for Signal Processing (NNSP99), 1999 postscript, pdf. (Control Theory Formulation of Random Variate Generation)
Summary: We consider density estimation and random variate generation using neural networks. For density estimation, we give two methods, both based on learning the sample cumulative distribution function, one is randomized (SLC) and one is a deterministic interpolation using a smoothness constraint. We prove that the randomized algorithm converges to the deterministic version, asymptotically, however practically, randomization offers an additional smoothness property. We prove convergence of the deterministic algorithm to the true density given bounded derivative constraints on the true density.
(2002 Conf) Malik Magdon-Ismail, Hung-Ching (Justin) Chen and Yaser S. Abu-Mostafa "The Multilevel Classification Problem and a Monotonicity Hint", Intelligent Data Engineering and Learning, Third International Conference, Manchester, UK, August, 2002. Springer. postscript, pdf. Slides: postscript, pdf.
(2002 Conf) Malik Magdon-Ismail, Yu Shao, Daniel Freedman and Chris Bystroff "Compressing Protein Conformational Space", ISMB02: Tenth International Conference on Intelligent Systems for Molecular Biology, August 2002. postscript, pdf. (submitted, rejected)
(2002 Conf) Yu Shao, Malik Magdon-Ismail, Daniel Freedman, Srinivas Akella, Mohammed Zaki and Chris Bystroff "Compressing Protein Conformational Space", RECOMB02: Sixth International Conference on Research in Computational Molecular Biology, April 2002 (poster). postscript, pdf. (abstract)
(2002 BkChap) Malik Magdon-Ismail "Maximum Likelihood Estimation (MLE)", Book Chapter, to appear in "Encyclopedia of Financial Engineering and Risk Management", commissioning editor Gillian Lindsey, Fitzroy Dearborn Publishers, 2002. postscript , pdf.
(2002 BkChap) Malik Magdon-Ismail "Ordinary Least Squares (OLS)", Book Chapter, to appear in "Encyclopedia of Financial Engineering and Risk Management", commissioning editor Gillian Lindsey, Fitzroy Dearborn Publishers, 2002. postscript , pdf.
(2002 BkChap) Malik Magdon-Ismail "Expected Value / Mathematical Expectation", Book Chapter, to appear in "Encyclopedia of Financial Engineering and Risk Management", commissioning editor Gillian Lindsey, Fitzroy Dearborn Publishers, 2002. postscript , pdf.

2001:

(2001 Jrnl) Malik Magdon-Ismail "The Equivalent Martingale Measure: An Introduction to Pricing Using Expectations", IEEE Transactions on Neural Networks, Volume 12, Number 4, pp 684-693, July 2001. postscript , pdf.
Summary: We give an introduction to the Martingale measure, a tool for pricing options, which forml a link between pricing of financial derivatives and Monte Carlo simulation. We use this approach together with a limiting argument to give an elementary derivation of the price of the European call option, and discuss American options.
(2001 Jrnl) Eric Breimer, Mark K. Goldberg, Brian Kolstad and Malik Magdon-Ismail "On the Height of a Random Set of Points in a d-Dimensional Unit Cube", Journal of Experimental Mathematics, Volume 10, Number 4, pp 583-597, 2001. Journal Version: postscript , pdf, source code. Conference Version: Workshop on Algorithm Engineering and Experiments (ALENEX), 2001, postscript, pdf.
Summary: We give give efficient algorithms for estimating the height (maximum antichain) of a set of random points in a d-dimensional unit cube. The algorithms are based on a result we show which states that it suffices to consider only points near the diagonal of the cube. We develop efficient algorithms for generating these points directly, together with an efficient algorithm for constructing the height of this set. We use the co-convergence of estimates from different sized regions around the diagonal to the same value too develop an improved estimate. As a result, we give the first accurate estimates of c_d, the coefficients governing the convergence rate.
(2001 Conf) Mark K. Goldberg, Darren T. Lim and Malik Magdon-Ismail "A Learning Algorithm for String Assembly", BIOKDD01: Workshop on Data Mining in Bioinformatics (with SIGKDD01 Conference), August 2001. postscript, pdf. Slides: postscript, pdf.
(2001 Conf) Eric Breimer, Mark K. Goldberg, Brian Kolstad and Malik Magdon-Ismail "Experimental Evaluation of the Height of a Random Set of Points in a d-Dimensional Cube", 3rd Workshop on Algorithm Engineering and Experiments (ALENEX), January 2001. postscript, pdf. Slides: postscript, pdf.

2000:

(2000 Jrnl) Malik Magdon-Ismail and Amir F. Atiya "The Early Restart Algorithm", Neural Computation, Volume 12, Number 6, pp 1303-1313, June 2000. postscript , pdf.
Summary: We give an analysis of the early restart phenomenon which can be used to speed up an algorithm whose running time is a function of the initial conditions. For some initial conditions, the runtime may be excessively long or unbounded (for example local minima in an optimization). Given some probability distribution over which the initial conditions are sampled, there is some distribution for the stopping time of the process. We give a analysis of how using restart can improve the expected runtime, and develop a condition for selecting the optimal restart time. Essentially, if the algorithm has run for long enough, it may be favorable to restart the algorithm.
(2000 Jrnl) Malik Magdon-Ismail "No Free Lunch for Noise Prediction", Neural Computation, Volume 12, Number 3, pp547-565, March 2000. postscript , pdf.
Summary: We show that if the noise is additive, then the prior and posterior distributions of the noise are equal when the prior over target functions is uniform. There is no free lunch in the sense that if no useful assumptions can be made about the target function, then it is not possible gain any extra information regarding the nature of the noise.
(2000 Conf) Malik Magdon-Ismail and Amir F. Atiya "Volatility Estimation Using High, Low and Close Data - A Maximum Likelihood Approach", Computational Finance (CF2000), Proceedings, June 2000. postscript, pdf.
(2000 Conf) Malik Magdon-Ismail Amir F. Atiya, Yaser Abu-Mostafa "Pricing the quality option for the bond futures contract in a multifactor Vasicek framework", Proceedings of the 16th IMACS World Congress, Lausanne, Switzerland, August 2000. postscript, pdf.

1999 and before:

(1998 Jrnl) Malik Magdon-Ismail, Alexander Nicholson and Yaser Abu-Mostafa, "Financial Markets, Very Noisy Information Processing", Proceedings of the IEEE, Special Issue on Intelligent Signal Processing, Volume 86, Number 11, pp 2184-2195, November 1998. postscript , pdf. addendum to definition A.1, postcript , pdf.
Summary: We give a general characterization of the performance of a general class of learning systems. Specifically, when the learning system is stable (satisfies certain regularity conditions) and the noise is additive with bounded variance and the input distribution has compact support, the test performance approaches the optimal test performance at a universal rate of O(1/N), where N is the number of data points.
(1999 Jrnl) Zehra Cataltepe, Yaser Abu-Mostafa and Malik Magdon-Ismail "No Free Lunch for Early Stopping", Neural Computation, Volume 11, Number 4, pp 995-1010, May 1998. postscript , pdf.
Summary: We show that the success of early stopping is an artifact of training dynamics which tends to initialize the weights at smaller values and so the resulting function tends to have smaller magnitude weights, and hence tends to be smoother, which corresponds to having a smoothness prior on the target functions. When no such hidden prior assumption is made, i.e., when early stopping chooses uniformily among all hypotheses with the same (higher than optimal) error, we show that the resulting classifier has performance which is monotonically decreasing in the value of the early stopping error. Thus there is No Free Lunch for early stopping, in that it can only be succesfull if some bias is given toward the target function prior -- which means that some prior on the target function is necessary.
(1998 Jrnl) Malik Magdon-Ismail and Yaser Abu-Mostafa, "Validation of Volatility Models", Journal of Forecasting, Volume 17, pp 349-368, 1998. Journal Version: postscript , pdf. Conference Version: Conference on Neural Networks in the Capital Markets (NNCM), 1996. postscript, pdf.
Summary: We analyse some systematic errors in the prediction of volatility using Maximum Likelihood methods. In particular, there is a systematic underprediction even when the mean is predicted well. Further, Maximum Likelihood methods can systematically select the wrong model over the correct model, when faced with a choice of two models. We develop an expression for a correction factor for systematically correcting for this systematic underprediction.
(1999 Conf) Malik Magdon-Ismail and Amir F. Atiya "A Control Formulation for Random Variate Generation", Neural Networks for Signal Processing (NNSP99) IX, Proceedings of the 1999 IEEE Workshop, eds. Yu-Hen Hu, Jan Larsen, Elizabeth Wilson, Scott Douglas, August 1999. postscript, pdf.
(1999 Conf) Malik Magdon-Ismail and Amir Atiya, "A Bayesian Approach to Estimating Mutual Fund Returns", Computational Finance (CF99), eds Yaser S. Abu-Mostafa, Blake LeBaron, Andrew W. Lo and Andreas S. Weigend, MIT Press, 1999. postscript, pdf.
(1998 Conf) Malik Magdon-Ismail and Amir Atiya, "Neural Networks for Density Estimation", Advances in Neural Information Processing Systems (NIPS) 11, Proceedings of the 1998 Conference, pp522-528, eds. Michael S. Kearns, Sara A. Solla and David A. Cohen, MIT Press, 1998. postscript, pdf.
(1998 Conf) Malik Magdon-Ismail and Amir Atiya "Neural Networks for Density Estimation in Financial Markets", Intelligent Data Engineering and Learning, First International Symposium, pp 171-178, eds. L. Xu, L. W. Chan, I. King and A. Fu, Springer, October, 1998. postscript, pdf.
(1998 Conf) Malik Magdon-Ismail, Alexander Nicholson and Yaser Abu-Mostafa, "Estimating Model Limitation in Financial Markets", Intelligent Data Engineering and Learning (IDEAL), First International Symposium, pp 19-26, eds. L. Xu, L. W. Chan, I. King and A. Fu, Springer, October, 1998. postscript, pdf.
(1997 Conf) Zehra Cataltepe and Malik Magdon-Ismail "Incorporating Test Inputs into Learning", Advances in Neural Information Processing Systems (NIPS) 10, Proceedings of the 1997 Conference, pp 437-443, eds. Michael I. Jordan, Michael J. Kearns and Sara A. Solla, MIT Press, 1997. postscript, pdf.
(1996 Conf) Malik Magdon-Ismail and Yaser Abu-Mostafa, "Systematic Underprediction of Volatility in Maximum Likelihood Methods", Decision Technologies for Financial Engineering, Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets, NNCM 96 (currently Computational Finance), pp 125-137, eds. Andreas S. Weigend, Yaser S. Abu-Mostafa and A.-Paul N. Refenes, World Scientific, 1996. postscript, pdf.
(1998 BkChap) Malik Magdon-Ismail, Alexander Nicholson and Yaser Abu-Mostafa, "Learning in the Presence of Noise", Book Chapter, in "Intelligent Signal Processing", eds. Simon Haykin and Bart Kosko, IEEE Press, 2001. postscript , pdf.
(1998 Thesis) Malik Magdon-Ismail, "Supervised Learning in Probabilistic Environments", 1998. (pdf)

The IEEE provisional copyright policies require the following message be posted:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.