DAI Group | Publications

Publications

0 Publications, 0 Theses

2024

Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni

Large language models can be zero-shot anomaly detectors for time series?
Preprint. May 2024.
Alexandra Zytek, Sara Pido, Kalyan Veeramachaneni

LLMs for XAI: Future Directions for Explaining Explanations
ACM CHI 2024, HCXAI. Workshop on Human-Centered Explainable AI, May 2024.
Lei Xu, Sarah Alnegheimish, Laure Berti-Equille, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers
Preprint. January 2024.

Theses

Grace Y. Song
Modeling Control Signals for Reconstruction-based Time Series Anomaly Detection
M.Eng Thesis. Dept. of EECS, MIT. May 2024.
Wei-En Warren Wang
Why-did-the-prediction-change-Explaining-changes in-predictions-as-time-progresses
M.Eng Thesis. Dept. of EECS, MIT. Feb 2024.
Guanpeng Andy Xu
SigPro-Enabling-Subject-Matter-Expert-Guidance-in-Feature-Engineering
M.Eng Thesis. Dept. of EECS, MIT. Feb 2024.

2023

Alexandra Zytek, Wei-En Wang, Dongyu Liu, Laure Berti-Equille, Kalyan Veeramachaneni
Pyreal: A Framework for Interpretable ML Explanations
Preprint. December 2023.
Alexandra Zytek, Wei-En Wang, Sofia Koukoura, Kalyan Veeramachaneni
Lessons from Usable ML Deployments and Application to Wind Turbine Monitoring
NeurIPS XAI in Action. Workshop on XAI in Action: Past, Present, and Future Applications, December 2023.
Sarah Alnegheimish, Laure Berti-Equille, Kalyan Veeramachaneni
Making the End-User a Priority in Benchmarking: OrionBench for Unsupervised Time Series Anomaly Detection
Preprint. 26 Oct 2023.

Theses

Nassim Oufattole
Towards Creating Synthetic Data Testbeds for Research
S.M Thesis. Dept. of EECS, MIT. June 2023.
Frances R. Hartwell
Zephyr: a Data-Centric Framework for Predictive Maintenance of Wind Turbines
M.Eng Thesis. Dept. of EECS, MIT. February 2023.

2022

Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni
AER: Auto-Encoder with Regression for Time Series Anomaly Detection
BigData-2022. In Proceedings of IEEE International Conference on Big Data, December 2022.
Lei Xu, Alfredo Cuesta-Infante, Laure Berti-Equille, Kalyan Veeramachaneni
R&R: Metric-guided Adversarial Sentence Generation
AACL-2022. In Findings of the Association for Computational Linguistics: AACL-IJCNLP , November 2022.
Shubhra Kanti Karmaker Santu, Md. Mahadi Hassan, Micah J. Smith, Lei Xu, ChengXiang Zhai, Kalyan Veeramachaneni
AutoML to Date and Beyond: Challenges and Opportunities
CSUR-2022. In ACM Computing Surveys, November 2022.
Dongyu Liu, Sarah Alnegheimish, Alexandra Zytek, Kalyan Veeramachaneni
MTV: Visual Analytics for Detecting, Investigating, and Annotating Anomalies in Multivariate Time Series
CSCW-2022. In Proceedings of the ACM Conference On Computer-Supported Cooperative Work And Social Computing, October 2022.
Lei Xu, Laure Berti-Equille, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
In Situ Augmentation for Defending Against Adversarial Attacks on Text Classifiers
ICONIP-2022. In Proceedings of the International Conference on Neural Information Processing, November 2022.
AdvML-2022. In KDD Workshop on Adversarial Learning Methods for Machine Learning and Data Mining, August 2022.
Alexandra Zytek, Ignacio Arnaldo, Dongyu Liu, Laure Berti-Equille, Kalyan Veeramachaneni
The Need for Interpretable Features: Motivation and Taxonomy
KDD Explorations. June 2022.
Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni
Sintel: A Machine Learning Framework to Extract Insights from Signals
SIGMOD-2022. In Proceedings of International Conference on Management of Data, June 2022.
[code]

Theses

Alicia (Yi) Sun
Algorithmic Fairness in Sequential Decision Making
Ph.D. Thesis. Dept. of IDSS, MIT. October 2022.
Lei Xu
Towards Deployable Robust Text Classifiers
Ph.D. Thesis. Dept. of EECS, MIT. September 2022.
Romain Palazzo
Synthetic Data Assessment based on Model Improvement
M.S Thesis. Dept. of Math, EPFL. July 2022.
Lawrence C. Wong
Time Series Anomaly Detection using Prediction-Reconstruction Mixture Errors
M.Eng Thesis. Dept. of EECS, MIT. May 2022.
Sarah Alnegheimish
Orion – A Machine Learning Framework for Unsupervised Time Series Anomaly Detection
S.M Thesis. Dept. of EECS, MIT. May 2022.

2021

Alexandra Zytek, Dongyu Liu, Rhema Vaithianathan, Kalyan Veeramachaneni
Sibyl: Understanding and Addressing the Usability Challenges of Machine Learning In High-Stakes Decision Making
VIS-2021. In IEEE Transactions on Visualization and Computer Graphics (TVCG), January 2022.
Furui Cheng, Dongyu Liu, Fan Du, Yanna Lin, Alexandra Zytek, Haomin Li, Huamin Qu, Kalyan Veeramachaneni
VBridge: Connecting the Dots Between Features and Data to Explain Healthcare Models **Honorable Mention Award**
VIS-2021. In IEEE Transactions on Visualization and Computer Graphics (TVCG), January 2022.
Micah J. Smith, Jürgen Cito, Kelvin Lu, Kalyan Veeramachaneni
Enabling Collaborative Data Science Development with the Ballet Framework
CSCW-2021. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work and Social Computing, October 2021.
Yi Sun, Ivan Ramirez, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Towards Reducing Biases in Combining Multiple Experts Online
IJCAI-2021. In Proceedings of the International Joint Conference on Artificial Intelligence, August 2021.
Alexandra Zytek, Dongyu Liu, Rhema Vaithianathan, Kalyan Veeramachaneni
Sibyl: Explaining Machine Learning Models for High-Stakes Decision Making
CHI-2021. In Extended Abstracts of ACM CHI Conference on Human Factors in Computing Systems , May 2021.
Micah J. Smith, Jürgen Cito, Kalyan Veeramachaneni
Meeting in the notebook: a notebook-based environment for micro-submissions in data science collaborations
Preprint. March 2021.

Theses

Zhuofan Xie
Tracer: A Machine Learning Based Data Lineage Solver with Visualized Metadata Management
M.Eng Thesis. Dept. of EECS, MIT. December 2021.
Micah J. Smith
Collaborative, Open, and Automated Data Science
Ph.D. Thesis. Dept. of EECS, MIT. September 2021.
Alexandra Zytek
Towards Usable Machine Learning
S.M. Thesis. Dept. of EECS, MIT. February 2021.

2020

Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks
BigData-2020. In Proceedings of IEEE International Conference on Big Data, December 2020.
Sarah Alnegheimish, Najat Alrashed, Faisal Aleissa, Shahad Althobaiti, Dongyu Liu, Mansour Alsaleh, Kalyan Veeramachaneni
Cardea: An Open Automated Machine Learning Framework for Electronic Health Records
DSAA-2020. In Proceedings of IEEE 7th International Conference on Data Science and Advanced Analytics, October 2020.
Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni
The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
SIGMOD-2020. In Proceedings of International Conference on Management of Data, June 2020.
Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni
Understanding User-Bot Interactions for Small-Scale Automation in Open-Source Development
CHI-2020. In Extended Abstracts of ACM CHI Conference on Human Factors in Computing Systems, April 2020.
Micah J. Smith, Kelvin Lu, Kalyan Veeramachaneni
Demonstration of Ballet: A Framework for Open-Source Collaborative Feature Engineering
MLSys-2020. Proc. Third Conference on Machine Learning and Systems, March 2020.

Theses

Felipe Alex Hofmann
Tracer: A Machine Learning Approach to Data Lineage
M.Eng Thesis. Dept. of EECS, MIT. May 2020.
Ajinkya Kishore Nene
Deep Learning Approaches to Universal and Practical Steganalysis
M.Eng Thesis. Dept. of EECS, MIT. May 2020.
Kevin Zhang
Tiresias: A Peer-to-Peer Platform for Privacy Preserving Machine Learning
M.Eng Thesis. Dept. of EECS, MIT. February 2020.
Katherine Wang
A Machine Learning Framework for Predictive Maintenance of Wind Turbines
M.Eng Thesis. Dept. of EECS, MIT. February 2020.
Lei Xu
Synthesizing Tabular Data using Conditional GAN
S.M Thesis. Dept. of EECS, MIT. February 2020.

2019

Kevin Alex Zhang, Kalyan Veeramachaneni
Enhancing Image Steganalysis with Adversarially Generated Examples
CSCML-19. In Proc. 3rd International Symposium, CSCML 2019.
Kevin Alex Zhang, Lei Xu, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Robust Invisible Video Watermarking with Attention
Preprint. 4 Sep 2019.
Yi Sun, Ivan Ramirez, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Learning Fair Classifiers in Online Stochastic Settings
Preprint. 19 Aug 2019.
Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Modeling Tabular data using Conditional GAN
NeurIPS 2019. Proc. of Advances in Neural Information Processing Systems, 2019.
Lei Xu, Shubhra Kanti Karmaker Santu, Kalyan Veeramachaneni
MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data
Preprint. 28 Jun 2019.
Kevin Zhang, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
SteganoGAN: High Capacity Image Steganography with GANs
Preprint. 12 Jan 2019.
Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu
ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
CHI-19. In Proc. ACM Conference on Human Factors in Computing Systems, 2019.
Yi Sun, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Learning Vine Copula Models For Synthetic Data Generation
AAAI-19. In Proc. 33rd AAAI Conference on Artificial Intelligence, 2019.

Theses

Ihssan Tinawi
Machine Learning for Time Series Anomaly Detection
M.Eng Thesis. Dept. of EECS, MIT. June 2019.
Kelvin Liu
Feature Engineering and Evaluation in Lightweight Systems
M.Eng Thesis. Dept. of EECS, MIT. June 2019.

2018

Micah Smith, Kelvin Lu, Kalyan Veeramachaneni
Ballet: A lightweight framework for open-source, collaborative feature engineering
NeurIPS SysML. Workshop on Systems for ML and Open Source Software, December 2018.
Gaurav Sheni, Benjamin Schreck, Roy Wedge, Max Kanter, Kalyan Veeramachaneni.
Prediction Factory: automated development and collaborative evaluation of predictive models
Preprint. 29 Nov 2018.
Lei Xu, Kalyan Veeramachaneni
Synthesizing Tabular Data using Generative Adversarial Networks
Preprint. 27 Nov 2018.
Dennis Wilson, Silvio Rodrigues, Carlos Segurad, Ilya Loshchilov, Frank Hutter, Guillermo López Buenfil, Ahmed Kheiri, Ed Keedwell, Mario Ocampo-Pineda , Ender Özcan, Sergio Ivvan Valdez Peña, Brian Goldman, Salvador Botello Rionda, Arturo Hernández-Aguirre, Kalyan Veeramachaneni, Sylvain Cussat-Blanc
Evolutionary computation for wind farm layout optimization
Renewable Energy. Volume 126, October 2018, Pages 681-691.
Max Kanter, Benjamin Schreck, Kalyan Veeramachaneni
Machine Learning 2.0: Engineering Data Driven AI Products
Preprint. 1 Jul 2018.
Zara Perumal, Kalyan Veeramachaneni
Towards building active defense systems for software applications
CSCML-18. In Proceedings of International Symposium on Cyber Security Cryptography and Machine Learning, Be'er Sheva, Israel, June 2018.
Ignacio Arnaldo, Ankit Arun, Sumeet Kyathanahalli, Kalyan Veeramachaneni
Acquire, adapt, and anticipate: continuous learning to block malicious domains
IEEE Big Data 2018. In Proceedings of IEEE international conference on Big data, December 2018.
Benjamin Schreck, Nitin John James, Shankar Mallapur, Rajendra Prasad, Sanjeev Vohra, Kalyan Veeramachaneni
Augmenting Software Project Managers with Predictions from Machine Learning
IEEE Big Data 2018. In Proceedings of IEEE international conference on Big data, December 2018.

Theses

Andrew Montanez
SDV: An Open Source Library for Synthetic Data Generation
M.Eng Thesis. Dept. of EECS, MIT. August 2018.
William Xue
A Flexible Framework for Composing End to End Machine Learning Pipelines
M.Eng Thesis. Dept. of EECS, MIT. May 2018.
Laura Gustafson
Bayesian Tuning and Bandits: An Extensible, Open Source Library for AutoML
M.Eng Thesis. Dept. of EECS, MIT. May 2018.
Akshay Ravikumar
A Framework to Search for Machine Learning Pipelines
M.Eng Thesis. Dept. of EECS, MIT. May 2018.
BingFei Cao
Augmenting the Software Testing Workflow with Machine Learning
M.Eng Thesis. Dept. of EECS, MIT. May 2018.
Micah J. Smith
Scaling Collaborative Open Data Science
S.M Thesis. Dept. of EECS, MIT. May 2018. (bibtex)
Alexander Friedrich Nordin
End to End Machine Learning Workflow Using Automation Tools
M.Eng Thesis. Dept. of EECS, MIT. May 2018.
Zara Perumal
Towards Building Active Defense for Software Applications
M.Eng Thesis. Dept. of EECS, MIT. February 2018.
Caroline Morganti
Applying Natural Language Models and Causal Models to Project Management Systems
M.Eng Thesis. Dept. of EECS, MIT. February 2018.

2017

Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross and Kalyan Veeramachaneni
ATM: A Distributed, Collaborative, Scalable, System for Automated Machine Learning (Code)
IEEE Big Data - 17. Proc. of 2017 IEEE International Conference on Big Data (Big Data 2017), Boston, MA, USA, December 2017.
Roy Wedge, James Max Kanter, Santiago Moral Rubio, Sergio Iglesias Perez, Kalyan Veeramachaneni
Solving the "false positives" problem in fraud prediction
Preprint. 20 Oct 2017.
Alec Anderson, Sebastien Dubois, Alfredo Cuesta-Infante and Kalyan Veeramachaneni
Sample, Estimate, Tune: Scaling Bayesian Auto-Tuning of Data Science Pipelines
IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.
Bennett Cyphers and Kalyan Veeramachaneni
AnonML: Locally Private Machine Learning over a Network of Data Holders
IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.
Micah Smith and Kalyan Veeramachaneni
FeatureHub: Towards Collaborative Data Science
IEEE DSAA - 17. IEEE International Conference on Data Science and Advance Analytics, Tokyo, Japan. October 2017.
Igacio Arnaldo, Alfredo Cuesta-Infante, Akit Arun, Mei Lam, Costas Bassias, and Kalyan Veeramachaneni
Learning Representations for Log Data in Cybersecurity
CSCML-17. International Symposium on Cyber Security Cryptography and Machine Learning, June 2017.

Theses

Alec W. Anderson
Deep Mining: Scaling Bayesian Auto-tuning of Data Science Pipelines
M.E. Thesis, MIT Dept of EECS, August 2017.
Bennett James Cyphers
A System for Privacy-Preserving Machine Learning on Personal Data
M.Eng. Thesis MIT Dept of EECS, August 2017.
Donghyun Michael Choi
SenseML: A Platform for Constructing IOT Data Pipelines
M.E. Thesis, MIT Dept of EECS, August 2017.
Jonathan Johannemann
COAL: A Continuous Active Learning System
M.Fin. Thesis, MIT Sloan school of management, June 2017.
David Wong
Build your own deep learner
M.E. Thesis, MIT Dept of EECS, June 2017.
Katharine Xiao
Towards Automatically Linking Data Elements
M.E. Thesis, MIT Dept of EECS, June 2017.
John J.D. O’Sullivan
Teach2Learn: Gamifying Education to Gather Training Data for Natural Language Processing
M.E. Thesis, MIT Dept of EECS, February 2017.

2016

Bennett Cyphers, Kalyan Veeramachaneni
AnonML: Anonymous machine learning over a network of data holders
NIPS. NIPS workshop on Private multiparty communication Barcelona, Spain. December 2016.
Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Markov Switching Copula Models for Longitudinal Data
ICDM W-16. 11th International Workshop on Spatial and Spatiotemporal Data Mining Barcelona, Spain. December 2016.
James Max Kanter, Owen Gillespie, Kalyan Veeramachaneni
Label, Segment, Featurize: a cross domain framework for prediction engineering
IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.
Benjamin Schreck, Kalyan Veeramachaneni
What would a data scientist ask? Automatically formulating and solving prediction problems
IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.
Neha Patki, Roy Wedge, Kalyan Veeramachaneni
The synthetic data vault
IEEE DSAA - 16. IEEE International Conference on Data Science and Advance Analytics Montreal, CA. October 2016.
Ben Gelman, Matt Revelle, Carlotta Domeniconi, Aditya Johri, Kalyan Veeramachaneni
Acting the Same Differently: A Cross-Course Comparison of User Behavior in MOOCs
EDM-16. International conference on Educational data mining Raleigh, NC. July 2016.
Kalyan Veeramachaneni, Ignacio Arnaldo, Alfredo Cuesta-Infante, Costas Bassias, Vamsi Korrapati, Kei li.
AI2: Training a big data machine to defend
IEEE BDS - 16. IEEE International Conference on Big Data Security on Cloud New York, NY. April 2016.

Theses

Benjamin J. Schreck
Towards An Automatic Predictive Question Formulation
M.Eng. Thesis, MIT Dept of EECS, June 2016.
Yonglin Wu
Model Factory: A New Way to Look at Data Through Models.
M.Eng. Thesis, MIT Dept of EECS, June 2016.
Sebastien Boyer
Transfer Learning for Predictive Models in MOOCs.
S.M. Thesis, MIT Dept of EECS, IDSS, June 2016.
Neha Patki
The Synthetic Data Vault: Generative Modeling for Relational Databases.
M.Eng. Thesis, MIT Dept of EECS, June 2016.
Mario Orozco Gabriel
Articial Intelligence Opportunities and an End-To-End Data-Driven Solution for Predicting Hardware Failures
S.M., MBA thesis, MIT Dept of Mechanical Engineering, Sloan School of Management, June 2016.

2015

Sebastien Boyer, Ben U. Gelman, Benjamin Schreck, Kalyan Veeramachaneni
Data Science Foundry for MOOCs
IEEE DSAA - 15. IEEE/ACM Data Science and Advance Analytics Conference, October 2015.
James Max Kanter, Kalyan Veeramachaneni
Deep Feature Synthesis: Torwards Automating Data Science Endeavors
IEEE DSAA - 15. IEEE/ACM Data Science and Advance Analytics Conference (10% acceptance rate), October 2015.
Ignacio Arnaldo, Una-May O’Reilly, Kalyan Veeramachaneni
Building Predictive Models via Feature Synthesis
GECCO-15. ACM conference on Genetic and Evolutionary Computation, July 2015.
Kalyan Veeramachaneni, Alfredo Cuesta-Infante, Una-May O’Reilly
Copula Graphical Models for Wind Resource Estimation
IJCAI-15. International joint conference on Artificial Intelligence, July 2015.
Franck Dernoncourt*, Kalyan Veeramachaneni, Una-May O’Reilly
Gaussian Process-based Feature Selection for Wavelet Parameters: Predicting Acute Hypotensive Episodes from Physiological Signals
CBMS-15. 28th IEEE International Symposium on Computer-Based Medical Systems, June 2015.
Sebastien Boyer, Kalyan Veeramachaneni
Transfer Learning for Predictive Models in Massive Open Online Courses.
AIED-15. 17th International Conference on Artificial Intelligence in Education, June 2015.
Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O’Reilly, Saman Amarasinghe
Autotuning Algorithmic Choice for Input Sensitivity
36th annual ACM SIGPLAN conference on Programming Language Design and Implementation, June 2015.
Kalyan Veeramachaneni, Una-May O’Reilly, Kiarash Adl
Feature Factory: Crowd Sourcing Feature Discovery
L@S-2015. WIP session at ACM Learning @Scale, March 2015.
Ignacio Arnaldo, Kalyan Veeramachaneni, Andrew Song, Una-May O’Reilly
Bring Your Own Learner! A cloud-based, data-parallel commons for machine learning
IEEE Computational Intelligence Magazine. Special Issue on Computational Intelligence for Cloud Computing (February 2015).

Theses

Kevin Wu
Deep Tuner: A System for Search Technique Recommendation in Program Autotuning. Prof. Saman Amarasinghe
M.Eng. thesis, MIT Dept of EECS, August 2015.
Max Kanter, 2015
The Data Science Machine: Emulating Human Intelligence in Data Science Endeavors
Bryan Collazo
Machine Learning Blocks
M.Eng. Thesis, MIT Dept of EECS, June 2015.
Michael Wu, 2015.
The Synthetic Student: A Machine Learning Model to Simulate MOOC Data
Alex Wang, 2015
Feature Factory: A collaborative, Crowd-Sourced Machine Learning System
Edwin Zhang, 2015
Image Miner: An Architecture to Support Deep Mining of Images

2014

Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly and Saman Amarasinghe
OpenTuner: an extensible framework for program autotuning
ACM 23rd International Conference on Parallel Architectures and Compilation, August 2014.
Colin Taylor*, Kalyan Veeramachaneni, Una-May O’Reilly, arXiv report
Likely to stop? Predicting Stopout in Massive Open Online Courses
Preprint. 14 Aug 2014.
Dennis Wilson, Sylvain Cussat-Blanc, Kalyan Veeramachaneni, Hervé Luga, Una-May O’Reilly
A continuous developmental model for wind farm layout optimization
ACM conference on Genetic and Evolutionary Computation, July 2014.
Kalyan Veeramachanenif, Una-May O’Reilly, Colin Taylor, arXiv report
Towards Feature Engineering at Scale for Data from Massive Open Online Courses
Preprint. 20 July 2014.
Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly
Flash: A GP-GPU Ensemble Learning System for handling Large Datasets
17th European Conference on Genetic Programming, April 2014.
Una-May O’Reilly, Kalyan Veeramachaneni
Technology for Mining the Big Data of MOOCs
Research and Practice in Assessment, Winter 2014.
Kalyan Veeramachaneni, Ignacio Arnaldo, Owen Derby, Una-May O’Reilly
FlexGP: Cloud-Based Ensemble Learning with Genetic Programming for Large Regression Problems
Journal Of Grid Computing.

Theses

Will Drevo 2014
Delphi: A Distributed Multi-algorithm, Multi-user, Self Optimizing Machine Learning System
(This thesis was filed as a patent and is pending release)
Franck Dernoncourt (S. M) 2014
BeatDB: An end-to-end approach to unveil saliencies from massive signal data sets.
Quentin Agren (Visiting student) 2014
From Click Stream to Learning Trajectories, Bridging OpenEdx and MOOCdb
Vineet Gopal 2014
PhysioMiner: A Scalable Cloud Based Framework for Physiological Waveform Mining
Colin Taylor 2014
Stopout Prediction in Massive Open Online Courses
Elaine Han 2014
Modeling Problem Solving in Massive Open Online Courses

2013

Monica Vitali, Una-May O’Reilly, and Kalyan Veeramachaneni
Modeling Service Execution on Data Centers for Energy Efficiency and Quality of Service Monitoring
IEEE International Conference on Systems, Man and Cybernetics, October 2013.
Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly.
Analyzing Millions of Submissions to Help MOOC instructors Understand Problem Solving
NIPS Workshop on Big Learning, August 2013.
Franck Dernoncourt*, Kalyan Veeramachaneni and Una-May O’Reilly
beatDB : A Large Scale Waveform Feature Repository
NIPS Workshop on Machine Learning for Clinical Data Analysis and Healthcare, August 2013.
Ignacio Arnaldo, Kalyan Veeramachaneni and Una-May O’Reilly
Building MultiClass Nonlinear Classifiers with GPUs
NIPS Workshop on Big Learning, August 2013.
Kalyan Veeramachaneni, Teasha Feldman-Fitzthum, Una-May O’Reilly, Alfredo Cuesta-Infante
Copula-Based Wind Resource Assessment
NIPS Workshop on Machine Learning for Sustainability, August 2013.
Franck Dernoncourt*, Choung Do, Sherif Halawa, Una-May O’Reilly, Colin Taylor, Kalyan Veeramachaneni and Sherwin Wu
MoocViz: A Large Scale, Open Access, Collaborative Data Analytics Framework for MOOCs
NIPS workshop on Data Directed Education, August 2013.
Kalyan Veeramachaneni, Zachary A. Pardos, Una-May O’Reilly
MOOCdb: Developing Data Standards for MOOC Datascience
MOOCShop at Artificial Intelligence in Education, July 2013.
Kalyan Veeramachaneni, Owen Derby, Dylan Sherry, Una-May O’Reilly
Learning regression ensembles with genetic programming at scale
Proceeding of the fifteenth ACM annual conference on Genetic and evolutionary computation conference, July 2013.
Dennis Wilson*, Emmanuel Awa, Sylvain Cussat-Blanc, Kalyan Veeramachaneni, Una-May O’Reilly
On Learning to Generate Wind Farm Layouts
Proceeding of the fifteenth ACM annual conference on Genetic and evolutionary computation conference, July 2013.
Alexander Waldin*, Kalyan Veeramachaneni, Una-May O’Reilly
Learning Blood Pressure Behavior from Large Physiological Waveform Repositories
ICML Workshop on Healthcare, June 2013.
Dennis Wilson*, Kalyan Veeramachaneni, and Una-May O’Reilly
Cloud Scale Distributed Evolutionary Strategies for High Dimensional Problems
Applications of Evolutionary Computation, Lecture Notes in Computer Science.
Owen Derby*, Kalyan Veeramachaneni, and Una-May O’Reilly
Cloud Driven Design of a Distributed Genetic Programming Platform
Applications of Evolutionary Computation, Lecture Notes in Computer Science.
Erik Hemberg, Constantin Berzan*, Kalyan Veeramachaneni, Una-May O’Reilly
Introducing Graphical Models to Analyze Genetic Programming Dynamics
Proceedings of the twelfth workshop on Foundations of genetic algorithms, January 2013.

Theses

Exploiting multiple levels of parallelisms in FlexGP for big data sets Dylan Sherry
Owen Derby, 2013
FlexGP: a Scalable System for Factored Learning in the Cloud
Alex Waldin, 2013
Learning Blood Pressure Behavior From Large Blood Pressure Waveform Repositories and Building Predictive Models
Chidube Ezeozue, 2013
Large-scale Consensus Clustering and Data Ownership Considerations for Medical Applications
Josh Ingram, 2012
[a]sorted Selection: Improving Building Performance and Diversity Using a New Form of Interactive Evolutionary Algorithm
Danielle Ramazotti
An Observational Study: The Affect of Diuretics Administration on Outcomes of Mortality and Mean Duration of I.C.U. Stay