Hello, I'm Kapil!

I'm a 1st year PhD student at Northwestern University in the Technology and Social Behavior (TSB) program--a joint PhD program in Computer Science and Communication Studies--advised by Professor Haoqi Zhang and Professor Darren Gergle. My research interest broadly falls at the intersection of human-computer interaction (HCI), social and crowd computing, and artificial intelligence. Here, I wish to build human-machine systems that deeply intertwine the abilities of people to complete intricate tasks with the abilities of machines to greatly scale their impact beyond what any individual could do alone.

My current work explores how we can scale authentic research training by developing systems that efficiently coordinate the needs of students to resources in the community who can best help them without over utilizing them.

When not building stuff, I spend my time reading, trying new places to eat, learning how to become a better cook, and making the perfect cup of coffee.

Experience

Delta Lab | Evanston, IL

Design, Technology, and Research (DTR) Tech Fellow

I work on implementing extended, scalable versions of research prototypes for future large-scale deployment in Meteor.js and iOS. Additionally, I continue my research on physical crowdsourcing, with a focus on developing intelligent and responsive coordination algorithms.

Nov 2017 - Sept 2018

Data Mining & Supply, LLC | Chicago, IL

Lead Data Scientist, Partner

Data Mining & Supply offers a tech platform and consulting services to help teams design, build and run data-smart services. My role involved (1) assisting in developing the technology; (2) conducting data analysis for client work; and (3) consulting on how companies can use data science practices and techniques in their strategy.

July 2016 - Nov 2017

Design, Technology, and Research (DTR) | Evanston, IL

Undergraduate Researcher

I conducted independent research advised by Professor Haoqi Zhang and Yongsung Kim on physical crowdsourcing, where I sought to address the challenge of collecting high-fidelity, high-coverage data in physical space without needing to pay people to use our system. We built multiple prototypes for the iPhone and Apple Watch, and ran large-scale studies to assess how well they performed. A paper of this work is currently under review for CHI 2019.

Jan 2016 - Jun 2017

Northwestern University | Evanston, IL

Teaching Assistant

I led discussion sections, acted as design mentor (for EECS 330, EECS 395), answered questions on Piazza, and graded assignments/exams. I was a teaching assistant for:

  • EECS 395: Tangible Interaction Design and Learning (Spring 2017)
  • EECS 349: Machine Learning (Fall 2016)
  • EECS 330: Human-Computer Interaction (Winter 2016, Winter 2017)

Jan 2016 - Jun 2017

Anjoui Technologies | Evanston, IL

Chief Technology Officer, Co-Founder

Anjoui's goal was to create a marketplace where cooks could prepare full meals, fit for 4-6 people, and sell off portions they would not be able to eat to the community, who wanted affordable, high-quality food. I led the technical team on developing an iOS application, website, and analytic framework of the product. Additionally, I worked with others to develop and test our business model.

May 2015 - Jan 2016

SapientNitro | Chicago, IL

Associate, Marketing Strategy and Analysis
(Data Scientist and Software Developer)

I worked with an interdisciplinary research team of software engineers, data scientists, architects, ethnographers, and social scientists to study human behavior and with clients on how best to improve or expand existing products. My work included data science, full-stack software development, and designing new proof-of-concept analytic approaches for digital log and sensor data.

Some specific examples include: (1) developing statistical and predictive models for customer segmentation, shopping habits, room occupancy, and space utilization; (2) modeling kitchen activities using third party sensors; (3) assisting in building a web-based community management platform for 700+ community members; and (4) researching methods and designing algorithms to relate linear digital activity log data with non-linear episodes of focus.

Jun 2014 - Nov 2016

SapientNitro | Chicago, IL

Junior Associate, Technology
(Data Analyst Intern)

I designed and built algorithms to differentiate between speakers and model the posture of surgeons in an operating room environment. Additionally, I wrote a simple text classification tool to categorize unlabeled URLs, and modeled posture of surgeons in the operating room by using hierarchical clustering on Microsoft Kinect skeleton data.

Jun 2013 - Sept 2013

Selected Projects

4X + LES

Design, Technology, and Research (DTR)
Research Development

Collecting data from physical crowds is difficult. To encourage enough participation, you must make the interaction easy, low-effort, and opportunistic (i.e. contribute when they want to), but this leads to gaps in data fidelity and coverage that cannot be addressed. To fill these gaps, we need a data collection process that, first, gathers opportunistic contributions, and, then, directs crowds to data gaps.

For this, we designed the 4X framework: a process that continually improves data fidelity and coverage by scaffolding low-effort contributions and uses existing data to steer users to data gaps when properly motivated to go (i.e. user interests align with collected data). As a demonstration of 4X, we built an iPhone and Apple Watch application called LES that allowed users to contribute data and receive information of interest to them. 4X + LES was advised by Professor Haoqi Zhang, Professor Darren Gergle, and Yongsung Kim.
Technologies: Swift, MongoDB, Node.js, Express.js

Posture Detection in the Operating Theater

SapientNitro
Research Data Analysis

During surgery, there are many times where a surgeon’s posture is improper due to having to hunch over obstacles to properly operate. This posture can lead to chronic back pain as the surgeon ages. Though individuals in an operating theater could be mindful of the surgeon’s posture and remind the surgeon to correct it when possible, a machine solution would remove the burden from the person, allowing person to focus on other tasks.

To address, we used a Microsoft Kinect camera to capture skeletal data from surgeons in the operating theater during real surgeries. In visual analysis of the data, and found that skeletons for proper and improper posture to be different. To model, we used hierarchical clustering to group postures and worked with doctors to label them. This project was done in collaboration with Pasindu Wewegama, Megan Silas, and The University of Chicago Pritzker School of Medicine.
Technologies: R, Processing, Microsoft Kinect

Puppet Master

Tangible Interaction Design and Learning
Research

Museums, while a great place for learning, often have difficulty engaging children past surface-level information about artifacts. We found that at the China Hall in the Field Museum of Chicago, children often did not read the text information about the Chinese art and culture for most artifacts. One notable exception was the Chinese Shadow Puppets, where children were engaged but did not learn the material the exhibit aimed to present (interactive storytelling, culture and history).

We built a system that allows children to learn (1) the art of puppetry, (2) the art of storytelling, and (3) important aspects of Chinese culture or history through the utilization of a physical puppet controlled by the user via whole-body gesture recognition. We believe that through this interaction, children visiting the exhibit will take away more about the art and culture behind puppetry than from observation alone. Puppet Master was done in collaboration with Norah Altuwaim, Aishwarya Vaikuntam, and Sydney Zink, and advised by Professor Michael Horn.
Technologies: C#, Python, Microsoft Kinect, Arduino

Chris Jones Bot

EECS 338: Practicum in Intelligent Information Systems
Coursework Development Data Analysis

Chris Jones is a theater critic at the Chicago Tribune and has published reviews for over 30 years, resulting in an extremely large set of opinions and critiques of performances, venues, playwrights, directors, and actors/actresses. Given this, is it possible to create a machine that "speaks" like Chris, and enable a user to ask it questions about what Chris thinks about different aspects of theater?

Using natural language processing (NLP) methods, we built an API and Slack chat bot interface that allows users to ask queries such as, "How is Chicago different than New York?". This was done by scraping all his previously written reviews, annotating them with sentiment and other metadata, and designing query templates that can be used against the data to answer queries. The Chris Jones Bot was done in collaboration with Nick Paras and Helen Foster, and advised by Professor Kristian Hammond.
Technologies: Python, Google Natural Language API, Elasticsearch, Slack Integrations

Map Matching

EECS 495: Geospatial Vision and Visualization
Coursework Data Analysis

GPS probe data from mapping vehicles is often misaligned with the road driven on due to variability in sensor hardware, and must be remapped to the road to be of use.

We designed an approach to remapping the data to a road segment by selecting the road segment (from a set of five nearest roads to the probe point) that minimized the sum of the normalized distance to road segment and the difference in angle between the vehicle's heading and the road segment. Further, we used XGBoost (due to need for a non-linear regression model) to predict the elevation change between two probe data points. Map Matching was done in collaboration with Nick Paras.
Technologies: Python, XGBoost

Steam Game Recommender

EECS 349: Machine Learning
Coursework Data Analysis

Unlike recommendation services provided by Netflix and Amazon, video game recommendations are still, for the most part, done peer to peer or by a salesperson at a store. For big, AAA games this is not as big a problem since few of these games are released each year and it is easy for the user to find what they may or may not like. With an online store like Steam, however, there are many great games built by Indie Developers that are more difficult to find. We proposed to build a recommendation system that took games own by Steam users and provided them a list of games that they may enjoy.

To collect data, we scraped the Steam libraries (games owned by users) of around ~10,000 users. Each instance has 2,283 binary attributes, either 0 or 1, representing a user’s ownership of a specific game. We used a random 80/20 split to generate our training and test sets, on which we trained and evaluated our classifiers, finding that the decision tree algorithm yielded the highest F1 score of 0.474. The Steam Game Recommender was done in collaboration with Aaron Karp and William Xiao.
Technologies: Python, Weka

Skills

Proficient
Intermediate
Familiar
Data Science

R, Jupyter Notebooks, Pandas, Numpy, Scikit Learn, Weka, Natural Language Processing (NLP) Techniques, Tensorflow (familiar)

Hardware

Ardunio, Raspberry Pi, Breadboarding, Printed Circuit Board (PCB) Design (familiar)