Experience
- Amphora — A lightweight digital repository for corpus analysis GitHub
- Added RESTful API endpoints.
- Fixed bugs associated with resource authorizations. Identified and fixed SQL performance bottlenecks.
- Wrote Celery tasks handling the processing of resources.
- Wrote Ansible scripts for managing the deployed instance.
- Technologies: Python, Django, Celery, Ansible, Supervisor, Postgres
- JSTOR DfR
- Indexed JSTOR Data for Research corpora in Elasticsearch.
- Performed Named Entity Recognition using NLTK, Stanford NLP Package.
- Technologies: Python, Elasticsearch, NLTK
- SirIsaac — Automated dynamical systems inference GitHub
- Restructured package to be better suited for installation and distribution through setuptools.
- Technologies: Python, Setuptools
- Tethne — Python project for bibliographic network analysis. GitHub
- Added support for exporting corpus to Pandas DataFrames.
- Added documentation for all classes and modules.
- Technologies: Python, Pandas.
- Designed Machine Learning system estimating OS build times based on build config and time of the day/week.
- Established streaming data pipeline connecting Elasticsearch and the ML system
- Experimented with Deep Neural Networks, Decision Trees, Regression methods to achieve best results.
- Established streaming data pipeline connecting NVIDIA build systems and the ML system.
- Wrote RESTful APIs supporting queries to the system.
- Technologies: Python, xgboost, sklearn, Pandas, TensorFlow, Docker, Elasticsearch.
- Project Orb
- Designed and developed Python SDK facilitating communication between a Linux/Windows Host and Android/Embedded-Linux/QNX/Chrome OS based target board.
- Abstracted SSH/ADB/UART/Telnet/FTDI communication interfaces supporting command execution, file transfer, filesystem management, log monitoring, package installation, and power control.
- Developed python libraries serving as wrappers over perforce and other internal tools/systems.
- Wrote scripts supporting automated debian and setuptools package generation.
- Established infrastructure supporting pip, debian based installations of the system.
- Technologies: Python, Bash.
- Tintin
- Designed system supporting free-text, schema-independent search on multiple large MySQL databases.
- Reduced typical SQL based search times from tens of seconds per search query to milliseconds.
- Technologies: Java, JSP, SQL, Bash.
Projects
Person Re-Identification Github
- Used Convolutional Neural Networks to identify same individual's photos from two vantage points.
- Devised new image similarity identification technique to improve existing accuracy.
- Technologies: Python, TensorFlow.
- Achieved state of the art accuracy of 86.2% on VIPeR dataset.
KDD Cup 2016
- Predicted the relevance of an academic institution based on Microsoft Academic Graph data.
- Generated statistical features by parsing and processing 120 GB of textual data.
- Used Gradient Boosted Decision Trees, Ranked SVM to make predictions.
- Technologies: Python, Google Big Query, Bash.
Swallow GitHub
Wallpaper utility overlaying quote of the day on a picture of the day fetched from the web. Python.
Pricewatch
Python tool that notifies the user when an Amazon.com wishlisted item's price is updated.
Indexy GitHub
Light-weight library for pretty web directory listing. PHP, HTML, CSS.