About Me
Hello! I am Chhaya Choudhary!
Currently, I am working as a Data Scientist at Infoblox where I am solving challenging data problems involving malware detection and classification using Machine Learning and Deep Learning techniques. I am actively working on data exploration, data cleaning, feature engineering, model preparation, evaluation and deploying the model using docker containers. I have multiple accepted publications in the field of cybersecurity using AI/ML.
I graduated from University of Washington Tacoma with Masters in Computer Science with specialization in Data Science. My fields of interest are Machine Learning, Deep Learning, Natural Language Processing, Distributed Computing and Algorithms.
I worked on research projects at University of Washington under the supervision of Prof. Martine De Cock on Detection of Domain Generation Algorithms (DGA) using Deep Learning. My Capstone (thesis) was focused on protecting DGA classifiers against Adversarial examples using Generative Adversarial Networks (GANs). Please download my resume here.
I am an enthusiastic person and always curious to learn. I strongly believe in collaboration. I love to attend conferences to learn, grow in my career and have attended conferences/events like USENIX’18 as grant recipient, Society of Women Engineers, Women in Data Science, Azure University event, ACT-W conference , GHC 2018 as a scholar, Hopper X 1 Seattle.
Publications
An Evaluation of DGA Classifiers PDF
R. Sivaguru, C. Choudhary, B. Yu, V. Tymchenko, A. Nascimento, M. De Cock in: Proceedings of IEEE BigData2018 (2018 IEEE International Conference on Big Data), p. 5051-5060, 2018
Algorithmically Generated Domain Detection and Malware Family Classification. The Sixth International Symposium on Security in Computing and Communications (SSCC’ 18) PDF
C. Choudhary, R. Sivaguru, M. Pereira, B. Yu, A. Nascimento, M. De Cock in: Proceedings of the Sixth International Symposium on Security in Computing and Communications (SSCC’18), Communications in Computer and Information Science Series 969, p.640-655, 2019
First place for two out of the four datasets in the Detecting Malicious Domain Names competition DMD 2018
Weakly Supervised Deep Learning for the Detection of Domain Generation Algorithms PDF
B. Yu, J. Pan, D. Gray, J. Hu, C. Choudhary, A. Nascimento, M. De Cock IEEE Access 7, p. 51542-51556, 2019
Projects
Age, Gender and Personality Trait Prediction of Social Media Users
- Designed and developed Machine Learning and Deep Learning models to predict age, gender and personality traits of social media users using their status updates, profile pictures and page likes with 87% accuracy.
- Technical Proficiency Demonstrated: Python, Convolutional Neural Networks, Keras, Tensorflow, Scikit-Learn, Transfer Learning, Numpy, Pandas, Jupyter, Matplotlib
Machine Learning and Deep Learning Notebooks
These are my Jupyter Notebooks where I experiment with various machine and deep learning problems and frameworks.
- Iris Flower Classification using Keras, Tensorflow and Scikit-Learn (IPython notebook)
- Recognizing CIFAR-10 images using Convolutional Neural Networks
- Part I - Simple model (IPython Notebook)
- Part II - Improved model (IPython Notebook)
- Part III - Data Augmentation (IPython Notebook)
- Traffic Sign Recognition using Convolutional Neural Networks (IPython Notebook)
- Movie Recommendation Engine from Scratch (IPython Notebook)
- Linear Regression (IPython Notebook)
- Multivariate Linear Regression (IPython Notebook)
Empirical Study of Network Flow Algorithms
- Implemented Ford-Fulkerson, Scaling Ford-Fulkerson, and Preflow-push algorithms in Java to find maximum flow in a network and performed empirical study with Random, Bipartite and Mesh graphs.
- Technical Proficiency Demonstrated: Java, Git, Algorithms and Data-Structures
Simulation and Visualization of 5 Sorting Algorithms
- Developed a simulation experiment to compare and contrast various sorting algorithms by drawing real-time graphs and interpreting their behaviors on various degree of sortedness.
- Technical Proficiency Demonstrated: Python, Git, Matplotlib
Serverless Computing Microservices Composition
- Working on analyzing and evaluating performance and cost metrics for different compositions of microservices on AWS Lambda.
Data Structure and Algorithm problems
This is my repository where I solve interesting algorithms and data structure problems.