I’m a full-stack data scientist with experience in media and entertainment, consumer electronics, and health and life sciences. I have built data pipelines (PySpark, BigQuery, Snowflake, dbt, and Python), analytical tools (Databricks, Tableau) and machine learning models (Databricks, Python, AWS SageMaker) with the ultimate goal of assisting teams across the company. I have partnered with marketing, strategy, product teams to deliver consumer insights and drive data driven decisions.
Prior to becoming a data scientist I was a postdoctoral fellow at NYU where I worked as a researcher and built computational tools and published numerous papers in bioimage informatics and Neuroscience.
April 2012 - January 2015
Neuroscience
Completed a Ph.D. in Neuroscience from NYU.
August 2007 - April 2012
Neuroscience
Completed a masters in Neuroscience from NYU .
August 2003 - June 2007
Biology
Graduated from University of Puerto Rico, Rio Piedras, PR, with a major in Biology and
honors.
October 2022 - June 2023
January 2021 - October 2022
April 2020 - November 2020
October 2019 - February 2020
November 2017 - October 2019
May 2017 - August 2017
May 2015 - November 2017
April 2010 – April 2015
August 2008 - April 2010
September 2005- May 2007
June 2006-August 2006
June 2005 – August 2005
My interest in programming started when I took an introductory class to C and C++ in 2013. Since then, I have worked mostly in Python and Pyspark to develop data pipelines, machine learning, and analytical tools.
I am proficient in numerous software applications. Here are some examples.
As the result of a collaboration with members of Dr. Eli Rothenburg's laboratory in Dr. David Fenyo's laboratory we have developed an algorithm in Python and ImageJ called the "Interaction Factor" to quantify protein-protein interactions by stochastic modeling of super-resolution fluorescence microscopy images. The Interaction Factor indicates if the close proximity of the two proteins is greater than that observed by random chance. It can be used to determine if two proteins interact and to determine if experimental manipulations affect the degree of interaction between the two proteins. The manuscript was accepted in the journal Scientific Reports on October 19 2017.
These are screeenshots of the a user-friendly ImageJ plugin package called Interaction Factor. There are two plugins in the Interaction Factor Package that may be used to analyze an image. The first version, simply called “Interaction Factor”, allows you to calculate the Interaction Factor of the image as well as other measures such as overlap area. The second version, called “Interaction Factor Simulations” allows you to run any number of simulations against the image.
As the result of a collaboration with members of Dr. Michele Pagano's laboratory I developed a custom Python module to track cell movement and measure mean intensity levels over time.
As the result of a collaboration with members of Dr. Michele Pagano's laboratory I developed a custom module script to quantify yH2AX foci.
The manuscript is the following:Dankert, J.F., Rona, G., Geter, P., Clijsters, L., Skaar, J.S., Bermudez-Hernandez, K., Fenyo, D., Ueberheide, B., Schneider, R., Pagano, M. (2016) Cyclin F-mediated degradation of SLBP limits H2A.X accumulation and apoptosis upon genotoxic stress in G2. Mol. Cell.
As the result of a collaboration with Dr.Eric Schafler and Dr.Susan Logan I developed a custom algorithm in Python to classify germ cells in green that express Ki67 in red.The cells marked in red are the cells that were classified as cells that expressed Ki67.
The manuscript is the following:Schafler, E., Thomas, P., Bermudez-Hernandez, K., Tang, Z., Fenyo, D., Vigodner, M., Logan, S. (In review) ART-27 regulates mammalian spermatogonial stem cell survival and differentiation. Dev. Biol.
One of the challenges I encountered during my thesis work was being able to quantify immunoreactive cells in a determined region of the brain for hundreds of images. In addition, I needed a way to calculate cell density. If I wanted to manually count cells, it would have taken me months, but instead, I decided to automate the process by writing macros and scripts, making the process of counting faster and unbiased. To automate the process, I wrote scripts in JavaScript that automated image processing in Adobe Photoshop CC; for counting cells and density, I wrote macros for ImageJ; and finally, I organized all the data using Python scripts. You can learn more about the scripts by visiting my project webiste.
As part of this project, I was in charge of planning and executing all the experiments, analyzing, and writing the paper. In this project I had to troubleshoot and develop technical skills like developing a new way of sectioning the hippocampus of mice, a technique that has never been done before. In addition of developing technical skills, I developed a great partnership with the animal caretakers and the administrative staff, ensuring we executed effectively the importation of mice, the breeding scheme, and the chemical purchases.
I contributed to this project by doing immunocythochemistry and histochemistry. Among the histological stainings I did were: myelin, cresyl, and thioflavin-s staining.
In this project I was in charge of purchasing, breeding, and genotyping a portion of the mice that were used in the study, as well as immunocytochemistry for different markers.
Contact me for any collaborations or job opportunities by clicking on the LinkedIn and email links below