Daniel George

Search

Search IconIcon to open search

Last updated Apr 22, 2024 Edit Source

# Past Projects

# PreScribe

Demo

An AI medical scribe that proactively checks off requirements for medicine as the conversation goes on, real-time.

Links:

The embedded tweets take a bit to load! Sorry about that


# Whisper Notes

Demo

A simple voice-to-text clipboard app for quickly recording your thoughts. Use global commands to record and fill your copy register with the transcript. View and delete previous transcripts.

Links:


# BioConceptVecXplorer

Demo

Knowledge is continuous, but its representation in papers is discrete. We can instead represent concepts as vectors and explore a more fluid latent space. Using vector embeddings trained on 30 million PubMed abstracts, we created a tool for researchers to create biological analogies to discover relationships not explicitly in the literature. This was previously done in material science to discover an anti-ferromagnetic material not explicitly in the literature. We extended this idea as a tool to enable bioengineers to make discoveries as well.

We were interviewed and written about for this project on the following biotech newsletter .

Links:


The following are mostly from the genetics lab I work at

# Spheroid Analysis

Example plot from one of my EDAs

Spheroids are 3D cultures that mimic tissues and micro tumors better than the 2D cultures we see in Petri dishes for a variety of reasons. When extending biological circuits from the 2D to the more realistic 3D, the morphology of the cells could affect things like protein production. Protein synthesis is an inherently stochastic process, so when creating biological circuits quantifying a measure of noise in the system is quite useful. This is extending the 2D analysis to spheroids. You can find out more in my slides.

Links:


# Physically Unclonable Function (PUF) Pipeline and Analysis

PUFs are a type of fingerprint that is introduced in manufacturing to show the lineage of a device. They are mainly used in integrated circuits, but they can also be used in other domains like biology. Using CRISPR one can make a PUF to that is impossible to reproduce and can be used to determine the lineage of a cell line ( CRISPR-PUF). I wrote a pipeline to process the millions of DNA sequencing reads that are used to find a ‘distance’ between cell lines. If they are the same cell line the distance would be small, and it would be large if the cell lines are different.

Links:


# Finding Genomic Safe Harbors for CHO

Logic behind pipeline from slides

When you edit a cell line with techniques like CRISPR, if you add code without taking into account the function of the regions around it, there might be unintended consequences. There are certain regions that are considered safe harbors where your changes won’t interfere with existing functionality. There was code from a paper which defined requirements for safe harbors for the human genome, but we needed it to work for the Chinese Hamster Ovary, so I wrote a Python port than can be extended to any genome. See slides for more info.

Links:

# Projects


# Small Molecule Autocomplete RNN

I trained a LSTM based RNN to efficiently enumerate chemical space given a SMILES input. It uses BFS to take the top few possibilities in the probability distribution, and I made model more creative and incentivized it to give shorter outputs.

Links: