Scraping Reddit, part 2

8 minute read

Published: April 09, 2021

The last post dealt with using pushshift and handling requests to access posts and comments from Reddit. This post deals with using the Python Reddit API wrapper to accces posts and comments from Reddit and then using some NLP tools for some basic sentiment analysis.

Scraping Reddit, part 1

10 minute read

Published: February 01, 2021

In light of recent internet trends about retail investors, I’m sure many of us have questions about the kinds of content that gets posted on reddit, and if there are home-grown, analytical ways of addressing these questions. I’ll be showing two ways of parsing submissions and comments to Reddit, this one focusing on using pushshift API endpoints using the requests library, some custom classes for processing these responses, and asyncio to handle asynchronous threading for multiple requests to pushshift.

Accessing FoldingAtHome data on AWS

4 minute read

Published: December 29, 2020

Some F@H data is freely accessible on AWS. This will be a relatively short post on accessing and navigating the data on AWS.

Poetry and Docker

6 minute read

Published: December 23, 2020

What is poetry and where does this fit in the python software/DS ecosystem? And some beginner forays into docker.

Exploring PyTorch + ANI + MD

9 minute read

Published: August 15, 2020

PyTorch + ANI + MD

Downloading and studying my message behavior

5 minute read

Published: August 07, 2020

Digital privacy is everywhere, and recent laws are pushing companies to disclose whatever personal information they may have on you. In the spirit of science, I’m going to make myself my own study subject and observe what Facebook has stored from my messenger history. Along the way, I’ll do some recursion, a little parallelization, some generators for data procesing, and basic visualization to observe my messenger behavior. Notebooks can be found here, but this one you can’t reproduce because I won’t be providing my messenger data (try this notebook on your own messenger data if you’re curious).

Lessons learned from accelerating foyer with dask

32 minute read

Published: June 20, 2020

Combining Foyer + Dask

Big data tools for MD simulation analysis

18 minute read

Published: May 13, 2020

Big data tools for MD simulation analysis

Digging through some Folding@Home data

32 minute read

Published: May 06, 2020

Learning cheminformatics from some Folding@Home data

Is being “clutch” a myth?

7 minute read

Published: April 05, 2020

Are some players more “clutch” than others?

NBA defensive schemes

7 minute read

Published: March 31, 2020

Team defensive schemes

Does a team’s defensive scheme influence opponents’ shot portfolios? I’m going to be using a different NBA API to query all the games for the 2018-2019 regular season. In each game, I’m going to log each “make” against a team’s defense. For example, Houston makes a 22-ft 3-point shot against Boston, I will log the distance at which Houston scored against Boston.

Molecular Modeling Software: OpenMM (part 2)

12 minute read

Published: October 30, 2019

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

Molecular Modeling Software: OpenMM

11 minute read

Published: October 29, 2019

OpenMM

Molecular Modeling Software: OpenForceField

15 minute read

Published: October 15, 2019

Putting together open-source molecular modeling software

Molecular Modeling Software: Open Babel

6 minute read

Published: October 03, 2019

Open Babel

“Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.”

Molecular Modeling Software: Foyer

33 minute read

Published: September 27, 2019

Foyer

Foyer is an open-source Python package, part of the MoSDeF suite of tools for molecular modeling. In the description: “a package for atom-typing as well as applying and disseminating forcefields.”

Molecular Modeling Software: ParmEd

17 minute read

Published: August 01, 2019

ParmEd

ParmEd is an open-source python package for molecular modeling applications. In the description: “Cross-program parameter and topology file editor and molecular mechanical simulator engine.”

Technical Debt

6 minute read

Published: July 21, 2019

Technical debt is an ongoing, ever-pressing issue to any large, collaborative code base. If left unaddressed, technical debt can seriously cripple productivity.

Introduction

1 minute read

Published: July 13, 2019

Hello world

This is my first post. I’m Alex. I’m from the Northern Virginia area. I like chemical engineering, chemistry, computer science, and scientific computing/data science.

Accessing FoldingAtHome data on AWS

4 minute read

Published: December 29, 2020

Some F@H data is freely accessible on AWS. This will be a relatively short post on accessing and navigating the data on AWS.

Exploring PyTorch + ANI + MD

9 minute read

Published: August 15, 2020

PyTorch + ANI + MD

Lessons learned from accelerating foyer with dask

32 minute read

Published: June 20, 2020

Combining Foyer + Dask

Big data tools for MD simulation analysis

18 minute read

Published: May 13, 2020

Big data tools for MD simulation analysis

Digging through some Folding@Home data

32 minute read

Published: May 06, 2020

Learning cheminformatics from some Folding@Home data

Molecular Modeling Software: OpenMM (part 2)

12 minute read

Published: October 30, 2019

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

Molecular Modeling Software: OpenMM

11 minute read

Published: October 29, 2019

OpenMM

Molecular Modeling Software: OpenForceField

15 minute read

Published: October 15, 2019

Putting together open-source molecular modeling software

Molecular Modeling Software: Open Babel

6 minute read

Published: October 03, 2019

Open Babel

“Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.”

Molecular Modeling Software: Foyer

33 minute read

Published: September 27, 2019

Foyer

Foyer is an open-source Python package, part of the MoSDeF suite of tools for molecular modeling. In the description: “a package for atom-typing as well as applying and disseminating forcefields.”

Molecular Modeling Software: ParmEd

17 minute read

Published: August 01, 2019

ParmEd

ParmEd is an open-source python package for molecular modeling applications. In the description: “Cross-program parameter and topology file editor and molecular mechanical simulator engine.”

Bayesian Methods and Molecular Modeling 1

26 minute read

Published: July 29, 2019

(Last updated: 2019-07-31). This is an ongoing post as I work through a tutorial I found.

Molecular Modeling 2

12 minute read

Published: July 25, 2019

Conducting a simulation

Running a simulation means taking a model and sampling sort of distribution with it. Example simulation

Molecular Modeling Software: MDTraj

7 minute read

Published: July 17, 2019

Some anecdotes with analyzing simulations

Let’s say you’ve conducted a simulation. Everything up to that point (parametrization, initialization, actually running the simulation) will be assumed and probably discussed another day. What you have from a simulation is a trajectory (timeseries of coordinates), and now we have to derive some meaningful properties from this trajectory.

Molecular Modeling 1

5 minute read

Published: July 14, 2019

How do you model something?

Let’s talk about molecular modeling from both the chemistry and mathematic standpoints. When you want to model something, what do you need?

Introduction

1 minute read

Published: July 13, 2019

Hello world

This is my first post. I’m Alex. I’m from the Northern Virginia area. I like chemical engineering, chemistry, computer science, and scientific computing/data science.

Scraping Reddit, part 2

8 minute read

Published: April 09, 2021

The last post dealt with using pushshift and handling requests to access posts and comments from Reddit. This post deals with using the Python Reddit API wrapper to accces posts and comments from Reddit and then using some NLP tools for some basic sentiment analysis.

Scraping Reddit, part 1

10 minute read

Published: February 01, 2021

In light of recent internet trends about retail investors, I’m sure many of us have questions about the kinds of content that gets posted on reddit, and if there are home-grown, analytical ways of addressing these questions. I’ll be showing two ways of parsing submissions and comments to Reddit, this one focusing on using pushshift API endpoints using the requests library, some custom classes for processing these responses, and asyncio to handle asynchronous threading for multiple requests to pushshift.

Accessing FoldingAtHome data on AWS

4 minute read

Published: December 29, 2020

Some F@H data is freely accessible on AWS. This will be a relatively short post on accessing and navigating the data on AWS.

Poetry and Docker

6 minute read

Published: December 23, 2020

What is poetry and where does this fit in the python software/DS ecosystem? And some beginner forays into docker.

Exploring PyTorch + ANI + MD

9 minute read

Published: August 15, 2020

PyTorch + ANI + MD

Downloading and studying my message behavior

5 minute read

Published: August 07, 2020

Digital privacy is everywhere, and recent laws are pushing companies to disclose whatever personal information they may have on you. In the spirit of science, I’m going to make myself my own study subject and observe what Facebook has stored from my messenger history. Along the way, I’ll do some recursion, a little parallelization, some generators for data procesing, and basic visualization to observe my messenger behavior. Notebooks can be found here, but this one you can’t reproduce because I won’t be providing my messenger data (try this notebook on your own messenger data if you’re curious).

Lessons learned from accelerating foyer with dask

32 minute read

Published: June 20, 2020

Combining Foyer + Dask

Big data tools for MD simulation analysis

18 minute read

Published: May 13, 2020

Big data tools for MD simulation analysis

Digging through some Folding@Home data

32 minute read

Published: May 06, 2020

Learning cheminformatics from some Folding@Home data

Is being “clutch” a myth?

7 minute read

Published: April 05, 2020

Are some players more “clutch” than others?

NBA defensive schemes

7 minute read

Published: March 31, 2020

Team defensive schemes

Does a team’s defensive scheme influence opponents’ shot portfolios? I’m going to be using a different NBA API to query all the games for the 2018-2019 regular season. In each game, I’m going to log each “make” against a team’s defense. For example, Houston makes a 22-ft 3-point shot against Boston, I will log the distance at which Houston scored against Boston.

Fantasy NBA 2

14 minute read

Published: August 15, 2019

Part 2 of evaluating fantasy NBA draft picks - modeling and sampling for expected fantasy output.

Fantasy NBA 1

1 minute read

Published: August 14, 2019

Part 1 of evaluating fantasy NBA draft picks - first gathering the relevant data.

Introduction

1 minute read

Published: July 13, 2019

Hello world

This is my first post. I’m Alex. I’m from the Northern Virginia area. I like chemical engineering, chemistry, computer science, and scientific computing/data science.

Molecular Modeling Software: OpenMM (part 2)

12 minute read

Published: October 30, 2019

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

Molecular Modeling Software: OpenMM

11 minute read

Published: October 29, 2019

OpenMM

Molecular Modeling Software: OpenForceField

15 minute read

Published: October 15, 2019

Putting together open-source molecular modeling software

Molecular Modeling Software: Open Babel

6 minute read

Published: October 03, 2019

Open Babel

“Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.”

Molecular Modeling Software: Foyer

33 minute read

Published: September 27, 2019

Foyer

Foyer is an open-source Python package, part of the MoSDeF suite of tools for molecular modeling. In the description: “a package for atom-typing as well as applying and disseminating forcefields.”

Bayesian Methods 1

11 minute read

Published: August 08, 2019

First-attempt at using PyMC3 for Bayesian parameter estimation

Applying some principles from earlier mcmc posts/notebooks to estimate the parameters of a linear model

Molecular Modeling Software: ParmEd

17 minute read

Published: August 01, 2019

ParmEd

ParmEd is an open-source python package for molecular modeling applications. In the description: “Cross-program parameter and topology file editor and molecular mechanical simulator engine.”

Bayesian Methods and Molecular Modeling 1

26 minute read

Published: July 29, 2019

(Last updated: 2019-07-31). This is an ongoing post as I work through a tutorial I found.

Molecular Modeling 2

12 minute read

Published: July 25, 2019

Conducting a simulation

Running a simulation means taking a model and sampling sort of distribution with it. Example simulation

Technical Debt

6 minute read

Published: July 21, 2019

Technical debt is an ongoing, ever-pressing issue to any large, collaborative code base. If left unaddressed, technical debt can seriously cripple productivity.

Molecular Modeling Software: MDTraj

7 minute read

Published: July 17, 2019

Some anecdotes with analyzing simulations

Let’s say you’ve conducted a simulation. Everything up to that point (parametrization, initialization, actually running the simulation) will be assumed and probably discussed another day. What you have from a simulation is a trajectory (timeseries of coordinates), and now we have to derive some meaningful properties from this trajectory.

Molecular Modeling 1

5 minute read

Published: July 14, 2019

How do you model something?

Let’s talk about molecular modeling from both the chemistry and mathematic standpoints. When you want to model something, what do you need?

Introduction

1 minute read

Published: July 13, 2019

Hello world

This is my first post. I’m Alex. I’m from the Northern Virginia area. I like chemical engineering, chemistry, computer science, and scientific computing/data science.

Alex H. Yang

Posts by Tags

datascience

PyTorch + ANI + MD

Combining Foyer + Dask

Big data tools for MD simulation analysis

Learning cheminformatics from some Folding@Home data

Are some players more “clutch” than others?

Team defensive schemes

gradSchool

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

OpenMM

Putting together open-source molecular modeling software

Open Babel

Foyer

ParmEd

Hello world

molecularmodeling

PyTorch + ANI + MD

Combining Foyer + Dask

Big data tools for MD simulation analysis

Learning cheminformatics from some Folding@Home data

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

OpenMM

Putting together open-source molecular modeling software

Open Babel

Foyer

ParmEd

Conducting a simulation

Some anecdotes with analyzing simulations

How do you model something?

Hello world

personal

PyTorch + ANI + MD

Combining Foyer + Dask

Big data tools for MD simulation analysis

Learning cheminformatics from some Folding@Home data

Are some players more “clutch” than others?

Team defensive schemes

Hello world

scientificComputing

Less-standard molecular modeling methods, combining rules, and OpenMM nonbonded forces

OpenMM

Putting together open-source molecular modeling software

Open Babel

Foyer

First-attempt at using PyMC3 for Bayesian parameter estimation

ParmEd

Conducting a simulation

Some anecdotes with analyzing simulations

How do you model something?

Hello world