Category Archives: Uncategorized

Combining dichotomous categorical variables into one variable: An example for racial/ethnic diversity

Sometimes it is useful to group similar dichotomous variables together into a single variable. One example is when asking survey respondents to identify race/ethnicity. Given a large, heterogeneous sample, there will undoubtedly be respondents who select two, three, or more racial/ethnic categories. When using checkbox input fields, most survey data collection platforms will store each checkbox value as a distinct variable (False=0, True=1) and provide a data column for each.  This can become problematic when you want to report the summary output as mutually-exclusive data (each respondent counted only once).

For the race/ethnicity example, a respondent might click checkboxes for both ‘Black’ and ‘White’ if they identify as biracial. If you tabulate these two variables separately, this person will be counted twice. If you want mutually-exclusive respondent counts, you may want to combine all of the dichotomous variables into a single composite variable that you can then summarize. What follows is a fairly easy way to do that without losing your mind or the nuance of the original variables.

Continue reading

Recently stopped drinking? Check in to Reddit.

Sat check-in

Figure 1: Network graph

This figure represents the 100 most recent threads on an alcohol support subreddit captured on the morning of Saturday 1/16/2016. The data account for 1,021 posts or comments made by 337 valid users (55 were excluded due to not having a stop-drinking date). Each node in the graph represents a post or comment. The red nodes correspond to Saturday check-in responses, where individuals pledge to not drink today. Node size corresponds to number of days since last drink (μ=382, SD=1340, Med.=30, IQR: 12-214). Continue reading

Social Network Analysis Graphs

Over winter break, I’ve been learning to mine Reddit for data that might be used to conduct social science research.  Specifically, I’m interested in the content and structure of “subreddit” communities that focus on mental and behavioral health conditions. For example, what might we be able to learn from examining the social structure and level of engagement in different types of emotional support threads?

So far, I’ve gotten the hang of retrieving data from the Python Reddit API Wrapper (PRAW). I’m now testing a few different visualization tools for social network analysis. Below are quick overviews of three options that I’ve tested so far (NodeXL, NetworkX, and Gephi). These data come from a subreddit that focuses on a particular mental health condition. The central nodes (hubs) are the subreddit moderators, and the peripheral nodes (spokes) are the last ~50 people who they responded to.  What we can see from these visualizations is that the moderators sometimes communicate with each other publicly, but their communications with non-moderators don’t tend to overlap. Continue reading

Playback speed for instructional videos: Considerations for Vimeo users and developers

Instructional videos are a staple for people learning new skills – whether learning new languages (math and programming languages included), how to prepare new recipes for dinner (thanks America’s Test Kitchen!), figuring out home maintenance tasks, or watching educational lectures. These videos can be notoriously complex and too fast, or long-winded and too slow. As a viewer, I find it incredibly helpful to be able change the playback speed to suit my learning needs. This is a core feature that the Vimeo player is currently lacking…
Continue reading

Early peer-review experience: Benefits and barriers

A few years back, I wrote a brief blog post to help guide new peer-reviewers.  It seemed to me that I couldn’t be the only person spending hours searching for some kind of rubric, or a quick-start guide, to help navigate the review process that I was invited and excited to participate in.  Since then, I’ve seen reports of several cases where peer-review processes had been demonstrably fallible through lack of rigor, or had been purposefully scammed by authors in order to secure publication.  Meanwhile, there is a burgeoning group of academics, like myself, who would eagerly and meticulously serve as reviewers, if only they were given more opportunities to do so… Continue reading